工作中经常用到CloudStack,过程中发现的一些故障排查分享出来,希望可以帮到大家。
一、添加主机失败
现象1:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
|
[root@mgmt ~]
# tail -f /var/log/cloudstack/management/management-server.log
2014-02-28 11:05:32,172 DEBUG [kvm.discoverer.LibvirtServerDiscoverer] (catalina-
exec
-22:null) Timeout, to wait
for
the host connecting to mgt svr, assuming it is failed
2014-02-28 11:05:32,205 WARN [cloud.resource.ResourceManagerImpl] (catalina-
exec
-22:null) Unable to
find
the server resources at http:
//192
.168.150.250
2014-02-28 11:05:32,220 INFO [utils.exception.CSExceptionErrorCode] (catalina-
exec
-22:null) Could not
find
exception: com.cloud.exception.DiscoveryException
in
error code list
for
exceptions
2014-02-28 11:05:32,220 WARN [admin.host.AddHostCmd] (catalina-
exec
-22:null) Exception:
com.cloud.exception.DiscoveryException: Unable to add the host
at com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManagerImpl.java:798)
at com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImpl.java:590)
at org.apache.cloudstack.api.
command
.admin.host.AddHostCmd.execute(AddHostCmd.java:143)
at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)
at com.cloud.api.ApiServer.queueCommand(ApiServer.java:514)
at com.cloud.api.ApiServer.handleRequest(ApiServer.java:372)
at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:305)
at com.cloud.api.ApiServlet.doPost(ApiServlet.java:71)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:889)
at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:721)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:2268)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
2014-02-28 11:05:32,222 INFO [cloud.api.ApiServer] (catalina-
exec
-22:null) Unable to add the host
2014-02-28 11:05:32,224 DEBUG [cloud.api.ApiServlet] (catalina-
exec
-22:null) ===END=== 192.168.151.234 -- POST
command
=addHost&response=json&sessionkey=GEI3EIOONoV5RG9Mcs4xcdx31oc%3D
|
现象2:
1
2
|
[root@kvm01 agent]
# /etc/init.d/cloudstack-agent status ##查看kvm主机的cloudstack-agent服务状态
cloudstack-agent dead but subsys locked
|
现象3:
1
2
|
[root@kvm01 agent]
# cat /var/log/cloudstack/agent/agent.log ##查看kvm主机的agent.log日志中的异常
ERROR [cloud.agent.AgentShell] (main:null) Unable to start agent: NO HVM support on this machine, please
make
sure: 1. VT
/SVM
is supported by your CPU, or is enabled
in
BIOS. 2. kvm modules are loaded (kvm, kvm_amd|kvm_intel)
|
解决方法:
1
|
[root@kvm01 agent]
# yum -y groupinstall 'Virtualization' 'Virtualization Client' 'Virtualzation Platform' 'Virtualization Tools'
|
1
2
3
|
[root@kvm01 ~]
# lsmod | grep kvm
kvm_intel 52570 0
kvm 314739 1 kvm_intel
|
1
2
|
[root@kvm01 ~]
# modprobe kvm_intel ##intel平台
[root@kvm01 ~]
# modprobe kvm_amd ##amd平台
|
3.再次添加。
福利:
关于添加主机过程中的错误,千奇百怪,而java的报错又。。。教给大家一个小技巧:
当添加主机报错,日志中有没有明确原因时,可以手动在agent上面执行添加主机的命令。具体添加主机的命令可以在management的日志中获得:
1
2
3
4
5
|
[root@localhost management]# cat /
var
/log/cloudstack/management/management-server.log | grep cloudstack-setup-agent
2014
-
03
-
13
09
:
56
:
17
,
758
DEBUG [utils.ssh.SSHCmdHelper] (catalina-exec-
11
:
null
) Executing cmd: cloudstack-setup-agent -m
192.168
.
153.28
-z
2
-p
2
-c
2
-g 0d21492f-
9565
-329d-9a26-0c85f6d39d12 -a --pubNic=cloud0 --prvNic=cloud0 --guestNic=cloud0
2014
-
03
-
13
09
:
56
:
52
,
775
DEBUG [utils.ssh.SSHCmdHelper] (catalina-exec-
11
:
null
) cloudstack-setup-agent -m
192.168
.
153.28
-z
2
-p
2
-c
2
-g 0d21492f-
9565
-329d-9a26-0c85f6d39d12 -a --pubNic=cloud0 --prvNic=cloud0 --guestNic=cloud0 output:CloudStack Agent setup
is
done!
2014
-
03
-
13
11
:
12
:
22
,
455
DEBUG [utils.ssh.SSHCmdHelper] (catalina-exec-
12
:
null
) Executing cmd: cloudstack-setup-agent -m
192.168
.
153.28
-z
3
-p
3
-c
3
-g 0d21492f-
9565
-329d-9a26-0c85f6d39d12 -a --pubNic=cloud0 --prvNic=cloud0 --guestNic=cloud0
2014
-
03
-
13
11
:
12
:
57
,
267
DEBUG [utils.ssh.SSHCmdHelper] (catalina-exec-
12
:
null
) cloudstack-setup-agent -m
192.168
.
153.28
-z
3
-p
3
-c
3
-g 0d21492f-
9565
-329d-9a26-0c85f6d39d12 -a --pubNic=cloud0 --prvNic=cloud0 --guestNic=cloud0 output:CloudStack Agent setup
is
done!
|
比如我上面的例子,得到如下命令,并在agent上面执行:
1
2
3
4
5
6
7
8
9
10
11
|
[root@kvm01 ~]
# cloudstack-setup-agent -m 192.168.153.28 -z 3 -p 3 -c 3 -g 0d21492f-9565-329d-9a26-0c85f6d39d12 -a --pubNic=cloud0 --prvNic=cloud0 --guestNic=cloud0
Starting to configure your system:
Configure Cgroup ... [OK]
Configure SElinux ... [OK]
Configure Network ... [OK]
Configure Libvirt ... [OK]
Configure Firewall ... [OK]
Configure Nfs ... [OK]
Configure cloudAgent ... [OK]
CloudStack Agent setup is
done
!
[root@kvm01 ~]
#
|
这个过程中,如果报错,就很轻易就能判断出问题是出在哪一步。
另外,上面cloudstack-setup-agent命令的参数如下,根据自己的情况改写:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
[root@dbserver ~]
# cloudstack-setup-agent -h
Usage: cloudstack-setup-agent [options]
Options:
-h, --help show this help message and
exit
-a auto mode
-m MGT, --host=MGT Management server
hostname
or IP-Address
-z ZONE, --zone=ZONE zone
id
-p POD, --pod=POD pod
id
-c CLUSTER, --cluster=CLUSTER
cluster
id
-g GUID, --guid=GUID guid
--pubNic=PUBNIC Public traffic interface
--prvNic=PRVNIC Private traffic interface
--guestNic=GUESTNIC Guest traffic interface
|
至于参数后面具体的值,可以从agent主机的/etc/cloudstack/agent/agent.properties中获得:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
[root@kvm01 ~]
# cat /etc/cloudstack/agent/agent.properties
#Storage
#Thu Mar 13 11:23:48 CST 2014
guest.network.device=cloud0
workers=5
private.network.device=cloud0
port=8250
resource=com.cloud.hypervisor.kvm.resource.LibvirtComputingResource
pod=3
zone=3
guid=0d21492f-9565-329d-9a26-0c85f6d39d12
public.network.device=cloud0
cluster=3
local
.storage.uuid=ac70655b-f452-4d14-a1a1-2a5eebc4bb01
domr.scripts.
dir
=scripts
/network/domr/kvm
LibvirtComputingResource.
id
=0
host=192.168.153.28
|
持续更新中。。。
本文出自 “systems” 博客,请务必保留此出处http://systems.blog.51cto.com/2500547/1375332