CloudStack 故障排查汇总-不定期更新

工作中经常用到CloudStack,过程中发现的一些故障排查分享出来,希望可以帮到大家。

一、添加主机失败

现象1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
[root@mgmt ~] # tail -f /var/log/cloudstack/management/management-server.log
2014-02-28 11:05:32,172 DEBUG [kvm.discoverer.LibvirtServerDiscoverer] (catalina- exec -22:null) Timeout, to wait  for  the host connecting to mgt svr, assuming it is failed
2014-02-28 11:05:32,205 WARN [cloud.resource.ResourceManagerImpl] (catalina- exec -22:null) Unable to find  the server resources at http: //192 .168.150.250
2014-02-28 11:05:32,220 INFO [utils.exception.CSExceptionErrorCode] (catalina- exec -22:null) Could not find  exception: com.cloud.exception.DiscoveryException  in  error code list  for  exceptions
2014-02-28 11:05:32,220 WARN [admin.host.AddHostCmd] (catalina- exec -22:null) Exception:
com.cloud.exception.DiscoveryException: Unable to add the host
  at com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManagerImpl.java:798)
  at com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImpl.java:590)
  at org.apache.cloudstack.api. command .admin.host.AddHostCmd.execute(AddHostCmd.java:143)
  at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)
  at com.cloud.api.ApiServer.queueCommand(ApiServer.java:514)
  at com.cloud.api.ApiServer.handleRequest(ApiServer.java:372)
  at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:305)
  at com.cloud.api.ApiServlet.doPost(ApiServlet.java:71)
  at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)
  at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
  at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
  at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
  at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
  at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
  at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
  at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
  at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555)
  at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
  at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
  at org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:889)
  at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:721)
  at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:2268)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:679)
2014-02-28 11:05:32,222 INFO [cloud.api.ApiServer] (catalina- exec -22:null) Unable to add the host
2014-02-28 11:05:32,224 DEBUG [cloud.api.ApiServlet] (catalina- exec -22:null) ===END=== 192.168.151.234 -- POST  command =addHost&response=json&sessionkey=GEI3EIOONoV5RG9Mcs4xcdx31oc%3D

现象2

1
2
[root@kvm01 agent] # /etc/init.d/cloudstack-agent status    ##查看kvm主机的cloudstack-agent服务状态
cloudstack-agent dead but subsys locked

 

现象3

1
2
[root@kvm01 agent] # cat /var/log/cloudstack/agent/agent.log        ##查看kvm主机的agent.log日志中的异常
ERROR [cloud.agent.AgentShell] (main:null) Unable to start agent: NO HVM support on this machine, please  make  sure: 1. VT /SVM  is supported by your CPU, or is enabled  in  BIOS. 2. kvm modules are loaded (kvm, kvm_amd|kvm_intel)

 

解决方法:

1.必须安装虚拟化套件支持
1
[root@kvm01 agent] # yum -y groupinstall 'Virtualization' 'Virtualization Client' 'Virtualzation Platform' 'Virtualization Tools'
2.确认kvm模块已经被正确加载
1
2
3
[root@kvm01 ~] # lsmod | grep kvm
kvm_intel 52570 0
kvm 314739 1 kvm_intel
如果没有任何信息,请使用如下命令加载kvm模块:
1
2
[root@kvm01 ~] # modprobe kvm_intel     ##intel平台
[root@kvm01 ~] # modprobe kvm_amd       ##amd平台

3.再次添加。

 

福利

关于添加主机过程中的错误,千奇百怪,而java的报错又。。。教给大家一个小技巧:

当添加主机报错,日志中有没有明确原因时,可以手动在agent上面执行添加主机的命令。具体添加主机的命令可以在management的日志中获得:

1
2
3
4
5
[root@localhost management]# cat / var /log/cloudstack/management/management-server.log | grep cloudstack-setup-agent
2014 - 03 - 13  09 : 56 : 17 , 758  DEBUG [utils.ssh.SSHCmdHelper] (catalina-exec- 11 : null ) Executing cmd: cloudstack-setup-agent  -m  192.168 . 153.28  -z  2  -p  2  -c  2  -g 0d21492f- 9565 -329d-9a26-0c85f6d39d12 -a --pubNic=cloud0 --prvNic=cloud0 --guestNic=cloud0
2014 - 03 - 13  09 : 56 : 52 , 775  DEBUG [utils.ssh.SSHCmdHelper] (catalina-exec- 11 : null ) cloudstack-setup-agent  -m  192.168 . 153.28  -z  2  -p  2  -c  2  -g 0d21492f- 9565 -329d-9a26-0c85f6d39d12 -a --pubNic=cloud0 --prvNic=cloud0 --guestNic=cloud0 output:CloudStack Agent setup  is  done!
2014 - 03 - 13  11 : 12 : 22 , 455  DEBUG [utils.ssh.SSHCmdHelper] (catalina-exec- 12 : null ) Executing cmd: cloudstack-setup-agent  -m  192.168 . 153.28  -z  3  -p  3  -c  3  -g 0d21492f- 9565 -329d-9a26-0c85f6d39d12 -a --pubNic=cloud0 --prvNic=cloud0 --guestNic=cloud0
2014 - 03 - 13  11 : 12 : 57 , 267  DEBUG [utils.ssh.SSHCmdHelper] (catalina-exec- 12 : null ) cloudstack-setup-agent  -m  192.168 . 153.28  -z  3  -p  3  -c  3  -g 0d21492f- 9565 -329d-9a26-0c85f6d39d12 -a --pubNic=cloud0 --prvNic=cloud0 --guestNic=cloud0 output:CloudStack Agent setup  is  done!

比如我上面的例子,得到如下命令,并在agent上面执行:

1
2
3
4
5
6
7
8
9
10
11
[root@kvm01 ~] # cloudstack-setup-agent  -m 192.168.153.28 -z 3 -p 3 -c 3 -g 0d21492f-9565-329d-9a26-0c85f6d39d12 -a --pubNic=cloud0 --prvNic=cloud0 --guestNic=cloud0
Starting to configure your system:
Configure Cgroup ...          [OK]
Configure SElinux ...         [OK]
Configure Network ...         [OK]
Configure Libvirt ...         [OK]
Configure Firewall ...        [OK]
Configure Nfs ...             [OK]
Configure cloudAgent ...      [OK]
CloudStack Agent setup is  done !
[root@kvm01 ~] #

这个过程中,如果报错,就很轻易就能判断出问题是出在哪一步。

另外,上面cloudstack-setup-agent命令的参数如下,根据自己的情况改写:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[root@dbserver ~] # cloudstack-setup-agent -h
Usage: cloudstack-setup-agent [options]
Options:
   -h, --help            show this help message and  exit
   -a                    auto mode
   -m MGT, --host=MGT    Management server  hostname  or IP-Address
   -z ZONE, --zone=ZONE  zone  id
   -p POD, --pod=POD     pod  id
   -c CLUSTER, --cluster=CLUSTER
                         cluster  id
   -g GUID, --guid=GUID  guid
   --pubNic=PUBNIC       Public traffic interface
   --prvNic=PRVNIC       Private traffic interface
   --guestNic=GUESTNIC   Guest traffic interface

至于参数后面具体的值,可以从agent主机的/etc/cloudstack/agent/agent.properties中获得:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[root@kvm01 ~] # cat /etc/cloudstack/agent/agent.properties
#Storage
#Thu Mar 13 11:23:48 CST 2014
guest.network.device=cloud0
workers=5
private.network.device=cloud0
port=8250
resource=com.cloud.hypervisor.kvm.resource.LibvirtComputingResource
pod=3
zone=3
guid=0d21492f-9565-329d-9a26-0c85f6d39d12
public.network.device=cloud0
cluster=3
local .storage.uuid=ac70655b-f452-4d14-a1a1-2a5eebc4bb01
domr.scripts. dir =scripts /network/domr/kvm
LibvirtComputingResource. id =0
host=192.168.153.28

 

持续更新中。。。

本文出自 “systems” 博客,请务必保留此出处http://systems.blog.51cto.com/2500547/1375332

你可能感兴趣的:(CloudStack)