到了Grizzly版本,quantum组件才比较稳定,可以正常使用,自己也花了很多时间研究,现在已可以成功部署多节点环境。以下是部署过程中遇到的一些问题,包括Essex和Grizzly两个版本。国内网上关于这方面的资料很少,很多资料也都是国外网站上看到的。而且很多情况下日志错误信息相同,但导致错误的原因却不尽相同,这时候就需要仔细分析其中的原理,才能准确定位。遇到错误并不可怕,我们可以通过对错误的排查加深对系统的理解,这样也是好事。
关于安装部署,网上有一些自动化的部署工具,如devstack和onestack,一键式部署。如果你是初学者,并不建议你使用这些工具,很明显,这样你学不到任何东西,不会有任何收获。如果没有问题可以暂时恭喜你一下,一旦中间环节出现错误信息,你可能一头雾水,根本不知道是哪里错了,加之后期的维护也是相当困难的。你可能需要花更多的时间去排查故障。因为你根本不了解中间经过了哪些环节,需要做哪些配置!这些工具大多数是为了快速部署开发环境所用,正真生产环境还需要我们一步一步来操作。这样有问题也可快速定位排查错误。
本文仅是针对部署过程中的一些错误信息进行总结梳理,并给予解决办法,这些情况是在我的环境里遇到的,并成功解决的,可能会因为环境的不同而有所差异,仅供参考。
1、检查服务是否正常:
root@control:~# nova-manage service list Binary Host Zone Status State Updated_At nova-cert control internal enabled :-) 2013-04-26 02:29:44 nova-conductor control internal enabled :-) 2013-04-26 02:29:42 nova-consoleauth control internal enabled :-) 2013-04-26 02:29:44 nova-scheduler control internal enabled :-) 2013-04-26 02:29:47 nova-compute node-01 nova enabled :-) 2013-04-26 02:29:46 nova-compute node-02 nova enabled :-) 2013-04-26 02:29:46 nova-compute node-03 nova enabled :-) 2013-04-26 02:29:42
如果看到都是笑脸状态,说明nova的服务属于正常状态,如果出现XXX,请查看该服务的相关日志信息,在/var/log/nova/下查看,通过日志一般可以分析出错误的原因。
2、libvirt错误
python2.7/dist-packages/nova/virt/libvirt/connection.py”, line 338, in _connect 2013-03-0917:05:42 TRACE nova return libvirt.openAuth(uri, auth, 0) 2013-03-09 17:05:42 TRACE nova File “/usr/lib/python2.7/dist-packages/libvirt.py”, line 102, in openAuth 2013-03-09 17:05:42 TRACE nova if ret is None:raise libvirtError(‘virConnectOpenAuth() failed’) 2013-03-09 17:05:42 TRACE nova libvirtError: Failed to connect socket to ‘/var/run/libvirt/libvirt-sock’: No such file or directory 2013-03-09 22:05:41.909+0000: 12466: info : libvirt version: 0.9.8 2013-03-09 22:05:41.909+0000: 12466: error : virNetServerMDNSStart:460 : internal error Failed to create mDNS client: Daemon not running
解决方案:
出现这种错误首先要查看/var/log/libvirt/libvirtd.log日志信息,日志里会显示:libvirt-bin service will not start without dbus installed.
我们再查看ps –ea|grep dbus,确认dbus is running,然后执行apt-get install lxc
3、Failed to add image
Error: Failed to add image. Got error: The request returned 500 Internal Server Error
解决方案:
环境变量问题,配置环境变量,在/etc/profile文件中新增:
OS_AUTH_KEY=”openstack” OS_AUTH_URL=”http://localhost:5000/v2.0/” OS_PASSWORD=”openstack” OS_TENANT_NAME=”admin” OS_USERNAME=”admin”
然后执行source /etc/profile即可!当然你也可以不在profile里配置环境变量,但是只能临时生效,重启服务器就很麻烦,所以建议你还是写在profile里,这样会省很多麻烦。
4、僵尸实例的产生
僵尸实例一般是非法的关闭nova或者底层虚拟机,又或者在实例错误时删除不了的错误,注意用virsh list检查底层虚拟机是否还在运行,有的话停掉,然后直接进入数据库删除。
Nova instance not found Local file storage of the image files. Error: 2013-03-09 17:58:08 TRACE nova raise exception.InstanceNotFound(instance_id=instance_name) 2013-03-09 17:58:08 TRACE nova InstanceNotFound: Instance instance-00000002 could not be found. 2013-03-09 17:58:08 TRACE nova
解决方案:
删除数据库中的僵尸实例或将数据库删除重新创建:
a、删除数据库:
$mysql –u root –p DROP DATABASE nova; Recreate the DB: CREATE DATABASE nova; (strip formatting if you copy and paste any of this) GRANT ALL PRIVILEGES ON nova.* TO ‘novadbadmin’@'%’ IDENTIFIED BY ‘<password>’; Quit Resync DB
b、删除数据库中的实例:
#!/bin/bash mysql -uroot -pmysql << _ESXU_ use nova; DELETE a FROM nova.security_group_instance_association AS a INNER JOIN nova.instances AS b ON a.instance_uuid=b.id where b.uuid='$1'; DELETE FROM nova.instance_info_caches WHERE instance_uuid='$1'; DELETE FROM nova.instances WHERE uuid='$1'; _ESXU_
将以上文件写入delete_insrance.sh中,然后执行sh delete_instrance.sh insrance_id;
其中instrance_id可以通过nova list 查看。
5、Keystone NoHandlers
Error root@openstack-dev-r910:/home/brent/openstack# ./keystone_data.sh No handlers could be found for logger “keystoneclient.client” Unable to authorize user No handlers could be found for logger “keystoneclient.client” Unable to authorize user No handlers could be found for logger “keystoneclient.client” Unable to authorize user
解决方案:
出现这种错误是大多数是由于keystone_data.sh有误,其中
admin_token必须与/etc/keystone/keystone.conf中相同。然后确认keystone.conf中有如下配置:
driver = keystone.catalog.backends.templated.TemplatedCatalog template_file = /etc/keystone/default_catalog.templates
6、清空系统组件,重新安装:
#!/bin/bash mysql -uroot -popenstack -e “drop database nova;” mysql -uroot -popenstack -e “drop database glance;” mysql -uroot -popenstack -e “drop database keystone;” apt-get purge nova-api nova-cert nova-common nova-compute nova-compute-kvm nova-doc nova-network nova-objectstore nova-scheduler nova-vncproxy nova-volume python-nova python-novaclient apt-get autoremove rm -rf /var/lib/glance rm -rf /var/lib/keystone/ rm -rf /var/lib/nova/ rm -rf /var/lib/mysql
可通过执行上面的脚本,卸载已安装的组件并清空数据库。这样可以省去重装系统的麻烦!
7、Access denied for user ‘keystone@localhost(using password:YES’)
# keystone-manage db_sync File “/usr/lib/python2.7/dist-packages/MySQLdb/connections.py”, line 187, in __init__ super(Connection, self).__init__(*args, **kwargs2) sqlalchemy.exc.OperationalError: (OperationalError) (1045, “Access denied for user ‘keystone’@'openstack1′ (using password: YES)”) None None
解决方案:
查看keystone.conf配置文件链接数据库是否有误,正确如下:
[sql] connection = mysql://keystone:openstack@localhost:3306/keystone
8、nova-compute挂掉与时间同步的关系
很多时候发现nova-compute挂掉,或者不正常了,通过nova-manage查看状态是XXX了。
往往是nova-compute的主机时间和controller的主机时间不一致。 nova-compute是定时地往数据库中services这个表update时间的,这个时间是nova-compute的主机时间。
controller校验nova-compute的存活性是以controller的时间减去nova-compute的update时间,如果大于多少秒(具体数值代码里面有,好像是15秒)就判断nova-compute异常。
这个时候你用nova-manage查看nova-compute状态是XXX,如果创建虚拟机,查看nova-scheduler.log 就是提示找不到有效的host 其他服务节点类同,这是nova心跳机制问题。所以讲nova环境中各节点时间同步很重要。一定要确保时间同步!!
如果在dashboard上看nova-compute状态,可能一会儿变红,一会儿变绿。那就严格同步时间,或者找到代码,把上面的那个15秒改大一点。
9、noVNC不能连接到实例
novnc的问题比较多,网上也有关于这方面的很多配置介绍,其实配置不复杂,只有四个参数,配置正确基本上没什么大问题,但是装的过程中还是遇到了不少的问题。
a、提示“Connection Refuesd”
可能是控制节点在收到vnc请求的时候,无法解析计算节点的主机名,从而无法和计算节点上的实例建立连接。
另外可能是,当前浏览器不支持或者不能访问,将计算节点的ip和主机名的对应关系加入到控制节点的/etc/hosts文件中。
b、提示“failed connect to server”
出现这种错误的情况比较多,有可能是配置文件的错误,我们的环境中遇到这个错误是因为网络源有更新,导致安装版本不一致,使组件无法正常使用,解决方法就是使用本地源。另外需要特别说明的是使用novnc的功能需要浏览器支持Web Socket和HTML5.推荐使用谷歌。
10、cinder错误,无法登录dashboard.
出现如下错误:
TypeError at /admin/ hasattr(): attribute name must be string Request Method: GET Request URL: http://192.168.80.21/horizon/admin/ Django Version: 1.4.5 Exception Type: TypeError Exception Value: hasattr(): attribute name must be string Exception Location: /usr/lib/python2.7/dist-packages/cinderclient/client.py in __init__, line 78 Python Executable: /usr/bin/python Python Version: 2.7.3 Python Path: ['/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../..', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-linux2', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages', '/usr/share/openstack-dashboard/', '/usr/share/openstack-dashboard/openstack_dashboard'] Server time: Fri, 29 Mar 2013 12:51:09 +0000
解决方案
查看 apache2 的 error 日志,报如下错误:
ERROR:django.request:Internal Server Error: /horizon/admin/ Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/django/core/handlers/base.py", line 111, in get_response response = callback(request, *callback_args, **callback_kwargs) File "/usr/lib/python2.7/dist-packages/horizon/decorators.py", line 38, in dec return view_func(request, *args, **kwargs) File "/usr/lib/python2.7/dist-packages/horizon/decorators.py", line 86, in dec return view_func(request, *args, **kwargs) File "/usr/lib/python2.7/dist-packages/horizon/decorators.py", line 54, in dec return view_func(request, *args, **kwargs) File "/usr/lib/python2.7/dist-packages/horizon/decorators.py", line 38, in dec return view_func(request, *args, **kwargs) File "/usr/lib/python2.7/dist-packages/horizon/decorators.py", line 86, in dec return view_func(request, *args, **kwargs) File "/usr/lib/python2.7/dist-packages/django/views/generic/base.py", line 48, in view return self.dispatch(request, *args, **kwargs) File "/usr/lib/python2.7/dist-packages/django/views/generic/base.py", line 69, in dispatch return handler(request, *args, **kwargs) File "/usr/lib/python2.7/dist-packages/horizon/tables/views.py", line 155, in get handled = self.construct_tables() File "/usr/lib/python2.7/dist-packages/horizon/tables/views.py", line 146, in construct_tables handled = self.handle_table(table) File "/usr/lib/python2.7/dist-packages/horizon/tables/views.py", line 118, in handle_table data = self._get_data_dict() File "/usr/lib/python2.7/dist-packages/horizon/tables/views.py", line 182, in _get_data_dict self._data = {self.table_class._meta.name: self.get_data()} File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/dashboards/admin/overview/views.py", line 41, in get_data data = super(GlobalOverview, self).get_data() File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/usage/views.py", line 34, in get_data self.usage.get_quotas() File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/usage/base.py", line 115, in get_quotas _("Unable to retrieve quota information.")) File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/usage/base.py", line 112, in get_quotas self.quotas = quotas.tenant_quota_usages(self.request) File "/usr/lib/python2.7/dist-packages/horizon/utils/memoized.py", line 33, in __call__ value = self.func(*args) File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/usage/quotas.py", line 115, in tenant_quota_usages disabled_quotas=disabled_quotas): File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/usage/quotas.py", line 98, in get_tenant_quota_data tenant_id=tenant_id) File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/usage/quotas.py", line 80, in _get_quota_data quotasets.append(getattr(cinder, method_name)(request, tenant_id)) File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/api/cinder.py", line 123, in tenant_quota_get c_client = cinderclient(request) File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/api/cinder.py", line 59, in cinderclient http_log_debug=settings.DEBUG) File "/usr/lib/python2.7/dist-packages/cinderclient/v1/client.py", line 69, in __init__ cacert=cacert) File "/usr/lib/python2.7/dist-packages/cinderclient/client.py", line 78, in __init__ if hasattr(requests, logging): TypeError: hasattr(): attribute name must be string
错误信息中指出了 Cinderclient 的 client.py 中 78 行 hasattr() 方法的属性必须是一个字符串。
修改代码:
# vim /usr/lib/python2.7/dist-packages/cinderclient/client.py 78 if hasattr(requests, logging): # 改为 : if hasattr(requests, 'logging'): 79 requests.logging.getLogger(requests.__name__).addHandler(ch)
重新启动 apache2 :
/etc/init.d/apache2 restart
这次访问 dashboard 没有报错,尝试创建 volume 也没有问题了。
11、Unable to attach cinder volume to VM
在测试openstack中的volume服务时把lvm挂载到虚拟机实例时失败,这其实不是cinder的错误,是iscsi挂载的问题。
以下是计算节点nova-compute.log 的错误日志:
2012-07-24 14:33:08 TRACE nova.rpc.amqp ProcessExecutionError: Unexpected error while running command. 2012-07-24 14:33:08 TRACE nova.rpc.amqp Command: sudo nova-rootwrap iscsiadm -m node -T iqn.2010-10.org.openstack:volume-00000011 -p 192.168.0.23:3260 –rescan 2012-07-24 14:33:08 TRACE nova.rpc.amqp Exit code: 255 2012-07-24 14:33:08 TRACE nova.rpc.amqp Stdout: ” 2012-07-24 14:33:08 TRACE nova.rpc.amqp Stderr: ‘iscsiadm: No portal found.\n’
以上错误是没有找到iscsi服务端共享出的存储,查找了很多openstack 资料说要添加以下两个参数:
iscsi_ip_prefix=192.168.80 #openstack环境内网段
iscsi_ip_address=192.168.80.22 # volume机器内网IP
可是问题依然无法解决,后来发现只要在nova.conf配置文件中添加参数iscsi_helper=tgtadm 就挂载失败。
根据这个情况进行了测试查看日志才发现:如果使用参数 :iscsi_helper=tgtadm 时就必须使用 tgt 服务,反之使用iscsitarget服务再添加参数iscsi_helper=ietadm。
我测试环境的问题是tgt和iscsitarget服务都已安装并运行着(在安装nova-common时会把tgt服务也安装上,这个不小心还真不会发现),在nova.conf配置中添加参数iscsi_helper=tgtadm ,查看端口3260 发现是iscsitarget服务占用,所以导致挂载失败,我们可以根据情况来使用哪个共享存储服务!!将tgt 和iscsi_helper=tgtadm、iscsitarget和iscsi_helper=ietadm保留一个即可。
12、glance index报错:
Authorization Failed: Unable to communicate with identity service: {"error": {"message": "An unexpected error prevented the server from fulfilling your request. Command 'openssl' returned non-zero exit status 3", "code": 500, "title": "Internal Server Error"}}. (HTTP 500)
在 Grizzly 版,我测试 glance index 时候报错:
Authorization Failed: Unable to communicate with identity service: {"error": {"message": "An unexpected error prevented the server from fulfilling your request. Command 'openssl' returned non-zero exit status 3", "code": 500, "title": "Internal Server Error"}}. (HTTP 500)
错误信息指出:glance 没有通过keystone验证,查看了 keystone 日志,报错如下:
2677 2013-03-04 12:40:58 ERROR [keystone.common.cms] Signing error: Error opening signer certificate /etc/keystone/ssl/certs/signing_cert.pem
2678 139803495638688:error:02001002:system library:fopen:No such file or directory:bss_file.c:398:fopen('/etc/keystone/ssl/certs/signing_cert.pem','r')
2679 139803495638688:error:20074002:BIO routines:FILE_CTRL:system lib:bss_file.c:400:
2680 unable to load certificate
2682 2013-03-04 12:40:58 ERROR [root] Command 'openssl' returned non-zero exit status 3
2683 Traceback (most recent call last):
2684 File "/usr/lib/python2.7/dist-packages/keystone/common/wsgi.py", line 231, in __call__
2685 result = method(context, **params)
2686 File "/usr/lib/python2.7/dist-packages/keystone/token/controllers.py", line 118, in authenticate
2687 CONF.signing.keyfile)
2688 File "/usr/lib/python2.7/dist-packages/keystone/common/cms.py", line 140, in cms_sign_token
2689 output = cms_sign_text(text, signing_cert_file_name, signing_key_file_name)
2690 File "/usr/lib/python2.7/dist-packages/keystone/common/cms.py", line 135, in cms_sign_text
2691 raise subprocess.CalledProcessError(retcode, "openssl")
2692 CalledProcessError: Command 'openssl' returned non-zero exit status 3
在Grizzly 版中,keystone 默认验证方式是 PKI , 需要签名证书,之前的版本都是用的 UUID,改 keystone.conf:
token_format = UUID
在试一次就没有错误了。
13、镜像制作
这里主要强调下windows的镜像制作,因为windows的涉及到加载驱动的问题,就比较麻烦。
下载virtio驱动,因为win默认不支持virtio驱动,而通过openstack管理虚拟机是需要virtio驱动的。需要两个virtio驱动,一个是硬盘的,一个是网卡的,即:virtio-win-0.1-30.iso和virtio-win-1.1.16.vfd。这里主要强调两个地方:
1、创建镜像:
kvm -m 512 -boot d –drive file=win2003server.img,cache=writeback,if=virtio,boot=on -fda virtio-win-1.1.16.vfd -cdrom windows2003_x64.iso -vnc:10
2、引导系统 :
kvm -m 1024 –drive file=win2003server.img,if=virtio, boot=on -cdrom virtio-win-0.1-30.iso -net nic,model=virtio -net user -boot c -nographic -vnc 8
这里需要注意的地方是if=virtio,boot=on –fda virtio-win-1.1.16.vfd和引导系统时使用的virtio-win-0.1-30.iso 这两个驱动分别是硬盘和网卡驱动。如果不加载这两个驱动安装时会发现找不到硬盘,并且用制作好的镜像生成实例也会发现网卡找不到驱动,所以在这里安装镜像生成后需要重新引导镜像安装更新网卡驱动为virtio。
14、删除僵尸volume
如果cinder服务不正常,我们在创建volume时会产生一些僵尸volume,如果在horizon中无法删除的话,我们需要到服务器上去手动删除,
命令:lvremove /dev/nova-volumes/volume-000002
注意这里一定要写完整的路径,不然无法删除,如果删除提示:
“Can't remove open logical volume“ 可尝试将相关服务stop掉,再尝试删除。删除完还需到数据库cinder的volumes表里清除相关记录。
采用Neutron的GRE模式,默认配置下,VM出网的性能极其低下,BUG列表: https://bugs.launchpad.net/neutron/+bug/1252900
keystone把Token数据存放在数据库token表中,在使用过程中不会删除过期的Token数据,导致Token数据量异常。可参考https://bugs.launchpad.net/ubuntu/+source/keystone/+bug/1032633
2013-12-05 19:13:11.732 2625 WARNING keystone.common.controller [-] RBAC: Invalid token
2013-12-05 19:13:11.732 2625 WARNING keystone.common.wsgi [-] Authorization failed. The request you have made requires authentication. from 192.168.1.165
http://www.sebastien-han.fr/blog/2012/12/12/cleanup-keystone-tokens/
\** (process:11739): WARNING **: Error connecting to bus: org.freedesktop.DBus.Error.FileNotFound: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory
process 11739: arguments to dbus_connection_get_data() were incorrect, assertion "connection != NULL" failed in file dbus-connection.c line 5804.
重启messagebus服务
[root@compute1 ~]# /etc/init.d/messagebus start
Starting system message bus: [ OK ]
2014-01-26 00:58:07.074 29610 TRACE neutron File "/usr/lib/python2.6/site-packages/neutron/agent/linux/ip_lib.py", line 81, in _execute
2014-01-26 00:58:07.074 29610 TRACE neutron root_helper=root_helper)
2014-01-26 00:58:07.074 29610 TRACE neutron File "/usr/lib/python2.6/site-packages/neutron/agent/linux/utils.py", line 62, in execute
2014-01-26 00:58:07.074 29610 TRACE neutron raise RuntimeError(m)
2014-01-26 00:58:07.074 29610 TRACE neutron RuntimeError:
2014-01-26 00:58:07.074 29610 TRACE neutron Command: ['ip', '-o', 'link', 'show', 'br-int']
2014-01-26 00:58:07.074 29610 TRACE neutron Exit code: 255
2014-01-26 00:58:07.074 29610 TRACE neutron Stdout: ''
2014-01-26 00:58:07.074 29610 TRACE neutron Stderr: 'Device "br-int" does not exist.\n'
2014-01-26 00:58:07.074 29610 TRACE neutron
增加br-int
[root@controller1 neutron]# ovs-vsctl add-br br-int
[root@controller1 neutron]# ovs-vsctl show
acb40cab-1fa0-48a0-a48c-56c89e1acfcd
Bridge br-int
Port br-int
Interface br-int
type: internal
ovs_version: "1.10.2"
[root@controller1 neutron]# ip -o link show br-int
5: br-int: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN \ link/ether 3e:d6:38:4e:28:43 brd ff:ff:ff:ff:ff:ff
spice配置问题,nova-compute节点的listen参数有问题!
2014-03-13 18:01:11.312 12413 WARNING nova.virt.disk.api [req-4cb3d0ef-d70b-4383-a122-c070a62f757f 6965226966304bd5a3ae07587d5ef958 d2390e6dd4ce4b48866be0d3d1417c01] Ignoring error injecting data into image (Error mounting /share/instances/be363098-6749-42ea-84e0-824fdb1c8e59/disk with libguestfs (command failed: LC_ALL=C '/usr/libexec/qemu-kvm' -nographic -help
errno: File exists
[root@controller1 ~]# ln -s /usr/bin/qemu-kvm /usr/libexec/qemu-kvm
[root@controller1 ~]# ls -l /usr/libexec/qemu-kvm
lrwxrwxrwx 1 root root 17 Mar 27 17:15 /usr/libexec/qemu-kvm -> /usr/bin/qemu-kvm
[root@compute2 /data/nova/instances/0071e60b-a0b6-41fa-b484-ede5448d87b9]#guestmount -a disk -i --ro /mnt/
guestmount: no operating system was found on this disk
未安装libguestfs-winsupport,需要安装libguestfs-winsupport。
[root@compute2 /root] yum install libguestfs-winsupport
修改对应代码,在函数头加入network_name全局变量:nova/virt/libvirt/driver.py
def pre_live_migration(self, context, instance, block_device_info,((Havana 2013.2.1))
#这个函数内增加instance_dir变量!
instance_dir = None
修改对应代码,在函数头加入network_name全局变量:
def _nw_info_build_network(self, port, networks, subnets):
network_name = None(加入这个环境变量)
修改代码/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py
大概在1307行处
metadata = {'is_public': False, 这里的False改为True
[root@node1 ~]# virsh start vm01
error: Failed to start domain vm01
error: internal error process exited while connecting to monitor: Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
No accelerator found!
上面的提示信息就是因为QEMU在初始化阶段因为无法找到kvm内核模块,确保内核支持KVM模块,硬件打开CPU VT技术。
[root@node1 ~]# modprobe kvm #载入kvm模块
重启电脑,进入bios界面,设置advance(cpu)选项里面的virtualization标签为Enabled
[root@node1 ~]# lsmod |grep kvm #显示已载入的模块
kvm_intel 54394 3
kvm 317536 1 kvm_intel
[root@node1 ~]# virsh migrate --live 1 qemu+tcp://node2 --p2p --tunnelled --unsafe
error: operation failed: Failed to connect to remote libvirt URI qemu+tcp://node2
在URI后面加上/system,‘system’相当于root用户的访问权限。
[root@node1 ~]# virsh migrate --live 2 qemu+tcp://node2/system --p2p --tunnelled
error: Unsafe migration: Migration may lead to data corruption if disks use cache != none
加上–unsafe参数进行迁移。
[root@node1 ~]# virsh migrate --live 2 qemu+tcp://192.168.0.121/system --p2p --tunnelled --unsafe
error: Timed out during operation: cannot acquire state change lock
启动虚拟机有时也会遇此错误,需要重启libvirtd进程。
[root@node1 ~]# virsh migrate 5 --live qemu+tcp://node2/system
error: Unable to read from monitor: Connection reset by peer
OpenStack nova.conf vncserver_listen的配置是否正确。
error: internal error Attempt to migrate guest to the same host 00020003-0004-0005-0006-000700080009
查看两个节点的system-uuid是否一样,如果一样需要修改libvirt的配置文件。可以通过如下的命令查看:
[root@controller1 ~]# dmidecode -s system-uuid
63897446-817B-0010-B604-089E01B33744
查看 /etc/libvirt/libvirtd.conf 中的host_uuid发现该行被注释,将该注释去掉,并需要对host_uuid的值进行修改!
在两台机器上分别用 cat /proc/sys/kernel/random/uuid的值来替换原来host_uuid的值!
CentOS中ssh-agent无法自动启动,可以通过在/etc/profile.d/ssh-agent.sh,启动脚本的方式启动agent。
[root@test ~]# vim /etc/profile.d/ssh-agent.sh
#!/bin/sh
if [ -f ~/.agent.env ]; then
. ~/.agent.env >/dev/null
if ! kill -0 $SSH_AGENT_PID >/dev/null 2>&1; then
echo “Stale agent file found. Spawning new agent…”
eval `ssh-agent |tee ~/.agent.env`
ssh-add
fi
else
echo “Starting ssh-agent…”
eval `ssh-agent |tee ~/.agent.env`
ssh-add
fi
通过清除nova数据库中的services表的binary字段对应的记录,(注意有外键约束,需要把services表和compute_nodes表的deleted字段都改为1)。