尝试给nova,cinder-volume/cinder-backup,glance配置Ceph作为后端。
参考Ceph文档:http://docs.ceph.com/docs/master/rbd/rbd-openstack/
先在devstack安装的Kilo版本的环境中配置。
1. 创建pool
[root@controller-1 ~]# ceph osd pool create volumes-maqi-kilo 128
pool 'volumes-maqi-kilo' created
[root@controller-1 ~]# ceph osd pool create backups-maqi-kilo 128
pool 'backups-maqi-kilo' created
[root@controller-1 ~]# ceph osd pool create images-maqi-kilo 128
pool 'images-maqi-kilo' created
[root@controller-1 ~]# ceph osd pool create vms-maqi-kilo 128
pool 'vms-maqi-kilo' created
[root@controller-1 ~]# ceph osd lspools
0 rbd,1 volumes-maqi-kilo,2 backups-maqi-kilo,3 images-maqi-kilo,4 vms-maqi-kilo,
128是pg(Placement Group) number,少于5个OSD的环境推荐设为128。
Update 2015/11/16: pg number设置不准确
admin@maqi-kilo:~|⇒ ceph -s
cluster d3752df9-221d-43c7-8cf5-f39061a630da
health HEALTH_WARN
too many PGs per OSD (576 > max 300)
monmap e1: 1 mons at {controller-1=10.134.1.3:6789/0}
election epoch 2, quorum 0 controller-1
osdmap e18: 2 osds: 2 up, 2 in
pgmap v48: 576 pgs, 5 pools, 394 bytes data, 4 objects
20567 MB used, 36839 GB / 36860 GB avail
576 active+clean
创建的4个pool,每个128个pg,默认的rbd pool有64个pg,一共128*4+64=576个pg。这些pg分布在两个osd上。
看warning信息,max 300,貌似一个osd只能150?
2. 拷贝ceph节点的ssh public key到openstack 节点上
[root@controller-1 ~]# ssh-copy-id [email protected]
3. 在openstack节点上创建ceph目录
admin@maqi-kilo:~|⇒ sudo mkdir /etc/ceph
4. 配置openstack ceph client
跑cinder-volume、cinder-backup,nova-compute,glance-xxx的节点都是ceph cluster的client,需要ceph.conf
配置文件
[root@controller-1 ~]# ssh [email protected] sudo tee /etc/ceph/ceph.conf
[global]
fsid = 1c9f72d3-3ebc-465b-97a4-2784f2db1db3
mon_initial_members = controller-1
mon_host = 10.254.4.3
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
osd_pool_default_size = 2
public_network = 10.254.4.3/24
5. 在openstack节点上安装ceph包
admin@maqi-kilo:~|⇒ sudo apt-get install ceph-common
6. Setup Ceph client authentication
为cinder-volume,cinder-backup,glance创建用户
[root@controller-1 ~]# ceph auth get-or-create client.cinder-maqi-kilo mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=volumes-maqi-kilo , allow rwx pool=vms-maqi-kilo, allow rx pool=images-maqi-kilo'
[client.cinder-maqi-kilo]
key = AQDJYkhWwv4uKRAAI/JPWK2H4qV+DqMSkkliOQ==
[root@controller-1 ~]# ceph auth get-or-create client.glance-maqi-kilo mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=images-maqi-kilo'
[client.glance-maqi-kilo]
key = AQAPY0hW+1YQOBAA3aRlTVGkfzTA4ZfaBEmM8Q==
[root@controller-1 ~]# ceph auth get-or-create client.cinder-backup-maqi-kilo mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=backups-maqi-kilo'
[client.cinder-backup-maqi-kilo]
key = AQA7Y0hWxegCChAAhTHc7abrE9bGON97bSsLgw==
把key拷贝到openstack节点上并修改ownership:【注意,这一步中的keyring文件名是错误的,详见问题1】
[root@controller-1 ~]# ceph auth get-or-create client.cinder-maqi-kilo | ssh [email protected] sudo tee /etc/ceph/ceph.client.cinder.keyring
[client.cinder-maqi-kilo]
key = AQDJYkhWwv4uKRAAI/JPWK2H4qV+DqMSkkliOQ==
[root@controller-1 ~]# ssh [email protected] sudo chown admin:admin /etc/ceph/ceph.client.cinder.keyring
[root@controller-1 ~]# ceph auth get-or-create client.cinder-backup-maqi-kilo | ssh [email protected] sudo tee /etc/ceph/ceph.client.cinder-backup.keyring
[client.cinder-backup-maqi-kilo]
key = AQA7Y0hWxegCChAAhTHc7abrE9bGON97bSsLgw==
[root@controller-1 ~]# ssh [email protected] sudo chown admin:admin /etc/ceph/ceph.client.cinder-backup.keyring
[root@controller-1 ~]# ceph auth get-or-create client.glance-maqi-kilo | ssh [email protected] sudo tee /etc/ceph/ceph.client.glance.keyring
[client.glance-maqi-kilo]
key = AQAPY0hW+1YQOBAA3aRlTVGkfzTA4ZfaBEmM8Q==
[root@controller-1 ~]# ssh [email protected] sudo chown admin:admin /etc/ceph/ceph.client.glance.keyring
注意:
- 这里的owner是admin,因为我们执行devstack安装脚本的用户是admin,所以用admin用户就可以。但是在手动安装的环境中,需要把owner:group设置为
cinder:cinder
对于跑nova-compute的节点,首先需要取得ceph.client.cinder.keyring:
[root@controller-1 ~]# ceph auth get-or-create client.cinder-maqi-kilo | ssh [email protected] sudo tee /etc/ceph/ceph.client.cinder.keyring
然后libvirt也需要这个key,创建过程需要一个临时文件:
在ceph节点上:
[root@controller-1 ~]# ceph auth get-key client.cinder-maqi-kilo | ssh [email protected] tee client.cinder.key AQDJYkhWwv4uKRAAI/JPWK2H4qV+DqMSkkliOQ==[root@controller-1 ~]#
在nova-compute节点上:
admin@maqi-kilo:~|⇒ uuidgen 57ac147c-199b-4b1c-a3d8-70be795c4d07 admin@maqi-kilo:~|⇒ cat > secret.xml <<EOF heredoc> <secret ephemeral='no' private='no'> heredoc> <uuid>57ac147c-199b-4b1c-a3d8-70be795c4d07</uuid> heredoc> <usage type='ceph'> heredoc> <name>client.cinder-maqi-kilo secret</name> heredoc> </usage> heredoc> </secret> heredoc> EOF admin@maqi-kilo:~|⇒ sudo virsh secret-define --file secret.xml Secret 57ac147c-199b-4b1c-a3d8-70be795c4d07 created admin@maqi-kilo:~|⇒ ls client.cinder.key secret.xml client.cinder.key secret.xml admin@maqi-kilo:~|⇒ admin@maqi-kilo:~|⇒ sudo virsh secret-set-value --secret 57ac147c-199b-4b1c-a3d8-70be795c4d07 --base64 $(cat client.cinder.key) && rm client.cinder.key secret.xml Secret value set
Note:UUID最好在各个nova-compute节点上保持一致。
7. 配置glance
/etc/glance/glance-api.conf
[DEFAULT]
default_store = rbd
show_image_direct_url = True
[glance_store]
stores = rbd
rbd_store_pool = images-maqi-kilo
rbd_store_user = glance-maqi-kilo
rbd_store_ceph_conf = /etc/ceph/ceph.conf
rbd_store_chunk_size = 8
[paste_deploy]
flavor = keystone
8. 配置cinder
/etc/cinder/cinder.conf
[DEFAULT]
enabled_backends = ceph
glance_api_version = 2
backup_driver = cinder.backup.drivers.ceph
backup_ceph_conf = /etc/ceph/ceph.conf
backup_ceph_user = cinder-backup-maqi-kilo
#backup_ceph_chunk_size = 134217728
backup_ceph_pool = backups-maqi-kilo
backup_ceph_stripe_unit = 0
backup_ceph_stripe_count = 0
restore_discard_excess_bytes = true
[ceph]
volume_driver = cinder.volume.drivers.rbd.RBDDriver
rbd_pool = volumes-maqi-kilo
rbd_ceph_conf = /etc/ceph/ceph.conf
rbd_flatten_volume_from_snapshot = false
rbd_max_clone_depth = 5
#rbd_store_chunk_size = 4
#rados_connect_timeout = -1
rbd_user = cinder-maqi-kilo
rbd_secret_uuid = 57ac147c-199b-4b1c-a3d8-70be795c4d07
volume_backend_name = ceph
9. 配置nova
nova.conf
[libvirt]
images_rbd_ceph_conf = /etc/ceph/ceph.conf
images_rbd_pool = vms-maqi-kilo
images_type = rbd
disk_cachemodes = network=writeback
inject_key = false
rbd_secret_uuid = 57ac147c-199b-4b1c-a3d8-70be795c4d07
rbd_user = cinder-maqi-kilo
问题
- 配置完成之后,所有的client都连不上ceph cluster。例如cinder-volume报错:
2015-11-15 11:45:15.998 21172 ERROR cinder.volume.drivers.rbd [req-b016ccb5-1544-4b5d-9ec9-1431457f4679 - - - - -] Error connecting to ceph cluster.
2015-11-15 11:45:15.998 21172 TRACE cinder.volume.drivers.rbd Traceback (most recent call last):
2015-11-15 11:45:15.998 21172 TRACE cinder.volume.drivers.rbd File "/home/openstack/workspace/cinder/cinder/volume/drivers/rbd.py", line 314, in _connect_to_rados
2015-11-15 11:45:15.998 21172 TRACE cinder.volume.drivers.rbd client.connect()
2015-11-15 11:45:15.998 21172 TRACE cinder.volume.drivers.rbd File "/usr/lib/python2.7/dist-packages/rados.py", line 417, in connect
2015-11-15 11:45:15.998 21172 TRACE cinder.volume.drivers.rbd raise make_ex(ret, "error calling connect")
2015-11-15 11:45:15.998 21172 TRACE cinder.volume.drivers.rbd ObjectNotFound: error calling connect
2015-11-15 11:45:15.998 21172 TRACE cinder.volume.drivers.rbd
2015-11-15 11:45:15.999 21165 DEBUG cinder.openstack.common.service [req-b3ad30cb-853a-4d68-846d-d5500f8cd0dc - - - - -] glance_ca_certificates_file = None log_opt_values /usr/local/lib/python2.7/dist-packages/oslo_config/cfg.py:2187
2015-11-15 11:45:16.000 21172 ERROR cinder.volume.manager [req-b016ccb5-1544-4b5d-9ec9-1431457f4679 - - - - -] Error encountered during initialization of driver: RBDDriver
。。。。。。
。。。。。。
2015-11-15 11:45:16.000 21172 ERROR cinder.volume.manager [req-b016ccb5-1544-4b5d-9ec9-1431457f4679 - - - - -] Bad or unexpected response from the storage volume backend API: Error connecting to ceph cluster.
2015-11-15 11:45:16.000 21172 TRACE cinder.volume.manager Traceback (most recent call last):
2015-11-15 11:45:16.000 21172 TRACE cinder.volume.manager File "/home/openstack/workspace/cinder/cinder/volume/manager.py", line 302, in init_host
2015-11-15 11:45:16.000 21172 TRACE cinder.volume.manager self.driver.check_for_setup_error()
2015-11-15 11:45:16.000 21172 TRACE cinder.volume.manager File "/usr/local/lib/python2.7/dist-packages/osprofiler/profiler.py", line 105, in wrapper
2015-11-15 11:45:16.000 21172 TRACE cinder.volume.manager return f(*args, **kwargs)
2015-11-15 11:45:16.000 21172 TRACE cinder.volume.manager File "/home/openstack/workspace/cinder/cinder/volume/drivers/rbd.py", line 287, in check_for_setup_error
2015-11-15 11:45:16.000 21172 TRACE cinder.volume.manager with RADOSClient(self):
2015-11-15 11:45:16.000 21172 TRACE cinder.volume.manager File "/home/openstack/workspace/cinder/cinder/volume/drivers/rbd.py", line 242, in __init__
2015-11-15 11:45:16.000 21172 TRACE cinder.volume.manager self.cluster, self.ioctx = driver._connect_to_rados(pool)
2015-11-15 11:45:16.000 21172 TRACE cinder.volume.manager File "/home/openstack/workspace/cinder/cinder/volume/drivers/rbd.py", line 322, in _connect_to_rados
2015-11-15 11:45:16.000 21172 TRACE cinder.volume.manager raise exception.VolumeBackendAPIException(data=msg)
2015-11-15 11:45:16.000 21172 TRACE cinder.volume.manager VolumeBackendAPIException: Bad or unexpected response from the storage volume backend API: Error connecting to ceph cluster.
分析:
cinder应该是通过librados连接到ceph cluster的,先不研究这个library怎么用,先看看用现有的配置文件等能否通过命令行连接:
admin@maqi-kilo:/etc/ceph|⇒ ls
ceph.client.admin.keyring ceph.client.cinder-backup.keyring ceph.client.cinder.keyring ceph.client.glance.keyring ceph.conf
admin@maqi-kilo:/etc/ceph|⇒ ceph osd lspools
0 rbd,1 volumes-maqi-kilo,2 backups-maqi-kilo,3 images-maqi-kilo,4 vms-maqi-kilo,
admin@maqi-kilo:/etc/ceph|⇒ rbd ls volumes-maqi-kilo --keyring ceph.client.cinder.keyring --id cinder-maqi-kilo
没有报错,所以如果cinder-volume用这些配置的话应该没问题。
cinder.conf也没问题:
admin@maqi-kilo:/etc/ceph|⇒ grep cinder-maqi-kilo /etc/cinder/cinder.conf
rbd_user = cinder-maqi-kilo
admin@maqi-kilo:/etc/ceph|⇒ grep volumes-maqi-kilo /etc/cinder/cinder.conf
rbd_pool = volumes-maqi-kilo
现在的问题是,启动cinder-volume时,在ceph-mon上抓包,根本看不到cinder-volume发过来的包!!但是直接在cinder-volume节点上执行ceph osd lspools时,能抓到包。看起来像cinder-volume没有读取到ceph.conf中的mon IP地址呢?
尝试用rados.py/rdb.py连接ceph cluster(文档)
>>> import rados
>>> import rbd
>>> cluster = rados.Rados(conffile='/etc/ceph/ceph.conf')
>>> cluster.connect()
>>> ioctx = cluster.open_ioctx('rbd')
>>> rbd_inst = rbd.RBD()
>>> size = 4 * 1024**3
>>> rbd_inst.create(ioctx, 'testimage', size)
>>> rbd_inst.create(ioctx, 'testimage2', size)
也没问题啊:
admin@maqi-kilo:/etc/ceph|⇒ rbd ls
myimage
testimage
testimage2
Update 2015/11/16:
大概能猜想到原因:cinder-volume用client cinder-maqi-kilo
去连接ceph,但是在/etc/ceph/下找不到同名的(或者按照某种规则命名的)keyring文件。
看了一下Rados类的初始化代码,如果不指定client,那么就用默认的client.admin,而/etc/ceph/下存在ceph.client.admin.keyring文件。我尝试把这个keyring文件拿走,那么就连不上了:
>>> cluster = rados.Rados(conffile='/etc/ceph/ceph.conf')
>>> cluster.connect()
Traceback (most recent call last):
File "" , line 1, in
File "/usr/lib/python2.7/dist-packages/rados.py", line 417, in connect
raise make_ex(ret, "error calling connect")
rados.ObjectNotFound: error calling connect
最终的解决方法:
重命名cinder-volume使用的keyring文件:
admin@maqi-kilo:/etc/ceph|⇒ cp -p ceph.client.cinder.keyring ceph.client.cinder-maqi-kilo.keyring
admin@maqi-kilo:/etc/ceph|⇒ ll
total 24K
-rw-r--r-- 1 root root 66 Nov 15 10:44 ceph.client.admin.keyring
-rw-r--r-- 1 admin admin 81 Nov 15 10:52 ceph.client.cinder-backup.keyring
-rw-r--r-- 1 admin admin 74 Nov 15 10:51 ceph.client.cinder.keyring
-rw-r--r-- 1 admin admin 74 Nov 15 10:51 ceph.client.cinder-maqi-kilo.keyring
-rw-r--r-- 1 admin admin 74 Nov 15 10:53 ceph.client.glance.keyring
-rw-r--r-- 1 root root 289 Nov 15 10:39 ceph.conf
>>> cluster = rados.Rados(name='client.cinder-maqi-kilo', conffile='')
>>> cluster.connect()