这里不记录ceph安装的过程了,ceph安装详情见这里:http://www.vpsee.com/2015/07/install-ceph-on-centos-7/
ceph官方安装文档:http://docs.ceph.com/ceph-deploy/docs/install.html
ceph常用命令:http://zhanguo1110.blog.51cto.com/5750817/1543032
ceph运维手册:https://lihaijing.gitbooks.io/ceph-handbook/
ceph有个好用的命令,能查看当前生效的配置:
ceph daemon /var/run/ceph/ceph-mon*.asok config show
ceph osd扩容的时候,为了防止集群抖动,可以做如下设置:
ceph osd set nobackfill ;ceph osd set norebalance;ceph osd set norecover
取消操作:
ceph osd unset
安装ceph client
# 创建一个pool(具体根据你实际pg来,使用ceph-deploy安装完的时候会自动帮你创建一个rbd的pool) # 这里glance、cinder、nova共用一个pool,实际生产环境 [root@ceph01 ~(keystone_admin)]# ceph osd pool create rbd 128 # glance-api、nova-compute、cinder-backup、cinder-volume节点安装ceph client包 [root@ceph01 ~(keystone_admin)]# yum install python-rbd ceph
建立ceph client认证
# 创建ceph认证的用户,这里glance、cinder、nova共用一个ceph认证用户 # 官方文档上建议分别为nova、cinder、glance创建不同的用户 [root@ceph01 ~(keystone_admin)]# ceph auth get-or-create client.rbd mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=rbd' # 查看生成的rbd用户keyring [root@ceph01 ~(keystone_admin)]# ceph auth get-or-create client.rbd [client.rbd] key = AQBKGHBWzJCYORAAABHki+tWoOFgiTZL8FNnaA== # 以下两步操作,在glance-api、cinder-volume、cinder-backup、nova-compute节点上执行 # 创建keyring文件,添加如下内容 [root@ceph01 ~(keystone_admin)]# vim /etc/ceph/ceph.client.rbd.keyring [client.rbd] key = AQBKGHBWzJCYORAAABHki+tWoOFgiTZL8FNnaA== # 因为nova、cinder、glance共用一个用户,所以文件权限改为777 [root@ceph01 ~(keystone_admin)]# ll /etc/ceph/ceph.client.rbd.keyring -rwxrwxrwx 1 root root 61 Dec 15 21:44 /etc/ceph/ceph.client.rbd.keyring # 配置libvirt secret key,libvirt进程需要cinder keyring(这里也就是client.rbd) # 这样它才能访问ceph集群挂载块设备 # 使用tee命令创建一个暂时的文件 [root@ceph01 ~(keystone_admin)]# ceph auth get-key client.rbd | ssh {your-compute-node} tee client.rbd.key # 针对所有计算节点 # 以下操作在所有计算节点上执行 [root@ceph01 ~(keystone_admin)]# uuidgen # 生成随机的uuid aa03e7e8-6fcc-443f-94aa-ac169bfd0fd5 cat > secret.xml <aa03e7e8-6fcc-443f-94aa-ac169bfd0fd5 EOF sudo virsh secret-define --file secret.xml Secret aa03e7e8-6fcc-443f-94aa-ac169bfd0fd5 created sudo virsh secret-set-value --secret aa03e7e8-6fcc-443f-94aa-ac169bfd0fd5 --base64 $(cat client.rbd.key) && rm client.rbd.key secret.xml # 实际上计算节点的uuid可以不一致,保持一致单纯只是从平台一致性来考虑的。 client.rbd secret
OpenStack rbd配置
# glance rbd配置 [root@ceph01 ~(keystone_admin)]# vim /etc/glance/glance-api.conf [DEFAULT] show_p_w_picpath_direct_url = True # 启动镜像copy-on-write克隆功能 [glance_store] default_store = rbd stores = rbd filesystem_store_datadir=/var/lib/glance/p_w_picpaths/ rbd_store_pool = rbd rbd_store_user = rbd rbd_store_ceph_conf = /etc/ceph/ceph.conf rbd_store_chunk_size = 8 [paste_deploy] flavor = keystone # 禁用glance cache管理,如果你的flavor=keystone+cachemanagement,请修改 # cinder rbd配置 [root@ceph01 ~(keystone_admin)]# vim /etc/cinder/cinder.conf volume_driver = cinder.volume.drivers.rbd.RBDDriver rbd_pool = rbd rbd_ceph_conf = /etc/ceph/ceph.conf rbd_flatten_volume_from_snapshot = false rbd_max_clone_depth = 5 rbd_store_chunk_size = 4 rados_connect_timeout = -1 glance_api_version = 2 # 官方文档说如果你配置了cinder multi backends,必须配置这个 rbd_user = rbd rbd_secret_uuid = aa03e7e8-6fcc-443f-94aa-ac169bfd0fd5 # nova rbd配置 [libvirt] inject_password = False # openstack boot from volume启动instance的时候不支持file injection inject_key = False # ditto inject_partition = -2 # ditto virt_type = kvm live_migration_flag=VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_PERSIST_DEST p_w_picpaths_type = rbd p_w_picpaths_rbd_pool = rbd p_w_picpaths_rbd_ceph_conf = /etc/ceph/ceph.conf rbd_user = rbd rbd_secret_uuid = aa03e7e8-6fcc-443f-94aa-ac169bfd0fd5 disk_cachemodes="network=writeback" # 每个计算节点上执行,编辑ceph配置文件 [root@ceph01 ~(keystone_admin)]# vim /etc/ceph/ceph.conf # 开启admin socket,有助于排错 [client] rbd cache = true rbd cache writethrough until flush = true admin socket = /var/run/ceph/guests/$cluster-$type.$id.$pid.$cctid.asok log file = /var/log/qemu/qemu-guest-$pid.log rbd concurrent management ops = 20 [root@ceph01 ~(keystone_admin)]# mkdir -p /var/run/ceph/guests/ /var/log/qemu/ [root@ceph01 ~(keystone_admin)]# chown qemu:qemu /var/run/ceph/guests /var/log/qemu/ OpenStack 配置最佳实践(转载自:http://www.wzxue.com/openstack-ceph-kilo/) Ceph.conf : [client] rbd cache = true rbd cache writethrough until flush = true rbd concurrent management ops = 20 admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok log file = {{ rbd_client_log_file }} GLANCE Disable local cache: s/flavor = keystone+cachemanagement/flavor = keystone/ Expose p_w_picpaths URL: show_p_w_picpath_direct_url = T w_scsi_model=virtio-scsi # for discard and perf hw_disk_bus=scsi Nova: hw_disk_discard = unmap # enable discard support (be careful of perf) inject_password = false # disable password injection inject_key = false # disable key injection inject_partition = -2 # disable partition injection disk_cachemodes = "network=writeback" # make QEMU aware so caching works live_migration_flag="VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER, VIR_MIGRATE_LIVE,VIR_MIGRATE_PERSIST_DEST" Cinder: glance_api_version = 2 # 最后重启下服务 [root@ceph01 ~(keystone_admin)]# service openstack-glance-api restart [root@ceph01 ~(keystone_admin)]# service openstack-nova-compute restart [root@ceph01 ~(keystone_admin)]# service openstack-cinder-volume restart
排错
关于使用ceph-disk的问题,ceph-disk会写入一些udev规则来解决盘符漂移的问题。
关于拔盘后xfs进程残留问题的总结及解决方法:
一、先停止osd服务,umount osd挂载目录,然后拔盘(不存在xfs进程残留)
1、 查看osd分区
[root@controller-21 openstack-deploy]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/os-root 100G 15G 86G 15% / devtmpfs 7.8G 0 7.8G 0% /dev tmpfs 7.8G 71M 7.7G 1% /dev/shm tmpfs 7.8G 430M 7.4G 6% /run tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup /dev/mapper/os-glusterfs 823G 33M 823G 1% /gfs /dev/sda2 197M 127M 71M 65% /boot tmpfs 1.6G 0 1.6G 0% /run/user/993 tmpfs 1.6G 0 1.6G 0% /run/user/0 /dev/sdb1 1.9T 6.8G 1.9T 1% /var/lib/ceph/osd/ceph-0
2、查看sdb1 xfs进程
[root@controller-21 openstack-deploy]# ps axu | grep sdb1 root 26705 0.0 0.0 0 0 ? S< 11:42 0:00 [xfs-buf/sdb1] root 26706 0.0 0.0 0 0 ? S< 11:42 0:00 [xfs-data/sdb1] root 26707 0.0 0.0 0 0 ? S< 11:42 0:00 [xfs-conv/sdb1] root 26708 0.0 0.0 0 0 ? S< 11:42 0:00 [xfs-cil/sdb1] root 26709 0.0 0.0 0 0 ? S 11:42 0:01 [xfsaild/sdb1]
3、停止osd.0进程
[root@controller-21 openstack-deploy]# /etc/init.d/ceph stop osd.0 === osd.0 === Stopping Ceph osd.0 on controller-21…kill 26952…kill 26952…done
4、在未umount osd.0挂载目录的前提下,xfs进程还在
[root@controller-21 openstack-deploy]# ps axu | grep sdb1 root 26705 0.0 0.0 0 0 ? S< 11:42 0:00 [xfs-buf/sdb1] root 26706 0.0 0.0 0 0 ? S< 11:42 0:00 [xfs-data/sdb1] root 26707 0.0 0.0 0 0 ? S< 11:42 0:00 [xfs-conv/sdb1] root 26708 0.0 0.0 0 0 ? S< 11:42 0:00 [xfs-cil/sdb1] root 26709 0.0 0.0 0 0 ? S 11:42 0:01 [xfsaild/sdb1] root 27797 0.0 0.0 112652 968 pts/0 S+ 12:48 0:00 grep —color=auto sdb1
5、umount osd.0挂载目录
[root@controller-21 openstack-deploy]# umount /var/lib/ceph/osd/ceph-0
6、xfs进程不存在了
[root@controller-21 openstack-deploy]# ps axu | grep sdb1 root 27846 0.0 0.0 112648 964 pts/0 S+ 12:48 0:00 grep —color=auto sdb1
二、直接拔盘,不停止osd服务(存在xfs进程残留)
xfs进程残留解决方法:
1、umount /var/lib/ceph/osd/ceph-0(如果umount不掉,再执行下面2操作) 2、/etc/init.d/ceph restart osd(重启osd服务) 3、ps aux | grep sdb(如果xfs进程还有残留,执行4操作) 4、systemctl restart systemd-udevd(重启udevd,xfs进程消失)
拔出去的盘重新成为osd:
1、ceph-disk activate /dev/sdb1(osd数据分区) 2、ceph -s (查看集群状态)
小结
1、一般是nova、glance共用一个pool,cinder用一个pool。
2、nova(boot from volume)做快照实际上是rbd snapshots。
3、nova(boot from p_w_picpath)虚拟机快照还是用传统的方式来实现的,最新Mitaka版本已经支持rbd instance snapshots。
ref: https://review.openstack.org/#/c/205282/42
ref: https://review.openstack.org/#/c/188244/
实用命令
1、qemu rbd访问:http://docs.ceph.com/docs/master/rbd/qemu-rbd/
2、查看ceph 块设备实际占用空间
rbd diff volumes/volume-19cc992e-d66d-4141-a05b-5b12ab74727b | awk '{ SUM += $2 } END { print SUM/1024/1024 " MB" }'
3、测试用户是否能连接到ceph pool
rbd -c /etc/ceph/ceph.conf -p volumes --id cinder --keyring /etc/ceph/ceph.client.cinder.keyring ls
参考链接
http://docs.ceph.com/docs/master/rbd/rbd-openstack/
http://my.oschina.net/JerryBaby/blog/376580?fromerr=wNPJrqPP#OSC_h2_1
http://bbs.ceph.org.cn/question/363 (OSD full/nearfull 的解决办法)