Ceph在OpenStack中的一些应用

Ceph在OpenStack中的一些应用_第1张图片

Ceph作为已经发展了10年的分布式存储,目前已经有很多生产系统的实例,当中对OpenStack提供后端存储这块也是时下最流行也比较成熟的解决方案,经过一段时间的使用,对其结合自己实际系统做了一个简单的应用总结。

现有的OpenStack环境当中Ceph只对接Cinder和部分Nova,根据需求不同,有的虚拟机跑在本地文件系统,有的则放在Ceph存储当中。数据卷则统一由Ceph提供块存储。

与OpenStack集成

基层这块Ceph官网已经提供健全的文档,我这里就不多讲了,直接贴上我的操作和配置。

  • 安装Ceph客户端
$ yum install -y centos-release-ceph-jewel
$ yum clean all && yum makecache
$ yum install ceph python-rbd
  • 配置客户端认证
ceph auth get-or-create client.oepnstack mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=instances  -o /etc/ceph/ceph.client.openstack.keyring'

我这里就统一对openstack账号授权,官方建议是每个服务创建一个账号。将ceph.confceph.client.openstack.keyring这两个文件拷贝到需要对接Ceph的Nova-compute和Cinder-volume节点的/etc/ceph目录下

  • 配置Libvirt

libvirt这块可以参考之前的文章《Ceph in Libvirt and Kubernetes》,为保证平台一致性,最好同步所有计算节点secret的UUID。

配置Cinder

cinder

在配置文件cinder.conf结尾追加一下配置

[ceph]
volume_driver = cinder.volume.drivers.rbd.RBDDriver
rbd_pool = volumes
rbd_ceph_conf = /etc/ceph/ceph.conf
rbd_flatten_volume_from_snapshot = false
rbd_max_clone_depth = 5
rbd_store_chunk_size = 4
rados_connect_timeout = -1
glance_api_version = 2
volume_clear_size = 100
rbd_user = openstack
rbd_secret_uuid = 881e72de-961f-4556-9e8c-0b909408186b #libvirt的secret id

同时修改cinder.conf中的

enabled_backends = ceph

nova

为了挂载 Cinder 块设备(块设备或者启动卷),必须告诉 Nova 挂载设备时使用的用户和 uuid 。libvirt会使用该用户来和 Ceph 集群进行连接和认证。

rbd_user = openstack
rbd_secret_uuid = 881e72de-961f-4556-9e8c-0b909408186b

配置Nova

ceph

客户端配置文件ceph.conf开启RBD缓存和套接字,对于故障排查来说大有好处,给每个使用 Ceph 块设备的虚拟机分配一个套接字有助于调查性能和/或异常行为。

[client]
        rbd cache = true
        rbd cache writethrough until flush = true
        admin socket = /var/run/ceph/guests/$cluster-$type.$id.$pid.$cctid.asok
        log file = /var/log/qemu/qemu-guest-$pid.log
        rbd concurrent management ops = 20

调整目录权限

mkdir -p /var/run/ceph/guests/ /var/log/qemu/
chown qemu:libvirtd /var/run/ceph/guests /var/log/qemu/

nova

调整nova.conf文件的[libvirt]域下配置

#确保热迁移能顺利进行,要使用如下标志
live_migration_flag="VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_PERSIST_DEST,VIR_MIGRATE_TUNNELLED"
#禁止文件注入
inject_password = false
inject_key = false
inject_partition = -2
disk_cachemodes ="network=writeback"
images_type=rbd
images_rbd_pool=instances
images_rbd_ceph_conf = /etc/ceph/ceph.conf
rbd_user=openstack
rbd_secret_uuid=881e72de-961f-4556-9e8c-0b909408186b

玩法

  • 通过镜像创建Cinder卷
usage: cinder create [--consisgroup-id ]
                     [--snapshot-id ]
                     [--source-volid ]
                     [--source-replica ]
                     [--image-id ] [--image ] [--name ]
                     [--description ]
                     [--volume-type ]
                     [--availability-zone ]
                     [--metadata [ [ ...]]]
                     [--hint ] [--allow-multiattach]
                     []


$ cinder create --image-id a02c0829-b198-4650-a9c6-7cc6b0b94018 --name jcloud.v1.0 60 
+--------------------------------------+-------------+--------------------+------+-------------+----------+--------------------------------------+
|                  ID                  |    Status   |        Name        | Size | Volume Type | Bootable |             Attached to              |
+--------------------------------------+-------------+--------------------+------+-------------+----------+--------------------------------------+
| ed63922f-f88c-4dab-9e8d-9670db8ee7b2 | downloading |    jcloud.v1.0     |  60  |      -      |  false   |                                      |
+--------------------------------------+-------------+--------------------+------+-------------+----------+--------------------------------------+
可以看到刚刚创建的卷,会从glance中下线镜像再导入到rbd块中,等导入完成之后这个卷就变成可一个引导的卷了,如下显示:
+--------------------------------------+-----------+--------------------+------+-------------+----------+--------------------------------------+
|                  ID                  |   Status  |        Name        | Size | Volume Type | Bootable |             Attached to              |
+--------------------------------------+-----------+--------------------+------+-------------+----------+--------------------------------------+
| ed63922f-f88c-4dab-9e8d-9670db8ee7b2 | available |    jcloud.v1.0     |  60  |      -      |   true   |                                      |
+--------------------------------------+-----------+--------------------+------+-------------+----------+--------------------------------------+
  • 对镜像做快照
usage: cinder snapshot-create [--force []] [--name ]
                              [--description ]
                              [--metadata [ [ ...]]]
                              

$  cinder snapshot-create --name jcloud.v1.0_snapshot ed63922f-f88c-4dab-9e8d-9670db8ee7b2
+-------------+--------------------------------------+
|   Property  |                Value                 |
+-------------+--------------------------------------+
|  created_at |      2017-03-22T07:52:28.887130      |
| description |                 None                 |
|      id     | 6be98fec-bfee-4550-99a6-358e2b8b6609 |
|   metadata  |                  {}                  |
|     name    |         jcloud.v1.0_snapshot         |
|     size    |                  60                  |
|    status   |               creating               |
|  updated_at |                 None                 |
|  volume_id  | ed63922f-f88c-4dab-9e8d-9670db8ee7b2 |
+-------------+--------------------------------------+

  • 通过快照创建虚拟机
usage: nova boot [--flavor ] [--image ]
                 [--image-with ] [--boot-volume ]
                 [--snapshot ] [--min-count ]
                 [--max-count ] [--meta ]
                 [--file ] [--key-name ]
                 [--user-data ]
                 [--availability-zone ]
                 [--security-groups ]
                 [--block-device-mapping ]
                 [--block-device key1=value1[,key2=value2...]]
                 [--swap ]
                 [--ephemeral size=[,format=]]
                 [--hint ]
                 [--nic ]
                 [--config-drive ] [--poll] [--admin-pass ]
                 [--access-ip-v4 ] [--access-ip-v6 ]
                 

--block-device 参数

  • source=images|napshot|volume|blank

  • dest=volume|local

  • id=XXXXXX (a volume|image|snapshot UUID if using source=volume|snapshot|image)

  • format=swap|ext4|...|none (to format the image/volume/ephemeral file; defaults to 'none' if omitted)

  • bus=ide|usb|virtio|scsi (hypervisor driver chooses a suitable default if omitted)

  • device=the desired device name (e.g. /dev/vda, /dev/xda, ...)

  • type=disk|cdrom|floppy|mmc (defaults to 'disk' if omitted)

  • bootindex=N (where N is any number >= 0, controls the order in which disks are looked at for booting)

  • size=NN (where NN is number of GB to create type=emphemeral image, or the size to re-size to for type=glance|cinder)

  • shutdown=preserve|remove

这里面只有 source 和 id 是必须的,别的都有默认值。比如:

  • --block-device source=image,dest=volume,id=XXXXXXX,bus=ide,bootindex=2

  • --block-device source=volume,dest=volume,id=XXXXXXX,bus=ide,type=cdrom,bootdex=1

  • --block-device source=blank,dest=local,format=swap,size=50,bus=ide,type=floppy

dest 会指定source 的 destination,包括本地的(local)和 Cinder 卷 (volume)。

dest sources 说明 shotcut
volume volume 直接挂载到 compute 节点 当 boot_index = 0 时相当于 --boot-volume
snapshot 调用 cinder 依据快照创建新卷,挂载到compute节点 当 boot_index = 0 时相当于 --snapshot
image 调用cinder依据镜像创建新卷,挂载到compute节点 当 boot_index = 0 时相当于 --image (Boot from image (creates a new volume))
blank 通知cinder依大小创建空卷并挂载到compute节点
local image 在 Hypervisor 上创建 ephemeral 分区,将 image 拷贝到里面并启动虚机 相当于普通的 Boot from image
local blank format=swap时,创建swap分区,默认创建ephemeral分区 当 boot_index=-1, shutdown=remove, format=swap 时相当于 --swap ,当 boot_index=-1, shutdown=remove 时相当于 --ephemeral

通过快照创建虚拟机

nova boot --flavor ff29c42b-754d-4230-9e1f-9bdaba800f5e --snapshot 6be98fec-bfee-4550-99a6-358e2b8b6609 --security-groups default --nic net-id=163df3b0-13f2-4f2e-8401-e82088e8dc07 test

#或者

nova boot --flavor ff29c42b-754d-4230-9e1f-9bdaba800f5e --block-device source=snapshot,dest=volume,id=6be98fec-bfee-4550-99a6-358e2b8b6609,bootindex=0 --security-groups default --nic net-id=163df3b0-13f2-4f2e-8401-e82088e8dc07  test

跟踪虚拟机状态如下

#映射设备
+--------------------------------------+-------+--------+----------------------+-------------+--------------------------+
| ID                                   | Name  | Status | Task State           | Power State | Networks                 |
+--------------------------------------+-------+--------+----------------------+-------------+--------------------------+
| 52369db0-f7db-4eb0-a708-d77e250e3ecc | test  | BUILD  | block_device_mapping | NOSTATE     | privite01=192.168.17.251 |
+--------------------------------------+-------+--------+----------------------+-------------+--------------------------+

#孵化
+--------------------------------------+-------+--------+------------+-------------+--------------------------+
| ID                                   | Name  | Status | Task State | Power State | Networks                 |
+--------------------------------------+-------+--------+------------+-------------+--------------------------+
| 52369db0-f7db-4eb0-a708-d77e250e3ecc | test  | BUILD  | spawning   | NOSTATE     | privite01=192.168.17.251 |
+--------------------------------------+-------+--------+------------+-------------+--------------------------+

#启动成功
+--------------------------------------+-------+--------+------------+-------------+--------------------------+
| ID                                   | Name  | Status | Task State | Power State | Networks                 |
+--------------------------------------+-------+--------+------------+-------------+--------------------------+
| 52369db0-f7db-4eb0-a708-d77e250e3ecc | test  | ACTIVE | -          | Running     | privite01=192.168.17.251 |
+--------------------------------------+-------+--------+------------+-------------+--------------------------+

快照创建成功后我们可以通过rbd命令去ceph的pool里面查询volumes状态。

  • 首先看下cinder-volume的挂载
+--------------------------------------+-----------+--------------------+------+-------------+----------+--------------------------------------+
|                  ID                  |   Status  |        Name        | Size | Volume Type | Bootable |             Attached to              |
+--------------------------------------+-----------+--------------------+------+-------------+----------+--------------------------------------+
| 7398e12b-333b-4610-b75c-e237d164781d |   in-use  |                    |  60  |      -      |   true   | 52369db0-f7db-4eb0-a708-d77e250e3ecc |
+--------------------------------------+-----------+--------------------+------+-------------+----------+--------------------------------------+

7398e12b-333b-4610-b75c-e237d164781d这个便是虚拟机系统盘所用的volumes了,有了这个uuid就可以在底层Ceph上对应上rbd块了。

  • 查下系统卷的信息
#根据volume的uuid查找rbd块
$ rbd ls volumes --name client.libvirt |grep 7398e12b-333b-4610-b75c-e237d164781d
volume-7398e12b-333b-4610-b75c-e237d164781d

#查看块信息
$ rbd  info volumes/volume-7398e12b-333b-4610-b75c-e237d164781d --name client.libvirt
rbd image 'volume-7398e12b-333b-4610-b75c-e237d164781d':
    size 61440 MB in 15360 objects
    order 22 (4096 kB objects)
    block_name_prefix: rbd_data.872a41ece775
    format: 2
    features: layering, striping
    flags: 
    parent: volumes/volume-ed63922f-f88c-4dab-9e8d-9670db8ee7b2@snapshot-6be98fec-bfee-4550-99a6-358e2b8b6609
    overlap: 61440 MB
    stripe unit: 4096 kB
    stripe count: 1

#快照的信息  
$ rbd  info volumes/volume-ed63922f-f88c-4dab-9e8d-9670db8ee7b2@snapshot-6be98fec-bfee-4550-99a6-358e2b8b6609 --name client.libvirt
rbd image 'volume-ed63922f-f88c-4dab-9e8d-9670db8ee7b2':
    size 61440 MB in 15360 objects
    order 22 (4096 kB objects)
    block_name_prefix: rbd_data.86c4238e1f29
    format: 2
    features: layering
    flags: 
    protected: True

这里可以看到volume-7398e12b-333b-4610-b75c-e237d164781d实际上是通过volumes/volume-ed63922f-f88c-4dab-9e8d-9670db8ee7b2@snapshot-6be98fec-bfee-4550-99a6-358e2b8b6609这个快照克隆而来。cinder-volumes
服务在处理快照前已经对其做了protected保护,那么按照这个特性,之后创建虚拟机或者批量创建虚拟机就可以通过这个快照克隆而来。我这里测试了下确实是可行的。

通过快照批量创建虚拟机

$ nova boot --flavor ff29c42b-754d-4230-9e1f-9bdaba800f5e --block-device source=snapshot,dest=volume,id=6be98fec-bfee-4550-99a6-358e2b8b6609,bootindex=0 --security-groups default --nic net-id=163df3b0-13f2-4f2e-8401-e82088e8dc07 --min-count 2 --max-count 3  test

对于--min-count--max-count的处理逻辑是,在批量创建虚拟机是,nova会根据--max-count的数量计算需要的配额,如果超过配额就按照--min-count的数量计算分配,如果仍然超过配额就返回失败。

创建成功后就如下

#nova 虚拟机状态
+--------------------------------------+--------+--------+------------+-------------+--------------------------+
| ID                                   | Name   | Status | Task State | Power State | Networks                 |
+--------------------------------------+--------+--------+------------+-------------+--------------------------+
| 66e5bf31-6ad0-496b-9dbc-77a1c2b8fa32 | test-1 | ACTIVE | -          | Running     | privite01=192.168.17.246 |
| 6ebad2a4-b844-4662-9b1b-862a726c1873 | test-2 | ACTIVE | -          | Running     | privite01=192.168.17.248 |
| 5f5d7088-cfb3-4e8a-9b38-7e128b4fc8eb | test-3 | ACTIVE | -          | Running     | privite01=192.168.17.249 |
+--------------------------------------+--------+--------+------------+-------------+--------------------------+

#cinder 数据卷状态
+--------------------------------------+-----------+--------------------+------+-------------+----------+--------------------------------------+
|                  ID                  |   Status  |        Name        | Size | Volume Type | Bootable |             Attached to              |
+--------------------------------------+-----------+--------------------+------+-------------+----------+--------------------------------------+
| 77835304-1324-4063-9d25-211065fe406f |   in-use  |                    |  60  |      -      |   true   | 5f5d7088-cfb3-4e8a-9b38-7e128b4fc8eb |
| 85f66862-1fad-4634-b96f-cdb51558c422 |   in-use  |                    |  60  |      -      |   true   | 6ebad2a4-b844-4662-9b1b-862a726c1873 |
| b457befb-490a-402e-8b6f-215945ab6248 |   in-use  |                    |  60  |      -      |   true   | 66e5bf31-6ad0-496b-9dbc-77a1c2b8fa32 |
| ed63922f-f88c-4dab-9e8d-9670db8ee7b2 | available |    jcloud.v1.0     |  60  |      -      |   true   |                                      |
+--------------------------------------+-----------+--------------------+------+-------------+----------+--------------------------------------+

#ceph卷信息
$ rbd ls volumes --name client.libvirt
volume-77835304-1324-4063-9d25-211065fe406f
volume-85f66862-1fad-4634-b96f-cdb51558c422
volume-b457befb-490a-402e-8b6f-215945ab6248
volume-ed63922f-f88c-4dab-9e8d-9670db8ee7b2

本文参考:
http://docs.ceph.org.cn/
http://www.cnblogs.com/sammyliu/p/4462718.html

你可能感兴趣的:(Ceph在OpenStack中的一些应用)