目前kolla官方最新的版本已经废弃ceph的部署,不过之前的Kolla版本还是支持。最后一个版本支持部署ceph的为Openstack-Train。那就使用T版本的ceph镜像进行部署。
本次部署的ceph使用bluestore类型的存储。关于filestore和bluestore的区别,就在于写到磁盘的次数,filestore有journal存储,每次数据先写到此磁盘中,然后再写到数据库,并且还有文件系统 ,所以此方式对于连续写的集群来说,性能不佳。使用bluestore后,数据直接落盘,不存在写入到文件系统。其中blustore还有不同空间的使用,slow:慢速空间,直接存储数据,一般使用机械盘;db:高速空间,存储元数据,一般使用ssd盘;wal:超高速空间,存储集群日志,一般使用NVMe ssd 或NVRAM 时延小的SSD盘。
具体步骤如下
注:此步最重要,如果标签没有打对,后续部署可能会出错。
本集群使用三台虚拟机模拟,每一台机器上面有五块盘,三块盘做slow空间,一块盘做db空间,一块盘做wal空间。
[root@node01 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 100G 0 disk
├─sda1 8:1 0 1G 0 part /boot
├─sda2 8:2 0 49G 0 part
│ ├─centos-root 253:0 0 97G 0 lvm /
│ └─centos-swap 253:1 0 2G 0 lvm
└─sda3 8:3 0 50G 0 part
└─centos-root 253:0 0 97G 0 lvm /
sdb 8:16 0 20G 0 disk
sdc 8:32 0 20G 0 disk
sdd 8:48 0 20G 0 disk
sde 8:64 0 20G 0 disk
sdf 8:80 0 20G 0 disk
sr0 11:0 1 1024M 0 rom
使用ansible对三台机器行标签
此步为设置slow空间
先创建slow空间gpt分区
###sdb盘
[root@node01 ~]# ansible all -m shell -a "parted /dev/sdb -s -- mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP_BS_FOO1 1 100"
/usr/lib/python2.7/site-packages/ansible/parsing/vault/__init__.py:44: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
from cryptography.exceptions import InvalidSignature
node01 | SUCCESS | rc=0 >>
node03 | SUCCESS | rc=0 >>
node02 | SUCCESS | rc=0 >>
###sdc盘
[root@node01 ~]# ansible all -m shell -a "parted /dev/sdc -s -- mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP_BS_FOO2 1 100"
/usr/lib/python2.7/site-packages/ansible/parsing/vault/__init__.py:44: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
from cryptography.exceptions import InvalidSignature
node03 | SUCCESS | rc=0 >>
node01 | SUCCESS | rc=0 >>
node02 | SUCCESS | rc=0 >>
###sdd盘
[root@node01 ~]# ansible all -m shell -a "parted /dev/sdd -s -- mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP_BS_FOO3 1 100"
/usr/lib/python2.7/site-packages/ansible/parsing/vault/__init__.py:44: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
from cryptography.exceptions import InvalidSignature
node01 | SUCCESS | rc=0 >>
node02 | SUCCESS | rc=0 >>
node03 | SUCCESS | rc=0 >>
将此三块盘的剩余空间全部分配给slow空间
###sdb盘
[root@node01 ~]# ansible all -m shell -a "parted /dev/sdb -s mkpart KOLLA_CEPH_OSD_BOOTSTRAP_BS_FOO1_B 101 100%"
/usr/lib/python2.7/site-packages/ansible/parsing/vault/__init__.py:44: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
from cryptography.exceptions import InvalidSignature
node01 | SUCCESS | rc=0 >>
node02 | SUCCESS | rc=0 >>
node03 | SUCCESS | rc=0 >>
###sdc盘
[root@node01 ~]# ansible all -m shell -a "parted /dev/sdc -s mkpart KOLLA_CEPH_OSD_BOOTSTRAP_BS_FOO2_B 101 100%"
/usr/lib/python2.7/site-packages/ansible/parsing/vault/__init__.py:44: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
from cryptography.exceptions import InvalidSignature
node01 | SUCCESS | rc=0 >>
node03 | SUCCESS | rc=0 >>
node02 | SUCCESS | rc=0 >>
###sdd盘
[root@node01 ~]# ansible all -m shell -a "parted /dev/sdd -s mkpart KOLLA_CEPH_OSD_BOOTSTRAP_BS_FOO3_B 101 100%"
/usr/lib/python2.7/site-packages/ansible/parsing/vault/__init__.py:44: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
from cryptography.exceptions import InvalidSignature
node03 | SUCCESS | rc=0 >>
node01 | SUCCESS | rc=0 >>
node02 | SUCCESS | rc=0 >>
此步为设置db空间
db空间的数量对就slow空间数量,比如每台节点有3个slow空间,那么此节点分配3个db空间即可。
关于db空间的容量分配为。size/{slow-num}。此集群中,每个磁盘的容量为20GB,slow数量为3,那么db空间为20GB/3,6GB多点,我这边每个空间分配了5GB。
wal空间与上类似。
先创建db空间gpt分区
[root@node01 ~]# ansible all -m shell -a "parted /dev/sde -s -- mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP_BS_FOO1_D 1 5001"
/usr/lib/python2.7/site-packages/ansible/parsing/vault/__init__.py:44: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
from cryptography.exceptions import InvalidSignature
node01 | SUCCESS | rc=0 >>
node02 | SUCCESS | rc=0 >>
node03 | SUCCESS | rc=0 >>
将此磁盘的剩余空间再创建两个db空间
[root@node01 ~]# ansible all -m shell -a "parted /dev/sde -s mkpart KOLLA_CEPH_OSD_BOOTSTRAP_BS_FOO2_D 5002 10001"
/usr/lib/python2.7/site-packages/ansible/parsing/vault/__init__.py:44: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
from cryptography.exceptions import InvalidSignature
node01 | SUCCESS | rc=0 >>
node02 | SUCCESS | rc=0 >>
node03 | SUCCESS | rc=0 >>
[root@node01 ~]# ansible all -m shell -a "parted /dev/sde -s mkpart KOLLA_CEPH_OSD_BOOTSTRAP_BS_FOO3_D 10002 15002"
/usr/lib/python2.7/site-packages/ansible/parsing/vault/__init__.py:44: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
from cryptography.exceptions import InvalidSignature
node01 | SUCCESS | rc=0 >>
node03 | SUCCESS | rc=0 >>
node02 | SUCCESS | rc=0 >>
此步为设置wal空间
先创建db空间gpt分区
[root@node01 ~]# ansible all -m shell -a "parted /dev/sdf -s -- mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP_BS_FOO1_W 1 5001"
/usr/lib/python2.7/site-packages/ansible/parsing/vault/__init__.py:44: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
from cryptography.exceptions import InvalidSignature
node01 | SUCCESS | rc=0 >>
node03 | SUCCESS | rc=0 >>
node02 | SUCCESS | rc=0 >>
将此磁盘的剩余空间再创建两个wal空间
[root@node01 ~]# ansible all -m shell -a "parted /dev/sdf -s mkpart KOLLA_CEPH_OSD_BOOTSTRAP_BS_FOO2_W 5002 10001"
/usr/lib/python2.7/site-packages/ansible/parsing/vault/__init__.py:44: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
from cryptography.exceptions import InvalidSignature
node02 | SUCCESS | rc=0 >>
node01 | SUCCESS | rc=0 >>
node03 | SUCCESS | rc=0 >>
[root@node01 ~]# ansible all -m shell -a "parted /dev/sdf -s mkpart KOLLA_CEPH_OSD_BOOTSTRAP_BS_FOO3_W 10002 15002"
/usr/lib/python2.7/site-packages/ansible/parsing/vault/__init__.py:44: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
from cryptography.exceptions import InvalidSignature
node02 | SUCCESS | rc=0 >>
node01 | SUCCESS | rc=0 >>
node03 | SUCCESS | rc=0 >>
以上标签配置好,为了防止系统分区表没生效,最好把所以机器都重启。
这是kolla配置的情况下进行的,关于kolla的配置,可以参考我的其它文章
[root@node01 config]# pwd
/etc/kolla/config
[root@node01 config]# cat ceph.conf
[global]
osd pool default size = 3
osd pool default min size = 2
mon_clock_drift_allowed = 2
mon_clock_drift_warn_backoff = 30
osd_deep_scrub_randomize_ratio = 0.0
[mon]
mon_allow_pool_delete = true
[osd]
osd_max_write_size = 1024
osd_recovery_op_priority = 1
osd_recovery_max_active = 1
osd_recovery_max_single_start = 1
osd_recovery_max_chunk = 1048576
osd_recovery_threads = 1
osd_max_backfills = 1
osd_scrub_begin_hour = 22
osd_scrub_end_hour = 7
osd_recovery_sleep = 0
osd_crush_update_on_start = false
globals.yml配置如下:
[root@node01 kolla]# cat globals.yml | grep -v ^$ | grep -v ^#
---
kolla_base_distro: "centos"
kolla_install_type: "source"
openstack_release: "rocky"
kolla_internal_vip_address: "192.168.122.253"
docker_registry: "registry.example.com:4000"
docker_namespace: "openstackbl"
network_interface: "ens33"
storage_interface: "ens38"
cluster_interface: "ens39"
neutron_external_interface: "ens37"
keepalived_virtual_router_id: "77"
enable_ceph: "yes"
enable_ceph_rgw: "yes"
enable_cinder: "yes"
enable_cinder_backup: "yes"
enable_grafana: "yes"
enable_prometheus: "yes"
enable_ceph_rgw_keystone: "yes"
glance_backend_ceph: "yes"
glance_enable_rolling_upgrade: "no"
cinder_backend_ceph: "{
{ enable_ceph }}"
cinder_backup_driver: "ceph"
nova_backend_ceph: "{
{ enable_ceph }}"
ironic_dnsmasq_dhcp_range:
tempest_image_id:
tempest_flavor_ref_id:
tempest_public_network_id:
tempest_floating_network_name:
enable_prometheus_ceph_mgr_exporter: "{
{ enable_prometheus | bool and enable_ceph | bool }}"
ceph_osd_store_type: "bluestore"
ceph_osd_wipe_disk: "yes-i-really-really-mean-it"
[root@node01 kolla]#
部署的时候在kolla-ansible后面加上 -t ceph,就指定了只部署ceph
[root@node01 ~]# kolla-ansible -i /usr/share/kolla-ansible/ansible/inventory/multinode -t ceph deploy
PLAY RECAP *********************************************************************************************
localhost : ok=2 changed=0 unreachable=0 failed=0
node01 : ok=76 changed=34 unreachable=0 failed=1
node02 : ok=62 changed=32 unreachable=0 failed=0
node03 : ok=62 changed=32 unreachable=0 failed=0
这儿有个错误提示:
TASK [ceph : Creating the Swift service and endpoint] **************************************************
failed: [node01] (item={
u'interface': u'admin', u'url': u'http://192.168.122.253:6780/swift/v1'}) => {
"changed": true, "item": {
"interface": "admin", "url": "http://192.168.122.253:6780/swift/v1"}, "msg": "'Traceback (most recent call last):\\n File \"/tmp/ansible_vwPYil/ansible_module_kolla_keystone_service.py\", line 55, in main\\n for _service in cloud.keystone_client.services.list():\\n File \"/opt/ansible/lib/python2.7/site-packages/shade/_legacy_clients.py\", line 95, in keystone_client\\n self._identity_client\\n File \"/opt/ansible/lib/python2.7/site-packages/shade/openstackcloud.py\", line 616, in _identity_client\\n \\'identity\\', min_version=2, max_version=\\'3.latest\\')\\n File \"/opt/ansible/lib/python2.7/site-packages/shade/openstackcloud.py\", line 506, in _get_versioned_client\\n if adapter.get_endpoint():\\n File \"/opt/ansible/lib/python2.7/site-packages/keystoneauth1/adapter.py\", line 282, in get_endpoint\\n return self.session.get_endpoint(auth or self.auth, **kwargs)\\n File \"/opt/ansible/lib/python2.7/site-packages/keystoneauth1/session.py\", line 1200, in get_endpoint\\n return auth.get_endpoint(self, **kwargs)\\n File \"/opt/ansible/lib/python2.7/site-packages/keystoneauth1/identity/base.py\", line 380, in get_endpoint\\n allow_version_hack=allow_version_hack, **kwargs)\\n File \"/opt/ansible/lib/python2.7/site-packages/keystoneauth1/identity/base.py\", line 271, in get_endpoint_data\\n service_catalog = self.get_access(session).service_catalog\\n File \"/opt/ansible/lib/python2.7/site-packages/keystoneauth1/identity/base.py\", line 134, in get_access\\n self.auth_ref = self.get_auth_ref(session)\\n File \"/opt/ansible/lib/python2.7/site-packages/keystoneauth1/identity/generic/base.py\", line 206, in get_auth_ref\\n self._plugin = self._do_create_plugin(session)\\n File \"/opt/ansible/lib/python2.7/site-packages/keystoneauth1/identity/generic/base.py\", line 161, in _do_create_plugin\\n \\'auth_url is correct. %s\\' % e)\\nDiscoveryFailure: Could not find versioned identity endpoints when attempting to authenticate. Please check that your auth_url is correct. Unable to establish connection to http://192.168.122.253:35357: HTTPConnectionPool(host=\\'192.168.122.253\\', port=35357): Max retries exceeded with url: / (Caused by NewConnectionError(\\': Failed to establish a new connection: [Errno 113] No route to host\\',))\\n'" }
是在创建swift时出错, 因为部署的时候没有部署keystone,所以无法创建user、service、endpoint,此步对ceph集群没有影响,可以忽略。
通过ceph命令查看集群状态
[root@node01 ~]# docker exec -it ceph_mon ceph -s
cluster:
id: 107d8048-2640-4db8-a27b-503f6b100773
health: HEALTH_OK
services:
mon: 3 daemons, quorum 192.168.123.4,192.168.123.5,192.168.123.6 (age 5m)
mgr: node01(active, since 4m), standbys: node02, node03
osd: 9 osds: 9 up (since 2m), 9 in (since 2m)
rgw: 1 daemon active (radosgw.gateway)
task status:
data:
pools: 4 pools, 128 pgs
objects: 187 objects, 1.2 KiB
usage: 51 GiB used, 170 GiB / 221 GiB avail
pgs: 128 active+clean
[root@node01 ~]#
此时ceph集群部署完成