特别说明:
- 本方法也可以用于单节点部署,只部署一个
Monitor
(只是会形成单点故障而已),最低要求是使用两个分区创建2
个OSD
(因为默认最小副本是2
);如果不需要使用CephFS
,则可以不部署MDS
服务;如果不使用对象存储,则可以不部署RGW
服务。 -
Ceph
从11.x (kraken)
版本开始新增Manager
服务,是可选的,从12.x (luminous)
版本开始是必选的。
系统环境
- 3个节点的主机
DNS
名及IP
配置(主机名和DNS
名称一样):
$ cat /etc/hosts
...
172.29.101.166 osdev01
172.29.101.167 osdev02
172.29.101.168 osdev03
...
- 内核及发行版版本:
$ uname -r
3.10.0-862.11.6.el7.x86_64
$ cat /etc/redhat-release
CentOS Linux release 7.5.1804 (Core)
- 3个节点使用
sdb
做OSD
磁盘,使用dd
命令清除其中可能存在的分区信息(会破坏磁盘数据,谨慎操作):
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 222.6G 0 disk
├─sda1 8:1 0 1G 0 part /boot
└─sda2 8:2 0 221.6G 0 part /
sdb 8:16 0 7.3T 0 disk
$ dd if=/dev/zero of=/dev/sdb bs=512 count=1024
系统配置
Yum配置
- 安装
epel
仓库:
$ yum install -y epel-release
- 安装
yum
优先级插件:
$ yum install -y yum-plugin-priorities --enablerepo=rhel-7-server-optional-rpms
系统配置
- 安装和开启
NTP
服务:
$ yum install -y ntp ntpdate ntp-doc
$ systemctl enable ntpd.service && systemctl start ntpd.service && systemctl status ntpd.service
- 添加
osdev
用户,并放开sudo
权限(也可以直接使用root
用户,此步骤只是出于安全考虑):
$ useradd -d /home/osdev -m osdev
$ passwd osdev
$ echo "osdev ALL = (root) NOPASSWD:ALL" | tee /etc/sudoers.d/osdev
$ chmod 0440 /etc/sudoers.d/osdev
- 关闭防火墙:
$ systemctl stop firewalld && systemctl disable firewalld && systemctl status firewalld
- 关闭
SELinux
:
$ sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config && cat /etc/selinux/config
# setenforce 0 && sestatus
$ reboot
$ sestatus
SELinux status: disabled
SSH配置
- 安装
SSH
服务软件包:
$ yum install -y openssh-server
-
SSH
免密登录:
$ ssh-keygen
$ ssh-copy-id osdev@osdev01
$ ssh-copy-id osdev@osdev02
$ ssh-copy-id osdev@osdev03
- 配置
SSH
默认用户,或者在执行cepy-deploy
命令时使用--username
指定用户名(这个配置会导致Kolla-Ansible
也把这个用户作为默认用户使用,导致权限不足而出现错误。可以在osdev
用户下进行如下配置,在root
用户下使用Kolla-Ansible
即可):
$ vi ~/.ssh/config
Host osdev01
Hostname osdev01
User osdev
Host osdev02
Hostname osdev02
User osdev
Host osdev03
Hostname osdev03
User osdev
- 测免密登录是否正确:
[root@osdev01 ~]# ssh osdev01
Last login: Wed Aug 22 16:53:56 2018 from osdev01
[osdev@osdev01 ~]$ exit
登出
Connection to osdev01 closed.
[root@osdev01 ~]# ssh osdev02
Last login: Wed Aug 22 16:55:06 2018 from osdev01
[osdev@osdev02 ~]$ exit
登出
Connection to osdev02 closed.
[root@osdev01 ~]# ssh osdev03
Last login: Wed Aug 22 16:55:35 2018 from osdev01
[osdev@osdev03 ~]$ exit
登出
Connection to osdev03 closed.
开始部署
初始化系统
- 安装
ceph-deploy
:
$ yum install -y ceph-deploy
- 创建
ceph-deploy
配置目录:
$ su - osdev
$ mkdir -pv /opt/ceph/deploy && cd /opt/ceph/deploy
- 创建一个
Ceph
集群,使用osdev01
、osdev02
和osdev03
做Monitor
节点:
$ ceph-deploy new osdev01 osdev02 osdev03
- 查看生成的配置文件:
$ ls
ceph.conf ceph-deploy-ceph.log ceph.mon.keyring
$ cat ceph.conf
[global]
fsid = 42ded78e-211b-4095-b795-a33f116727fc
mon_initial_members = osdev01, osdev02, osdev03
mon_host = 172.29.101.166,172.29.101.167,172.29.101.168
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
- 编辑
Ceph
集群配置:
$ vi ceph.conf
public_network = 172.29.101.0/24
cluster_network = 172.29.101.0/24
osd_pool_default_size = 3
osd_pool_default_min_size = 1
osd_pool_default_pg_num = 8
osd_pool_default_pgp_num = 8
osd_crush_chooseleaf_type = 1
[mon]
mon_clock_drift_allowed = 0.5
[osd]
osd_mkfs_type = xfs
osd_mkfs_options_xfs = -f
filestore_max_sync_interval = 5
filestore_min_sync_interval = 0.1
filestore_fd_cache_size = 655350
filestore_omap_header_cache_size = 655350
filestore_fd_cache_random = true
osd op threads = 8
osd disk threads = 4
filestore op threads = 8
max_open_files = 655350
安装软件包
- 在
3
个节点上安装Ceph
软件包(如果出现错误,则先到3
个节点上分别先删除软件包):
# sudo yum remove -y ceph-release
$ ceph-deploy install osdev01 osdev02 osdev03
部署Monitor
- 部署初始
Monitor
:
$ ceph-deploy mon create-initial
- 查看生成的配置和秘钥文件:
$ ls
ceph.bootstrap-mds.keyring ceph.bootstrap-mgr.keyring ceph.bootstrap-osd.keyring ceph.bootstrap-rgw.keyring ceph.client.admin.keyring ceph.conf ceph-deploy-ceph.log ceph.mon.keyring
$ sudo chmod a+r /etc/ceph/ceph.client.admin.keyring
- 拷贝配置和秘钥文件到指定节点:
$ ceph-deploy --overwrite-conf admin osdev01 osdev02 osdev03
- 配置
osdev01
的Monitor
剩余可用数据空间警告比例:
$ ceph -s
cluster:
id: 383237bd-becf-49d5-9bd6-deb0bc35ab2a
health: HEALTH_WARN
mon osdev01 is low on available space
services:
mon: 3 daemons, quorum osdev01,osdev02,osdev03
mgr: osdev03(active), standbys: osdev02, osdev01
osd: 3 osds: 3 up, 3 in
rgw: 3 daemons active
data:
pools: 10 pools, 176 pgs
objects: 578 objects, 477 MiB
usage: 4.0 GiB used, 22 TiB / 22 TiB avail
pgs: 176 active+clean
$ ceph daemon mon.osdev01 config get mon_data_avail_warn
{
"mon_data_avail_warn": "30"
}
$ ceph daemon mon.osdev01 config set mon_data_avail_warn 10
{
"success": "mon_data_avail_warn = '10' (not observed, change may require restart) "
}
$ vi /etc/ceph/ceph.conf
[mon]
mon_clock_drift_allowed = 0.5
mon allow pool delete = true
mon_data_avail_warn = 10
$ systemctl restart [email protected]
$ ceph daemon mon.osdev01 config get mon_data_avail_warn
{
"mon_data_avail_warn": "10"
}
$ ceph -s
cluster:
id: 383237bd-becf-49d5-9bd6-deb0bc35ab2a
health: HEALTH_OK
services:
mon: 3 daemons, quorum osdev01,osdev02,osdev03
mgr: osdev03(active), standbys: osdev02, osdev01
osd: 3 osds: 3 up, 3 in
rgw: 3 daemons active
data:
pools: 10 pools, 176 pgs
objects: 578 objects, 477 MiB
usage: 4.0 GiB used, 22 TiB / 22 TiB avail
pgs: 176 active+clean
移除Monitor
- 移除
osdev01
上的Monitor
服务:
$ ceph-deploy mon destroy osdev01
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mon destroy osdev01
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : destroy
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf :
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] mon : ['osdev01']
[ceph_deploy.cli][INFO ] func :
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.mon][DEBUG ] Removing mon from osdev01
[osdev01][DEBUG ] connected to host: osdev01
[osdev01][DEBUG ] detect platform information from remote host
[osdev01][DEBUG ] detect machine type
[osdev01][DEBUG ] find the location of an executable
[osdev01][DEBUG ] get remote short hostname
[osdev01][INFO ] Running command: ceph --cluster=ceph -n mon. -k /var/lib/ceph/mon/ceph-osdev01/keyring mon remove osdev01
[osdev01][WARNIN] removing mon.osdev01 at 172.29.101.166:6789/0, there will be 2 monitors
[osdev01][INFO ] polling the daemon to verify it stopped
[osdev01][INFO ] Running command: systemctl stop [email protected]
[osdev01][INFO ] Running command: mkdir -p /var/lib/ceph/mon-removed
[osdev01][DEBUG ] move old monitor data
- 重新在
osdev01
上添加Monitor
服务:
$ ceph-deploy mon add osdev01
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mon add osdev01
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : add
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf :
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] mon : ['osdev01']
[ceph_deploy.cli][INFO ] func :
[ceph_deploy.cli][INFO ] address : None
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.mon][INFO ] ensuring configuration of new mon host: osdev01
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to osdev01
[osdev01][DEBUG ] connected to host: osdev01
[osdev01][DEBUG ] detect platform information from remote host
[osdev01][DEBUG ] detect machine type
[osdev01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.mon][DEBUG ] Adding mon to cluster ceph, host osdev01
[ceph_deploy.mon][DEBUG ] using mon address by resolving host: 172.29.101.166
[ceph_deploy.mon][DEBUG ] detecting platform for host osdev01 ...
[osdev01][DEBUG ] connected to host: osdev01
[osdev01][DEBUG ] detect platform information from remote host
[osdev01][DEBUG ] detect machine type
[osdev01][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO ] distro info: CentOS Linux 7.5.1804 Core
[osdev01][DEBUG ] determining if provided host has same hostname in remote
[osdev01][DEBUG ] get remote short hostname
[osdev01][DEBUG ] adding mon to osdev01
[osdev01][DEBUG ] get remote short hostname
[osdev01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[osdev01][DEBUG ] create the mon path if it does not exist
[osdev01][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-osdev01/done
[osdev01][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-osdev01/done
[osdev01][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-osdev01.mon.keyring
[osdev01][DEBUG ] create the monitor keyring file
[osdev01][INFO ] Running command: ceph --cluster ceph mon getmap -o /var/lib/ceph/tmp/ceph.osdev01.monmap
[osdev01][WARNIN] got monmap epoch 3
[osdev01][INFO ] Running command: ceph-mon --cluster ceph --mkfs -i osdev01 --monmap /var/lib/ceph/tmp/ceph.osdev01.monmap --keyring /var/lib/ceph/tmp/ceph-osdev01.mon.keyring --setuser 167 --setgroup 167
[osdev01][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-osdev01.mon.keyring
[osdev01][DEBUG ] create a done file to avoid re-doing the mon deployment
[osdev01][DEBUG ] create the init path if it does not exist
[osdev01][INFO ] Running command: systemctl enable ceph.target
[osdev01][INFO ] Running command: systemctl enable ceph-mon@osdev01
[osdev01][INFO ] Running command: systemctl start ceph-mon@osdev01
[osdev01][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.osdev01.asok mon_status
[osdev01][WARNIN] monitor osdev01 does not exist in monmap
[osdev01][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.osdev01.asok mon_status
[osdev01][DEBUG ] ********************************************************************************
[osdev01][DEBUG ] status for monitor: mon.osdev01
[osdev01][DEBUG ] {
[osdev01][DEBUG ] "election_epoch": 0,
[osdev01][DEBUG ] "extra_probe_peers": [],
[osdev01][DEBUG ] "feature_map": {
[osdev01][DEBUG ] "client": [
[osdev01][DEBUG ] {
[osdev01][DEBUG ] "features": "0x1ffddff8eea4fffb",
[osdev01][DEBUG ] "num": 1,
[osdev01][DEBUG ] "release": "luminous"
[osdev01][DEBUG ] },
[osdev01][DEBUG ] {
[osdev01][DEBUG ] "features": "0x3ffddff8ffa4fffb",
[osdev01][DEBUG ] "num": 1,
[osdev01][DEBUG ] "release": "luminous"
[osdev01][DEBUG ] }
[osdev01][DEBUG ] ],
[osdev01][DEBUG ] "mds": [
[osdev01][DEBUG ] {
[osdev01][DEBUG ] "features": "0x3ffddff8ffa4fffb",
[osdev01][DEBUG ] "num": 2,
[osdev01][DEBUG ] "release": "luminous"
[osdev01][DEBUG ] }
[osdev01][DEBUG ] ],
[osdev01][DEBUG ] "mgr": [
[osdev01][DEBUG ] {
[osdev01][DEBUG ] "features": "0x3ffddff8ffa4fffb",
[osdev01][DEBUG ] "num": 3,
[osdev01][DEBUG ] "release": "luminous"
[osdev01][DEBUG ] }
[osdev01][DEBUG ] ],
[osdev01][DEBUG ] "mon": [
[osdev01][DEBUG ] {
[osdev01][DEBUG ] "features": "0x3ffddff8ffa4fffb",
[osdev01][DEBUG ] "num": 1,
[osdev01][DEBUG ] "release": "luminous"
[osdev01][DEBUG ] }
[osdev01][DEBUG ] ],
[osdev01][DEBUG ] "osd": [
[osdev01][DEBUG ] {
[osdev01][DEBUG ] "features": "0x3ffddff8ffa4fffb",
[osdev01][DEBUG ] "num": 2,
[osdev01][DEBUG ] "release": "luminous"
[osdev01][DEBUG ] }
[osdev01][DEBUG ] ]
[osdev01][DEBUG ] },
[osdev01][DEBUG ] "features": {
[osdev01][DEBUG ] "quorum_con": "0",
[osdev01][DEBUG ] "quorum_mon": [],
[osdev01][DEBUG ] "required_con": "144115188346404864",
[osdev01][DEBUG ] "required_mon": [
[osdev01][DEBUG ] "kraken",
[osdev01][DEBUG ] "luminous",
[osdev01][DEBUG ] "mimic",
[osdev01][DEBUG ] "osdmap-prune"
[osdev01][DEBUG ] ]
[osdev01][DEBUG ] },
[osdev01][DEBUG ] "monmap": {
[osdev01][DEBUG ] "created": "2018-08-23 10:55:27.755434",
[osdev01][DEBUG ] "epoch": 3,
[osdev01][DEBUG ] "features": {
[osdev01][DEBUG ] "optional": [],
[osdev01][DEBUG ] "persistent": [
[osdev01][DEBUG ] "kraken",
[osdev01][DEBUG ] "luminous",
[osdev01][DEBUG ] "mimic",
[osdev01][DEBUG ] "osdmap-prune"
[osdev01][DEBUG ] ]
[osdev01][DEBUG ] },
[osdev01][DEBUG ] "fsid": "383237bd-becf-49d5-9bd6-deb0bc35ab2a",
[osdev01][DEBUG ] "modified": "2018-09-19 14:57:08.984472",
[osdev01][DEBUG ] "mons": [
[osdev01][DEBUG ] {
[osdev01][DEBUG ] "addr": "172.29.101.167:6789/0",
[osdev01][DEBUG ] "name": "osdev02",
[osdev01][DEBUG ] "public_addr": "172.29.101.167:6789/0",
[osdev01][DEBUG ] "rank": 0
[osdev01][DEBUG ] },
[osdev01][DEBUG ] {
[osdev01][DEBUG ] "addr": "172.29.101.168:6789/0",
[osdev01][DEBUG ] "name": "osdev03",
[osdev01][DEBUG ] "public_addr": "172.29.101.168:6789/0",
[osdev01][DEBUG ] "rank": 1
[osdev01][DEBUG ] }
[osdev01][DEBUG ] ]
[osdev01][DEBUG ] },
[osdev01][DEBUG ] "name": "osdev01",
[osdev01][DEBUG ] "outside_quorum": [],
[osdev01][DEBUG ] "quorum": [],
[osdev01][DEBUG ] "rank": -1,
[osdev01][DEBUG ] "state": "probing",
[osdev01][DEBUG ] "sync_provider": []
[osdev01][DEBUG ] }
[osdev01][DEBUG ] ********************************************************************************
[osdev01][INFO ] monitor: mon.osdev01 is currently at the state of probing
部署Manager
- 在
3
个节点上安装Manager
服务(从kraken
版本开始增加该服务,从luminous
版本开始是必选):
$ ceph-deploy mgr create osdev01 osdev02 osdev03
- 查看集群状态,
3
个Manager
只有一个是被激活的:
$ sudo ceph -s
cluster:
id: 383237bd-becf-49d5-9bd6-deb0bc35ab2a
health: HEALTH_WARN
mon osdev01 is low on available space
services:
mon: 3 daemons, quorum osdev01,osdev02,osdev03
mgr: osdev01(active), standbys: osdev03, osdev02
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
- 查看当前的集群投票状态:
$ sudo ceph quorum_status --format json-pretty
{
"election_epoch": 8,
"quorum": [
0,
1,
2
],
"quorum_names": [
"osdev01",
"osdev02",
"osdev03"
],
"quorum_leader_name": "osdev01",
"monmap": {
"epoch": 2,
"fsid": "383237bd-becf-49d5-9bd6-deb0bc35ab2a",
"modified": "2018-08-23 10:55:53.598952",
"created": "2018-08-23 10:55:27.755434",
"features": {
"persistent": [
"kraken",
"luminous",
"mimic",
"osdmap-prune"
],
"optional": []
},
"mons": [
{
"rank": 0,
"name": "osdev01",
"addr": "172.29.101.166:6789/0",
"public_addr": "172.29.101.166:6789/0"
},
{
"rank": 1,
"name": "osdev02",
"addr": "172.29.101.167:6789/0",
"public_addr": "172.29.101.167:6789/0"
},
{
"rank": 2,
"name": "osdev03",
"addr": "172.29.101.168:6789/0",
"public_addr": "172.29.101.168:6789/0"
}
]
}
}
部署OSD
- 如果之前部署过
OSD
,则清理掉其中的LVM卷:
$ sudo lvs | awk 'NR!=1 {if($1~"osd-block-") print $2 "/" $1}' | xargs -I {} sudo lvremove -y {}
- 清除磁盘数据(如果之前
dd处理过
,以及没有LVM
卷,则可省略):
$ ceph-deploy disk zap osdev01 /dev/sdb
$ ceph-deploy disk zap osdev02 /dev/sdb
$ ceph-deploy disk zap osdev03 /dev/sdb
- 在
3
个节点上部署OSD
服务,默认使用bluestore
,没有journal
和block_db
:
$ ceph-deploy osd create --data /dev/sdb osdev01
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/osdev/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy osd create --data /dev/sdb osdev01
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] bluestore : None
[ceph_deploy.cli][INFO ] cd_conf :
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] fs_type : xfs
[ceph_deploy.cli][INFO ] block_wal : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] journal : None
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] host : osdev01
[ceph_deploy.cli][INFO ] filestore : None
[ceph_deploy.cli][INFO ] func :
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] zap_disk : False
[ceph_deploy.cli][INFO ] data : /dev/sdb
[ceph_deploy.cli][INFO ] block_db : None
[ceph_deploy.cli][INFO ] dmcrypt : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] dmcrypt_key_dir : /etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] debug : False
[ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data device /dev/sdb
[osdev01][DEBUG ] connection detected need for sudo
[osdev01][DEBUG ] connected to host: osdev01
[osdev01][DEBUG ] detect platform information from remote host
[osdev01][DEBUG ] detect machine type
[osdev01][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.5.1804 Core
[ceph_deploy.osd][DEBUG ] Deploying osd to osdev01
[osdev01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[osdev01][DEBUG ] find the location of an executable
[osdev01][INFO ] Running command: sudo /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb
[osdev01][DEBUG ] Running command: /bin/ceph-authtool --gen-print-key
[osdev01][DEBUG ] Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 3c3d6c5a-c82e-4318-a8fb-134de5444ca7
[osdev01][DEBUG ] Running command: /usr/sbin/vgcreate --force --yes ceph-95b94aa4-22df-401c-822b-dd62f82f6b08 /dev/sdb
[osdev01][DEBUG ] stdout: Physical volume "/dev/sdb" successfully created.
[osdev01][DEBUG ] stdout: Volume group "ceph-95b94aa4-22df-401c-822b-dd62f82f6b08" successfully created
[osdev01][DEBUG ] Running command: /usr/sbin/lvcreate --yes -l 100%FREE -n osd-block-3c3d6c5a-c82e-4318-a8fb-134de5444ca7 ceph-95b94aa4-22df-401c-822b-dd62f82f6b08
[osdev01][DEBUG ] stdout: Logical volume "osd-block-3c3d6c5a-c82e-4318-a8fb-134de5444ca7" created.
[osdev01][DEBUG ] Running command: /bin/ceph-authtool --gen-print-key
[osdev01][DEBUG ] Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-1
[osdev01][DEBUG ] Running command: /bin/chown -h ceph:ceph /dev/ceph-95b94aa4-22df-401c-822b-dd62f82f6b08/osd-block-3c3d6c5a-c82e-4318-a8fb-134de5444ca7
[osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /dev/dm-0
[osdev01][DEBUG ] Running command: /bin/ln -s /dev/ceph-95b94aa4-22df-401c-822b-dd62f82f6b08/osd-block-3c3d6c5a-c82e-4318-a8fb-134de5444ca7 /var/lib/ceph/osd/ceph-1/block
[osdev01][DEBUG ] Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-1/activate.monmap
[osdev01][DEBUG ] stderr: got monmap epoch 1
[osdev01][DEBUG ] Running command: /bin/ceph-authtool /var/lib/ceph/osd/ceph-1/keyring --create-keyring --name osd.1 --add-key AQDxF35bOAdNHBAAelXgl7laeMnVsGAlHl0dxQ==
[osdev01][DEBUG ] stdout: creating /var/lib/ceph/osd/ceph-1/keyring
[osdev01][DEBUG ] added entity osd.1 auth auth(auid = 18446744073709551615 key=AQDxF35bOAdNHBAAelXgl7laeMnVsGAlHl0dxQ== with 0 caps)
[osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-1/keyring
[osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-1/
[osdev01][DEBUG ] Running command: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 1 --monmap /var/lib/ceph/osd/ceph-1/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-1/ --osd-uuid 3c3d6c5a-c82e-4318-a8fb-134de5444ca7 --setuser ceph --setgroup ceph
[osdev01][DEBUG ] --> ceph-volume lvm prepare successful for: /dev/sdb
[osdev01][DEBUG ] Running command: /bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-95b94aa4-22df-401c-822b-dd62f82f6b08/osd-block-3c3d6c5a-c82e-4318-a8fb-134de5444ca7 --path /var/lib/ceph/osd/ceph-1 --no-mon-config
[osdev01][DEBUG ] Running command: /bin/ln -snf /dev/ceph-95b94aa4-22df-401c-822b-dd62f82f6b08/osd-block-3c3d6c5a-c82e-4318-a8fb-134de5444ca7 /var/lib/ceph/osd/ceph-1/block
[osdev01][DEBUG ] Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-1/block
[osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /dev/dm-0
[osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-1
[osdev01][DEBUG ] Running command: /bin/systemctl enable ceph-volume@lvm-1-3c3d6c5a-c82e-4318-a8fb-134de5444ca7
[osdev01][DEBUG ] stderr: Created symlink from /etc/systemd/system/multi-user.target.wants/[email protected] to /usr/lib/systemd/system/[email protected].
[osdev01][DEBUG ] Running command: /bin/systemctl start ceph-osd@1
[osdev01][DEBUG ] --> ceph-volume lvm activate successful for osd ID: 1
[osdev01][DEBUG ] --> ceph-volume lvm create successful for: /dev/sdb
[osdev01][INFO ] checking OSD status...
[osdev01][DEBUG ] find the location of an executable
[osdev01][INFO ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
[osdev01][WARNIN] there is 1 OSD down
[osdev01][WARNIN] there is 1 OSD out
[ceph_deploy.osd][DEBUG ] Host osdev01 is now ready for osd use.
$ ceph-deploy osd create --data /dev/sdb osdev02
$ ceph-deploy osd create --data /dev/sdb osdev03
- 查看
OSD
的分区状况,新版Ceph
默认使用bluestore
:
$ ceph-deploy osd list osdev01
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy osd list osdev01
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] debug : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : list
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf :
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] host : ['osdev01']
[ceph_deploy.cli][INFO ] func :
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[osdev01][DEBUG ] connected to host: osdev01
[osdev01][DEBUG ] detect platform information from remote host
[osdev01][DEBUG ] detect machine type
[osdev01][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.5.1804 Core
[ceph_deploy.osd][DEBUG ] Listing disks on osdev01...
[osdev01][DEBUG ] find the location of an executable
[osdev01][INFO ] Running command: /usr/sbin/ceph-volume lvm list
[osdev01][DEBUG ]
[osdev01][DEBUG ]
[osdev01][DEBUG ] ====== osd.0 =======
[osdev01][DEBUG ]
[osdev01][DEBUG ] [block] /dev/ceph-a2130090-fb78-4b65-838f-7496c63fa025/osd-block-2cb30e7c-7b98-4a6c-816a-2de7201a7669
[osdev01][DEBUG ]
[osdev01][DEBUG ] type block
[osdev01][DEBUG ] osd id 0
[osdev01][DEBUG ] cluster fsid 383237bd-becf-49d5-9bd6-deb0bc35ab2a
[osdev01][DEBUG ] cluster name ceph
[osdev01][DEBUG ] osd fsid 2cb30e7c-7b98-4a6c-816a-2de7201a7669
[osdev01][DEBUG ] encrypted 0
[osdev01][DEBUG ] cephx lockbox secret
[osdev01][DEBUG ] block uuid AL5bfk-acAQ-9guP-tl61-A4Jf-RQOF-nFnE9o
[osdev01][DEBUG ] block device /dev/ceph-a2130090-fb78-4b65-838f-7496c63fa025/osd-block-2cb30e7c-7b98-4a6c-816a-2de7201a7669
[osdev01][DEBUG ] vdo 0
[osdev01][DEBUG ] crush device class None
[osdev01][DEBUG ] devices /dev/sdb
$ lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
osd-block-2cb30e7c-7b98-4a6c-816a-2de7201a7669 ceph-a2130090-fb78-4b65-838f-7496c63fa025 -wi-ao---- <7.28t
$ pvs
PV VG Fmt Attr PSize PFree
/dev/sdb ceph-a2130090-fb78-4b65-838f-7496c63fa025 lvm2 a-- <7.28t 0
# osdev01
$ df -h | grep ceph
tmpfs 189G 24K 189G 1% /var/lib/ceph/osd/ceph-0
$ ll /var/lib/ceph/osd/ceph-0
总用量 24
lrwxrwxrwx 1 ceph ceph 93 8月 29 15:15 block -> /dev/ceph-a2130090-fb78-4b65-838f-7496c63fa025/osd-block-2cb30e7c-7b98-4a6c-816a-2de7201a7669
-rw------- 1 ceph ceph 37 8月 29 15:15 ceph_fsid
-rw------- 1 ceph ceph 37 8月 29 15:15 fsid
-rw------- 1 ceph ceph 55 8月 29 15:15 keyring
-rw------- 1 ceph ceph 6 8月 29 15:15 ready
-rw------- 1 ceph ceph 10 8月 29 15:15 type
-rw------- 1 ceph ceph 2 8月 29 15:15 whoami
$ cat /var/lib/ceph/osd/ceph-0/whoami
0
$ cat /var/lib/ceph/osd/ceph-0/type
bluestore
$ cat /var/lib/ceph/osd/ceph-0/ready
ready
$ cat /var/lib/ceph/osd/ceph-0/fsid
2cb30e7c-7b98-4a6c-816a-2de7201a7669
# osdev02
$ df -h | grep ceph
tmpfs 189G 48K 189G 1% /var/lib/ceph/osd/ceph-1
- 查看集群状态:
$ sudo ceph health
HEALTH_WARN mon osdev01 is low on available space
$ sudo ceph -s
cluster:
id: 383237bd-becf-49d5-9bd6-deb0bc35ab2a
health: HEALTH_WARN
mon osdev01 is low on available space
services:
mon: 3 daemons, quorum osdev01,osdev02,osdev03
mgr: osdev01(active), standbys: osdev03, osdev02
osd: 3 osds: 3 up, 3 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 3.0 GiB used, 22 TiB / 22 TiB avail
pgs:
- 查看
OSD
状态:
$ sudo ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 21.83066 root default
-3 7.27689 host osdev01
0 hdd 7.27689 osd.0 up 1.00000 1.00000
-5 7.27689 host osdev02
1 hdd 7.27689 osd.1 up 1.00000 1.00000
-7 7.27689 host osdev03
2 hdd 7.27689 osd.2 up 1.00000 1.00000
移除OSD
- 删除
OSD
:
$ ceph osd out 0
marked out osd.0.
- 观察数据迁移:
$ ceph -w
- 在对应的节点上停止
OSD
服务:
$ systemctl stop ceph-osd@0
- 删除该
OSD
的CRUSH
表:
$ ceph osd crush remove osd.0
removed item id 0 name 'osd.0' from crush map
- 删除该
OSD
的认证:
$ ceph auth del osd.0
updated
- 清理
OSD
的磁盘:
$ sudo lvs | awk 'NR!=1 {if($1~"osd-block-") print $2 "/" $1}' | xargs -I {} sudo lvremove -y {}
Logical volume "osd-block-2cb30e7c-7b98-4a6c-816a-2de7201a7669" successfully removed
$ ceph-deploy disk zap osdev01 /dev/sdb
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy disk zap osdev01 /dev/sdb
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] debug : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : zap
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf :
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] host : osdev01
[ceph_deploy.cli][INFO ] func :
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] disk : ['/dev/sdb']
[ceph_deploy.osd][DEBUG ] zapping /dev/sdb on osdev01
[osdev01][DEBUG ] connected to host: osdev01
[osdev01][DEBUG ] detect platform information from remote host
[osdev01][DEBUG ] detect machine type
[osdev01][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.5.1804 Core
[osdev01][DEBUG ] zeroing last few blocks of device
[osdev01][DEBUG ] find the location of an executable
[osdev01][INFO ] Running command: /usr/sbin/ceph-volume lvm zap /dev/sdb
[osdev01][DEBUG ] --> Zapping: /dev/sdb
[osdev01][DEBUG ] Running command: /usr/sbin/cryptsetup status /dev/mapper/
[osdev01][DEBUG ] stdout: /dev/mapper/ is inactive.
[osdev01][DEBUG ] Running command: /usr/sbin/wipefs --all /dev/sdb
[osdev01][DEBUG ] stdout: /dev/sdb:8 个字节已擦除,位置偏移为 0x00000218 (LVM2_member):4c 56 4d 32 20 30 30 31
[osdev01][DEBUG ] Running command: /bin/dd if=/dev/zero of=/dev/sdb bs=1M count=10
[osdev01][DEBUG ] stderr: 记录了10+0 的读入
[osdev01][DEBUG ] 记录了10+0 的写出
[osdev01][DEBUG ] 10485760字节(10 MB)已复制
[osdev01][DEBUG ] stderr: ,0.0131341 秒,798 MB/秒
[osdev01][DEBUG ] --> Zapping successful for: /dev/sdb
- 重新添加
OSD
:
$ ceph-deploy osd create --data /dev/sdb osdev01
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy osd create --data /dev/sdb osdev01
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] bluestore : None
[ceph_deploy.cli][INFO ] cd_conf :
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] fs_type : xfs
[ceph_deploy.cli][INFO ] block_wal : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] journal : None
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] host : osdev01
[ceph_deploy.cli][INFO ] filestore : None
[ceph_deploy.cli][INFO ] func :
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] zap_disk : False
[ceph_deploy.cli][INFO ] data : /dev/sdb
[ceph_deploy.cli][INFO ] block_db : None
[ceph_deploy.cli][INFO ] dmcrypt : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] dmcrypt_key_dir : /etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] debug : False
[ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data device /dev/sdb
[osdev01][DEBUG ] connected to host: osdev01
[osdev01][DEBUG ] detect platform information from remote host
[osdev01][DEBUG ] detect machine type
[osdev01][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.5.1804 Core
[ceph_deploy.osd][DEBUG ] Deploying osd to osdev01
[osdev01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[osdev01][DEBUG ] find the location of an executable
[osdev01][INFO ] Running command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb
[osdev01][DEBUG ] Running command: /bin/ceph-authtool --gen-print-key
[osdev01][DEBUG ] Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new df124d5a-122a-48b4-9173-87088c6e6aac
[osdev01][DEBUG ] Running command: /usr/sbin/vgcreate --force --yes ceph-5cddc4d4-2b62-452a-8ba1-61df276d5320 /dev/sdb
[osdev01][DEBUG ] stdout: Physical volume "/dev/sdb" successfully created.
[osdev01][DEBUG ] stdout: Volume group "ceph-5cddc4d4-2b62-452a-8ba1-61df276d5320" successfully created
[osdev01][DEBUG ] Running command: /usr/sbin/lvcreate --yes -l 100%FREE -n osd-block-df124d5a-122a-48b4-9173-87088c6e6aac ceph-5cddc4d4-2b62-452a-8ba1-61df276d5320
[osdev01][DEBUG ] stdout: Logical volume "osd-block-df124d5a-122a-48b4-9173-87088c6e6aac" created.
[osdev01][DEBUG ] Running command: /bin/ceph-authtool --gen-print-key
[osdev01][DEBUG ] Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-3
[osdev01][DEBUG ] Running command: /bin/chown -h ceph:ceph /dev/ceph-5cddc4d4-2b62-452a-8ba1-61df276d5320/osd-block-df124d5a-122a-48b4-9173-87088c6e6aac
[osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /dev/dm-0
[osdev01][DEBUG ] Running command: /bin/ln -s /dev/ceph-5cddc4d4-2b62-452a-8ba1-61df276d5320/osd-block-df124d5a-122a-48b4-9173-87088c6e6aac /var/lib/ceph/osd/ceph-3/block
[osdev01][DEBUG ] Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-3/activate.monmap
[osdev01][DEBUG ] stderr: got monmap epoch 4
[osdev01][DEBUG ] Running command: /bin/ceph-authtool /var/lib/ceph/osd/ceph-3/keyring --create-keyring --name osd.3 --add-key AQDP9qFbXoYRERAAMMz5EHjYAdlveVdDe1uAYg==
[osdev01][DEBUG ] stdout: creating /var/lib/ceph/osd/ceph-3/keyring
[osdev01][DEBUG ] stdout: added entity osd.3 auth auth(auid = 18446744073709551615 key=AQDP9qFbXoYRERAAMMz5EHjYAdlveVdDe1uAYg== with 0 caps)
[osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/keyring
[osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/
[osdev01][DEBUG ] Running command: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 3 --monmap /var/lib/ceph/osd/ceph-3/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-3/ --osd-uuid df124d5a-122a-48b4-9173-87088c6e6aac --setuser ceph --setgroup ceph
[osdev01][DEBUG ] --> ceph-volume lvm prepare successful for: /dev/sdb
[osdev01][DEBUG ] Running command: /bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-5cddc4d4-2b62-452a-8ba1-61df276d5320/osd-block-df124d5a-122a-48b4-9173-87088c6e6aac --path /var/lib/ceph/osd/ceph-3 --no-mon-config
[osdev01][DEBUG ] Running command: /bin/ln -snf /dev/ceph-5cddc4d4-2b62-452a-8ba1-61df276d5320/osd-block-df124d5a-122a-48b4-9173-87088c6e6aac /var/lib/ceph/osd/ceph-3/block
[osdev01][DEBUG ] Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-3/block
[osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /dev/dm-0
[osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-3
[osdev01][DEBUG ] Running command: /bin/systemctl enable ceph-volume@lvm-3-df124d5a-122a-48b4-9173-87088c6e6aac
[osdev01][DEBUG ] stderr: Created symlink from /etc/systemd/system/multi-user.target.wants/[email protected] to /usr/lib/systemd/system/[email protected].
[osdev01][DEBUG ] Running command: /bin/systemctl start ceph-osd@3
[osdev01][DEBUG ] --> ceph-volume lvm activate successful for osd ID: 3
[osdev01][DEBUG ] --> ceph-volume lvm create successful for: /dev/sdb
[osdev01][INFO ] checking OSD status...
[osdev01][DEBUG ] find the location of an executable
[osdev01][INFO ] Running command: /bin/ceph --cluster=ceph osd stat --format=json
[osdev01][WARNIN] there is 1 OSD down
[osdev01][WARNIN] there is 1 OSD out
[ceph_deploy.osd][DEBUG ] Host osdev01 is now ready for osd use.
- 查看
OSD
状态:
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 21.83066 root default
-3 7.27689 host osdev01
3 hdd 7.27689 osd.3 up 1.00000 1.00000
-5 7.27689 host osdev02
1 hdd 7.27689 osd.1 up 1.00000 1.00000
-7 7.27689 host osdev03
2 hdd 7.27689 osd.2 up 1.00000 1.00000
0 0 osd.0 down 0 1.00000
$ ceph-bluestore-tool show-label --dev /var/lib/ceph/osd/ceph-3/block
{
"/var/lib/ceph/osd/ceph-3/block": {
"osd_uuid": "df124d5a-122a-48b4-9173-87088c6e6aac",
"size": 8000995590144,
"btime": "2018-09-19 15:12:17.376253",
"description": "main",
"bluefs": "1",
"ceph_fsid": "383237bd-becf-49d5-9bd6-deb0bc35ab2a",
"kv_backend": "rocksdb",
"magic": "ceph osd volume v026",
"mkfs_done": "yes",
"osd_key": "AQDP9qFbXoYRERAAMMz5EHjYAdlveVdDe1uAYg==",
"ready": "ready",
"whoami": "3"
}
}
- 查看数据迁移状态:
$ ceph -w
cluster:
id: 383237bd-becf-49d5-9bd6-deb0bc35ab2a
health: HEALTH_WARN
Degraded data redundancy: 4825/16156 objects degraded (29.865%), 83 pgs degraded, 63 pgs undersized
clock skew detected on mon.osdev02
mon osdev01 is low on available space
services:
mon: 3 daemons, quorum osdev01,osdev02,osdev03
mgr: osdev03(active), standbys: osdev02, osdev01
osd: 4 osds: 3 up, 3 in; 63 remapped pgs
rgw: 3 daemons active
data:
pools: 10 pools, 176 pgs
objects: 5.39 k objects, 19 GiB
usage: 43 GiB used, 22 TiB / 22 TiB avail
pgs: 4825/16156 objects degraded (29.865%)
88 active+clean
48 active+undersized+degraded+remapped+backfill_wait
19 active+recovery_wait+degraded
15 active+recovery_wait+undersized+degraded+remapped
5 active+recovery_wait
1 active+recovering+degraded
io:
recovery: 15 MiB/s, 3 objects/s
2018-09-19 15:14:35.149958 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4825/16156 objects degraded (29.865%), 83 pgs degraded, 63 pgs undersized (PG_DEGRADED)
2018-09-19 15:14:40.154936 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4802/16156 objects degraded (29.723%), 83 pgs degraded, 63 pgs undersized (PG_DEGRADED)
2018-09-19 15:14:45.155511 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4785/16156 objects degraded (29.617%), 72 pgs degraded, 63 pgs undersized (PG_DEGRADED)
2018-09-19 15:14:50.156258 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4761/16156 objects degraded (29.469%), 70 pgs degraded, 63 pgs undersized (PG_DEGRADED)
2018-09-19 15:14:55.157259 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4736/16156 objects degraded (29.314%), 66 pgs degraded, 63 pgs undersized (PG_DEGRADED)
2018-09-19 15:15:00.157805 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4715/16156 objects degraded (29.184%), 66 pgs degraded, 63 pgs undersized (PG_DEGRADED)
2018-09-19 15:15:05.159788 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4700/16156 objects degraded (29.091%), 65 pgs degraded, 62 pgs undersized (PG_DEGRADED)
2018-09-19 15:15:10.160347 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4687/16156 objects degraded (29.011%), 65 pgs degraded, 62 pgs undersized (PG_DEGRADED)
2018-09-19 15:15:15.161346 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4663/16156 objects degraded (28.862%), 65 pgs degraded, 62 pgs undersized (PG_DEGRADED)
2018-09-19 15:15:20.163878 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4639/16156 objects degraded (28.714%), 64 pgs degraded, 62 pgs undersized (PG_DEGRADED)
2018-09-19 15:15:25.166626 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4634/16156 objects degraded (28.683%), 64 pgs degraded, 62 pgs undersized (PG_DEGRADED)
2018-09-19 15:15:30.168933 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4612/16156 objects degraded (28.547%), 62 pgs degraded, 61 pgs undersized (PG_DEGRADED)
2018-09-19 15:15:35.170116 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4590/16156 objects degraded (28.410%), 62 pgs degraded, 61 pgs undersized (PG_DEGRADED)
2018-09-19 15:15:35.310448 mon.osdev01 [WRN] Health check failed: Reduced data availability: 1 pg inactive, 1 pg peering (PG_AVAILABILITY)
2018-09-19 15:15:40.170608 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4578/16156 objects degraded (28.336%), 60 pgs degraded, 60 pgs undersized (PG_DEGRADED)
2018-09-19 15:15:41.314443 mon.osdev01 [INF] Health check cleared: PG_AVAILABILITY (was: Reduced data availability: 1 pg inactive, 1 pg peering)
2018-09-19 15:15:45.171537 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4564/16156 objects degraded (28.250%), 60 pgs degraded, 60 pgs undersized (PG_DEGRADED)
2018-09-19 15:15:50.172340 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4546/16156 objects degraded (28.138%), 59 pgs degraded, 59 pgs undersized (PG_DEGRADED)
2018-09-19 15:15:55.173243 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4536/16156 objects degraded (28.076%), 59 pgs degraded, 59 pgs undersized (PG_DEGRADED)
2018-09-19 15:16:00.174125 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4514/16156 objects degraded (27.940%), 59 pgs degraded, 59 pgs undersized (PG_DEGRADED)
2018-09-19 15:16:05.176502 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4496/16156 objects degraded (27.829%), 58 pgs degraded, 58 pgs undersized (PG_DEGRADED)
2018-09-19 15:16:10.177113 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4486/16156 objects degraded (27.767%), 58 pgs degraded, 58 pgs undersized (PG_DEGRADED)
2018-09-19 15:16:15.178024 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4464/16156 objects degraded (27.631%), 58 pgs degraded, 58 pgs undersized (PG_DEGRADED)
2018-09-19 15:16:20.178774 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4457/16156 objects degraded (27.587%), 57 pgs degraded, 57 pgs undersized (PG_DEGRADED)
2018-09-19 15:16:25.179609 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4436/16156 objects degraded (27.457%), 57 pgs degraded, 57 pgs undersized (PG_DEGRADED)
2018-09-19 15:16:30.180333 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4426/16156 objects degraded (27.395%), 56 pgs degraded, 56 pgs undersized (PG_DEGRADED)
2018-09-19 15:16:35.180850 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4404/16156 objects degraded (27.259%), 56 pgs degraded, 56 pgs undersized (PG_DEGRADED)
2018-09-19 15:16:37.760009 mon.osdev01 [WRN] mon.1 172.29.101.167:6789/0 clock skew 1.47964s > max 0.5s
2018-09-19 15:16:40.181520 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4383/16156 objects degraded (27.129%), 55 pgs degraded, 55 pgs undersized (PG_DEGRADED)
2018-09-19 15:16:45.183101 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4373/16156 objects degraded (27.067%), 55 pgs degraded, 55 pgs undersized (PG_DEGRADED)
2018-09-19 15:16:50.184008 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4351/16156 objects degraded (26.931%), 55 pgs degraded, 55 pgs undersized (PG_DEGRADED)
2018-09-19 15:16:51.434708 mon.osdev01 [WRN] Health check failed: Reduced data availability: 1 pg inactive, 1 pg peering (PG_AVAILABILITY)
2018-09-19 15:16:55.184869 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4336/16156 objects degraded (26.838%), 54 pgs degraded, 54 pgs undersized (PG_DEGRADED)
2018-09-19 15:16:56.238863 mon.osdev01 [INF] Health check cleared: PG_AVAILABILITY (was: Reduced data availability: 1 pg inactive, 1 pg peering)
2018-09-19 15:17:00.185629 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4318/16156 objects degraded (26.727%), 54 pgs degraded, 54 pgs undersized (PG_DEGRADED)
2018-09-19 15:17:05.186503 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4296/16156 objects degraded (26.591%), 54 pgs degraded, 54 pgs undersized (PG_DEGRADED)
2018-09-19 15:17:10.187331 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4283/16156 objects degraded (26.510%), 52 pgs degraded, 52 pgs undersized (PG_DEGRADED)
2018-09-19 15:17:15.188170 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4261/16156 objects degraded (26.374%), 52 pgs degraded, 52 pgs undersized (PG_DEGRADED)
2018-09-19 15:17:20.189922 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4243/16156 objects degraded (26.263%), 51 pgs degraded, 51 pgs undersized (PG_DEGRADED)
2018-09-19 15:17:25.190843 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4227/16156 objects degraded (26.164%), 51 pgs degraded, 51 pgs undersized (PG_DEGRADED)
2018-09-19 15:17:30.191813 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4205/16156 objects degraded (26.027%), 51 pgs degraded, 51 pgs undersized (PG_DEGRADED)
2018-09-19 15:17:32.348305 mon.osdev01 [WRN] Health check failed: Reduced data availability: 1 pg inactive, 1 pg peering (PG_AVAILABILITY)
...
$ watch -n1 ceph -s
Every 1.0s: ceph -s Wed Sep 19 15:21:12 2018
cluster:
id: 383237bd-becf-49d5-9bd6-deb0bc35ab2a
health: HEALTH_WARN
Degraded data redundancy: 3372/16156 objects degraded (20.872%), 36 pgs degraded, 36 pgs undersized
clock skew detected on mon.osdev02
mon osdev01 is low on available space
services:
mon: 3 daemons, quorum osdev01,osdev02,osdev03
mgr: osdev03(active), standbys: osdev02, osdev01
osd: 4 osds: 3 up, 3 in; 36 remapped pgs
rgw: 3 daemons active
data:
pools: 10 pools, 176 pgs
objects: 5.39 k objects, 19 GiB
usage: 48 GiB used, 22 TiB / 22 TiB avail
pgs: 3372/16156 objects degraded (20.872%)
140 active+clean
35 active+undersized+degraded+remapped+backfill_wait
1 active+undersized+degraded+remapped+backfilling
io:
recovery: 17 MiB/s, 4 objects/s
部署MDS
- 在
3
个节点上部署MDS
服务:
$ ceph-deploy mds create osdev01 osdev02 osdev03
部署RGW
- 在
3
个节点上部署RGW
服务:
$ ceph-deploy rgw create osdev01 osdev02 osdev03
- 查看集群状态:
$ sudo ceph -s
cluster:
id: 383237bd-becf-49d5-9bd6-deb0bc35ab2a
health: HEALTH_WARN
too few PGs per OSD (22 < min 30)
services:
mon: 3 daemons, quorum osdev01,osdev02,osdev03
mgr: osdev01(active), standbys: osdev03, osdev02
osd: 3 osds: 3 up, 3 in
rgw: 1 daemon active
data:
pools: 4 pools, 32 pgs
objects: 16 objects, 3.2 KiB
usage: 3.0 GiB used, 22 TiB / 22 TiB avail
pgs: 31.250% pgs unknown
3.125% pgs not active
21 active+clean
10 unknown
1 creating+peering
io:
client: 2.4 KiB/s rd, 731 B/s wr, 3 op/s rd, 0 op/s wr
卸载Ceph
- 卸载掉部署的
Ceph
,包括软件包和配置:
# destroy and uninstall all packages
$ ceph-deploy purge osdev01 osdev02 osdev03
# destroy data
$ ceph-deploy purgedata osdev01 osdev02 osdev03
$ ceph-deploy forgetkeys
# remove all keys
$ rm -rfv ceph.*
测试使用
创建Pool
- 查看当前
Pool
信息,可以看到里面有几个RGW
网关的默认存储池:
$ rados lspools
.rgw.root
default.rgw.control
default.rgw.meta
default.rgw.log
$ rados -p .rgw.root ls
zone_info.4741b9cf-cc27-43d8-9bbc-59eee875b4db
zone_info.c775c6a6-036a-43ab-b558-ab0df40c3ad2
zonegroup_info.df77b60a-8423-4570-b9ae-ae4ef06a13a2
zone_info.0e5daa99-3863-4411-8d75-7d14a3f9a014
zonegroup_info.f652f53f-94bb-4599-a1c1-737f792a9510
zonegroup_info.5a4fb515-ef63-4ddc-85e0-5cf8339d9472
zone_names.default
zonegroups_names.default
$ ceph osd pool get .rgw.root pg_num
pg_num: 8
$ ceph osd dump
epoch 25
fsid 383237bd-becf-49d5-9bd6-deb0bc35ab2a
created 2018-08-23 10:55:49.409542
modified 2018-08-23 16:23:00.574710
flags sortbitwise,recovery_deletes,purged_snapdirs
crush_version 7
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
require_min_compat_client jewel
min_compat_client jewel
require_osd_release mimic
pool 1 '.rgw.root' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 17 flags hashpspool stripe_width 0 application rgw
pool 2 'default.rgw.control' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 20 flags hashpspool stripe_width 0 application rgw
pool 3 'default.rgw.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 22 flags hashpspool stripe_width 0 application rgw
pool 4 'default.rgw.log' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 24 flags hashpspool stripe_width 0 application rgw
max_osd 3
osd.0 up in weight 1 up_from 5 up_thru 23 down_at 0 last_clean_interval [0,0) 172.29.101.166:6801/719880 172.29.101.166:6802/719880 172.29.101.166:6803/719880 172.29.101.166:6804/719880 exists,up 2cb30e7c-7b98-4a6c-816a-2de7201a7669
osd.1 up in weight 1 up_from 15 up_thru 23 down_at 14 last_clean_interval [9,14) 172.29.101.167:6800/189449 172.29.101.167:6804/1189449 172.29.101.167:6805/1189449 172.29.101.167:6806/1189449 exists,up 9d3bafa9-9ea0-401c-ad67-a08ef7c2d9f7
osd.2 up in weight 1 up_from 13 up_thru 23 down_at 0 last_clean_interval [0,0) 172.29.101.168:6800/188591 172.29.101.168:6801/188591 172.29.101.168:6802/188591 172.29.101.168:6803/188591 exists,up a41fa4e0-c80b-4091-95cc-b58af291f387
- 创建一个
Pool
:
$ ceph osd pool create glance 32 32
pool 'glance' created
- 删除一个
Pool
,发现无法删除:
$ ceph osd pool delete glance
Error EPERM: WARNING: this will *PERMANENTLY DESTROY* all data stored in pool glance. If you are *ABSOLUTELY CERTAIN* that is what you want, pass the pool name *twice*, followed by --yes-i-really-really-mean-it.
$ ceph osd pool delete glance glance --yes-i-really-really-mean-it
Error EPERM: pool deletion is disabled; you must first set the mon_allow_pool_delete config option to true before you can destroy a pool
- 配置允许删除
Pool
:
$ vi /etc/ceph/ceph.conf
[mon]
mon allow pool delete = true
$ systemctl restart ceph-mon.target
- 再次删除
Pool
:
$ ceph osd pool delete glance glance --yes-i-really-really-mean-it
pool 'glance' removed
创建Object
- 创建一个测试用
Pool
,并设置副本数为3:
$ ceph osd pool create test-pool 128 128
$ ceph osd lspools
1 .rgw.root
2 default.rgw.control
3 default.rgw.meta
4 default.rgw.log
5 test-pool
$ ceph osd dump | grep pool
pool 1 '.rgw.root' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 17 flags hashpspool stripe_width 0 application rgw
pool 2 'default.rgw.control' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 20 flags hashpspool stripe_width 0 application rgw
pool 3 'default.rgw.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 22 flags hashpspool stripe_width 0 application rgw
pool 4 'default.rgw.log' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 24 flags hashpspool stripe_width 0 application rgw
pool 5 'test-pool' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 26 flags hashpspool stripe_width 0
$ rados lspools
.rgw.root
default.rgw.control
default.rgw.meta
default.rgw.log
test-pool
# set replicated size
$ ceph osd pool set test-pool size 3
set pool 5 size to 3
$ rados -p test-pool ls
- 创建一个测试文件:
$ echo "He110 Ceph, You are Awesome 1ike MJ" > hello_ceph
- 创建一个
Object
:
$ rados -p test-pool put object1 hello_ceph
- 查看
Object
的OSDMap
,可以看到名字,所属PG
和OSD
,以及他们的状态:
$ ceph osd map test-pool object1
osdmap e29 pool 'test-pool' (5) object 'object1' -> pg 5.bac5debc (5.3c) -> up ([0,1,2], p0) acting ([0,1,2], p0)
$ rados -p test-pool ls
object1
创建RBD
- 创建一个
RBD
Pool
:
$ ceph osd pool create rbd 8 8
$ rbd pool init rbd
- 创建一个
RBD
:
$ rbd create rbd_test --size 10240
- 查看
RADOS
和OSD
的变化,可以看到新建的RBD
会多出来3
个文件:
$ rbd ls
rbd_test
$ rados -p rbd ls
rbd_directory
rbd_header.11856b8b4567
rbd_info
rbd_object_map.11856b8b4567
rbd_id.rbd_test
$ ceph osd dump | grep pool
pool 1 '.rgw.root' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 17 flags hashpspool stripe_width 0 application rgw
pool 2 'default.rgw.control' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 20 flags hashpspool stripe_width 0 application rgw
pool 3 'default.rgw.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 22 flags hashpspool stripe_width 0 application rgw
pool 4 'default.rgw.log' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 24 flags hashpspool stripe_width 0 application rgw
pool 5 'test-pool' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 29 flags hashpspool stripe_width 0
pool 6 'rbd' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 35 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd
映射RBD
- 加载
RBD
内核模块:
$ uname -r
3.10.0-862.11.6.el7.x86_64
$ modprobe rbd
$ lsmod | grep rbd
rbd 83728 0
libceph 301687 1 rbd
- 映射
RBD
块设备,发现由于内核版本较低,无法映射:
$ rbd map rbd_test
rbd: sysfs write failed
RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable rbd_test object-map fast-diff deep-flatten".
In some cases useful info is found in syslog - try "dmesg | tail".
rbd: map failed: (6) No such device or address
$ dmesg | tail
[150078.190941] Key type dns_resolver registered
[150078.231155] Key type ceph registered
[150078.231538] libceph: loaded (mon/osd proto 15/24)
[150078.239110] rbd: loaded
[152620.392095] libceph: mon1 172.29.101.167:6789 session established
[152620.392821] libceph: client4522 fsid 383237bd-becf-49d5-9bd6-deb0bc35ab2a
[152620.646943] rbd: image rbd_test: image uses unsupported features: 0x38
[152648.322295] libceph: mon0 172.29.101.166:6789 session established
[152648.322845] libceph: client4530 fsid 383237bd-becf-49d5-9bd6-deb0bc35ab2a
[152648.357522] rbd: image rbd_test: image uses unsupported features: 0x38
- 查看
RBD
块设备的特性:
$ rbd info rbd_test
rbd image 'rbd_test':
size 10 GiB in 2560 objects
order 22 (4 MiB objects)
id: 11856b8b4567
block_name_prefix: rbd_data.11856b8b4567
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
op_features:
flags:
create_timestamp: Fri Aug 24 10:21:11 2018
layering: 支持分层
striping: 支持条带化 v2
exclusive-lock: 支持独占锁
object-map: 支持对象映射(依赖 exclusive-lock )
fast-diff: 快速计算差异(依赖 object-map )
deep-flatten: 支持快照扁平化操作
journaling: 支持记录 IO 操作(依赖独占锁)
- 修改
Ceph
默认RBD
特性来解决这一问题:
$ vi /etc/ceph/ceph.conf
rbd_default_features = 1
$ ceph --show-config | grep rbd | grep "features rbd_default_features = 1"
- 或者在创建
RBD
指定特性:
$ rbd create rbd_test --size 10G --image-format 1 --image-feature layering
- 关闭掉内核不支持的特性:
$ rbd feature disable rbd_test object-map fast-diff deep-flatten
$ rbd info rbd_test
rbd image 'rbd_test':
size 10 GiB in 2560 objects
order 22 (4 MiB objects)
id: 11856b8b4567
block_name_prefix: rbd_data.11856b8b4567
format: 2
features: layering, exclusive-lock
op_features:
flags:
create_timestamp: Fri Aug 24 10:21:11 2018
- 重新映射
RBD
:
# rbd map rbd/rbd_test
$ rbd map rbd_test
/dev/rbd0
$ rbd showmapped
id pool image snap device
0 rbd rbd_test - /dev/rbd0
$ lsblk | grep rbd0
rbd0 252:0 0 10.2G 0 disk
使用RBD
- 创建文件系统:
$ mkfs.xfs /dev/rbd0
meta-data=/dev/rbd0 isize=512 agcount=16, agsize=167936 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0, sparse=0
data = bsize=4096 blocks=2682880, imaxpct=25
= sunit=1024 swidth=1024 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=8 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
- 挂载
RBD
,并写入数据:
$ mkdir -pv /mnt/rbd_test
mkdir: 已创建目录 "/mnt/rbd_test"
$ mount /dev/rbd0 /mnt/rbd_test
$ dd if=/dev/zero of=/mnt/rbd_test/fi1e1 count=100 bs=1M
- 查看
RADOS
的变化,可以看到一个RBD
会被分为很多小对象:
$ ll -h /mnt/rbd_test/
总用量 100M
-rw-r--r-- 1 root root 100M 8月 24 11:35 fi1e1
$ rados -p rbd ls | grep 1185
rbd_data.11856b8b4567.0000000000000003
rbd_data.11856b8b4567.00000000000003d8
rbd_data.11856b8b4567.0000000000000d74
rbd_data.11856b8b4567.0000000000001294
rbd_data.11856b8b4567.0000000000000522
rbd_data.11856b8b4567.0000000000000007
rbd_data.11856b8b4567.0000000000001338
rbd_data.11856b8b4567.0000000000000018
rbd_data.11856b8b4567.000000000000000d
rbd_data.11856b8b4567.0000000000000148
rbd_data.11856b8b4567.00000000000000a4
rbd_data.11856b8b4567.00000000000013dc
rbd_data.11856b8b4567.0000000000000013
rbd_header.11856b8b4567
rbd_data.11856b8b4567.0000000000000000
rbd_data.11856b8b4567.0000000000000a40
rbd_data.11856b8b4567.000000000000114c
rbd_data.11856b8b4567.0000000000000008
rbd_data.11856b8b4567.0000000000000b88
rbd_data.11856b8b4567.0000000000000009
rbd_data.11856b8b4567.0000000000000521
rbd_data.11856b8b4567.0000000000000010
rbd_data.11856b8b4567.00000000000008f8
rbd_data.11856b8b4567.0000000000000012
rbd_data.11856b8b4567.0000000000000016
rbd_data.11856b8b4567.0000000000000014
rbd_data.11856b8b4567.000000000000001a
rbd_data.11856b8b4567.0000000000000854
rbd_data.11856b8b4567.000000000000000c
rbd_data.11856b8b4567.0000000000000ae4
rbd_data.11856b8b4567.000000000000047c
rbd_data.11856b8b4567.0000000000000005
rbd_data.11856b8b4567.0000000000000e18
rbd_data.11856b8b4567.000000000000000f
rbd_data.11856b8b4567.0000000000000cd0
rbd_data.11856b8b4567.00000000000001ec
rbd_data.11856b8b4567.0000000000000017
rbd_data.11856b8b4567.0000000000000a3b
rbd_data.11856b8b4567.0000000000000011
rbd_data.11856b8b4567.000000000000070c
rbd_data.11856b8b4567.0000000000000520
rbd_data.11856b8b4567.00000000000010a8
rbd_data.11856b8b4567.0000000000000015
rbd_data.11856b8b4567.0000000000000004
rbd_data.11856b8b4567.000000000000099c
rbd_data.11856b8b4567.0000000000000001
rbd_data.11856b8b4567.000000000000000b
rbd_data.11856b8b4567.0000000000000c2c
rbd_data.11856b8b4567.0000000000000334
rbd_data.11856b8b4567.00000000000005c4
rbd_data.11856b8b4567.000000000000000a
rbd_data.11856b8b4567.0000000000000006
rbd_data.11856b8b4567.0000000000000668
rbd_data.11856b8b4567.0000000000001004
rbd_data.11856b8b4567.0000000000000019
rbd_data.11856b8b4567.00000000000011f0
rbd_data.11856b8b4567.000000000000000e
rbd_data.11856b8b4567.0000000000000f60
rbd_data.11856b8b4567.00000000000007b0
rbd_data.11856b8b4567.0000000000000290
rbd_data.11856b8b4567.0000000000000ebc
rbd_data.11856b8b4567.0000000000000002
$ rados -p rbd ls | grep 1185 | wc -l
62
- 再次写入数据并查看变化,随着写入的数据变多,其中的对象也会变多:
$ dd if=/dev/zero of=/mnt/rbd_test/fi1e1 count=200 bs=1M
记录了200+0 的读入
记录了200+0 的写出
209715200字节(210 MB)已复制,0.441176 秒,475 MB/秒
$ rados -p rbd ls | grep 1185 | wc -l
87
调整RBD
- 调整
RBD
大小:
$ rbd resize rbd_test --size 20480
Resizing image: 100% complete...done.
- 调整文件系统大小:
$ xfs_growfs -d /mnt/rbd_test/
meta-data=/dev/rbd0 isize=512 agcount=16, agsize=167936 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0 spinodes=0
data = bsize=4096 blocks=2682880, imaxpct=25
= sunit=1024 swidth=1024 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=8 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
data blocks changed from 2682880 to 5242880
- 查看
RBD
变化:
$ rbd info rbd_test
rbd image 'rbd_test':
size 20 GiB in 5120 objects
order 22 (4 MiB objects)
id: 11856b8b4567
block_name_prefix: rbd_data.11856b8b4567
format: 2
features: layering, exclusive-lock
op_features:
flags:
create_timestamp: Fri Aug 24 10:21:11 2018
$ lsblk | grep rbd0
rbd0 252:0 0 20G 0 disk /mnt/rbd_test
$ df -h | grep rbd
/dev/rbd0 20G 234M 20G 2% /mnt/rbd_test
快照RBD
- 创建测试文件:
$ echo "Hello Ceph This is snapshot test" > /mnt/rbd_test/file2
$ ls -lh /mnt/rbd_test/
总用量 201M
-rw-r--r-- 1 root root 200M 8月 24 15:46 fi1e1
-rw-r--r-- 1 root root 33 8月 24 15:51 file2
$ cat /mnt/rbd_test/file2
Hello Ceph This is snapshot test
- 创建
RBD
快照:
$ rbd snap create rbd_test@snap1
$ rbd snap ls rbd_test
SNAPID NAME SIZE TIMESTAMP
4 snap1 20 GiB Fri Aug 24 15:52:49 2018
- 删除文件:
$ rm -rfv /mnt/rbd_test/file2
已删除"/mnt/rbd_test/file2"
$ ls -lh /mnt/rbd_test/
总用量 200M
-rw-r--r-- 1 root root 200M 8月 24 15:46 fi1e1
- 卸载并取消
RBD
映射:
$ umount /mnt/rbd_test
$ rbd unmap rbd_test
- 回滚
RBD
:
$ rbd snap rollback rbd_test@snap1
Rolling back to snapshot: 100% complete...done.
- 重新映射和挂载
RBD
,并查看文件:
$ rbd map rbd_test
/dev/rbd0
$ mount /dev/rbd0 /mnt/rbd_test
$ ls -lh /mnt/rbd_test/
总用量 201M
-rw-r--r-- 1 root root 200M 8月 24 15:46 fi1e1
-rw-r--r-- 1 root root 33 8月 24 15:51 file2
观察PG
- 随意查看
rbd
存储池中的对象OSDMap
,可以看到其中PG
的OSD
顺序并不完全相同,而且同一个Pool
中的对象的PG
的ID
中小数点前的数字是一样的:
$ ceph osd map rbd rbd_info
osdmap e74 pool 'rbd' (6) object 'rbd_info' -> pg 6.ac0e573a (6.2) -> up ([1,0,2], p1) acting ([1,0,2], p1)
$ ceph osd map rbd rbd_directory
osdmap e74 pool 'rbd' (6) object 'rbd_directory' -> pg 6.30a98c1c (6.4) -> up ([0,1,2], p0) acting ([0,1,2], p0)
$ ceph osd map rbd rbd_id.rbd_test
osdmap e74 pool 'rbd' (6) object 'rbd_id.rbd_test' -> pg 6.818788b3 (6.3) -> up ([1,2,0], p1) acting ([1,2,0], p1)
$ ceph osd map rbd rbd_data.11856b8b4567.0000000000000022
osdmap e74 pool 'rbd' (6) object 'rbd_data.11856b8b4567.0000000000000022' -> pg 6.deee7c73 (6.3) -> up ([1,2,0], p1) acting ([1,2,0], p1)
$ ceph osd map rbd rbd_data.11856b8b4567.000000000000000a
osdmap e74 pool 'rbd' (6) object 'rbd_data.11856b8b4567.000000000000000a' -> pg 6.561c344b (6.3) -> up ([1,2,0], p1) acting ([1,2,0], p1)
$ ceph osd map rbd rbd_data.11856b8b4567.00000000000007b0
osdmap e74 pool 'rbd' (6) object 'rbd_data.11856b8b4567.00000000000007b0' -> pg 6.a603e1f (6.7) -> up ([1,0,2], p1) acting ([1,0,2], p1)
- 创建一个两副本的存储池,可以看到同一个存储池对象的
PG
也可能会使用不同的OSD
:
$ ceph osd pool create pg_test 8 8
pool 'pg_test' created
$ ceph osd dump | grep pg_test
pool 12 'pg_test' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 75 flags hashpspool stripe_width 0
$ osd pool set pg_test size 2
set pool 12 size to 2
$ ceph osd dump | grep pg_test
pool 12 'pg_test' replicated size 2 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 78 flags hashpspool stripe_width 0
$ rados -p pg_test put object1 /etc/hosts
$ rados -p pg_test put object2 /etc/hosts
$ rados -p pg_test put object3 /etc/hosts
$ rados -p pg_test put object4 /etc/hosts
$ rados -p pg_test put object5 /etc/hosts
$ rados -p pg_test ls
object1
object2
object3
object4
object5
$ ceph osd map pg_test object1
osdmap e79 pool 'pg_test' (12) object 'object1' -> pg 12.bac5debc (12.4) -> up ([2,0], p2) acting ([2,0], p2)
$ ceph osd map pg_test object2
osdmap e79 pool 'pg_test' (12) object 'object2' -> pg 12.f85a416a (12.2) -> up ([2,0], p2) acting ([2,0], p2)
$ ceph osd map pg_test object3
osdmap e79 pool 'pg_test' (12) object 'object3' -> pg 12.f877ac20 (12.0) -> up ([1,0], p1) acting ([1,0], p1
$ ceph osd map pg_test object4
osdmap e79 pool 'pg_test' (12) object 'object4' -> pg 12.9d9216ab (12.3) -> up ([2,1], p2) acting ([2,1], p2)
$ ceph osd map pg_test object5
osdmap e79 pool 'pg_test' (12) object 'object5' -> pg 12.e1acd6d (12.5) -> up ([1,2], p1) acting ([1,2], p1)
测试性能
- 写入性能测试:
$ rados bench -p test-pool 10 write --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_osdev01_1827771
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 31 15 59.8716 60 0.388146 0.666288
2 16 49 33 65.9176 72 0.62486 0.824162
3 16 65 49 65.2595 64 1.18038 0.834558
4 16 86 70 69.8978 84 0.657194 0.834779
5 16 107 91 72.7115 84 0.594541 0.829814
6 16 125 109 72.5838 72 0.371435 0.796664
7 16 149 133 75.8989 96 1.17764 0.803259
8 16 165 149 74.4101 64 0.568129 0.797091
9 16 185 169 75.01 80 0.813372 0.81463
10 16 203 187 74.7085 72 0.728715 0.812529
Total time run: 10.3161
Total writes made: 203
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 78.7122
Stddev Bandwidth: 11.1634
Max bandwidth (MB/sec): 96
Min bandwidth (MB/sec): 60
Average IOPS: 19
Stddev IOPS: 2
Max IOPS: 24
Min IOPS: 15
Average Latency(s): 0.80954
Stddev Latency(s): 0.293645
Max latency(s): 1.77366
Min latency(s): 0.240024
- 顺序读取性能测试:
$ rados bench -p test-pool 10 seq
hints = 1
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 72 56 223.808 224 0.0519066 0.217292
2 16 111 95 189.736 156 0.658876 0.289657
3 16 160 144 191.663 196 0.0658452 0.301259
4 16 203 187 186.745 172 0.210803 0.297584
Total time run: 4.43386
Total reads made: 203
Read size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 183.136
Average IOPS: 45
Stddev IOPS: 7
Max IOPS: 56
Min IOPS: 39
Average Latency(s): 0.346754
Max latency(s): 1.37891
Min latency(s): 0.0249563
- 随机读取性能测试:
$ rados bench -p test-pool 10 rand
hints = 1
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 59 43 171.94 172 0.271225 0.222279
2 16 108 92 183.95 196 1.06429 0.275433
3 16 153 137 182.618 180 0.00350975 0.304582
4 16 224 208 207.951 284 0.0678476 0.278888
5 16 267 251 200.757 172 0.00386545 0.289519
6 16 319 303 201.955 208 0.866646 0.294983
7 16 360 344 196.529 164 0.00428517 0.30615
8 16 405 389 194.458 180 0.903073 0.311316
9 16 455 439 195.071 200 0.00368576 0.316057
10 16 517 501 200.36 248 0.621325 0.309242
Total time run: 10.5614
Total reads made: 518
Read size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 196.187
Average IOPS: 49
Stddev IOPS: 9
Max IOPS: 71
Min IOPS: 41
Average Latency(s): 0.321834
Max latency(s): 1.16304
Min latency(s): 0.0026629
- 使用
fio
进行测试:
$ yum install -y fio "*librbd*"
$ rbd create fio_test --size 20480
$ vi write.fio
[global]
description="write test with block size of 4M"
ioengine=rbd
clustername=ceph
clientname=admin
pool=rbd
rbdname=fio_test
iodepth=32
runtime=120
rw=write
bs=4M
[logging]
write_iops_log=write_iops_log
write_bw_log=write_bw_log
write_lat_log=write_lat_log
$ fio write.fio
logging: (g=0): rw=write, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=rbd, iodepth=32
fio-3.1
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 00m:00s]
logging: (groupid=0, jobs=1): err= 0: pid=161962: Wed Aug 29 19:17:17 2018
Description : ["write test with block size of 4M"]
write: IOPS=15, BW=60.4MiB/s (63.3MB/s)(7252MiB/120085msec)
slat (usec): min=665, max=14535, avg=1584.29, stdev=860.28
clat (msec): min=1828, max=4353, avg=2092.28, stdev=180.12
lat (msec): min=1829, max=4354, avg=2093.87, stdev=180.15
clat percentiles (msec):
| 1.00th=[ 1838], 5.00th=[ 1938], 10.00th=[ 1989], 20.00th=[ 2022],
| 30.00th=[ 2039], 40.00th=[ 2056], 50.00th=[ 2072], 60.00th=[ 2106],
| 70.00th=[ 2123], 80.00th=[ 2165], 90.00th=[ 2198], 95.00th=[ 2232],
| 99.00th=[ 2333], 99.50th=[ 3977], 99.90th=[ 4111], 99.95th=[ 4329],
| 99.99th=[ 4329]
bw ( KiB/s): min= 963, max= 2294, per=3.26%, avg=2013.72, stdev=117.50, samples=1813
iops : min= 1, max= 1, avg= 1.00, stdev= 0.00, samples=1813
lat (msec) : 2000=13.40%, >=2000=86.60%
cpu : usr=1.94%, sys=0.40%, ctx=157, majf=0, minf=157364
IO depths : 1=2.3%, 2=6.0%, 4=12.6%, 8=25.2%, 16=50.3%, 32=3.6%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=97.0%, 8=0.0%, 16=0.0%, 32=3.0%, 64=0.0%, >=64=0.0%
issued rwt: total=0,1813,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
WRITE: bw=60.4MiB/s (63.3MB/s), 60.4MiB/s-60.4MiB/s (63.3MB/s-63.3MB/s), io=7252MiB (7604MB), run=120085-120085msec
Disk stats (read/write):
sda: ios=5/653, merge=0/6, ticks=6/2818, in_queue=2824, util=0.17%
参考文档
- INSTALLATION (CEPH-DEPLOY)