如果你的ceph集群的数据只存在在该节点的所有OSD上,删除该节点的OSD会导致数据丢失。如果集群配置了冗余replication或者EC,需要做pg 修复。出于数据安全考虑,请一定,一定,一定,备份好你要删除的OSD上的数据。
这里一共有3篇文章讲述删除 Ceph 集群里的某个节点的全部OSD,其中前面两种做法都是不安全的(写在这里是因为CSDN上有其它blog介绍了这两种方法,但并没有提及安全性和可行性)
删除 Ceph 集群里的某个节点的全部OSD (1 of 3)
删除 Ceph 集群里的某个节点的全部OSD (2 of 3)
1. 查看 ceph 集群的现有OSD 列表
yjiang2@ubuntu-sebre:~$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.06798 root default
-3 0.02899 host ubuntu
0 ssd 0.00999 osd.0 up 1.00000 1.00000
1 ssd 0.01900 osd.1 up 1.00000 1.00000
-5 0.03899 host ubuntu-sebre
2 hdd 0.01900 osd.2 up 1.00000 1.00000
3 hdd 0.01900 osd.3 up 1.00000 1.00000
2. 使用命令 ceph-deploy purge ubuntu 来清除ubuntu节点的 所有OSD
yjiang2@ubuntu-sebre:~$ ceph-deploy purge ubuntu
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/yjiang2/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.38): /usr/bin/ceph-deploy purge ubuntu
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf :
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] host : ['ubuntu']
[ceph_deploy.cli][INFO ] func :
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.install][INFO ] note that some dependencies *will not* be removed because they can cause issues with qemu-kvm
[ceph_deploy.install][INFO ] like: librbd1 and librados2
[ceph_deploy.install][DEBUG ] Purging on cluster ceph hosts ubuntu
[ceph_deploy.install][DEBUG ] Detecting platform for host ubuntu ...
[ubuntu][DEBUG ] connection detected need for sudo
[ubuntu][DEBUG ] connected to host: ubuntu
[ubuntu][DEBUG ] detect platform information from remote host
[ubuntu][DEBUG ] detect machine type
[ceph_deploy.install][INFO ] Distro info: Ubuntu 18.04 bionic
[ubuntu][INFO ] Purging Ceph on ubuntu
[ubuntu][INFO ] Running command: sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q -f --force-yes remove --purge ceph ceph-mds ceph-common ceph-fs-common radosgw
[ubuntu][DEBUG ] Reading package lists...
[ubuntu][DEBUG ] Building dependency tree...
[ubuntu][DEBUG ] Reading state information...
[ubuntu][DEBUG ] Package 'ceph-fs-common' is not installed, so not removed
[ubuntu][DEBUG ] The following packages were automatically installed and are no longer required:
[ubuntu][DEBUG ] formencode-i18n libavahi-core7 libbabeltrace1 libcephfs2 libdaemon0
[ubuntu][DEBUG ] libgoogle-perftools4 libleveldb1v5 libradosstriper1 librgw2
[ubuntu][DEBUG ] libtcmalloc-minimal4 libvncclient1 libvncserver1 python-bs4 python-cephfs
[ubuntu][DEBUG ] python-cherrypy3 python-dbus python-dnspython python-formencode
[ubuntu][DEBUG ] python-jinja2 python-logutils python-mako python-markupsafe python-paste
[ubuntu][DEBUG ] python-pastedeploy python-pastedeploy-tpl python-pecan python-rados
[ubuntu][DEBUG ] python-rbd python-simplegeneric python-simplejson python-singledispatch
[ubuntu][DEBUG ] python-tempita python-waitress python-webob python-webtest python-werkzeug
[ubuntu][DEBUG ] x11vnc-data
[ubuntu][DEBUG ] Use 'sudo apt autoremove' to remove them.
[ubuntu][DEBUG ] The following packages will be REMOVED:
[ubuntu][DEBUG ] ceph* ceph-base* ceph-common* ceph-mds* ceph-mgr* ceph-mon* ceph-osd*
[ubuntu][DEBUG ] radosgw*
[ubuntu][DEBUG ] 0 upgraded, 0 newly installed, 8 to remove and 251 not upgraded.
[ubuntu][DEBUG ] After this operation, 159 MB disk space will be freed.
(Reading database ... 240145 files and directories currently installed.)
[ubuntu][DEBUG ] Removing ceph-mds (12.2.11-0ubuntu0.18.04.2) ...
[ubuntu][DEBUG ] Removing ceph (12.2.11-0ubuntu0.18.04.2) ...
[ubuntu][DEBUG ] Removing ceph-osd (12.2.11-0ubuntu0.18.04.2) ...
[ubuntu][DEBUG ] Removing ceph-mgr (12.2.11-0ubuntu0.18.04.2) ...
[ubuntu][DEBUG ] Removing radosgw (12.2.11-0ubuntu0.18.04.2) ...
[ubuntu][DEBUG ] Removing ceph-mon (12.2.11-0ubuntu0.18.04.2) ...
[ubuntu][DEBUG ] Removing ceph-base (12.2.11-0ubuntu0.18.04.2) ...
[ubuntu][DEBUG ] Removing ceph-common (12.2.11-0ubuntu0.18.04.2) ...
[ubuntu][DEBUG ] Processing triggers for libc-bin (2.27-3ubuntu1) ...
[ubuntu][DEBUG ] Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
(Reading database ... 239649 files and directories currently installed.)
[ubuntu][DEBUG ] Purging configuration files for ceph-osd (12.2.11-0ubuntu0.18.04.2) ...
[ubuntu][DEBUG ] dpkg: warning: while removing ceph-osd, directory '/var/lib/ceph/osd' not empty so not removed
[ubuntu][DEBUG ] Purging configuration files for ceph-mds (12.2.11-0ubuntu0.18.04.2) ...
[ubuntu][DEBUG ] Purging configuration files for ceph (12.2.11-0ubuntu0.18.04.2) ...
[ubuntu][DEBUG ] Purging configuration files for ceph-mon (12.2.11-0ubuntu0.18.04.2) ...
[ubuntu][DEBUG ] Purging configuration files for ceph-base (12.2.11-0ubuntu0.18.04.2) ...
[ubuntu][DEBUG ] dpkg: warning: while removing ceph-base, directory '/var/lib/ceph/tmp' not empty so not removed
[ubuntu][DEBUG ] dpkg: warning: while removing ceph-base, directory '/var/lib/ceph/bootstrap-osd' not empty so not removed
[ubuntu][DEBUG ] Purging configuration files for ceph-mgr (12.2.11-0ubuntu0.18.04.2) ...
[ubuntu][DEBUG ] Purging configuration files for ceph-common (12.2.11-0ubuntu0.18.04.2) ...
[ubuntu][DEBUG ] Purging configuration files for radosgw (12.2.11-0ubuntu0.18.04.2) ...
[ubuntu][DEBUG ] Processing triggers for ureadahead (0.100.0-21) ...
[ubuntu][DEBUG ] Processing triggers for systemd (237-3ubuntu10.25) ...
此时,查看OSD状态都已经为down
yjiang2@ubuntu-sebre:~$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.06798 root default
-3 0.02899 host ubuntu
0 ssd 0.00999 osd.0 down 1.00000 1.00000
1 ssd 0.01900 osd.1 down 1.00000 1.00000
-5 0.03899 host ubuntu-sebre
2 hdd 0.01900 osd.2 up 1.00000 1.00000
3 hdd 0.01900 osd.3 up 1.00000 1.00000
3. 运行命令ceph-deploy purgedata ubuntu来删除数据
yjiang2@ubuntu-sebre:~$ ceph-deploy purgedata ubuntu
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/yjiang2/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.38): /usr/bin/ceph-deploy purgedata ubuntu
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf :
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] host : ['ubuntu']
[ceph_deploy.cli][INFO ] func :
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.install][DEBUG ] Purging data from cluster ceph hosts ubuntu
[ubuntu][DEBUG ] connection detected need for sudo
[ubuntu][DEBUG ] connected to host: ubuntu
[ubuntu][DEBUG ] detect platform information from remote host
[ubuntu][DEBUG ] detect machine type
[ubuntu][DEBUG ] find the location of an executable
[ubuntu][DEBUG ] connection detected need for sudo
[ubuntu][DEBUG ] connected to host: ubuntu
[ubuntu][DEBUG ] detect platform information from remote host
[ubuntu][DEBUG ] detect machine type
[ceph_deploy.install][INFO ] Distro info: Ubuntu 18.04 bionic
[ubuntu][INFO ] purging data on ubuntu
[ubuntu][INFO ] Running command: sudo rm -rf --one-file-system -- /var/lib/ceph
[ubuntu][INFO ] Running command: sudo rm -rf --one-file-system -- /etc/ceph/
查看OSD 列表:
yjiang2@ubuntu-sebre:~$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.06798 root default
-3 0.02899 host ubuntu
0 ssd 0.00999 osd.0 down 1.00000 1.00000
1 ssd 0.01900 osd.1 down 1.00000 1.00000
-5 0.03899 host ubuntu-sebre
2 hdd 0.01900 osd.2 up 1.00000 1.00000
3 hdd 0.01900 osd.3 up 1.00000 1.00000
yjiang2@ubuntu-sebre:~$
1. 按ceph-deploy -h的提示如下:
purge Remove Ceph packages from remote hosts and purge all data.
purgedata Purge (delete, destroy, discard, shred) any Ceph data from /var/lib/ceph
说到删除远程host 的Ceph 包及所有数据,但是操作下来并没有删除CRUSH 里的信息。导致ceph osd tree仍然可以看到purge后的host 的 osd信息。
2. 所以还需要寻找其它方法来更彻底删除某个host上的所有 OSD,请看后续。
删除 Ceph 集群里的某个节点的全部OSD (2 of 2)