官方文档: https://docs.ceph.com/en/latest/cephadm/
本来有四台2U的服务器可以作为ceph的节点,但是由于节点3少了一块RAID卡,只能排除到ceph集群之外,作为测试节点使用。好在3个节点也可以正常部署ceph集群。
ceph集群的Public网络和Cluster网络使用同一个网段。
设备名 | 管理IP(网络接口) | Public/Cluster IP (网络接口) | OS | HDD | SSD | NVME |
---|---|---|---|---|---|---|
Node1 | 192.168.40.31/24(enp5s0f0) | 172.18.0.31/24(bond0) | rocky9.1 | 18TB*5 | 480GB*1 | 800GB*1 |
Node2 | 192.168.40.32/24(enp5s0f0) | 172.18.0.32/24(bond0) | rocky9.1 | 18TB*5 | 480GB*1 | 800GB*1 |
Node4 | 192.168.40.34/24(enp5s0f0) | 172.18.0.34/24(bond0) | rocky9.1 | 18TB*5 | 480GB*1 | 800GB*1 |
使用rocky9.1 minimal镜像以UEFI的方式启动系统,系统语言选择Englist,Package选择minimal-environment,分区时选择手动,先将系统盘上原先的系统删除,然后点自动分区,删除/home目录,将swap分区调整为32 GB,其余空间全部给/目录。
按照规划的样子设置网络,此处使用网络设置脚本进行的设置,故此略过。
dnf install vim net-tools wget lsof python3 -y
systemctl disable --now firewalld
sed -i '/^SELINUX=/c SELINUX=disabled' /etc/selinux/config
setenforce 0
cat << EOF > /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.18.0.31 Node1
172.18.0.32 Node2
172.18.0.34 Node4
EOF
# 内核参数设置:开启IP转发,允许iptables对bridge的数据进行处理
cat << EOF > /etc/sysctl.conf
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1
EOF
cat << EOF > /etc/sysctl.d/ceph.conf
kernel.pid_max = 4194303
vm.swappiness = 0
EOF
systemctl status chronyd.service
chronyc sources -v
yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum install -y docker-ce
cat << EOF > /etc/docker/daemon.json
{
"registry-mirrors": ["https://ud6340vz.mirror.aliyuncs.com"],
"bip": "172.24.16.1/24"
}
EOF
systemctl enable --now docker
事后发现cephadm默认使用podman作为容器运行时,而rocky9.1默认就安装了podman,所以docker相关的内容不安装也可以。
dnf search release-ceph
dnf install --assumeyes centos-release-ceph-reef
dnf install --assumeyes cephadm
which cephadm
# 在 Node1 上启动 cephadm bootstrap
[root@Node1 ~]# cephadm bootstrap --mon-ip 172.18.0.31
上述指令会为我们完成以下工作:
执行结果如下:
[root@Node1 ~]# cephadm bootstrap --mon-ip 172.18.0.31
Creating directory /etc/ceph for ceph.conf
Verifying podman|docker is present...
# 省略部分输出
......
Ceph Dashboard is now available at:
URL: https://Node1:8443/
User: admin
Password: fxwkmeq45z
Enabling client.admin keyring and conf on hosts with "admin" label
Saving cluster configuration to /var/lib/ceph/a8d27998-3683-11ee-a1d3-0cc47a34f238/config directory
Enabling autotune for osd_memory_target
You can access the Ceph CLI as following in case of multi-cluster or non-default config:
sudo /usr/sbin/cephadm shell --fsid a8d27998-3683-11ee-a1d3-0cc47a34f238 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring
Or, if you are only running a single cluster on this host:
sudo /usr/sbin/cephadm shell
Please consider enabling telemetry to help improve Ceph:
ceph telemetry on
For more information see:
https://docs.ceph.com/en/latest/mgr/telemetry/
Bootstrap complete.
[root@Node1 ~]#
登录https://192.168.40.31:8443,使用上面给出的账户密码登录dashboard,然后将密码修改为自己想要的密码,以免忘记。
要运行ceph命令需如下:
cephadm shell -- ceph -s
或在所有需要的ceph节点上安装ceph-common包
cephadm add-repo --release reef
# liburing 包需自动下载。从rpmfind(https://rpmfind.net/linux/rpm2html/search.php)搜索到该包的下载地址:
wget https://rpmfind.net/linux/centos-stream/9-stream/AppStream/x86_64/os/Packages/liburing-2.3-2.el9.x86_64.rpm
rpm -ivh liburing-2.3-2.el9.x86_64.rpm
# 安装ceph-common包
dnf install -y librbd1 ceph-common
# 查看ceph-common是否正常安装
ceph -v
在Node1上执行以下命令:
ssh-copy-id -f -i /etc/ceph/ceph.pub root@Node2 # 需手动输入密码
ssh-copy-id -f -i /etc/ceph/ceph.pub root@Node4 # 需手动输入密码
# 添加ceph节点
ceph orch host add Node2 172.18.0.32
ceph orch host add Node4 172.18.0.34
# 查看现在的节点情况
ceph orch host ls
# 添加label后可以允许该节点运行ceph cli(比如cephadm shell命令)
ceph orch host label add Node2 _admin
ceph orch host label add Node4 _admin
# 设置 mon 节点
ceph orch apply mon 3
ceph orch apply mon Node1,Node2,Node4
# 查看 mon 详情
[root@Node1 ~]# ceph mon dump
epoch 3
fsid a8d27998-3683-11ee-a1d3-0cc47a34f238
last_changed 2023-08-09T07:26:38.393240+0000
created 2023-08-09T07:10:49.128856+0000
min_mon_release 18 (reef)
election_strategy: 1
0: [v2:172.18.0.31:3300/0,v1:172.18.0.31:6789/0] mon.Node1
1: [v2:172.18.0.32:3300/0,v1:172.18.0.32:6789/0] mon.Node2
2: [v2:172.18.0.34:3300/0,v1:172.18.0.34:6789/0] mon.Node4
dumped monmap epoch 3
# 设置 mgr 节点
ceph orch apply mgr 3
ceph orch apply mgr Node1,Node2,Node4
[root@Node1 ~]# ceph -s
cluster:
id: a8d27998-3683-11ee-a1d3-0cc47a34f238
health: HEALTH_WARN
OSD count 0 <