CephFS
Ceph Filesystem:ceph的文件系统,主要用于文件共享,类似NFS
MDS: meta data service,元数据服务,CephFS的运行依赖于MDS。MDS的守护进程是ceph-mds
ceph-mds作用:
ceph-mds进程自身的管理
主要用于存储CephFS上存储文件相关的元数据,
协调对ceph存储集群的访问
部署MDS服务
可以部署在mgr,mon节点,在ceph-mgr1安装ceph-mds
ceph@ceph-mgr1:~$ sudo apt -y install ceph-mds
在ceph-deploy节点,
创建在ceph-mgr1上得mds
ceph@ceph-deploy:~/ceph-cluster$ ceph-deploy mds create ceph-mgr1
创建metadata元数据存储池和data数据存储池,这两个存储池用于创建CephFS,如下创建名为cephfs-metadata的元数据存储池和cephfs-data的数据存储池, 最后两个32分别是pg归置组的数量和pgp归置组排序的数量
ceph@ceph-deploy:~/ceph-cluster$ ceph osd pool create cephfs-metadata 32 32
ceph@ceph-deploy:~/ceph-cluster$ ceph osd pool create cephfs-data 64 64
查看ceph集群的状态:
ceph@ceph-deploy:~/ceph-cluster$ ceph -s
cluster:
id: 98762d01-8474-493a-806e-fcb0dfc5fdb2
health: HEALTH_WARN
1 pool(s) do not have an application enabled
services:
mon: 1 daemons, quorum ceph-mon1 (age 9d)
mgr: ceph-mgr1(active, since 9d)
mds: 1/1 daemons up
osd: 11 osds: 11 up (since 9d), 11 in (since 11d)
rgw: 1 daemon active (1 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 10 pools, 329 pgs
objects: 650 objects, 1.4 GiB
usage: 8.5 GiB used, 211 GiB / 220 GiB avail
pgs: 329 active+clean
创建cephfs
ceph@ceph-deploy:~/ceph-cluster$ ceph fs new mycephfs cephfs-metadata cephfs-data
查看fs状态
ceph@ceph-deploy:~/ceph-cluster$ ceph fs ls
name: mycephfs, metadata pool: cephfs-metadata, data pools: [cephfs-data ]
ceph@ceph-deploy:~/ceph-cluster$ ceph fs status mycephfs
mycephfs - 1 clients
========
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 active ceph-mgr1 Reqs: 0 /s 65 43 21 30
POOL TYPE USED AVAIL
cephfs-metadata metadata 1776k 66.3G
cephfs-data data 1364M 66.3G
MDS version: ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)
验证fs状态,active状态
ceph@ceph-deploy:~/ceph-cluster$ ceph mds stat
mycephfs:1 {0=ceph-mgr1=up:active}
创建具有cephfs权限的账户
ceph@ceph-deploy:~/ceph-cluster$ ceph auth add client.huahaulincephfs mon "allow rw" osd "allow rwx pool=cephfs-dada"
added key for client.huahaulincephfs
验证:
ceph@ceph-deploy:~/ceph-cluster$ ceph auth get client.huahaulincephfs
[client.huahaulincephfs]
key = AQDtrzJhUzNSOBAAnepZKifX1VAGoj31qAfjbw==
caps mon = "allow rw"
caps osd = "allow rwx pool=cephfs-dada"
exported keyring for client.huahaulincephfs
创建keyring文件
ceph@ceph-deploy:~/ceph-cluster$ ceph auth get client.huahaulincephfs -o ceph.client.huahaulincephfs.keyring
exported keyring for client.huahaulincephfs
创建key文件
[root@ceph-client1 ceph]# ceph auth print-key client.huahaulincephfs > huahaulincephfs.key
验证keyring文件
ceph@ceph-deploy:~/ceph-cluster$ cat ceph.client.huahaulincephfs.keyring
[client.huahaulincephfs]
key = AQDtrzJhUzNSOBAAnepZKifX1VAGoj31qAfjbw==
caps mon = "allow rw"
caps osd = "allow rwx pool=cephfs-dada"
在客户端挂载mycephfs
安装客户端工具ceph-common,需要配置相应yum源,这里使用的centos7作为客户端
yum install ceph-common -y
授权
在ceph-deploy将刚刚创建的huahaulincephfs的秘钥分发过来
ceph@ceph-deploy:~/ceph-cluster$ sudo scp ceph.conf ceph.client.huahaulincephfs.keyring huahaulincephfs.key root@ceph-clinet1:/etc/ceph/
客户端权限验证
[root@ceph-client1 ceph]# ceph --user huahaulincephfs -s
cluster:
id: 98762d01-8474-493a-806e-fcb0dfc5fdb2
health: HEALTH_WARN
1 pool(s) do not have an application enabled
services:
mon: 1 daemons, quorum ceph-mon1 (age 9d)
mgr: ceph-mgr1(active, since 9d)
mds: 1/1 daemons up
osd: 11 osds: 11 up (since 9d), 11 in (since 11d)
rgw: 1 daemon active (1 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 10 pools, 329 pgs
objects: 650 objects, 1.4 GiB
usage: 8.5 GiB used, 211 GiB / 220 GiB avail
pgs: 329 active+clean
挂载cephfs
有两种方式:内核空间挂载和用户空间挂载,推荐使用内核空间挂载,内核空间挂载需要支持ceph模块,用户空间挂载需要支持ceph-fuse模块,一般使用内核空间挂载方式,除非内核版本较低,且没有ceph模块的时候,可以安装ceph-fuse方式挂载
演示内核空间挂载,挂载有两种方式,通过key文件挂载和通过key挂载
#挂载cephfs需要挂载mon节点的6789端口,加入到mon集群的节点才可以被挂载,将mon节点加入集群之后挂载,发现报错
[root@ceph-client1 ceph]# mount -t ceph 192.168.241.12:6789,192.168.241.13:6789,192.168.241.14:6789:/ /datafs -o name=huahaulincephfs,secret=AQDtrzJhUzNSOBAAnepZKifX1VAGoj31qAfjbw==
mount error 13 = Permission denied
原因是没有授权mds权限,于是更新权限,加上mds权限
[root@ceph-client1 ceph]# ceph auth get client.huahaulincephfs
exported keyring for client.huahaulincephfs
[client.huahaulincephfs]
key = AQDtrzJhUzNSOBAAnepZKifX1VAGoj31qAfjbw==
caps mon = "allow rw"
caps osd = "allow rwx pool=cephfs-dada"
[root@ceph-client1 ceph]# ceph auth caps client.huahaulincephfs mon "allow r" mds "allow rw" osd "allow rwx pool=cephfs-data"
updated caps for client.huahaulincephfs
[root@ceph-client1 ceph]# ceph auth get client.huahaulincephfs
exported keyring for client.huahaulincephfs
[client.huahaulincephfs]
key = AQDtrzJhUzNSOBAAnepZKifX1VAGoj31qAfjbw==
caps mds = "allow rw"
caps mon = "allow r"
caps osd = "allow rwx pool=cephfs-data"
[root@ceph-client1 /]# mount -t ceph 192.168.241.12:6789,192.168.241.13:6789,192.168.241.14:6789:/ /datafs/ -o name=huahaulincephfs,secret=AQDtrzJhUzNSOBAAnepZKifX1VAGoj31qAfjbw==
或者使用key文件挂载
mount -t ceph 192.168.241.12:6789,192.168.241.13:6789,192.168.241.14:6789:/ /datafs/ -o name=huahaulincephfs,secretfile=/etc/ceph/huahaulincephfs.key
挂载成功!!
#验证数据,挂载之前 cephfs-data 已使用1.6G
[root@ceph-client1 /]# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 220 GiB 206 GiB 14 GiB 14 GiB 6.41
TOTAL 220 GiB 206 GiB 14 GiB 14 GiB 6.41
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
device_health_metrics 1 1 0 B 0 0 B 0 64 GiB
mypool 2 32 1.2 MiB 1 3.5 MiB 0 64 GiB
.rgw.root 3 32 1.3 KiB 4 48 KiB 0 64 GiB
default.rgw.log 4 32 3.6 KiB 209 408 KiB 0 64 GiB
default.rgw.control 5 32 0 B 8 0 B 0 64 GiB
default.rgw.meta 6 8 0 B 0 0 B 0 64 GiB
myrbd1 7 64 829 MiB 223 2.4 GiB 1.25 64 GiB
cephfs-metadata 8 32 640 KiB 23 2.0 MiB 0 64 GiB
cephfs-data 9 64 563 MiB 179 1.6 GiB 0.85 64 GiB
rbd1-data 10 32 538 MiB 158 1.6 GiB 0.81 64 GiB
#写入200M数据
[root@ceph-client1 /]# dd if=/dev/zero of=/datafs/test bs=1M count=200
200+0 records in
200+0 records out
209715200 bytes (210 MB) copied, 1.03247 s, 203 MB/s
#写入200M数据之后,cephfs-data变成1.8GiB
[root@ceph-client1 /]# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 220 GiB 205 GiB 15 GiB 15 GiB 6.93
TOTAL 220 GiB 205 GiB 15 GiB 15 GiB 6.93
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
device_health_metrics 1 1 0 B 0 0 B 0 63 GiB
mypool 2 32 1.2 MiB 1 3.5 MiB 0 63 GiB
.rgw.root 3 32 1.3 KiB 4 48 KiB 0 63 GiB
default.rgw.log 4 32 3.6 KiB 209 408 KiB 0 63 GiB
default.rgw.control 5 32 0 B 8 0 B 0 63 GiB
default.rgw.meta 6 8 0 B 0 0 B 0 63 GiB
myrbd1 7 64 829 MiB 223 2.4 GiB 1.26 63 GiB
cephfs-metadata 8 32 667 KiB 23 2.0 MiB 0 63 GiB
cephfs-data 9 64 627 MiB 179 1.8 GiB 0.96 63 GiB
rbd1-data 10 32 538 MiB 158 1.6 GiB 0.82 63 GiB
#查看挂载点/datafs状态
[root@ceph-client1 ceph]# stat -f /datafs/
File: "/datafs/"
ID: b1d1181888b4b15b Namelen: 255 Type: ceph
Block size: 4194304 Fundamental block size: 4194304
Blocks: Total: 16354 Free: 16191 Available: 16191
Inodes: Total: 179 Free: -1
#设置开机挂载
[root@ceph-client1 ceph]# vi /etc/fstab
192.168.241.12:6789,192.168.241.13:6789,192.168.241.14:6789:/ /datafs ceph defaults,name=huahaulincephfs,secretfile=/etc/ceph/huahaulincephfs.key,_netdev 0 0
[root@ceph-client1 ceph]# mount -a
MDS高可用
#查看mds状态,是单节点
[root@ceph-client1 /]# ceph mds stat
mycephfs:1 {0=ceph-mgr1=up:active}
#添加mds服务角色,当前已有ceph-mgr1一个mds角色,接下将ceph-mgr2,ceph-mon2,ceph-mon3添加为mds角色,实现两主两备和高性能结构
在ceph-mgr2,ceph-mon2,ceph-mon3 分别安装ceph-mds服务,安装ceph-mds命令同样不同在ceph直接过间接登录过的环境执行
dyl@ceph-mgr2:~$ sudo apt install -y ceph-mds
dyl@ceph-mon2:~$ sudo apt -y install ceph-mds
dyl@ceph-mon3:/etc/ceph$ sudo apt -y install ceph-mds
#添加mds,在ceph-deploy节点执行
ceph@ceph-deploy:~/ceph-cluster$ ceph-deploy mds create ceph-mgr2
ceph@ceph-deploy:~/ceph-cluster$ ceph-deploy mds create ceph-mon2
ceph@ceph-deploy:~/ceph-cluster$ ceph-deploy mds create ceph-mon3
#查看mds状态,4个up了,
ceph@ceph-deploy:~/ceph-cluster$ ceph mds stat
mycephfs:1 {0=ceph-mgr1=up:active} 4 up:standby
#验证mds集群的状态,有四个standby,1个active,之前可能将mon1加入过mds节点,所以mon1也在standby中
ceph@ceph-deploy:~/ceph-cluster$ ceph fs status
mycephfs - 2 clients
========
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 active ceph-mgr1 Reqs: 0 /s 66 44 21 42
POOL TYPE USED AVAIL
cephfs-metadata metadata 2108k 63.2G
cephfs-data data 1964M 63.2G
STANDBY MDS
ceph-mgr2
ceph-mon1
ceph-mon2
ceph-mon3
MDS version: ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)
#当前的文件系统状态
ceph@ceph-deploy:~/ceph-cluster$ ceph fs get mycephfs
Filesystem 'mycephfs' (1)
fs_name mycephfs
epoch 4
flags 12
created 2021-08-25T08:46:24.916762-0700
modified 2021-08-25T08:46:25.923608-0700
tableserver 0
root 0
session_timeout 60
session_autoclose 300
max_file_size 1099511627776
required_client_features {}
last_failure 0
last_failure_osd_epoch 0
compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
max_mds 1
in 0
up {0=14145}
failed
damaged
stopped
data_pools [9]
metadata_pool 8
inline_data disabled
balancer
standby_count_wanted 1
[mds.ceph-mgr1{0:14145} state up:active seq 68 addr [v2:192.168.241.15:6802/203148310,v1:192.168.241.15:6803/203148310]]
#设置两主两备,即得设置active激活状态的mds数量为2,现在有4个mds: ceph-mgr1,ceph-mgr2,ceph-mon2,ceph-mon3
ceph@ceph-deploy:~/ceph-cluster$ ceph fs set mycephfs max_mds 2
#查看active状态的mds数量为2,可以看到处于active状态的是ceph-mgr1,ceph-mon3,
ceph@ceph-deploy:~/ceph-cluster$ ceph fs status
mycephfs - 2 clients
========
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 active ceph-mgr1 Reqs: 0 /s 66 44 21 42
1 active ceph-mon3 Reqs: 0 /s 10 13 11 0
POOL TYPE USED AVAIL
cephfs-metadata metadata 2180k 63.2G
cephfs-data data 1964M 63.2G
STANDBY MDS
ceph-mgr2
ceph-mon1
ceph-mon2
MDS version: ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)
#设置mds高可用,处于active状态的是ceph-mgr1,ceph-mon3,处于standby状态的是 ceph-mgr2,ceph-mon1,ceph-mon2,ceph-mon1先不去
管,接下来就是为两个active状态的mds分别指定standby节点作为备,实现每个每个主都有一个备
不想要mon1加到mds集群里,将mon1踢出群,禁用mon1的mds服务即可,在mon1上执行
ceph@ceph-mon1:/etc/ceph$ sudo systemctl stop [email protected]
这样再用ceph fs status查看的时候就看不到ceph-mon1了
#在ceph-deploy节点配置mds高可用的配置
ceph@ceph-deploy:~/ceph-cluster$ cd /var/lib/ceph/ceph-cluster
#在ceph.conf中追加ceph-mon2为ceph-mon3的备,ceph-mgr2位ceph-mgr1的备
ceph@ceph-deploy:~/ceph-cluster$ vi ceph.conf
[mds.ceph-mon2]
mds_standby_for_name = ceph-mon3
mds_standy_replay = true
[mds.ceph-mgr2]
mds_standby_for_name = ceph-mgr1
mds_standby_replay = true
#将该配置推送到各mds节点,保持配置一致
ceph@ceph-deploy:~/ceph-cluster$ ceph-deploy --overwrite-conf config push ceph-mgr1
ceph@ceph-deploy:~/ceph-cluster$ ceph-deploy --overwrite-conf config push ceph-mgr2
ceph@ceph-deploy:~/ceph-cluster$ ceph-deploy --overwrite-conf config push ceph-mon2
ceph@ceph-deploy:~/ceph-cluster$ ceph-deploy --overwrite-conf config push ceph-mon3
#在各mds节点重启mds服务
ceph@ceph-mgr1:~$ sudo systemctl restart [email protected]
ceph@ceph-mgr2:/etc/ceph$ sudo systemctl restart [email protected]
ceph@ceph-mon2:/etc/ceph$ sudo systemctl restart [email protected]
ceph@ceph-mon3:/etc/ceph$ sudo systemctl restart [email protected]
#再次查看cephfs的状态,可以看到mds集群两主(active)两备(standby)了,可以看到aceive已经变了,可能是我重启服务的时候先后顺序不同,在主重启过程中,备就自动升级为主了
ceph@ceph-deploy:~/ceph-cluster$ ceph fs status
mycephfs - 2 clients
========
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 active ceph-mgr2 Reqs: 0 /s 91 44 21 2
1 active ceph-mon2 Reqs: 0 /s 10 13 11 0
POOL TYPE USED AVAIL
cephfs-metadata metadata 2228k 63.2G
cephfs-data data 1964M 63.2G
STANDBY MDS
ceph-mgr1
ceph-mon3
MDS version: ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)
#测试验证高可用性
在任意一个mds active节点停止mds服务,然后查看状态
比如现在的active是ceph-mgr2,ceph-mon2,和他们互为主备的分别是ceph-mgr1和ceph-mon3
ceph@ceph-deploy:~/ceph-cluster$ ceph fs status
mycephfs - 2 clients
========
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 active ceph-mgr2 Reqs: 0 /s 91 44 21 2
1 active ceph-mon2 Reqs: 0 /s 10 13 11 0
POOL TYPE USED AVAIL
cephfs-metadata metadata 2228k 63.2G
cephfs-data data 1964M 63.2G
STANDBY MDS
ceph-mon3
ceph-mgr1
MDS version: ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)
我们停止ceph-mon2的mds服务,看ceph-mon3会不会主动接替ceph-mon2为主,在ceph-mon2停止mds服务
ceph@ceph-mon2:/etc/ceph$ sudo systemctl stop [email protected]
查看fs集群状态,mon3并没有提升为主,mgr1居然提升为主了。。。ceph-mon2已经不在集群里了
ceph@ceph-deploy:~/ceph-cluster$ ceph fs status
mycephfs - 2 clients
========
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 active ceph-mgr2 Reqs: 0 /s 91 44 21 2
1 active ceph-mgr1 Reqs: 0 /s 10 13 11 0
POOL TYPE USED AVAIL
cephfs-metadata metadata 2228k 63.2G
cephfs-data data 1964M 63.2G
STANDBY MDS
ceph-mon3
MDS version: ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)
再把mon2 mds服务启动,再看鸡群里mon2又回到集群了,只不过是standby的状态
ceph@ceph-mon2:/etc/ceph$ sudo systemctl start [email protected]
ceph@ceph-deploy:~/ceph-cluster$ ceph fs status
mycephfs - 2 clients
========
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 active ceph-mgr2 Reqs: 0 /s 91 44 21 2
1 active ceph-mgr1 Reqs: 0 /s 10 13 11 0
POOL TYPE USED AVAIL
cephfs-metadata metadata 2228k 63.2G
cephfs-data data 1964M 63.2G
STANDBY MDS
ceph-mon3
ceph-mon2
MDS version: ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)
我们再把mgr2的mds停掉,看看效果,这次mon2又被提升为主了
ceph@ceph-mgr2:/etc/ceph$ sudo systemctl stop [email protected]
ceph@ceph-deploy:~/ceph-cluster$ ceph fs status
mycephfs - 2 clients
========
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 active ceph-mon2 Reqs: 0 /s 91 44 21 2
1 active ceph-mgr1 Reqs: 0 /s 10 13 11 0
POOL TYPE USED AVAIL
cephfs-metadata metadata 2264k 63.2G
cephfs-data data 1964M 63.2G
STANDBY MDS
ceph-mon3
MDS version: ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)