目录
Rook安装
下载Rook安装文件
配置更改
部署rook
创建ceph集群
配置更改
创建Ceph集群
*安装ceph snapshot控制器
安装ceph客户端工具
部署ceph-tools
查看mgr service
Ceph dashboard
暴露服务
登录
块存储、对象存储、文件存储的使用场景
ceph块存储的使用
创建StorageClass和ceph的存储池
挂载测试
StatefulSet volumeClaimTemplates
下载指定版本Rook
git clone --single-branch --branch v1.8.2 https://github.com.cnpmjs.org/rook/rook.git
cd rook/deploy/examples
修改Rook CSI镜像地址,原本的地址是gcr的镜像,但是gcr的镜像无法被国内访问,所以需要同步gcr的镜像到阿里云镜像仓库,,可以直接修改如下:
vi operator.yaml
在
ROOK_CSI_REGISTRAR_IMAGE: "registry.cn-beijing.aliyuncs.com/dotbalo/csi-node-driver-registrar:v2.0.1"
ROOK_CSI_RESIZER_IMAGE: "registry.cn-beijing.aliyuncs.com/dotbalo/csi-resizer:v1.0.1"
ROOK_CSI_PROVISIONER_IMAGE: "registry.cn-beijing.aliyuncs.com/dotbalo/csi-provisioner:v2.0.4"
ROOK_CSI_SNAPSHOTTER_IMAGE: "registry.cn-beijing.aliyuncs.com/dotbalo/csi-snapshotter:v4.0.0"
ROOK_CSI_ATTACHER_IMAGE: "registry.cn-beijing.aliyuncs.com/dotbalo/csi-attacher:v3.0.2"
注意空格缩进
如果是其他版本,需要自行同步,同步方法可以在网上找到相关文章。可以参考
https://blog.csdn.net/sinat_35543900/article/details/103290782
还是operator文件,新版本rook默认关闭了自动发现容器的部署,如果没有指定osd节点的位置,这层配置一定要打开,来让自动发现容器帮助我们去发现设备上的块设备,可以找到ROOK_ENABLE_DISCOVERY_DAEMON改成true即可:
部署步骤如下:
cd rook/deploy/examples
kubectl create -f crds.yaml -f common.yaml -f operator.yaml
等待operator容器和discover容器启动
[root@k8s-172-16-90-71 examples]# kubectl -n rook-ceph get pod
NAME READY STATUS RESTARTS AGE
rook-ceph-operator-7c7d8846f4-fsv9f 1/1 Running 0 25h
rook-discover-qw2ln 1/1 Running 0 28h
rook-discover-wf8t7 1/1 Running 0 28h
rook-discover-z6dhq 1/1 Running 0 28h
全部变成1/1 Running 才可以创建Ceph集群
vi cluster.yaml
主要更改的是osd节点所在的位置:
name只能填hostname,不能填写ip地址
注意:新版必须采用裸盘,即未格式化的磁盘。其中k8s-master03 k8s-node01 node02有新加的一个磁盘,可以通过lsblk -f查看新添加的磁盘名称。建议最少三个节点,否则后面的试验可能会出现问题
需要关闭dashboard的ssl选项,否则可能出现通过域名访问不到的情况
kubectl create -f cluster.yaml
创建完成后,可以查看pod的状态:
[root@k8s-172-16-90-71 examples]# kubectl -n rook-ceph get pod
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-6nwbp 3/3 Running 0 25h
csi-cephfsplugin-b7h6k 3/3 Running 0 25h
csi-cephfsplugin-provisioner-785798bc8f-78tvr 6/6 Running 0 25h
csi-cephfsplugin-provisioner-785798bc8f-krsdx 6/6 Running 0 25h
csi-rbdplugin-2mmmj 3/3 Running 0 25h
csi-rbdplugin-85sbg 3/3 Running 0 25h
csi-rbdplugin-provisioner-75cdf8cd6d-ghl8f 6/6 Running 0 25h
csi-rbdplugin-provisioner-75cdf8cd6d-wf6h8 6/6 Running 0 25h
rook-ceph-crashcollector-k8s-master03-64c56d8d8b-9vqrk 1/1 Running 0 28h
rook-ceph-crashcollector-k8s-node01-7fc9b79798-6r2rn 1/1 Running 0 28h
rook-ceph-crashcollector-k8s-node02-6954497cb9-pqll7 1/1 Running 0 28h
rook-ceph-mgr-a-dd4bf8445-scsrt 1/1 Running 0 25h
rook-ceph-mon-a-856779ddfd-8v7s2 1/1 Running 0 28h
rook-ceph-mon-b-6c94bddf8c-wb69x 1/1 Running 0 28h
rook-ceph-mon-c-5659bcb5c9-pjn5f 1/1 Running 0 28h
rook-ceph-operator-7c7d8846f4-fsv9f 1/1 Running 0 25h
rook-ceph-osd-0-7c6cdb8546-kkcts 1/1 Running 0 28h
rook-ceph-osd-1-f8b598d47-qjnwl 1/1 Running 0 28h
rook-ceph-osd-2-55846dbcd9-jvm62 1/1 Running 0 28h
rook-ceph-osd-prepare-k8s-master03-h8lt2 0/1 Completed 0 5h1m
rook-ceph-osd-prepare-k8s-node01-jqz7x 0/1 Completed 0 5h1m
rook-ceph-osd-prepare-k8s-node02-hm8lc 0/1 Completed 0 5h1m
rook-discover-qw2ln 1/1 Running 0 29h
rook-discover-wf8t7 1/1 Running 0 29h
rook-discover-z6dhq 1/1 Running 0 29h
需要注意的是,osd-x的容器必须是存在的,且是正常的。如果上述Pod均正常,则认为集群安装成功。
更多配置:https://rook.io/docs/rook/v1.8/ceph-cluster-crd.html
k8s 1.19版本以上需要单独安装snapshot控制器,才能完成pvc的快照功能,所以在此提前安装下,如果是1.19以下版本,不需要单独安装。
snapshot控制器的部署在集群安装时的k8s-ha-install项目中,需要切换到1.20.x分支:
具体文档:https://rook.io/docs/rook/v1.8/ceph-csi-snapshot.html
[root@k8s-172-16-90-71 examples]# pwd
/root/rook/deploy/examples
kubectl create -f toolbox.yaml -n rook-ceph
待容器Running后,即可执行相关命令
[root@k8s-172-16-90-71 examples]# kubectl get po -n rook-ceph -l app=rook-ceph-tools
NAME READY STATUS RESTARTS AGE
rook-ceph-tools-6f7467bb4d-qzsvg 1/1 Running 0
[root@k8s-172-16-90-71 examples]# kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
[rook@rook-ceph-tools-6f56cdd85d-9hfg5 /]$ ceph status
cluster:
id: 3e98e41e-9f6a-49ab-b4ff-63d926d8e860
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,d (age 4h)
mgr: a(active, since 117m)
osd: 3 osds: 3 up (since 4h), 3 in (since 4h)
data:
pools: 2 pools, 33 pgs
objects: 262 objects, 806 MiB
usage: 2.4 GiB used, 148 GiB / 150 GiB avail
pgs: 33 active+clean
[rook@rook-ceph-tools-6f56cdd85d-9hfg5 /]$ ceph osd status
ID HOST USED AVAIL WR OPS WR DATA RD OPS RD DATA STATE
0 k8s-172-16-90-72 822M 49.1G 0 0 0 0 exists,up
1 k8s-172-16-90-74 826M 49.1G 0 0 0 0 exists,up
2 k8s-172-16-90-73 826M 49.1G 0 0 0 0 exists,up
[rook@rook-ceph-tools-6f56cdd85d-9hfg5 /]$ ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 150 GiB 148 GiB 2.4 GiB 2.4 GiB 1.61
TOTAL 150 GiB 148 GiB 2.4 GiB 2.4 GiB 1.61
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
device_health_metrics 1 1 0 B 0 0 B 0 47 GiB
replicapool 2 32 795 MiB 262 2.3 GiB 1.64 47 GiB
[rook@rook-ceph-tools-6f56cdd85d-9hfg5 /]$ ceph mgr services
{
"dashboard": "http://100.112.95.51:7000/",
"prometheus": "http://100.112.95.51:9283/"
}
获取到ceph dashboard的endpoint
这里记录一个踩坑点,cluster.yaml中默认的dashboard端口配置是注释的8443,可以认为默认的端口是8443,但是在实际部署的过程中观察到了svc的port从8443漂移到了7000的情况,因此需要通过ceph-tools工具查询到实际端口
工具获取到的端口是7000,这里我们根据ceph mgr services的查询结果,在后面的步骤去配置dashboard的端口访问。
默认情况下,ceph dashboard是打开的,但是通过官方operator部署的service只有clusterIp类型,
[root@k8s-172-16-90-71 examples]# kubectl get svc rook-ceph-mgr-dashboard -n rook-ceph
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rook-ceph-mgr-dashboard ClusterIP 10.96.1.1727000/TCP 5h9m
可以创建一个nodePort类型的Service暴露服务:
vim dashboard-np.yaml
apiVersion: v1
kind: Service
metadata:
labels:
app: rook-ceph-mgr
ceph_daemon_id: a
rook_cluster: rook-ceph
name: rook-ceph-mgr-dashboard-np
namespace: rook-ceph
spec:
ports:
- name: http-dashboard
port: 7000
protocol: TCP
targetPort: 7000
selector:
app: rook-ceph-mgr
ceph_daemon_id: a
rook_cluster: rook-ceph
sessionAffinity: None
type: NodePort
保存退出并应用后,会创建一个端口,然后通过任意k8s节点的IP+该端口即可访问该dashboard:
[root@k8s-172-16-90-71 examples]# kubectl get svc -n rook-ceph rook-ceph-mgr-dashboard-np
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rook-ceph-mgr-dashboard-np NodePort 10.96.0.47000:55553/TCP 3h14m
账号为admin,查看密码:
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
警告解决:https://docs.ceph.com/en/octopus/rados/operations/health-checks/
块存储一般用于一个Pod挂载一块存储使用,相当于一个服务器新挂了一个盘,只给一个应用使用。
参考文档:https://rook.io/docs/rook/v1.8/ceph-block.html
[root@k8s-172-16-90-71 examples]# pwd
/root/rook/deploy/examples
[root@k8s-172-16-90-71 examples]# vi csi/rbd/storageclass.yaml
因为我是试验环境,所以将副本数设置成了2(不能设置为1),生产环境最少为3,且要小于等于osd的数量
这个配置文件包含了两个资源:CephBlockPool和StorageClass
StorageClass没有namespace隔离性,但是CephBlockPool有命名空间限制,需要指定到应用所在的命名空间内
这里将CephBlockPool部署到默认namespace,在此命名空间下的container都可以通过pvc来获得持久化的存储支持
创建StorageClass和存储池:
[root@k8s-172-16-90-71 examples]# kubectl create -f csi/rbd/storageclass.yaml -n rook-ceph -n default
cephblockpool.ceph.rook.io/replicapool created
storageclass.storage.k8s.io/rook-ceph-block created
查看创建的cephblockpool和storageClass:
[root@k8s-172-16-90-71 examples]# kubectl get cephblockpool
NAME AGE
replicapool 146m
[root@k8s-172-16-90-71 examples]# kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
rook-ceph-block rook-ceph.rbd.csi.ceph.com Delete Immediate true 136m
此时可以在ceph dashboard查看到改Pool,如果没有显示说明没有创建成功
创建一个MySQL服务
[root@k8s-172-16-90-71 examples]# pwd
/root/rook/deploy/examples
kubectl create -f mysql.yaml
kubectl create -f wordpress.yaml
该文件有一段pvc的配置
pvc会连接刚才创建的storageClass,然后动态创建pv,然后连接到ceph创建对应的存储
之后创建pvc只需要指定storageClassName为刚才创建的StorageClass名称即可连接到rook的ceph。如果是statefulset,只需要将volumeTemplateClaim里面的Claim名称改为StorageClass名称即可动态创建Pod。
其中MySQL deployment的volumes配置挂载了该pvc:
claimName为pvc的名称
因为MySQL的数据不能多个MySQL实例连接同一个存储,所以一般只能用块存储。相当于新加了一块盘给MySQL使用。
创建完成后可以查看创建的pvc和pv
[root@k8s-172-16-90-71 examples]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-3d26c684-db32-4680-8133-edfcba2e9e47 20Gi RWO Delete Bound default/wp-pv-claim rook-ceph-block 146m
pvc-760c1dbf-ce4a-49c5-89ce-d845c65a5fbe 20Gi RWO Delete Bound default/mysql-pv-claim rook-ceph-block 146m
[root@k8s-172-16-90-71 examples]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
mysql-pv-claim Bound pvc-760c1dbf-ce4a-49c5-89ce-d845c65a5fbe 20Gi RWO rook-ceph-block 146m
wp-pv-claim Bound pvc-3d26c684-db32-4680-8133-edfcba2e9e47 20Gi RWO rook-ceph-block 146m
此时在ceph dashboard上面也可以查看到对应的image
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # has to match .spec.template.metadata.labels
serviceName: "nginx"
replicas: 3 # by default is 1
template:
metadata:
labels:
app: nginx # has to match .spec.selector.matchLabels
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "rook-ceph-block"
resources:
requests:
storage: 1Gi