Rook使用简介

Rook使用简介(Ceph)

1、简介

Rook是一个自管理的分布式存储编排系统,可以为Kubernetes提供便利的存储解决方案。Rook本身并不提供存储,而是在kubernetes和存储系统之间提供适配层,简化存储系统的部署与维护工作。目前,rook支持的存储系统包括:Ceph、CockroachDB、Cassandra、EdgeFS、Minio、NFS,其中Ceph为Stable状态,其余均为Alpha。本文仅介绍Ceph相关内容。

Rook由Operator和Cluster两部分组成:

  • Operator:由一些CRD和一个All in one镜像构成,包含包含启动和监控存储系统的所有功能。
  • Cluster:负责创建CRD对象,指定相关参数,包括ceph镜像、元数据持久化位置、磁盘位置、dashboard等等…

下图是Rook的体系结构图,Operator启动之后,首先创建Agent和Discover容器,负责监视和管理各个节点上存储资源。然后创建Cluster,Cluster是创建Operator时定义的CRD。Operator根据Cluster的配置信息启动Ceph的相关容器。存储集群启动之后,使用kubernetes元语创建PVC为应用容器所用。

Rook使用简介_第1张图片

2、部署

2.1、环境说明

[root@master-0 ~]# kubectl get nodes -o wide
NAME       STATUS   ROLES    AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION          CONTAINER-RUNTIME
master-0   Ready    master   24m   v1.13.0   172.16.7.11   <none>        CentOS Linux 7 (Core)   3.10.0-862.el7.x86_64   docker://18.9.2
worker-0   Ready    worker   23m   v1.13.0   172.16.7.12   <none>        CentOS Linux 7 (Core)   3.10.0-862.el7.x86_64   docker://18.9.2
worker-1   Ready    worker   23m   v1.13.0   172.16.7.13   <none>        CentOS Linux 7 (Core)   3.10.0-862.el7.x86_64   docker://18.9.2

[root@master-0 ~]# lsblk 
NAME            MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda               8:0    0   20G  0 disk 
├─sda1            8:1    0    1G  0 part /boot
└─sda2            8:2    0   19G  0 part 
  ├─centos-root 253:0    0   17G  0 lvm  /
  └─centos-swap 253:1    0    2G  0 lvm  
sdb               8:16   0   20G  0 disk 
sr0              11:0    1 1024M  0 rom

每个节点两块磁盘,第二块sdb用来做ceph的数据盘

2.2、部署Operator

[root@master-0 ~]# kubectl apply -f https://raw.githubusercontent.com/rook/rook/master/cluster/examples/kubernetes/ceph/operator.yaml

[root@master-0 ~]# kubectl get pods --all-namespaces 
NAMESPACE          NAME                                                   READY   STATUS    RESTARTS   AGE
<snip>
rook-ceph-system   rook-ceph-agent-4tf9h                                  1/1     Running     0          17m
rook-ceph-system   rook-ceph-agent-4zg9t                                  1/1     Running     0          17m
rook-ceph-system   rook-ceph-agent-r82n7                                  1/1     Running     0          17m
rook-ceph-system   rook-ceph-operator-b996864dd-zbn29                     1/1     Running     0          18m
rook-ceph-system   rook-discover-88zkc                                    1/1     Running     0          17m
rook-ceph-system   rook-discover-ffsns                                    1/1     Running     0          17m
rook-ceph-system   rook-discover-wt942                                    1/1     Running     0          17m

[root@master-0 ~]# kubectl get ds --all-namespaces 
NAMESPACE          NAME                             DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                     AGE
<snip>
rook-ceph-system   rook-ceph-agent                  3         3         3       3            3           <none>                            17m
rook-ceph-system   rook-discover                    3         3         3       3            3           <none>                            17m

需要重点关注的是deployment:rook-ceph-operator,rook-ceph-operator启动之后会以DS启动Agent和Discover

2.3、部署Cluster

### 设置label,标记哪些节点用来部署哪些应用 ###
[root@master-0 ~]# kubectl label node master-0 role=storage-node
node/master-0 labeled
[root@master-0 ~]# kubectl label node worker-0 role=storage-node
node/worker-0 labeled
[root@master-0 ~]# kubectl label node worker-1 role=storage-node
node/worker-1 labeled
[root@master-0 ~]# kubectl get nodes -L role
NAME       STATUS   ROLES    AGE   VERSION   ROLE
master-0   Ready    master   26m   v1.13.0   storage-node
worker-0   Ready    worker   25m   v1.13.0   storage-node
worker-1   Ready    worker   25m   v1.13.0   storage-node

这里仅设置一个label,ceph的mon、ose、mgr都运行在这个label对应的节点上,当然也可以分别设置~

### cluster.yml ###
apiVersion: v1
kind: Namespace
metadata:
  name: rook-ceph
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: rook-ceph-osd
  namespace: rook-ceph
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: rook-ceph-mgr
  namespace: rook-ceph
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: rook-ceph-osd
  namespace: rook-ceph
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: [ "get", "list", "watch", "create", "update", "delete" ]
---
# Aspects of ceph-mgr that require access to the system namespace
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: rook-ceph-mgr-system
  namespace: rook-ceph
rules:
- apiGroups:
  - ""
  resources:
  - configmaps
  verbs:
  - get
  - list
  - watch
---
# Aspects of ceph-mgr that operate within the cluster's namespace
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: rook-ceph-mgr
  namespace: rook-ceph
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - services
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - batch
  resources:
  - jobs
  verbs:
  - get
  - list
  - watch
  - create
  - update
  - delete
- apiGroups:
  - ceph.rook.io
  resources:
  - "*"
  verbs:
  - "*"
---
# Allow the operator to create resources in this cluster's namespace
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: rook-ceph-cluster-mgmt
  namespace: rook-ceph
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: rook-ceph-cluster-mgmt
subjects:
- kind: ServiceAccount
  name: rook-ceph-system
  namespace: rook-ceph-system
---
# Allow the osd pods in this namespace to work with configmaps
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: rook-ceph-osd
  namespace: rook-ceph
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: rook-ceph-osd
subjects:
- kind: ServiceAccount
  name: rook-ceph-osd
  namespace: rook-ceph
---
# Allow the ceph mgr to access the cluster-specific resources necessary for the mgr modules
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: rook-ceph-mgr
  namespace: rook-ceph
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: rook-ceph-mgr
subjects:
- kind: ServiceAccount
  name: rook-ceph-mgr
  namespace: rook-ceph
---
# Allow the ceph mgr to access the rook system resources necessary for the mgr modules
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: rook-ceph-mgr-system
  namespace: rook-ceph-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: rook-ceph-mgr-system
subjects:
- kind: ServiceAccount
  name: rook-ceph-mgr
  namespace: rook-ceph
---
# Allow the ceph mgr to access cluster-wide resources necessary for the mgr modules
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: rook-ceph-mgr-cluster
  namespace: rook-ceph
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: rook-ceph-mgr-cluster
subjects:
- kind: ServiceAccount
  name: rook-ceph-mgr
  namespace: rook-ceph
---
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph
spec:
  cephVersion:
    # The container image used to launch the Ceph daemon pods (mon, mgr, osd, mds, rgw).
    # v12 is luminous, v13 is mimic, and v14 is nautilus.
    # RECOMMENDATION: In production, use a specific version tag instead of the general v13 flag, which pulls the latest release and could result in different
    # versions running within the cluster. See tags available at https://hub.docker.com/r/ceph/ceph/tags/.
    image: 192.168.101.88:5000/ceph/ceph:v13.1 
    # Whether to allow unsupported versions of Ceph. Currently only luminous and mimic are supported.
    # After nautilus is released, Rook will be updated to support nautilus.
    # Do not set to true in production.
    allowUnsupported: false
  # The path on the host where configuration files will be persisted. If not specified, a kubernetes emptyDir will be created (not recommended).
  # Important: if you reinstall the cluster, make sure you delete this directory from each host or else the mons will fail to start on the new cluster.
  # In Minikube, the '/data' directory is configured to persist across reboots. Use "/data/rook" in Minikube environment.
  dataDirHostPath: /var/lib/rook
  # set the amount of mons to be started
  mon:
    count: 3
    allowMultiplePerNode: true
  # enable the ceph dashboard for viewing cluster status
  dashboard:
    enabled: true
    # serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
    # urlPrefix: /ceph-dashboard
    # serve the dashboard at the given port.
    # port: 8443
    # serve the dashboard using SSL
    # ssl: true
  network:
    # toggle to use hostNetwork
    hostNetwork: false
  rbdMirroring:
    # The number of daemons that will perform the rbd mirroring.
    # rbd mirroring must be configured with "rbd mirror" from the rook toolbox.
    workers: 0
  # To control where various services will be scheduled by kubernetes, use the placement configuration sections below.
  # The example under 'all' would have all services scheduled on kubernetes nodes labeled with 'role=storage-node' and
  # tolerate taints with a key of 'storage-node'.
  placement:
    all:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
            - matchExpressions:
                - key: role
                  operator: In
                  values:
                    - storage-node
#      podAffinity:
#      podAntiAffinity:
      tolerations:
        - key: node-role.kubernetes.io/master
          operator: Exists
          effect: NoSchedule
# The above placement information can also be specified for mon, osd, and mgr components
#    mon:
#    osd:
#    mgr:
  resources:
# The requests and limits set here, allow the mgr pod to use half of one CPU core and 1 gigabyte of memory
#    mgr:
#      limits:
#        cpu: "500m"
#        memory: "1024Mi"
#      requests:
#        cpu: "500m"
#        memory: "1024Mi"
# The above example requests/limits can also be added to the mon and osd components
#    mon:
#    osd:
  storage: # cluster level storage configuration and selection
    useAllNodes: false
    useAllDevices: false
    deviceFilter:
    location:
    config:
      # The default and recommended storeType is dynamically set to bluestore for devices and filestore for directories.
      # Set the storeType explicitly only if it is required not to use the default.
      # storeType: bluestore
      databaseSizeMB: "1024" # this value can be removed for environments with normal sized disks (100 GB or larger)
      # journalSizeMB: "1024"  # this value can be removed for environments with normal sized disks (20 GB or larger)
      osdsPerDevice: "1" # this value can be overridden at the node or device level
# Cluster level list of directories to use for storage. These values will be set for all nodes that have no `directories` set.
#    directories:
#    - path: /rook/storage-dir
# Individual nodes and their config can be specified as well, but 'useAllNodes' above must be set to false. Then, only the named
# nodes below will be used as storage resources.  Each node's 'name' field should match their 'kubernetes.io/hostname' label.
    nodes:
      - name: "master-0"
        devices:
          - name: "sdb"
      - name: "worker-0"
        devices:
          - name: "sdb"
      - name: "worker-1"
        devices:
          - name: "sdb"
#    - name: "172.17.4.101"
#      directories: # specific directories to use for storage can be specified for each node
#      - path: "/rook/storage-dir"
#      resources:
#        limits:
#          cpu: "500m"
#          memory: "1024Mi"
#        requests:
#          cpu: "500m"
#          memory: "1024Mi"
#    - name: "172.17.4.201"
#      devices: # specific devices to use for storage can be specified for each node
#      - name: "sdb"
#      - name: "nvme01" # multiple osds can be created on high performance devices
#        config:
#          osdsPerDevice: "5"
#      config: # configuration can be specified at the node level which overrides the cluster level config
#        storeType: filestore
#    - name: "172.17.4.301"
#      deviceFilter: "^sd."

CephCluster中需要重点注意啊的几个地方:

  • spec.dataDirHostPath:存放rook元数据,确保在服务器重启后能正常运行,重新部署时 需要手动删除遗留文件
  • spec.storage.useAllNodes:是否根据配置将所有节点用于存储,如果指定了nodes,该值必须设置为false
  • spec.storage.config:根据实际磁盘大小,可以删除config下的某些配置
  • spec.storage.nodes:分别设置各个节点的存储路径,可以说磁盘或者目录
[root@master-0 ~]# kubectl apply -f cluster.yml 
namespace/rook-ceph created
serviceaccount/rook-ceph-osd created
serviceaccount/rook-ceph-mgr created
role.rbac.authorization.k8s.io/rook-ceph-osd created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-system created
role.rbac.authorization.k8s.io/rook-ceph-mgr created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
rolebinding.rbac.authorization.k8s.io/rook-ceph-osd created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-system created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
cephcluster.ceph.rook.io/rook-ceph created

[root@master-0 ~]# kubectl get pods --all-namespaces 
<snip>
rook-ceph-system   rook-ceph-agent-4tf9h                                  1/1     Running     0          19m
rook-ceph-system   rook-ceph-agent-4zg9t                                  1/1     Running     0          19m
rook-ceph-system   rook-ceph-agent-r82n7                                  1/1     Running     0          19m
rook-ceph-system   rook-ceph-operator-b996864dd-zbn29                     1/1     Running     0          20m
rook-ceph-system   rook-discover-88zkc                                    1/1     Running     0          19m
rook-ceph-system   rook-discover-ffsns                                    1/1     Running     0          19m
rook-ceph-system   rook-discover-wt942                                    1/1     Running     0          19m
rook-ceph          rook-ceph-mgr-a-7b9667498-j4bdx                        1/1     Running     0          16m
rook-ceph          rook-ceph-mon-a-749779c884-bqm9b                       1/1     Running     0          17m
rook-ceph          rook-ceph-mon-b-b97f6cbdb-hmrln                        1/1     Running     0          17m
rook-ceph          rook-ceph-mon-c-67d7dcc89f-pcjpn                       1/1     Running     0          16m
rook-ceph          rook-ceph-osd-0-86dff67f75-mvjkc                       1/1     Running     0          15m
rook-ceph          rook-ceph-osd-1-6cdd46dcdc-p62zq                       1/1     Running     0          15m
rook-ceph          rook-ceph-osd-2-c7b97f7bf-65r6d                        1/1     Running     0          15m
rook-ceph          rook-ceph-osd-prepare-master-0-vvp6c                   0/2     Completed   0          16m
rook-ceph          rook-ceph-osd-prepare-worker-0-spfbn                   0/2     Completed   0          16m
rook-ceph          rook-ceph-osd-prepare-worker-1-blmsz                   0/2     Completed   0          16m

这一步如果有问题,可以查看operator的日志~

### rook会自动将磁盘分区 ###
[root@master-0 ~]# lsblk 
NAME            MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda               8:0    0   20G  0 disk 
├─sda1            8:1    0    1G  0 part /boot
└─sda2            8:2    0   19G  0 part 
  ├─centos-root 253:0    0   17G  0 lvm  /
  └─centos-swap 253:1    0    2G  0 lvm  
sdb               8:16   0   20G  0 disk 
├─sdb1            8:17   0  576M  0 part 
├─sdb2            8:18   0    1G  0 part 
└─sdb3            8:19   0 18.4G  0 part 
sr0              11:0    1 1024M  0 rom
  • 给rook使用的sdb实现不要自己分区,rook会自行分区
  • 从事例配置来看,除了device外,rook还可以配置directory做存储,这里就先不尝试了~

3、测试

3.1、访问dashboard

[root@master-0 ~]# kubectl get svc --all-namespaces
NAMESPACE       NAME                                     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
<snip>
rook-ceph       rook-ceph-mgr                            ClusterIP   10.100.35.42     <none>        9283/TCP         19m
rook-ceph       rook-ceph-mgr-dashboard                  NodePort    10.109.82.52     <none>        8443/TCP   19m
rook-ceph       rook-ceph-mon-a                          ClusterIP   10.107.76.183    <none>        6789/TCP         20m
rook-ceph       rook-ceph-mon-b                          ClusterIP   10.99.3.203      <none>        6789/TCP         20m
rook-ceph       rook-ceph-mon-c                          ClusterIP   10.97.73.46      <none>        6789/TCP         19m
### 修改dashboard的访问方式为NodePort ###
[root@master-0 ~]# kubectl edit svc -n rook-ceph rook-ceph-mgr-dashboard
[root@master-0 ~]# kubectl get svc -n rook-ceph rook-ceph-mgr-dashboard
NAME                      TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE
rook-ceph-mgr-dashboard   NodePort   10.109.82.52   <none>        8443:30372/TCP   20m

### 获得dashboard的登录密码,用户为admin ###
[root@master-0 ~]# kubectl get secrets -n rook-ceph rook-ceph-dashboard-password -o jsonpath='{.data.password}' | base64 -d
eXnScTg7nm

浏览器登录:https://172.16.7.11:30372

3.2、创建CephBlockPool和StorageClass

### 块存 ###
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool
  namespace: rook-ceph
spec:
  failureDomain: host
  replicated:
    size: 3

### sc ###
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: rook-ceph-block
provisioner: ceph.rook.io/block
parameters:
  blockPool: replicapool
  # The value of "clusterNamespace" MUST be the same as the one in which your rook cluster exist
  clusterNamespace: rook-ceph
  # Specify the filesystem type of the volume. If not specified, it will use `ext4`.
  fstype: xfs
# Optional, default reclaimPolicy is "Delete". Other options are: "Retain", "Recycle" as documented in https://kubernetes.io/docs/concepts/storage/storage-classes/
reclaimPolicy: Retain
### 使用busybox测试 ###
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: ceph-block-volume
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: rook-ceph-block
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: busybox
  name: busybox
spec:
  replicas: 1
  selector:
    matchLabels:
      app: busybox
  template:
    metadata:
      labels:
        app: busybox
    spec:
      containers:
        - name: busybox
          image: busybox
          command: ["sh", "-c", "sleep 3600"]
          volumeMounts:
            - name: volume
              mountPath: /volume
      volumes:
        - name: volume
          persistentVolumeClaim:
            claimName: ceph-block-volume
[root@master-0 ~]# kubectl get pvc
NAME                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
ceph-block-volume   Bound    pvc-066408f1-5059-11e9-95ad-005056260373   1Gi        RWO            rook-ceph-block   23s
[root@master-0 ~]# kubectl get pods
NAME                      READY   STATUS    RESTARTS   AGE
busybox-578db44c4-vwszl   1/1     Running   0          25s
[root@master-0 ~]# kubectl exec -ti busybox-578db44c4-vwszl sh
/ # df -h
Filesystem                Size      Used Available Use% Mounted on
overlay                  17.0G      4.3G     12.6G  26% /
tmpfs                    64.0M         0     64.0M   0% /dev
tmpfs                     1.8G         0      1.8G   0% /sys/fs/cgroup
/dev/rbd0              1014.0M     32.3M    981.7M   3% /volume
<snip>

3.3、文件存储和对象存储

文件存储和对象存储和上面类型,先创建CRD对象,然后使用。但是文件系统目前还不支持使用StorageClass,当使用RWX时,不太好用,后面考虑使用 ganesha 导出NFS服务实现RWX功能。

4、参考

https://rook.io/docs/rook/v0.9/

你可能感兴趣的:(学习整理)