关于k8s 简单实用ceph, 以下两篇文章给出了详细步骤和要注意的事项:
https://github.com/kubernetes/kubernetes/tree/master/examples/volumes/cephfs
http://tonybai.com/2017/05/08/mount-cephfs-acrossing-nodes-in-kubernetes-cluster/
- 生成ceph-secret时候,要对/etc/ceph/admin.secret中的秘钥进行base64编码
- 需要提前手动创建 rbd
- 对于内核版本较低,导致无法支持rbd的挂载特性,可以在创建rbd时指定,或者对ceph配置文件修改。
通过上述文档创建的pod,其所挂载的rbd不会随pod的删除而消失。
例如node 1
上有 POD A
, 挂载了rbd B
。在node 1
上可通过 mount 命令观察到rbd的挂载情况。在POD A
被删除后,rbd B
仍挂载在node 1
上。
若此时又创建了 POD C
(挂载rbd B
), 并且POD C
被调度到 node 2
上时。可以观察到rbd B
从 node 1
转移到node 2
上。 并且之前的数据依然存在。
StorageClass
动态分配静态分配就是提前手动创建好 pv 资源。
kubernetes的新特性之一,allows storage volumes to be created on-demand
。利用这个新特性就不用每次都手动创建PV了。动态分配需要配置StorageClass
,在创建pvc的时候需要指定StorageClass
。
在pvc的配置项中,有个storageClassName
的配置字段。该字段填写StorageClass
的名字,如果该字段为""
。那就表示禁用动态分配功能。
不定义storageClasskName
和storageClassName:
是完全不同的。对于第一种情况,kubernetes会使用默认的StorageClass
来为用户动态分配存储卷(需要定义某个StorageClass
为DefaultStorageClass
)。第二种表示完全禁用。如果用户没有定义默认存储对象的话呢,那这两者就没啥区别了。
在https://kubernetes.io/docs/concepts/storage/persistent-volumes/#dynamic中有详细介绍。
参照官方文档做法如下:
1. 生成ceph-secret:
[root@walker-2 ~]# cat /etc/ceph/admin.secret | base64
QVFEdmRsTlpTSGJ0QUJBQUprUXh4SEV1ZGZ5VGNVa1U5cmdWdHc9PQo=
[root@walker-2 ~]# kubectl create secret generic ceph-secret --type="kubernetes.io/rbd" --from-literal=key='QVFEdmRsTlpTSGJ0QUJBQUprUXh4SEV1ZGZ5VGNVa1U5cmdWdHc9PQo=' --namespace=kube-system
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: axiba
provisioner: kubernetes.io/rbd
parameters:
monitors: 172.16.18.5:6789,172.16.18.6:6789,172.16.18.7:6789
adminId: admin
adminSecretName: ceph-secret
adminSecretNamespace: hehe
pool: rbd
userId: admin
userSecretName: ceph-secret
fsType: ext4
imageFormat: "2"
imageFeatures: "layering"
adminId
: ceph 客户端ID,能够在pool中创建image, 默认是adminadminSecretName
: ceph客户端秘钥名(第一步中创建的秘钥)adminSecretNamespace
: 秘钥所在的namespacepool
: ceph集群中的pool, 可以通过命令 ceph osd pool ls
查看userId
: 使用rbd的用户ID,默认adminuserSecretName
: 同上imageFormat
: ceph RBD 格式。”1” 或者 “2”,默认为”1”. 2支持更多rbd特性imageFeatures
: 只有格式为”2”时,才有用。默认为空值(“”)
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: axiba
namespace: hehe
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 4Gi
storageClassName: axiba
apiVersion: v1
kind: Pod
metadata:
name: axiba
namespace: hehe
spec:
containers:
- image: nginx:latest
imagePullPolicy: IfNotPresent
name: nginx
resources: {}
volumeMounts:
- name: axiba
mountPath: /usr/share/nginx/html
volumes:
- name: axiba
persistentVolumeClaim:
claimName: axiba
通过上述配置创建pod后,发现pvc总是处于 pending
状态:
[root@walker-1 ~]# kubectl describe pvc axiba --namespace=hehe
Name: axiba
Namespace: hehe
StorageClass: axiba
Status: Pending
Volume:
Labels:
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"axiba","namespace":"hehe"},"spec":{"accessModes":["ReadWriteOnce...
volume.beta.kubernetes.io/storage-provisioner=kubernetes.io/rbd
Capacity:
Access Modes:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning ProvisioningFailed 32s (x1701 over 7h) persistentvolume-controller Failed to provision volume with StorageClass "axiba": failed to create rbd image: executable file not found in $PATH, command output:
提示找不到可执行文件。但是我在每台节点都已经安装了ceph。
[root@walker-1 ~]# which rbd
/usr/bin/rbd
遂google之,定位到#issue:38923。
- Volume Provisioning: Currently, if you want dynamic provisioning, RBD provisioner in
controller-manager
needs to accessrbd
binary to create new image in ceph cluster for your PVC.
external-storage plans to move volume provisioners from in-tree to out-of-tree, there will be a separated RBD provisioner container image withrbd
utility included (kubernetes-incubator/external-storage#200), then controller-manager do not need access rbd binary anymore.- Volume Attach/Detach:
kubelet
needs to accessrbd
binary to attach (rbd map
) and detach (rbd unmap
) RBD image on node. Ifkubelet
is running on the host, host needs to installrbd
utility (installceph-common
package on most Linux distributions).
大致原因就是,负责动态创建RBD的controller-manager
容器中没有安装ceph,导致命令不可用。
https://github.com/kubernetes-incubator/external-storage/issues/200
对于RBD provisioner
的支持从 in-tree
移动到了out-of-tree
因此需要考虑out-of-tree
的解决方案
https://github.com/kubernetes/kubernetes/issues/38923
首先创建一个rbd-provisioner
的deployment
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: rbd-provisioner
namespace: kube-system
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: rbd-provisioner
spec:
serviceAccountName: rbd-provision
containers:
- name: rbd-provisioner
image: "quay.io/external_storage/rbd-provisioner:v0.1.1"
env:
- name: PROVISIONER_NAME
value: ceph.com/rbd
截止到2017/11/09,镜像的最新版是v0.1.1
。可通过[https://quay.io/repository/external_storage/rbd-provisioner] 来查看最新镜像。
我做了一些小改动,将rbd-provisioner放到了kube-system
的namespace中。并且为其创建了rbd-provisoin
的ServiceAccount
,不然在启动该deployment之后,通过kubectl log rbd-provisionxxx -f
就会提示:
Failed to list *v1.PersistentVolumeClaim: User "system:serviceaccount:kube-syst
em:default" cannot list persistentvolumeclaims at the cluster scope. (get persistentvolumeclaims)
此类的权限问题。
rabc 授权配置如下:
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: rbd-provision
subjects:
- kind: ServiceAccount
name: rbd-provision
namespace: kube-system
roleRef:
kind: ClusterRole
name: system:controller:persistent-volume-binder
apiGroup: rbac.authorization.k8s.io
---
kind: ServiceAccount
apiVersion: v1
metadata:
name: rbd-provision
namespace: kube-system
同时修改 sc 配置:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: axiba
provisioner: ceph.com/rbd
parameters:
monitors: 172.16.18.5:6789,172.16.18.6:6789,172.16.18.7:6789
adminId: admin
adminSecretName: ceph-secret
adminSecretNamespace: hehe
pool: rbd
userId: admin
userSecretName: ceph-secret
imageFormat: "2"
imageFeatures: "layering"
No need to and do not add fsType: ext4 to storage class. 这是作者原话,具体原因不知。默认以ext4格式挂载。
在建好之后,发现自动创建了rbd和pv
[root@walker-1 hehe]# kubectl describe pvc axiba --namespace=hehe
Name: axiba
Namespace: hehe
StorageClass: axiba
Status: Bound
Volume: pvc-61785500-c434-11e7-96a0-fa163e028b17
Labels:
Annotations: control-plane.alpha.kubernetes.io/leader={"holderIdentity":"8444d083-c453-11e7-9d04-b2667a809e20","leaseDurationSeconds":15,"acquireTime":"2017-11-08T07:07:43Z","renewTime":"2017-11-08T07:08:14Z","lea...
kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"axiba","namespace":"hehe"},"spec":{"accessModes":["ReadWriteOnce...
pv.kubernetes.io/bind-completed=yes
pv.kubernetes.io/bound-by-controller=yes
volume.beta.kubernetes.io/storage-provisioner=ceph.com/rbd
Capacity: 4Gi
Access Modes: RWO
Events:
[root@walker-1 hehe]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-61785500-c434-11e7-96a0-fa163e028b17 4Gi RWO Delete Bound hehe/axiba axiba 20h
[root@walker-1 hehe]# rbd ls
kubernetes-dynamic-pvc-844efd73-c453-11e7-9d04-b2667a809e20
但是pod挂载存储卷的时候,却提示超时:
[root@walker-1 hehe]# kubectl describe po axiba --namespace=hehe
Name: axiba
Namespace: hehe
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 11m default-scheduler Successfully assigned aaa to walker-4.novalocal
Normal SuccessfulMountVolume 11m kubelet, walker-4.novalocal MountVolume.SetUp succeeded for volume "default-token-sx9pb"
Warning FailedMount 7m (x2 over 9m) kubelet, walker-4.novalocal Unable to mount volumes for pod "aaa_default(e34ec7f7-c528-11e7-96a0-fa163e028b17)": timeout expired waiting for volumes to attach/mount for pod "default"/"aaa". list of unattached/unmounted volumes=[aaa]
Warning FailedSync 7m (x2 over 9m) kubelet, walker-4.novalocal Error syncing pod
Normal SuccessfulMountVolume 6m (x2 over 6m) kubelet, walker-4.novalocal MountVolume.SetUp succeeded for volume "pvc-597c8a48-c503-11e7-96a0-fa163e028b17"
Normal Pulled 6m kubelet, walker-4.novalocal Container image "nginx:latest" already present on machine
Normal Created 6m kubelet, walker-4.novalocal Created container
Normal Started 6m kubelet, walker-4.novalocal Started container
尝试了若干次,还是卡在同样的错误上。issue上也没找到相关解释。可以确保的是ceph配置正确,通过手动可以查看rbd信息,以及挂载。(能够自动创建rbd,也说明配置没问题)。 神奇的是过一段时间后,自动恢复了。pod成功挂载了rbd, anyway,持续观望中。