provision,卷分配成功
attach,卷挂载在对应worker node
mount,卷挂载为文件系统并且映射给对应Pod
umount,卷已经和对应worker node解除映射,且已经从文件系统umount
detach,卷已经从worker node卸载
recycle,卷被回收
pv controller,负责创建和回收卷
attach detach controller,负责挂载和卸载卷
volume manager,负责mount和umount卷
for {
desired := getDesiredState();
current := getCurrentState();
makeChanges(desired, current);
}
结合以上三个维度,Kubernetes需要保证卷的管理功能分布在不同控制器的前提下保证卷生命周期顺序的正确性。以Pod使用卷为例,看Kubernetes是如何做到这一点?
[root@10-10-88-152 ~]# kubectl get nodes 10-10-88-113 -o yaml
apiVersion: v1
kind: Node
....
volumesAttached:
- devicePath: csi-add9fc778d9593d01818d65ccde7013e87327d9f675b47df42a34b860c581711
name: kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-4faa18f5bbbd11e8-1365
- devicePath: csi-5dd249387138238e8e2209eb471450a072dd6543adde7a6769c8461943c789ca
name: kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-4fa9b764bbbd11e8-1366
- devicePath: csi-bc9b81e32d84e8890d17568964c1e01af97b0c175e0b73d4bf30bba54e3f1a1e
name: kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-4fa94533bbbd11e8-1364
volumesInUse:
- kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-4fa94533bbbd11e8-1364
- kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-4fa9b764bbbd11e8-1366
- kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-4faa18f5bbbd11e8-1365
先挂载到node中全局路径,比如/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-3ecd68c7b7d211e8/globalmount。
映射到Pod对应路径,比如/var/lib/kubelet/pods/49a5fede-b811-11e8-844f-fa7378845e00/volumes/kubernetes.io~csi/pvc-3ecd68c7b7d211e8/mount。
actualStateOfWorld中设置volume为挂载成功状态。
将Pod从desiredStateOfWorld的缓存信息中清除。
actualStateOfWorld中已经挂载的卷和desiredStateOfWorld发现Pod不应该挂载,执行UmountVolume操作,将Pod和卷映射关系解除,并将Pod从actualStateOfWorld的卷信息中剔除。
此时如果实际状态中卷没有关联任何Pod,则说明卷需要可以完全与节点分离,则先执行UnmountDevice将卷的globalpath umount掉,等到下次reconcile时执行MarkVolumeAsDetached将卷完全从实际状态中删除掉。
不同组件通过资源状态协作,attach detach controller需要PVC绑定PV的状态,volume manager需要node status中volume attached状态。
组件通过reconcile方式达到期望状态,并且状态可能需要多次reconcile中完成,如Pod清除掉后,volume最终和node分离。
volume manager发现Pod被删除,执行umount
StatefulSet发现Pod被删除,马上创建Pod
scheduler发现Pod进行调度
volume manager发现原有volume需要绑定Pod,执行mount
volume manager发现Pod被删除,执行umount/unmountDevice/MarkVolumeAsDelete(通过几次reconcile)
attach detach controller发现volume在node节点未被使用,执行detach
scheduler发现Pod进行调度
attach detach controller发现volume需要attach,执行attach
volume manager挂载
StatefulSet发现Pod被删除,马上创建Pod
volume manager发现Pod被删除,执行umount/deviceUmount(通过几次reconcile),注意此时devicePath和deviceMountPath都为空
scheduler发现Pod进行调度
volume manager发现原有volume需要绑定Pod,执行mount而此时devicePath和deviceMountPath都为空,问题出现
Sep 14 19:28:33 10-10-40-16 kubelet: I0914 19:28:33.174310 1953 operation_generator.go:1168] Controller attach succeeded for volume "pvc-3ecd68c7b7d211e8" (UniqueName: "kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-3ecd68c7b7d211e8-338") pod "yoooo-416ea0-0" (UID: "49a5fede-b811-11e8-844f-fa7378845e00") device path: "csi-eb93736e654600786d95eaffa7cd5d616f11a90bdc109e0df575e8646c250eb2"
Sep 14 19:28:33 10-10-40-16 kubelet: I0914 19:28:33.273344 1953 operation_generator.go:486] MountVolume.WaitForAttach entering for volume "pvc-3ecd68c7b7d211e8" (UniqueName: "kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-3ecd68c7b7d211e8-338") pod "yoooo-416ea0-0" (UID: "49a5fede-b811-11e8-844f-fa7378845e00") DevicePath "csi-eb93736e654600786d95eaffa7cd5d616f11a90bdc109e0df575e8646c250eb2"
Sep 14 19:28:33 10-10-40-16 kubelet: I0914 19:28:33.318275 1953 operation_generator.go:495] MountVolume.WaitForAttach succeeded for volume "pvc-3ecd68c7b7d211e8" (UniqueName: "kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-3ecd68c7b7d211e8-338") pod "yoooo-416ea0-0" (UID: "49a5fede-b811-11e8-844f-fa7378845e00") DevicePath "csi-eb93736e654600786d95eaffa7cd5d616f11a90bdc109e0df575e8646c250eb2"
Sep 14 19:28:33 10-10-40-16 kubelet: I0914 19:28:33.319345 1953 operation_generator.go:514] MountVolume.MountDevice succeeded for volume "pvc-3ecd68c7b7d211e8" (UniqueName: "kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-3ecd68c7b7d211e8-338") pod "yoooo-416ea0-0" (UID: "49a5fede-b811-11e8-844f-fa7378845e00") device mount path "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-3ecd68c7b7d211e8/globalmount"
Sep 14 19:29:12 10-10-40-16 kubelet: I0914 19:29:12.826916 1953 operation_generator.go:486] MountVolume.WaitForAttach entering for volume "pvc-3ecd68c7b7d211e8" (UniqueName: "kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-3ecd68c7b7d211e8-338") pod "yoooo-416ea0-0" (UID: "67f223dc-b811-11e8-844f-fa7378845e00") DevicePath "csi-eb93736e654600786d95eaffa7cd5d616f11a90bdc109e0df575e8646c250eb2"
Sep 14 19:29:14 10-10-40-16 kubelet: I0914 19:29:14.465225 1953 operation_generator.go:495] MountVolume.WaitForAttach succeeded for volume "pvc-3ecd68c7b7d211e8" (UniqueName: "kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-3ecd68c7b7d211e8-338") pod "yoooo-416ea0-0" (UID: "67f223dc-b811-11e8-844f-fa7378845e00") DevicePath "csi-eb93736e654600786d95eaffa7cd5d616f11a90bdc109e0df575e8646c250eb2"
Sep 14 19:29:14 10-10-40-16 kubelet: I0914 19:29:14.466483 1953 operation_generator.go:514] MountVolume.MountDevice succeeded for volume "pvc-3ecd68c7b7d211e8" (UniqueName: "kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-3ecd68c7b7d211e8-338") pod "yoooo-416ea0-0" (UID: "67f223dc-b811-11e8-844f-fa7378845e00") device mount path "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-3ecd68c7b7d211e8/globalmount"
Sep 14 19:29:15 10-10-40-16 kubelet: W0914 19:29:15.491424 1953 csi_mounter.go:354] kubernetes.io/csi: skipping mount dir removal, path does not exist [/var/lib/kubelet/pods/49a5fede-b811-11e8-844f-fa7378845e00/volumes/kubernetes.io~csi/pvc-3ecd68c7b7d211e8/mount]
Sep 14 19:29:15 10-10-40-16 kubelet: I0914 19:29:15.491450 1953 operation_generator.go:686] UnmountVolume.TearDown succeeded for volume "kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-3ecd68c7b7d211e8-338" (OuterVolumeSpecName: "data") pod "49a5fede-b811-11e8-844f-fa7378845e00" (UID: "49a5fede-b811-11e8-844f-fa7378845e00"). InnerVolumeSpecName "pvc-3ecd68c7b7d211e8". PluginName "kubernetes.io/csi", VolumeGidValue ""
Sep 14 19:29:44 10-10-40-16 kubelet: W0914 19:29:44.896387 1953 csi_mounter.go:354] kubernetes.io/csi: skipping mount dir removal, path does not exist [/var/lib/kubelet/pods/67f223dc-b811-11e8-844f-fa7378845e00/volumes/kubernetes.io~csi/pvc-3ecd68c7b7d211e8/mount]
Sep 14 19:29:44 10-10-40-16 kubelet: I0914 19:29:44.896403 1953 operation_generator.go:686] UnmountVolume.TearDown succeeded for volume "kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-3ecd68c7b7d211e8-338" (OuterVolumeSpecName: "data") pod "67f223dc-b811-11e8-844f-fa7378845e00" (UID: "67f223dc-b811-11e8-844f-fa7378845e00"). InnerVolumeSpecName "pvc-3ecd68c7b7d211e8". PluginName "kubernetes.io/csi", VolumeGidValue ""
Sep 14 19:29:44 10-10-40-16 kubelet: I0914 19:29:44.917540 1953 reconciler.go:278] operationExecutor.UnmountDevice started for volume "pvc-3ecd68c7b7d211e8" (UniqueName: "kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-3ecd68c7b7d211e8-338") on node "10-10-40-16"
Sep 14 19:29:44 10-10-40-16 kubelet: W0914 19:29:44.919231 1953 mount_linux.go:179] could not determine device for path: "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-3ecd68c7b7d211e8/globalmount"
Sep 14 19:29:45 10-10-40-16 kubelet: I0914 19:29:45.609605 1953 operation_generator.go:760] UnmountDevice succeeded for volume "pvc-3ecd68c7b7d211e8" %!(EXTRA string=UnmountDevice succeeded for volume "pvc-3ecd68c7b7d211e8" (UniqueName: "kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-3ecd68c7b7d211e8-338") on node "10-10-40-16" )
Sep 14 19:29:45 10-10-40-16 kubelet: I0914 19:29:45.624963 1953 operation_generator.go:486] MountVolume.WaitForAttach entering for volume "pvc-3ecd68c7b7d211e8" (UniqueName: "kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-3ecd68c7b7d211e8-338") pod "yoooo-416ea0-0" (UID: "77b8caf7-b811-11e8-844f-fa7378845e00") DevicePath ""
Sep 14 19:29:46 10-10-40-16 kubelet: E0914 19:29:46.006612 1953 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-3ecd68c7b7d211e8-338\"" failed. No retries permitted until 2018-09-14 19:29:46.506583596 +0800 CST m=+105572.978439381 (durationBeforeRetry 500ms). Error: "MountVolume.WaitForAttach failed for volume \"pvc-3ecd68c7b7d211e8\" (UniqueName: \"kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-3ecd68c7b7d211e8-338\") pod \"yoooo-416ea0-0\" (UID: \"77b8caf7-b811-11e8-844f-fa7378845e00\") : resource name may not be empty"
Sep 14 19:29:46 10-10-40-16 kubelet: I0914 19:29:46.533962 1953 operation_generator.go:486] MountVolume.WaitForAttach entering for volume "pvc-3ecd68c7b7d211e8" (UniqueName: "kubernetes.io/csi/csi-qcfsplugin^csi-qcfs-volume-3ecd68c7b7d211e8-338") pod "yoooo-416ea0-0" (UID: "77b8caf7-b811-11e8-844f-fa7378845e00") DevicePath ""
Sep 14 19:29:14以及之前DevicePath非空
Sep 14 19:29:45以及之后DevicePath为空
Sep 14 19:29:14 …… MountVolume.MountDevice ……
Sep 14 19:29:15 ….. UnmountVolume.TearDown ……
Sep 14 19:29:44 …… UnmountVolume.TearDown ……
Sep 14 19:29:44 …… operationExecutor.UnmountDevice ……
Sep 14 19:29:44 …… could not determine device for path ….
在步骤4中,有设置相关函数的:
其中比较关键的函数SetVolumeGloballyMounted:UnmountDevice->GenerateUnmountDeviceFunc->actualStateOfWorld.MarkDeviceAsUnmounted->asw.SetVolumeGloballyMounted
asw.SetVolumeGloballyMounted(volumeName, false /* globallyMounted */, /* devicePath */"", /* deviceMountPath */"")
总结
https://github.com/kubernetes/kubernetes/blob/release-1.10/pkg/controller/volume/persistentvolume/pv_controller.go#L301
https://github.com/kubernetes/kubernetes/blob/release-1.10/pkg/controller/volume/attachdetach/populator/desired_state_of_world_populator.go#L88
https://github.com/kubernetes/kubernetes/blob/release-1.10/pkg/controller/volume/attachdetach/reconciler/reconciler.go#L251
https://github.com/kubernetes/kubernetes/blob/release-1.10/pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go#L152
https://github.com/kubernetes/kubernetes/blob/release-1.10/pkg/kubelet/volumemanager/reconciler/reconciler.go#L160
https://github.com/kubernetes/kubernetes/blob/release-1.10/pkg/kubelet/volumemanager/reconciler/reconciler.go#L238