详解 Kubernetes Pod imagePullPolicy

详解 Kubernetes Pod imagePullPolicy

问题背景

Kubernetes 管理下的容器会在什么情况下对容器镜像重新拉取?

概念理解

官方文档:https://v1-12.docs.kubernetes.io/docs/concepts/containers/images/#updating-images

对一个 Pod 来说,spec.containers.imagePullPolicy字段用于管理容器镜像的拉取策略,可选项为IfNotPresentAlways

默认的 imagePullPolicy 为IfNotPresent,image配置如image: nginx:1.12.5,当宿主存在该镜像时,kubelet 会自动跳过镜像拉取的步骤;

如果希望每次容器启动时都从镜像库拉取镜像,可以通过以下方式中的任一一种来配置:

  1. 将 imagePullPolicy 设为 Always;
  2. 不配置 imagePullPolicy, 将容器镜像的版本号设置为:latest,即 images: nginx:latest
  3. 不配置 imagePullPolicy,也不配置容器版本号,如images: nginx

实验测试

测试准备

nginx-deployment.yaml 配置 container 对应的 imagePullPolicy: Always

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  minReadySeconds: 30
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.15.4
        imagePullPolicy: Always
        ports:
        - containerPort: 80

测试方法

  1. 创建好对应的 Deployment, 观察某个 Pod 的事件
  2. 通过某一种手段,不对 Pod 进行操作,只是使得 Pod 里的 Container 退出(模拟 OOM场景);
  3. 观测 Pod 元数据改变情况,观测 docker image 拉取情况。

测试过程

root@kmaster135:/home/chenjiaxi01/yaml/controllers# kubectl get pods -o wide| grep nginx
nginx-deployment-57f495d87b-k6g48   1/1     Running     2          173m   10.244.2.158   dnode137   
nginx-deployment-57f495d87b-rcvlg   1/1     Running     0          173m   10.244.2.156   dnode137   
nginx-deployment-57f495d87b-tnmrq   1/1     Running     0          173m   10.244.1.141   dnode136   

定位到宿主上:

root@dnode137:~# docker ps | grep nginx-deployment-57f495d87b-rcvlg
9947530bc668        nginx@sha256:e8ab8d42e0c34c104ac60b43ba60b19af08e19a0e6d50396bdfd4cef0347ba83   "nginx -g 'daemon ..."   2 hours ago         Up 2 hours                              k8s_nginx_nginx-deployment-57f495d87b-rcvlg_default_40cc107c-ee33-11e9-8ba3-000c290b4cc5_0
6436b335a958        k8s.gcr.io/pause:3.1                                                            "/pause"                 2 hours ago         Up 2 hours                              k8s_POD_nginx-deployment-57f495d87b-rcvlg_default_40cc107c-ee33-11e9-8ba3-000c290b4cc5_0

停掉 Nginx 容器 9947530bc668:

root@dnode137:~# docker stop 9947530bc668
9947530bc668

观察 Pod 情况:

root@kmaster135:/home/chenjiaxi01/yaml/controllers# kubectl describe pods nginx-deployment-57f495d87b-rcvlg
Name:               nginx-deployment-57f495d87b-rcvlg
Namespace:          default
Priority:           0
PriorityClassName:  
Node:               dnode137/192.168.77.137
Start Time:         Sun, 13 Oct 2019 20:32:30 -0700
Labels:             app=nginx
                    pod-template-hash=57f495d87b
Annotations:        
Status:             Running
IP:                 10.244.2.156
Controlled By:      ReplicaSet/nginx-deployment-57f495d87b
Containers:
  nginx:
    Container ID:   docker://29a227a4831cce1f02bd83512dfcc377f703f5663ed5e6409a3db2f0884ba374
    Image:          nginx:1.15.4
    Image ID:       docker-pullable://nginx@sha256:e8ab8d42e0c34c104ac60b43ba60b19af08e19a0e6d50396bdfd4cef0347ba83
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Sun, 13 Oct 2019 23:27:29 -0700
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Sun, 13 Oct 2019 20:32:46 -0700
      Finished:     Sun, 13 Oct 2019 23:27:23 -0700
    Ready:          True
    Restart Count:  1
    Environment:    
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zh48z (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
...
Events:
  Type    Reason   Age                 From               Message
  ----    ------   ----                ----               -------
  Normal  Pulling  32s (x2 over 175m)  kubelet, dnode137  pulling image "nginx:1.15.4"
  Normal  Pulled   27s (x2 over 175m)  kubelet, dnode137  Successfully pulled image "nginx:1.15.4"
  Normal  Created  27s (x2 over 175m)  kubelet, dnode137  Created container
  Normal  Started  27s (x2 over 175m)  kubelet, dnode137  Started container

测试结论

  1. Pod Start Time 保持不变,所以可以认为是 Pod 重启,而不是删掉 Pod 重建;
  2. Node 不会发生变化,也就是不会重新调度;
  3. IP 是否会发生变化?停掉 App Container 的时候没有发生变化,如果停掉的是 Infra Container 呢?猜想应该是有可能变化的,因为是重建 Sandbox 时由 CNI 负责的;
  4. Containers 字段里由于容器发生了重建,所以都会有变化;可以看到 State里有本次容器启动的情况,Last State保留了上一次容器运行的基本情况;
  5. Containers Restart Count 从 0 变为 1,表示容器的重启次数,这个数值也会体现在 kubectl get pod 里;
  6. 从 Events 中可以看到,kubelet, dnode137 pulling image "nginx:1.15.4" 重新拉取了容器镜像用于拉起新容器;如果是默认的配置imagePullPolicy: IfNotPresent则该信息为Container image "nginx:1.15.4" already present on machine,如果 image 已经存在的话。

源码分析

在 kublet 的代码里,在主循环SyncPod有相应处理逻辑负责 image 进行拉取:

  1. func SyncPod():pkg/kubelet/kuberuntime/kuberuntime_manager.go:578
// SyncPod syncs the running pod into the desired pod by executing following steps:
//
//  1. Compute sandbox and container changes.
//  2. Kill pod sandbox if necessary.
//  3. Kill any containers that should not be running.
//  4. Create sandbox if necessary.
//  5. Create init containers.
//  6. Create normal containers.
func (m *kubeGenericRuntimeManager) SyncPod(pod *v1.Pod, _ v1.PodStatus, podStatus *kubecontainer.PodStatus, pullSecrets []v1.Secret, backOff *flowcontrol.Backoff) (result kubecontainer.PodSyncResult) {
...
        glog.V(4).Infof("Creating init container %+v in pod %v", container, format.Pod(pod))
        if msg, err := m.startContainer(podSandboxID, podSandboxConfig, container, pod, podStatus, pullSecrets, podIP, kubecontainer.ContainerTypeInit); err != nil {
            startContainerResult.Fail(err, msg)
            utilruntime.HandleError(fmt.Errorf("init container start failed: %v: %s", err, msg))
            return
        }
...
}
  1. func StartContainer: pkg/kubelet/kuberuntime/kuberuntime_container.go:89
// startContainer starts a container and returns a message indicates why it is failed on error.
// It starts the container through the following steps:
// * pull the image
// * create the container
// * start the container
// * run the post start lifecycle hooks (if applicable)
func (m *kubeGenericRuntimeManager) startContainer(podSandboxID string, podSandboxConfig *runtimeapi.PodSandboxConfig, container *v1.Container, pod *v1.Pod, podStatus *kubecontainer.PodStatus, pullSecrets []v1.Secret, podIP string, containerType kubecontainer.ContainerType) (string, error) {
    // Step 1: pull the image.
    imageRef, msg, err := m.imagePuller.EnsureImageExists(pod, container, pullSecrets)
    if err != nil {
        m.recordContainerEvent(pod, container, "", v1.EventTypeWarning, events.FailedToCreateContainer, "Error: %v", grpc.ErrorDesc(err))
        return msg, err
    }
    ...
}
  1. type ImageManager interface:pkg/kubelet/images/types.go:50
// ImageManager provides an interface to manage the lifecycle of images.
// Implementations of this interface are expected to deal with pulling (downloading),
// managing, and deleting container images.
// Implementations are expected to abstract the underlying runtimes.
// Implementations are expected to be thread safe.
type ImageManager interface {
    // EnsureImageExists ensures that image specified in `container` exists.
    EnsureImageExists(pod *v1.Pod, container *v1.Container, pullSecrets []v1.Secret) (string, string, error)

    // TODO(ronl): consolidating image managing and deleting operation in this interface
}
  1. func EnsureImageExist:pkg/kubelet/images/image_manager.go:86
// EnsureImageExists pulls the image for the specified pod and container, and returns
// (imageRef, error message, error).
func (m *imageManager) EnsureImageExists(pod *v1.Pod, container *v1.Container, pullSecrets []v1.Secret) (string, string, error) {
    ...
    present := imageRef != ""
    if !shouldPullImage(container, present) {
        if present {
            msg := fmt.Sprintf("Container image %q already present on machine", container.Image)
            m.logIt(ref, v1.EventTypeNormal, events.PulledImage, logPrefix, msg, glog.Info)
            return imageRef, "", nil
        } else {
            msg := fmt.Sprintf("Container image %q is not present with pull policy of Never", container.Image)
            m.logIt(ref, v1.EventTypeWarning, events.ErrImageNeverPullPolicy, logPrefix, msg, glog.Warning)
            return "", msg, ErrImageNeverPull
        }
    }
    ...
}
  1. func shouldPullImage:pkg/kubelet/images/image_manager.go:62
// shouldPullImage returns whether we should pull an image according to
// the presence and pull policy of the image.
func shouldPullImage(container *v1.Container, imagePresent bool) bool {
    if container.ImagePullPolicy == v1.PullNever {
        return false
    }

    if container.ImagePullPolicy == v1.PullAlways ||
        (container.ImagePullPolicy == v1.PullIfNotPresent && (!imagePresent)) {
        return true
    }

    return false
}

新的问题: 当容器配置形如image: nginx:latest时,imagePullPolicy默认配置为Always,管理默认配置的代码在哪里?

pkg/apis/core/v1/defaults.go:77:

func SetDefaults_Container(obj *v1.Container) {
    if obj.ImagePullPolicy == "" {
        // Ignore error and assume it has been validated elsewhere
        _, tag, _, _ := parsers.ParseImageName(obj.Image)

        // Check image tag
        if tag == "latest" {
            obj.ImagePullPolicy = v1.PullAlways
        } else {
            obj.ImagePullPolicy = v1.PullIfNotPresent
        }
    }
    if obj.TerminationMessagePath == "" {
        obj.TerminationMessagePath = v1.TerminationMessagePathDefault
    }
    if obj.TerminationMessagePolicy == "" {
        obj.TerminationMessagePolicy = v1.TerminationMessageReadFile
    }
}

你可能感兴趣的:(详解 Kubernetes Pod imagePullPolicy)