查看状态的时候,遇见coredns出现crashlookbackoff,首先我们来进行排错,不管是什么原因,查看coredns的详细信息,以及logs

[root@k8s-master coredns]# kubectl get pod,svc,deployment,rc -n kube-system
NAME                           READY   STATUS             RESTARTS   AGE
pod/coredns-5bd5f9dbd9-h22lf   0/1     CrashLoopBackOff   106        20h

NAME               TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
service/kube-dns   ClusterIP   10.0.0.2             53/UDP,53/TCP   18h

NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.extensions/coredns   0/1     1            0           20h
[root@k8s-master coredns]# kubectl describe pod coredns-5bd5f9dbd9-h22lf -n kube-system
Name:           coredns-5bd5f9dbd9-h22lf
Namespace:      kube-system
Priority:       0
Node:           192.168.30.23/192.168.30.23
Start Time:     Tue, 06 Aug 2019 13:47:23 +0800
Labels:         k8s-app=kube-dns
                pod-template-hash=5bd5f9dbd9
Annotations:    seccomp.security.alpha.kubernetes.io/pod: docker/default
Status:         Running
IP:             172.17.87.2
Controlled By:  ReplicaSet/coredns-5bd5f9dbd9
Containers:
  coredns:
    Container ID:  docker://c02395208c9763d5061e3478def972f60274fd2a98f5fdd3bd6fbe0c542f39bb
    Image:         coredns/coredns:1.2.2
    Image ID:      docker-pullable://coredns
    Ports:         53/UDP, 53/TCP, 9153/TCP
    Host Ports:    0/UDP, 0/TCP, 0/TCP
    Args:
      -conf
      /etc/coredns/Corefile
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Wed, 07 Aug 2019 04:38:29 +0800
      Finished:     Wed, 07 Aug 2019 04:40:48 +0800
    Ready:          False
    Restart Count:  106
    Limits:
      memory:  170Mi
    Requests:
      cpu:        100m
      memory:     70Mi
    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
    Environment:  
    Mounts:
      /etc/coredns from config-volume (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from coredns-token-sv6tq (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      coredns
    Optional:  false
  coredns-token-sv6tq:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  coredns-token-sv6tq
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  
Tolerations:     CriticalAddonsOnly
                 node-role.kubernetes.io/master:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                     From                    Message
  ----     ------     ----                    ----                    -------
  Normal   Pulled     6h31m (x89 over 13h)    kubelet, 192.168.30.23  Container image "coredns/coredns:1.2.2" already present on machine
  Warning  Unhealthy  5h21m (x515 over 13h)   kubelet, 192.168.30.23  Liveness probe failed: HTTP probe failed with statuscode: 503
  Warning  BackOff    5h11m (x1223 over 13h)  kubelet, 192.168.30.23  Back-off restarting failed container

这里可以根据自己的原因来解决,其实原因不过两个,可能是你的容器lo回环找不到你的容器,有可能就是你的本地/etc/resolv.conf下的dns

[root@k8s-master coredns]# vim /etc/resolv.conf

这里我指定是nameserver 8.8.8.8

如果他接收了本地的接口,也有可能会出现coredns出现off状态,有其他问题我们继续讨论

[root@k8s-master coredns]# systemctl restart docker
[root@k8s-master coredns]# kubectl get pod,svc,deployment,rc -n kube-system
NAME                           READY   STATUS    RESTARTS   AGE
pod/coredns-5bd5f9dbd9-h22lf   1/1     Running   107        20h

NAME               TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)        

AGE
service/kube-dns   ClusterIP   10.0.0.2             53/UDP,53/TCP   18h

NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.extensions/coredns   1/1     1            1           20h

最后一招肯定能解决,我亲自尝试过

[root@k8s-master storage]# kubectl get pod -n kube-system
NAME                                 READY   STATUS              RESTARTS   AGE
coredns-bccdc95cf-l5sf7              0/1     ContainerCreating   0          2s
coredns-bccdc95cf-mvxnb              0/1     CrashLoopBackOff    28         25h
etcd-k8s-master                      1/1     Running             4          25h
kube-apiserver-k8s-master            1/1     Running             4          25h
kube-controller-manager-k8s-master   1/1     Running             3          25h
kube-flannel-ds-amd64-84244          1/1     Running             5          25h
kube-flannel-ds-amd64-hcbtq          1/1     Running             1          25h
kube-flannel-ds-amd64-jxlnm          1/1     Running             1          25h
kube-proxy-fxrz4                     1/1     Running             3          25h
kube-proxy-s48qj                     1/1     Running             1          25h
kube-proxy-t79nx                     1/1     Running             1          25h
kube-scheduler-k8s-master            1/1     Running             3          25h

我们这里可以让他重新下载,也就是删除,强制删除,之后它会再重启一个,我们把所有的coredns都删除,然后它会自动重建,也就是k8s自身的功能

[root@k8s-master storage]# kubectl delete pod coredns-bccdc95cf-mvxnb  --grace-period=0 --force -n kube-system 
pod "coredns-bccdc95cf-mvxnb" force deleted

[root@k8s-master storage]# kubectl delete pod coredns-bccdc95cf-l5sf7 --grace-period=0 --force -n kube-system
pod "coredns-bccdc95cf-l5sf7" force deleted

[root@k8s-master storage]# kubectl get pod -n kube-system
NAME                                 READY   STATUS    RESTARTS   AGE
coredns-bccdc95cf-5spkt              1/1     Running   0          79s
coredns-bccdc95cf-f499q              0/1     Running   0          2s
etcd-k8s-master                      1/1     Running   4          25h
kube-apiserver-k8s-master            1/1     Running   4          25h
kube-controller-manager-k8s-master   1/1     Running   3          25h
kube-flannel-ds-amd64-84244          1/1     Running   5          25h
kube-flannel-ds-amd64-hcbtq          1/1     Running   1          25h
kube-flannel-ds-amd64-jxlnm          1/1     Running   1          25h
kube-proxy-fxrz4                     1/1     Running   3          25h
kube-proxy-s48qj                     1/1     Running   1          25h
kube-proxy-t79nx                     1/1     Running   1          25h
kube-scheduler-k8s-master            1/1     Running   3          25h

现在是完全解决这个问题了