查看状态的时候,遇见coredns出现crashlookbackoff,首先我们来进行排错,不管是什么原因,查看coredns的详细信息,以及logs
[root@k8s-master coredns]# kubectl get pod,svc,deployment,rc -n kube-system
NAME READY STATUS RESTARTS AGE
pod/coredns-5bd5f9dbd9-h22lf 0/1 CrashLoopBackOff 106 20h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kube-dns ClusterIP 10.0.0.2 53/UDP,53/TCP 18h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.extensions/coredns 0/1 1 0 20h
[root@k8s-master coredns]# kubectl describe pod coredns-5bd5f9dbd9-h22lf -n kube-system
Name: coredns-5bd5f9dbd9-h22lf
Namespace: kube-system
Priority: 0
Node: 192.168.30.23/192.168.30.23
Start Time: Tue, 06 Aug 2019 13:47:23 +0800
Labels: k8s-app=kube-dns
pod-template-hash=5bd5f9dbd9
Annotations: seccomp.security.alpha.kubernetes.io/pod: docker/default
Status: Running
IP: 172.17.87.2
Controlled By: ReplicaSet/coredns-5bd5f9dbd9
Containers:
coredns:
Container ID: docker://c02395208c9763d5061e3478def972f60274fd2a98f5fdd3bd6fbe0c542f39bb
Image: coredns/coredns:1.2.2
Image ID: docker-pullable://coredns
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Wed, 07 Aug 2019 04:38:29 +0800
Finished: Wed, 07 Aug 2019 04:40:48 +0800
Ready: False
Restart Count: 106
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Environment:
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-sv6tq (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-sv6tq:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-sv6tq
Optional: false
QoS Class: Burstable
Node-Selectors:
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 6h31m (x89 over 13h) kubelet, 192.168.30.23 Container image "coredns/coredns:1.2.2" already present on machine
Warning Unhealthy 5h21m (x515 over 13h) kubelet, 192.168.30.23 Liveness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 5h11m (x1223 over 13h) kubelet, 192.168.30.23 Back-off restarting failed container
这里可以根据自己的原因来解决,其实原因不过两个,可能是你的容器lo回环找不到你的容器,有可能就是你的本地/etc/resolv.conf下的dns
[root@k8s-master coredns]# vim /etc/resolv.conf
这里我指定是nameserver 8.8.8.8
如果他接收了本地的接口,也有可能会出现coredns出现off状态,有其他问题我们继续讨论
[root@k8s-master coredns]# systemctl restart docker
[root@k8s-master coredns]# kubectl get pod,svc,deployment,rc -n kube-system
NAME READY STATUS RESTARTS AGE
pod/coredns-5bd5f9dbd9-h22lf 1/1 Running 107 20h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
AGE
service/kube-dns ClusterIP 10.0.0.2 53/UDP,53/TCP 18h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.extensions/coredns 1/1 1 1 20h
最后一招肯定能解决,我亲自尝试过
[root@k8s-master storage]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-bccdc95cf-l5sf7 0/1 ContainerCreating 0 2s
coredns-bccdc95cf-mvxnb 0/1 CrashLoopBackOff 28 25h
etcd-k8s-master 1/1 Running 4 25h
kube-apiserver-k8s-master 1/1 Running 4 25h
kube-controller-manager-k8s-master 1/1 Running 3 25h
kube-flannel-ds-amd64-84244 1/1 Running 5 25h
kube-flannel-ds-amd64-hcbtq 1/1 Running 1 25h
kube-flannel-ds-amd64-jxlnm 1/1 Running 1 25h
kube-proxy-fxrz4 1/1 Running 3 25h
kube-proxy-s48qj 1/1 Running 1 25h
kube-proxy-t79nx 1/1 Running 1 25h
kube-scheduler-k8s-master 1/1 Running 3 25h
我们这里可以让他重新下载,也就是删除,强制删除,之后它会再重启一个,我们把所有的coredns都删除,然后它会自动重建,也就是k8s自身的功能
[root@k8s-master storage]# kubectl delete pod coredns-bccdc95cf-mvxnb --grace-period=0 --force -n kube-system
pod "coredns-bccdc95cf-mvxnb" force deleted
[root@k8s-master storage]# kubectl delete pod coredns-bccdc95cf-l5sf7 --grace-period=0 --force -n kube-system
pod "coredns-bccdc95cf-l5sf7" force deleted
[root@k8s-master storage]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-bccdc95cf-5spkt 1/1 Running 0 79s
coredns-bccdc95cf-f499q 0/1 Running 0 2s
etcd-k8s-master 1/1 Running 4 25h
kube-apiserver-k8s-master 1/1 Running 4 25h
kube-controller-manager-k8s-master 1/1 Running 3 25h
kube-flannel-ds-amd64-84244 1/1 Running 5 25h
kube-flannel-ds-amd64-hcbtq 1/1 Running 1 25h
kube-flannel-ds-amd64-jxlnm 1/1 Running 1 25h
kube-proxy-fxrz4 1/1 Running 3 25h
kube-proxy-s48qj 1/1 Running 1 25h
kube-proxy-t79nx 1/1 Running 1 25h
kube-scheduler-k8s-master 1/1 Running 3 25h
现在是完全解决这个问题了