本文要讲的是k8s的故障排除,比较浅,最近刚入门。主要涵盖的内容是查看k8s对象的当前运行时信息;对于服务、容器的问题是如何诊断的;对于某些复杂的问题例如pod调度问题是如何排查的。
1、查看系统的Event事件
在对象资源(pod,service,RC,node,namespace,deployment等)运行有问题时,例如pod创建后没有成功运行,都应该查看k8s对象的当前运行时信息,特别是与对象关联的Event事件。这些事件记录了相关主题、发生时段、最近发生时间、发生次数和时间原因等。
k8s提供一下命令来查看对象运行状态:
kubectl describe pod xxxx
kubectl describe node xxxx
结果如下:
[root@centos ~]# kubectl get pod
NAME READY STATUS RESTARTS AGE
curl-5f8bff6547-rb4qk 1/1 Running 2 3d14h
redis-master-7j8cm 1/1 Running 2 3d14h
webapp-j7gd2 1/1 Running 3 3d21h
webapp-kzrn7 1/1 Running 3 3d14h
[root@centos ~]# kubectl describe pod webapp-j7gd2
Name: webapp-j7gd2
Namespace: default
Priority: 0
PriorityClassName:
Node: node3/192.168.195.138
Start Time: Mon, 08 Apr 2019 13:19:25 +0800
Labels: app=webapp
Annotations:
Status: Running
IP: 10.244.1.35
Controlled By: ReplicationController/webapp
Containers:
webapp:
Container ID: docker://e4dd5ec51e4d05456bd1605459a252085ad092c6be26e2becd5301114a470a33
Image: tomcat:9-jre8-alpine
Image ID: docker-pullable://tomcat@sha256:67fc2a0a54f9dfa7abda85a2900d721a55115dcae8ca7da560e65d15ca4c8aa7
Port: 8080/TCP
Host Port: 0/TCP
State: Running
Started: Thu, 11 Apr 2019 09:26:42 +0800
Last State: Terminated
Reason: Error
Exit Code: 255
Started: Mon, 08 Apr 2019 21:52:27 +0800
Finished: Thu, 11 Apr 2019 09:25:55 +0800
Ready: True
Restart Count: 3
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-nx72w (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-nx72w:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-nx72w
Optional: false
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
最后一行的event信息比较难重要,我这个pod是没有问题的,所以没啥信息,如果你的pod有一场的话,这边是会有错误信息的。然后错误信息是英文的,你一看就知道是什么问题。一般是镜像拉不到啥的,没有可用的node等等。如果你的pod是在某个namespace下的,不是default命名空间下的,那就需要用一下命令来指定命名空间:
kubectl describe pod xxx -n 你的命名空间
2、查看容器的日志
在需要排查容器内部应用程序生成的日志时,可以使用kubectl logs
[root@centos ~]# kubectl get pod
NAME READY STATUS RESTARTS AGE
curl-5f8bff6547-rb4qk 1/1 Running 2 3d14h
redis-master-7j8cm 1/1 Running 2 3d14h
webapp-j7gd2 1/1 Running 3 3d21h
webapp-kzrn7 1/1 Running 3 3d14h
[root@centos ~]# kubectl logs webapp-j7gd2
11-Apr-2019 01:26:45.108 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version name: Apache Tomcat/9.0.17
11-Apr-2019 01:26:45.145 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server built: Mar 13 2019 15:55:27 UTC
11-Apr-2019 01:26:45.146 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version number: 9.0.17.0
11-Apr-2019 01:26:45.146 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log OS Name: Linux
11-Apr-2019 01:26:45.146 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log OS Version: 3.10.0-957.el7.x86_64
11-Apr-2019 01:26:45.146 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Architecture: amd64
11-Apr-2019 01:26:45.146 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Java Home: /usr/lib/jvm/java-1.8-openjdk/jre
11-Apr-2019 01:26:45.147 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log JVM Version: 1.8.0_201-b08
11-Apr-2019 01:26:45.147 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log JVM Vendor: Oracle Corporation
11-Apr-2019 01:26:45.147 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log CATALINA_BASE: /usr/local/tomcat
11-Apr-2019 01:26:45.147 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log CATALINA_HOME: /usr/local/tomcat
11-Apr-2019 01:26:45.148 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties
11-Apr-2019 01:26:45.148 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
11-Apr-2019 01:26:45.148 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djdk.tls.ephemeralDHKeySize=2048
11-Apr-2019 01:26:45.149 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.protocol.handler.pkgs=org.apache.catalina.webresources
11-Apr-2019 01:26:45.149 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dorg.apache.catalina.security.SecurityListener.UMASK=0027
11-Apr-2019 01:26:45.150 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dignore.endorsed.dirs=
11-Apr-2019 01:26:45.150 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcatalina.base=/usr/local/tomcat
11-Apr-2019 01:26:45.150 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcatalina.home=/usr/local/tomcat
11-Apr-2019 01:26:45.150 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.io.tmpdir=/usr/local/tomcat/temp
11-Apr-2019 01:26:45.151 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent Loaded APR based Apache Tomcat Native library [1.2.21] using APR version [1.6.5].
11-Apr-2019 01:26:45.151 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent APR capabilities: IPv6 [true], sendfile [true], accept filters [false], random [true].
11-Apr-2019 01:26:45.151 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent APR/OpenSSL configuration: useAprConnector [false], useOpenSSL [true]
11-Apr-2019 01:26:45.160 INFO [main] org.apache.catalina.core.AprLifecycleListener.initializeSSL OpenSSL successfully initialized [OpenSSL 1.1.1b 26 Feb 2019]
11-Apr-2019 01:26:45.606 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-nio-8080"]
11-Apr-2019 01:26:45.678 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["ajp-nio-8009"]
11-Apr-2019 01:26:45.689 INFO [main] org.apache.catalina.startup.Catalina.load Server initialization in [2,071] milliseconds
11-Apr-2019 01:26:45.755 INFO [main] org.apache.catalina.core.StandardService.startInternal Starting service [Catalina]
11-Apr-2019 01:26:45.755 INFO [main] org.apache.catalina.core.StandardEngine.startInternal Starting Servlet engine: [Apache Tomcat/9.0.17]
11-Apr-2019 01:26:45.777 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/ROOT]
11-Apr-2019 01:26:46.985 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/ROOT] has finished in [1,202] ms
11-Apr-2019 01:26:46.986 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/docs]
11-Apr-2019 01:26:47.071 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/docs] has finished in [86] ms
11-Apr-2019 01:26:47.080 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/examples]
11-Apr-2019 01:26:48.100 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/examples] has finished in [1,020] ms
11-Apr-2019 01:26:48.104 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/host-manager]
11-Apr-2019 01:26:48.169 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/host-manager] has finished in [65] ms
11-Apr-2019 01:26:48.169 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/manager]
11-Apr-2019 01:26:48.227 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/manager] has finished in [58] ms
11-Apr-2019 01:26:48.235 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["http-nio-8080"]
11-Apr-2019 01:26:48.302 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["ajp-nio-8009"]
11-Apr-2019 01:26:48.323 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in [2,633] milliseconds
如果在一个pod中包含多个容器,则需要通过-c参数来指定容器的名称来进行查看,例如:
kubectl logs -c
当然也可以直接直用docker logs
[root@node2 ~]# docker ps | grep web
6041a63c30ea 6097ab3c4283 "catalina.sh run" 25 hours ago Up 25 hours k8s_webapp_webapp-kzrn7_default_7c476613-59f4-11e9-9a41-000c29f1f0e4_3
974390ced06b k8s.gcr.io/pause:3.1 "/pause" 25 hours ago Up 25 hours k8s_POD_webapp-kzrn7_default_7c476613-59f4-11e9-9a41-000c29f1f0e4_7
[root@node2 ~]# docker logs 6041a63c30ea
11-Apr-2019 01:26:33.432 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version name: Apache Tomcat/9.0.17
11-Apr-2019 01:26:33.526 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server built: Mar 13 2019 15:55:27 UTC
11-Apr-2019 01:26:33.526 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Server version number: 9.0.17.0
11-Apr-2019 01:26:33.526 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log OS Name: Linux
11-Apr-2019 01:26:33.527 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log OS Version: 3.10.0-957.el7.x86_64
11-Apr-2019 01:26:33.527 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Architecture: amd64
11-Apr-2019 01:26:33.527 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Java Home: /usr/lib/jvm/java-1.8-openjdk/jre
11-Apr-2019 01:26:33.527 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log JVM Version: 1.8.0_201-b08
11-Apr-2019 01:26:33.528 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log JVM Vendor: Oracle Corporation
11-Apr-2019 01:26:33.528 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log CATALINA_BASE: /usr/local/tomcat
11-Apr-2019 01:26:33.528 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log CATALINA_HOME: /usr/local/tomcat
11-Apr-2019 01:26:33.529 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties
11-Apr-2019 01:26:33.529 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
11-Apr-2019 01:26:33.529 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djdk.tls.ephemeralDHKeySize=2048
11-Apr-2019 01:26:33.529 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.protocol.handler.pkgs=org.apache.catalina.webresources
11-Apr-2019 01:26:33.530 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dorg.apache.catalina.security.SecurityListener.UMASK=0027
11-Apr-2019 01:26:33.530 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dignore.endorsed.dirs=
11-Apr-2019 01:26:33.530 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcatalina.base=/usr/local/tomcat
11-Apr-2019 01:26:33.530 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Dcatalina.home=/usr/local/tomcat
11-Apr-2019 01:26:33.530 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.io.tmpdir=/usr/local/tomcat/temp
11-Apr-2019 01:26:33.531 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent Loaded APR based Apache Tomcat Native library [1.2.21] using APR version [1.6.5].
11-Apr-2019 01:26:33.539 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent APR capabilities: IPv6 [true], sendfile [true], accept filters [false], random [true].
11-Apr-2019 01:26:33.540 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent APR/OpenSSL configuration: useAprConnector [false], useOpenSSL [true]
11-Apr-2019 01:26:33.565 INFO [main] org.apache.catalina.core.AprLifecycleListener.initializeSSL OpenSSL successfully initialized [OpenSSL 1.1.1b 26 Feb 2019]
11-Apr-2019 01:26:34.291 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-nio-8080"]
11-Apr-2019 01:26:34.374 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["ajp-nio-8009"]
11-Apr-2019 01:26:34.378 INFO [main] org.apache.catalina.startup.Catalina.load Server initialization in [3,215] milliseconds
11-Apr-2019 01:26:34.467 INFO [main] org.apache.catalina.core.StandardService.startInternal Starting service [Catalina]
11-Apr-2019 01:26:34.468 INFO [main] org.apache.catalina.core.StandardEngine.startInternal Starting Servlet engine: [Apache Tomcat/9.0.17]
11-Apr-2019 01:26:34.507 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/ROOT]
11-Apr-2019 01:26:36.293 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/ROOT] has finished in [1,786] ms
11-Apr-2019 01:26:36.294 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/docs]
11-Apr-2019 01:26:36.368 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/docs] has finished in [73] ms
11-Apr-2019 01:26:36.377 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/examples]
11-Apr-2019 01:26:37.797 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/examples] has finished in [1,420] ms
11-Apr-2019 01:26:37.802 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/host-manager]
11-Apr-2019 01:26:38.031 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/host-manager] has finished in [228] ms
11-Apr-2019 01:26:38.032 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/usr/local/tomcat/webapps/manager]
11-Apr-2019 01:26:38.161 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/usr/local/tomcat/webapps/manager] has finished in [128] ms
11-Apr-2019 01:26:38.183 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["http-nio-8080"]
11-Apr-2019 01:26:38.244 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["ajp-nio-8009"]
11-Apr-2019 01:26:38.290 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in [3,911] milliseconds
3、查看k8s的服务日志
如果在linux系统上进行安装,并且是使用systemd系统来管理k8s服务,那么systemd的journal系统会接管服务程序的输出日志。可以使用systemd status 或者systemctl status或者journalctl查看系统服务日志:
[root@node2 ~]# systemctl status kubelet.service
Display all 502 possibilities? (y or n)
[root@node2 ~]# systemctl status kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Thu 2019-04-11 09:25:36 CST; 1 day 1h ago
Docs: https://kubernetes.io/docs/
Main PID: 7793 (kubelet)
Tasks: 19
Memory: 112.4M
CGroup: /system.slice/kubelet.service
└─7793 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/v...
Apr 12 09:56:44 node2 kubelet[7793]: W0412 09:56:44.886746 7793 reflector.go:270] object-"kube-system"/"kube-proxy": watch of *v1.ConfigMa... (562273)
Apr 12 09:57:46 node2 kubelet[7793]: W0412 09:57:46.933029 7793 reflector.go:270] object-"kube-system"/"kube-flannel-cfg": watch of *v1.Co... (562359)
Apr 12 10:04:45 node2 kubelet[7793]: W0412 10:04:45.828641 7793 reflector.go:270] object-"kube-system"/"coredns": watch of *v1.ConfigMap e... (562964)
Apr 12 10:11:04 node2 kubelet[7793]: W0412 10:11:04.635497 7793 reflector.go:270] object-"kube-system"/"kube-flannel-cfg": watch of *v1.Co... (563510)
Apr 12 10:12:23 node2 kubelet[7793]: W0412 10:12:23.593624 7793 reflector.go:270] object-"kube-system"/"kube-proxy": watch of *v1.ConfigMa... (563619)
Apr 12 10:24:09 node2 kubelet[7793]: W0412 10:24:09.875061 7793 reflector.go:270] object-"kube-system"/"coredns": watch of *v1.ConfigMap e... (564637)
Apr 12 10:26:55 node2 kubelet[7793]: W0412 10:26:55.642788 7793 reflector.go:270] object-"kube-system"/"kube-proxy": watch of *v1.ConfigMa... (564886)
Apr 12 10:28:14 node2 kubelet[7793]: W0412 10:28:14.693489 7793 reflector.go:270] object-"kube-system"/"kube-flannel-cfg": watch of *v1.Co... (564992)
Apr 12 10:43:12 node2 kubelet[7793]: W0412 10:43:12.893306 7793 reflector.go:270] object-"kube-system"/"coredns": watch of *v1.ConfigMap e... (566287)
Apr 12 10:43:37 node2 kubelet[7793]: W0412 10:43:37.662130 7793 reflector.go:270] object-"kube-system"/"kube-proxy": watch of *v1.ConfigMa... (566320)
Hint: Some lines were ellipsized, use -l to show in full.
或者
[root@centos ~]# journalctl -xeu kubelet
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.510165 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.610691 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.711008 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.811468 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:53 centos.master kubelet[9787]: I0412 10:46:53.883382 9787 kubelet_node_status.go:278] Setting node annotation to enable volume controlle
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.912065 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:53 centos.master kubelet[9787]: I0412 10:46:53.914043 9787 kubelet_node_status.go:72] Attempting to register node centos.master
Apr 12 10:46:53 centos.master kubelet[9787]: E0412 10:46:53.916659 9787 kubelet_node_status.go:94] Unable to register node "centos.master" with API se
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.012363 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.113003 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: I0412 10:46:54.147210 9787 kubelet_node_status.go:278] Setting node annotation to enable volume controlle
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.213291 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.313616 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.413970 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.514292 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.615167 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.715863 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.816154 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:54 centos.master kubelet[9787]: E0412 10:46:54.916432 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.017040 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.117863 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.218694 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.319663 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.420254 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.521053 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.621575 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.722435 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.823464 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:55 centos.master kubelet[9787]: E0412 10:46:55.924273 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.024392 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.125129 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: I0412 10:46:56.146767 9787 kubelet_node_status.go:278] Setting node annotation to enable volume controlle
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.225839 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.326354 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.427552 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.528289 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.628843 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.729056 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.829340 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:56 centos.master kubelet[9787]: E0412 10:46:56.929690 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:57 centos.master kubelet[9787]: E0412 10:46:57.030373 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:57 centos.master kubelet[9787]: E0412 10:46:57.131158 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:57 centos.master kubelet[9787]: E0412 10:46:57.232373 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:57 centos.master kubelet[9787]: E0412 10:46:57.333084 9787 kubelet.go:2266] node "centos.master" not found
Apr 12 10:46:57 centos.master kubelet[9787]: E0412 10:46:57.433269 9787 kubelet.go:2266] node "centos.master" not found
上面的kubelet服务日志告诉我centos.master 的node找不到。
好了到这里三板斧算是用完了。很简单的三板斧,只能用于基本排查。
如果某个k8s对象存在问题而查看系统服务的日志,则我们可以用这个对象的名字作为关键字来搜索日志,在大多数情况下,我么平常所遇到的主要是与pod对象相关的问题,比如无法创建pod,pod启动后就停止或者Pod副本无法增加等。此时,我们可以先确定哪个pod在哪个节点上,然后登陆这个节点,从kubelet的日志中查询该pod的完整日志,然后进行问题排查。对于与pod扩容相关或者与RC相关的问题,则很有可能在kjbe-controller-manager及Kube-scheduler的日志中找出问题的关键点。
另外kube-proxy经常被我们忽略,因为就算他停了,pod的状态依旧时正常的,但会导致某些服务访问异常。