Kubrnetes work NotReady ResourceExhausted work节点资源耗尽

rpc error: code = ResourceExhausted desc = grpc: received message larger than max (4196772 vs. 4194304)

现象:kubernetes集群不可用,所有work节点离线

问题定位:

  • 执行kubectl get node 发现work节点都是NotReady状态
  • 登入到work节点查看日志发现
Nov  1 10:32:34 izwz9a75ak59utsbrrj9crz kubelet: E1101 10:32:34.119157    1669 kuberuntime_container.go:323] getKubeletContainers failed: rpc error: code = ResourceExhausted desc = grpc: received message larger than max (4196772 vs. 4194304)
Nov  1 10:32:34 izwz9a75ak59utsbrrj9crz kubelet: E1101 10:32:34.119174    1669 generic.go:197] GenericPLEG: Unable to retrieve pods: rpc error: code = ResourceExhausted desc = grpc: received message larger than max (4196772 vs. 4194304)
  • 发现/var/lib/docker/containers 下有上万个容器文件

该问题是有k8s的一个bug https://github.com/kubernetes/kubernetes/issues/63858

解决方法:

  • 登入到work节点清除不用的容器残留
    docker system prune

  • 重启docker和kubelet
    service docker restart && service kubelet restart

你可能感兴趣的:(kubernetes)