2022-01-21 记录一次K8S集群崩溃问题

1、kubectl get pods -A
The connection to the server lb.kubesphere.local:6443 was refused - did you specify the right host or port?
2、查看api pod
W0121 01:20:45.584859 1 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {https://192.168.27.136:2379 0 }. Err :connection error: desc = "transport: Error while dialing dial tcp 192.168.27.136:2379: connect: connection refused". Reconnecting...
没法访问etcd
3、查看etcd pod 永远在created状态当中,没有日志
4、systemctl status polkit
Authorization not available. Check if polkit service is running or see debug message for more information.
5、这个问题只有在centos7的虚拟机中遇到过,解决办法

getent group polkitd >/dev/null && echo -e "\e[1;32mpolkitd group already exists\e[0m" || { groupadd -r polkitd && echo -e "\e[1;33mAdded missing polkitd group\e[0m" || echo -e "\e[1;31mAdding polkitd group FAILED\e[0m"; }
getent passwd polkitd >/dev/null && echo -e "\e[1;32mpolkitd user already exists\e[0m" || { useradd -r -g polkitd -d / -s /sbin/nologin -c "User for polkitd" polkitd && echo -e "\e[1;33mAdded missing polkitd user\e[0m" || echo -e "\e[1;31mAdding polkitd user FAILED\e[0m"; }
rpm -Va polkit* && echo -e "\e[1;32mpolkit* rpm verification passed\e[0m" || { echo -e "\e[1;33mResetting polkit* rpm user/group ownership & perms\e[0m"; rpm --setugids polkit polkit-pkla-compat; rpm --setperms polkit polkit-pkla-compat; }
shutdown -r now

你可能感兴趣的:(2022-01-21 记录一次K8S集群崩溃问题)