2年前用虚拟机做K8S集群学习。最近打开当时的虚拟机,发现K8S无法启动。
The connection to the server xxxx:6443 was refused - did you specify the right host or port?
查看日志 sudo journalctl -xefu kubelet,发现报错“bootstrap-kubelet.conf: no such file or directory”,并提示证书过期。
Oct 11 23:55:51 master.k8s systemd[1]: kubelet.service failed.
Oct 11 23:56:01 master.k8s systemd[1]: kubelet.service holdoff time over, scheduling restart.
Oct 11 23:56:01 master.k8s systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit kubelet.service has finished shutting down.
Oct 11 23:56:01 master.k8s systemd[1]: Started kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit kubelet.service has finished starting up.
--
-- The start-up result is done.
Oct 11 23:56:01 master.k8s kubelet[2931]: I1011 23:56:01.532681 2931 server.go:411] Version: v1.19.4
Oct 11 23:56:01 master.k8s kubelet[2931]: I1011 23:56:01.533162 2931 server.go:831] Client rotation is on, will bootstrap in background
Oct 11 23:56:01 master.k8s kubelet[2931]: E1011 23:56:01.534673 2931 bootstrap.go:265] part of the existing bootstrap client certificate is expired: 2021-12-06 17:48:28 +0000 UTC
Oct 11 23:56:01 master.k8s kubelet[2931]: F1011 23:56:01.534777 2931 server.go:265] failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory
查看状态 systemctl status kubelet.service
[root@master kubernetes]# systemctl status kubelet.service
â— kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Wed 2022-10-12 00:26:38 CST; 7s ago
Docs: https://kubernetes.io/docs/
Process: 6437 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255)
Main PID: 6437 (code=exited, status=255)
Oct 12 00:26:38 master.k8s kubelet[6437]: k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.Forever(...f200)
Oct 12 00:26:38 master.k8s systemd[1]: kubelet.service failed.
Hint: Some lines were ellipsized, use -l to show in full.
#kubeadm alpha certs check-expiration,过期时间是2021/12/6.
[root@master pahu]# kubeadm alpha certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[check-expiration] Error reading configuration from the Cluster. Falling back to default configuration
W1012 17:44:18.201290 3636 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Dec 06, 2021 17:48 UTC no
apiserver Dec 06, 2021 17:48 UTC ca no
apiserver-etcd-client Dec 06, 2021 17:48 UTC etcd-ca no
apiserver-kubelet-client Dec 06, 2021 17:48 UTC ca no
controller-manager.conf Dec 06, 2021 17:48 UTC no
etcd-healthcheck-client Dec 06, 2021 17:48 UTC etcd-ca no
etcd-peer Dec 06, 2021 17:48 UTC etcd-ca no
etcd-server Dec 06, 2021 17:48 UTC etcd-ca no
front-proxy-client Dec 06, 2021 17:48 UTC front-proxy-ca no
scheduler.conf Dec 06, 2021 17:48 UTC no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Dec 04, 2030 17:48 UTC 8y no
etcd-ca Dec 04, 2030 17:48 UTC 8y no
front-proxy-ca Dec 04, 2030 17:48 UTC 8y no
PKI 证书和要求
Kubernetes 需要 PKI 证书才能进行基于 TLS 的身份验证。如果你是使用 kubeadm 安装的 Kubernetes, 则会自动生成集群所需的证书。你还可以生成自己的证书。 例如,不将私钥存储在 API 服务器上,可以让私钥更加安全。此页面说明了集群必需的证书。
集群是如何使用证书的
Kubernetes 需要 PKI 才能执行以下操作:
Kubelet 的客户端证书,用于 API 服务器身份验证
Kubelet 服务端证书, 用于 API 服务器与 Kubelet 的会话
API 服务器端点的证书
集群管理员的客户端证书,用于 API 服务器身份认证
API 服务器的客户端证书,用于和 Kubelet 的会话
API 服务器的客户端证书,用于和 etcd 的会话
控制器管理器的客户端证书/kubeconfig,用于和 API 服务器的会话
调度器的客户端证书/kubeconfig,用于和 API 服务器的会话
前端代理 的客户端及服务端证书
说明: 只有当你运行 kube-proxy 并要支持 扩展 API 服务器 时,才需要 front-proxy 证书
etcd 还实现了双向 TLS 来对客户端和对其他对等节点进行身份验证。
证书存放的位置
假如通过 kubeadm 安装 Kubernetes,大多数证书都存储在 /etc/kubernetes/pki。 本文档中的所有路径都是相对于该目录的,但用户账户证书除外,kubeadm 将其放在 /etc/kubernetes 中
查看证书位置
[root@master pahu]# ls -l /etc/kubernetes/pki
total 56
-rw-r--r--. 1 root root 1269 Dec 7 2020 apiserver.crt
-rw-r--r--. 1 root root 1135 Dec 7 2020 apiserver-etcd-client.crt
-rw-------. 1 root root 1675 Dec 7 2020 apiserver-etcd-client.key
-rw-------. 1 root root 1679 Dec 7 2020 apiserver.key
-rw-r--r--. 1 root root 1143 Dec 7 2020 apiserver-kubelet-client.crt
-rw-------. 1 root root 1679 Dec 7 2020 apiserver-kubelet-client.key
-rw-r--r--. 1 root root 1066 Dec 7 2020 ca.crt
-rw-------. 1 root root 1675 Dec 7 2020 ca.key
drwxr-xr-x. 2 root root 162 Dec 7 2020 etcd
-rw-r--r--. 1 root root 1078 Dec 7 2020 front-proxy-ca.crt
-rw-------. 1 root root 1675 Dec 7 2020 front-proxy-ca.key
-rw-r--r--. 1 root root 1103 Dec 7 2020 front-proxy-client.crt
-rw-------. 1 root root 1675 Dec 7 2020 front-proxy-client.key
-rw-------. 1 root root 1679 Dec 7 2020 sa.key
-rw-------. 1 root root 451 Dec 7 2020 sa.pub
自动续订指的是,在用kubeadm升级控制平面时 自动更新所有证书。
如果对证书续约没有要求,并定期升级kubernetes版本,每次升级间隔时间少于1年,最佳做法是经常升级集群以确保安全。
如果不想在升级集群时续约证书,则给 kubeadm upgrade apply 或 kubeadm upgrade node 传递参数:–certificate-renewal=false
#kubeadm alpha certs renew all
[root@master pahu]# kubeadm alpha certs renew all
[renew] Reading configuration from the cluster...
[renew] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[renew] Error reading configuration from the Cluster. Falling back to default configuration
W1012 18:30:31.234419 8414 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
certificate embedded in the kubeconfig file for the admin to use and for kubeadm itself renewed
certificate for serving the Kubernetes API renewed
certificate the apiserver uses to access etcd renewed
certificate for the API server to connect to kubelet renewed
certificate embedded in the kubeconfig file for the controller manager to use renewed
certificate for liveness probes to healthcheck etcd renewed
certificate for etcd nodes to communicate with each other renewed
certificate for serving etcd renewed
certificate for the front proxy client renewed
certificate embedded in the kubeconfig file for the scheduler manager to use renewed
依旧报错依然有过期client存在:“existing bootstrap client certificate is expired”
Oct 12 10:57:39 master.k8s kubelet[3264]: I1012 10:57:39.060116 3264 server.go:411] Version: v1.19.4
Oct 12 10:57:39 master.k8s kubelet[3264]: I1012 10:57:39.060783 3264 server.go:831] Client rotation is on, will bootstrap in background
Oct 12 10:57:39 master.k8s kubelet[3264]: E1012 10:57:39.063255 3264 bootstrap.go:265] part of the existing bootstrap client certificate is expired: 2021-12-06 17:48:28 +0000 UTC 《---------------
Oct 12 10:57:39 master.k8s kubelet[3264]: F1012 10:57:39.063338 3264 server.go:265] failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory
kubelet-client-current.pem对应连接的时间是 2020-12-07
[root@master pahu]# cat /etc/kubernetes/kubelet.conf
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: xxxxxxxxxxx
server: https://192.168.x.x:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: system:node:master.k8s
name: system:node:master.k8s@kubernetes
current-context: system:node:master.k8s@kubernetes
kind: Config
preferences: {}
users:
- name: system:node:master.k8s
user:
client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem <--------
client-key: /var/lib/kubelet/pki/kubelet-client-current.pem
[root@master pahu]# ls -l /var/lib/kubelet/pki
total 12
-rw-------. 1 root root 2810 Dec 7 2020 kubelet-client-2020-12-07-01-48-28.pem
lrwxrwxrwx. 1 root root 59 Dec 7 2020 kubelet-client-current.pem -> /var/lib/kubelet/pki/kubelet-client-2020-12-07-01-48-28.pem
-rw-r--r--. 1 root root 2266 Dec 7 2020 kubelet.crt
-rw-------. 1 root root 1679 Dec 7 2020 kubelet.key
主节点重新创建证书和配置文件
备份证书和配置文件
# cp -rp /etc/kubernetes /etc/kubernetes.bak
# mkdir /root/backconf
# mv /etc/kubernetes/*.conf /root/backconf/
生成新的证书和配置文件
# kubeadm alpha certs renew all
# kubeadm init phase kubeconfig all
将新生成的admin.conf文件覆盖掉.kube/config文件:
mv $HOME/.kube/config $HOME/.kube/config.old
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
chmod 644 $HOME/.kube/config
重启,查看节点状态
[root@master pahu]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master.k8s Ready master 674d v1.19.4
node1.k8s NotReady worker 674d v1.19.4
node2.k8s NotReady worker 674d v1.19.4
重建token以便work节点加入
[root@master pahu]# kubeadm token create --print-join-command
W1012 11:32:41.299172 5785 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
kubeadm join 192.168.157.129:6443 --token usn64n.v8vuy2csmgaj0hgu --discovery-token-ca-cert-hash sha256:f1eaf2cb88154f182e19d6f4a4f9a8ae2e23fd3f38f2e14a413fb68903727bc1
[root@master pahu]# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
usn64n.v8vuy2csmgaj0hgu 23h 2022-10-13T11:32:41+08:00 authentication,signing system:bootstrappers:kubeadm:default-node-token
备份并删除配置文件
#cp -r /etc/kubernetes /etc/kubernetes.bak
#rm -rf /etc/kubernetes/kubelet.conf
#rm -rf /etc/kubernetes/pki/ca.crt
#rm -rf /etc/kubernetes/bootstrap-kubelet.conf
加入集群
[root@node1 pahu]# kubeadm join 192.168.x.x:6443 --token usn64n.v8vuy2csmgaj0hgu --discovery-token-ca-cert-hash sha256:f1eaf2cb88154f182e19d6f4a4f9a8ae2e23fd3f38f2e14a413fb68903727bc1
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
验证状态为ready。
[root@master pahu]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master.k8s Ready master 674d v1.19.4
node1.k8s Ready worker 674d v1.19.4
node2.k8s NotReady worker 674d v1.19.4
相同方法应用在节点2,集群恢复正常。