K8S 证书过期的解决方法

K8S 证书过期的解决方法

2年前用虚拟机做K8S集群学习。最近打开当时的虚拟机,发现K8S无法启动。

The connection to the server xxxx:6443 was refused - did you specify the right host or port?

1.诊断

查看日志 sudo journalctl -xefu kubelet,发现报错“bootstrap-kubelet.conf: no such file or directory”,并提示证书过期。

Oct 11 23:55:51 master.k8s systemd[1]: kubelet.service failed.
Oct 11 23:56:01 master.k8s systemd[1]: kubelet.service holdoff time over, scheduling restart.
Oct 11 23:56:01 master.k8s systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit kubelet.service has finished shutting down.
Oct 11 23:56:01 master.k8s systemd[1]: Started kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit kubelet.service has finished starting up.
-- 
-- The start-up result is done.
Oct 11 23:56:01 master.k8s kubelet[2931]: I1011 23:56:01.532681    2931 server.go:411] Version: v1.19.4
Oct 11 23:56:01 master.k8s kubelet[2931]: I1011 23:56:01.533162    2931 server.go:831] Client rotation is on, will bootstrap in background
Oct 11 23:56:01 master.k8s kubelet[2931]: E1011 23:56:01.534673    2931 bootstrap.go:265] part of the existing bootstrap client certificate is expired: 2021-12-06 17:48:28 +0000 UTC
Oct 11 23:56:01 master.k8s kubelet[2931]: F1011 23:56:01.534777    2931 server.go:265] failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory

查看状态 systemctl status kubelet.service

[root@master kubernetes]# systemctl status kubelet.service 
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: activating (auto-restart) (Result: exit-code) since Wed 2022-10-12 00:26:38 CST; 7s ago
     Docs: https://kubernetes.io/docs/
  Process: 6437 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255)
 Main PID: 6437 (code=exited, status=255)

Oct 12 00:26:38 master.k8s kubelet[6437]: k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.Forever(...f200)
Oct 12 00:26:38 master.k8s systemd[1]: kubelet.service failed.
Hint: Some lines were ellipsized, use -l to show in full.

2.查看证书

#kubeadm alpha certs check-expiration,过期时间是2021/12/6.

[root@master pahu]# kubeadm alpha certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[check-expiration] Error reading configuration from the Cluster. Falling back to default configuration

W1012 17:44:18.201290    3636 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
CERTIFICATE                EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   EXTERNALLY MANAGED
admin.conf                 Dec 06, 2021 17:48 UTC                                  no      
apiserver                  Dec 06, 2021 17:48 UTC          ca                      no      
apiserver-etcd-client      Dec 06, 2021 17:48 UTC          etcd-ca                 no      
apiserver-kubelet-client   Dec 06, 2021 17:48 UTC          ca                      no      
controller-manager.conf    Dec 06, 2021 17:48 UTC                                  no      
etcd-healthcheck-client    Dec 06, 2021 17:48 UTC          etcd-ca                 no      
etcd-peer                  Dec 06, 2021 17:48 UTC          etcd-ca                 no      
etcd-server                Dec 06, 2021 17:48 UTC          etcd-ca                 no      
front-proxy-client         Dec 06, 2021 17:48 UTC          front-proxy-ca          no      
scheduler.conf             Dec 06, 2021 17:48 UTC                                  no      

CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
ca                      Dec 04, 2030 17:48 UTC   8y              no      
etcd-ca                 Dec 04, 2030 17:48 UTC   8y              no      
front-proxy-ca          Dec 04, 2030 17:48 UTC   8y              no 

2.1 关于证书

PKI 证书和要求
Kubernetes 需要 PKI 证书才能进行基于 TLS 的身份验证。如果你是使用 kubeadm 安装的 Kubernetes, 则会自动生成集群所需的证书。你还可以生成自己的证书。 例如,不将私钥存储在 API 服务器上,可以让私钥更加安全。此页面说明了集群必需的证书。

集群是如何使用证书的
Kubernetes 需要 PKI 才能执行以下操作:

Kubelet 的客户端证书,用于 API 服务器身份验证
Kubelet 服务端证书, 用于 API 服务器与 Kubelet 的会话
API 服务器端点的证书
集群管理员的客户端证书,用于 API 服务器身份认证
API 服务器的客户端证书,用于和 Kubelet 的会话
API 服务器的客户端证书,用于和 etcd 的会话
控制器管理器的客户端证书/kubeconfig,用于和 API 服务器的会话
调度器的客户端证书/kubeconfig,用于和 API 服务器的会话
前端代理 的客户端及服务端证书
说明: 只有当你运行 kube-proxy 并要支持 扩展 API 服务器 时,才需要 front-proxy 证书
etcd 还实现了双向 TLS 来对客户端和对其他对等节点进行身份验证。

证书存放的位置 
假如通过 kubeadm 安装 Kubernetes,大多数证书都存储在 /etc/kubernetes/pki。 本文档中的所有路径都是相对于该目录的,但用户账户证书除外,kubeadm 将其放在 /etc/kubernetes 中

查看证书位置

[root@master pahu]# ls -l /etc/kubernetes/pki
total 56
-rw-r--r--. 1 root root 1269 Dec  7  2020 apiserver.crt
-rw-r--r--. 1 root root 1135 Dec  7  2020 apiserver-etcd-client.crt
-rw-------. 1 root root 1675 Dec  7  2020 apiserver-etcd-client.key
-rw-------. 1 root root 1679 Dec  7  2020 apiserver.key
-rw-r--r--. 1 root root 1143 Dec  7  2020 apiserver-kubelet-client.crt
-rw-------. 1 root root 1679 Dec  7  2020 apiserver-kubelet-client.key
-rw-r--r--. 1 root root 1066 Dec  7  2020 ca.crt
-rw-------. 1 root root 1675 Dec  7  2020 ca.key
drwxr-xr-x. 2 root root  162 Dec  7  2020 etcd
-rw-r--r--. 1 root root 1078 Dec  7  2020 front-proxy-ca.crt
-rw-------. 1 root root 1675 Dec  7  2020 front-proxy-ca.key
-rw-r--r--. 1 root root 1103 Dec  7  2020 front-proxy-client.crt
-rw-------. 1 root root 1675 Dec  7  2020 front-proxy-client.key
-rw-------. 1 root root 1679 Dec  7  2020 sa.key
-rw-------. 1 root root  451 Dec  7  2020 sa.pub

2.2 自动续订证书

自动续订指的是,在用kubeadm升级控制平面时 自动更新所有证书。

如果对证书续约没有要求,并定期升级kubernetes版本,每次升级间隔时间少于1年,最佳做法是经常升级集群以确保安全。

如果不想在升级集群时续约证书,则给 kubeadm upgrade apply 或 kubeadm upgrade node 传递参数:–certificate-renewal=false

2.3 手动续订证书

#kubeadm alpha certs renew all

[root@master pahu]# kubeadm alpha certs renew all
[renew] Reading configuration from the cluster...
[renew] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[renew] Error reading configuration from the Cluster. Falling back to default configuration

W1012 18:30:31.234419    8414 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
certificate embedded in the kubeconfig file for the admin to use and for kubeadm itself renewed
certificate for serving the Kubernetes API renewed
certificate the apiserver uses to access etcd renewed
certificate for the API server to connect to kubelet renewed
certificate embedded in the kubeconfig file for the controller manager to use renewed
certificate for liveness probes to healthcheck etcd renewed
certificate for etcd nodes to communicate with each other renewed
certificate for serving etcd renewed
certificate for the front proxy client renewed
certificate embedded in the kubeconfig file for the scheduler manager to use renewed

依旧报错依然有过期client存在:“existing bootstrap client certificate is expired”

Oct 12 10:57:39 master.k8s kubelet[3264]: I1012 10:57:39.060116    3264 server.go:411] Version: v1.19.4
Oct 12 10:57:39 master.k8s kubelet[3264]: I1012 10:57:39.060783    3264 server.go:831] Client rotation is on, will bootstrap in background
Oct 12 10:57:39 master.k8s kubelet[3264]: E1012 10:57:39.063255    3264 bootstrap.go:265] part of the existing bootstrap client certificate is expired: 2021-12-06 17:48:28 +0000 UTC  《---------------
Oct 12 10:57:39 master.k8s kubelet[3264]: F1012 10:57:39.063338    3264 server.go:265] failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory

2.4 查看配置文件中的client信息

kubelet-client-current.pem对应连接的时间是 2020-12-07

[root@master pahu]# cat /etc/kubernetes/kubelet.conf 
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: xxxxxxxxxxx
    server: https://192.168.x.x:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: system:node:master.k8s
  name: system:node:master.k8s@kubernetes
current-context: system:node:master.k8s@kubernetes
kind: Config
preferences: {}
users:
- name: system:node:master.k8s
  user:
    client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem  <--------
    client-key: /var/lib/kubelet/pki/kubelet-client-current.pem
[root@master pahu]# ls -l /var/lib/kubelet/pki
total 12
-rw-------. 1 root root 2810 Dec  7  2020 kubelet-client-2020-12-07-01-48-28.pem
lrwxrwxrwx. 1 root root   59 Dec  7  2020 kubelet-client-current.pem -> /var/lib/kubelet/pki/kubelet-client-2020-12-07-01-48-28.pem
-rw-r--r--. 1 root root 2266 Dec  7  2020 kubelet.crt
-rw-------. 1 root root 1679 Dec  7  2020 kubelet.key

3. 解决方法

主节点重新创建证书和配置文件

3.1 主节点配置

备份证书和配置文件

# cp -rp /etc/kubernetes /etc/kubernetes.bak
# mkdir  /root/backconf
# mv /etc/kubernetes/*.conf    /root/backconf/

生成新的证书和配置文件

# kubeadm alpha certs renew all
# kubeadm init phase kubeconfig all

将新生成的admin.conf文件覆盖掉.kube/config文件:

mv $HOME/.kube/config $HOME/.kube/config.old
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
chmod 644 $HOME/.kube/config

重启,查看节点状态

[root@master pahu]# kubectl get nodes
NAME         STATUS     ROLES    AGE    VERSION
master.k8s   Ready      master   674d   v1.19.4
node1.k8s    NotReady   worker   674d   v1.19.4
node2.k8s    NotReady   worker   674d   v1.19.4

重建token以便work节点加入

[root@master pahu]# kubeadm token create --print-join-command
W1012 11:32:41.299172    5785 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
kubeadm join 192.168.157.129:6443 --token usn64n.v8vuy2csmgaj0hgu     --discovery-token-ca-cert-hash sha256:f1eaf2cb88154f182e19d6f4a4f9a8ae2e23fd3f38f2e14a413fb68903727bc1 

[root@master pahu]# kubeadm token list
TOKEN                     TTL         EXPIRES                     USAGES                   DESCRIPTION                                                EXTRA GROUPS
usn64n.v8vuy2csmgaj0hgu   23h         2022-10-13T11:32:41+08:00   authentication,signing                                                        system:bootstrappers:kubeadm:default-node-token

3.2 从节点配置

备份并删除配置文件

#cp -r /etc/kubernetes /etc/kubernetes.bak
#rm -rf /etc/kubernetes/kubelet.conf
#rm -rf /etc/kubernetes/pki/ca.crt
#rm -rf /etc/kubernetes/bootstrap-kubelet.conf

加入集群

[root@node1 pahu]# kubeadm join 192.168.x.x:6443 --token usn64n.v8vuy2csmgaj0hgu     --discovery-token-ca-cert-hash sha256:f1eaf2cb88154f182e19d6f4a4f9a8ae2e23fd3f38f2e14a413fb68903727bc1
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

验证状态为ready。

[root@master pahu]# kubectl get nodes
NAME         STATUS     ROLES    AGE    VERSION
master.k8s   Ready      master   674d   v1.19.4
node1.k8s    Ready      worker   674d   v1.19.4
node2.k8s    NotReady   worker   674d   v1.19.4

相同方法应用在节点2,集群恢复正常。

说明:以上测试基于自己用于学习的集群,仅供参考,可能做法不大严谨。如果问题对象是生产系统,请参考官方文档并细心测试。

你可能感兴趣的:(K8S,kubernetes,docker,容器)