kubeadm相关指令出现kubeadm-config无法找到的问题解决

问题的现象是凡是执行kubeadm相关的指令,如kubeadm upgrade plan等,都会出现类似以下的错误

[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[upgrade/config] In order to upgrade, a ConfigMap called "kubeadm-config" in the kube-system namespace must exist.
[upgrade/config] Without this information, 'kubeadm upgrade' won't know how to configure your upgraded cluster.

意思就找不到kubeadm-config相关的信息,但是按提示执行'kubectl -n kube-system get cm kubeadm-config -o yaml'是有内容的,我原来的解决思路是将该指令输出内容的ClusterConfiguration部分保存成kubeadm-config.yaml文件,再通过添加 --config=kubeadm-config.yaml 参数来执行kubeadm命令,这种方法虽然也能解决问题,更新升级都可执行,但总觉得不方便,问题解决不彻底。

最近有点时间再研究了一下,通过对kubeadm添加 -v=6 参数,会输出更详细的执行日志

I0630 17:01:31.992724    3697 plan.go:251] [upgrade/plan] verifying health of cluster
I0630 17:01:31.992785    3697 plan.go:252] [upgrade/plan] retrieving configuration from cluster
I0630 17:01:31.993390    3697 loader.go:374] Config loaded from file:  /etc/kubernetes/admin.conf
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
I0630 17:01:32.000747    3697 round_trippers.go:553] GET https://172.16.5.141:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s 200 OK in 6 milliseconds
I0630 17:01:32.001625    3697 loader.go:374] Config loaded from file:  /etc/kubernetes/kubelet.conf
I0630 17:01:32.002761    3697 round_trippers.go:553] GET https://172.16.5.141:6443/api/v1/nodes/kubernetes-admin?timeout=10s 404 Not Found in 1 milliseconds
[upgrade/config] In order to upgrade, a ConfigMap called "kubeadm-config" in the kube-system namespace must exist.
[upgrade/config] Without this information, 'kubeadm upgrade' won't know how to configure your upgraded cluster.

该日志显示 nodes/kubernetes-admin 不存在,通过curl直接调用api也证实了这点,为什么为请求这个不存在的节点内容,我从网上查找了相关的资料,是由于kubelet.conf配置不正常造成的(证书的CN/O内容错误),我复查了该文件,虽然users 是 name: system:node:k8s-m,但将client-certificate-data证书内容打印出来,CN/O的内容是(Subject: O = system:masters, CN = kubernetes-admin),打印命令如下:

echo -n "YOUR—client-certificate-data" | base64 --decode | openssl x509 -text

这表示证书和kubelet.conf配置用户不一致。我回想了一下,应该是上次kubelet不能启动,更新证书时无意间将admin.conf的证书内容复制了过来,因为admin.conf的配置显示用户正是(name: kubernetes-admin)。

既然知道问题所在解决起来就比较简单了,删除/etc/kubernetes/kubelet.conf,再通过以下命令重新生成新文件,注意指令参数中的kubeadm-config.yaml可参照文章开头的方法导出(只要ClusterConfiguration部分)。

kubeadm init phase kubeconfig kubelet --config=kubeadm-config.yaml

重新生成文件后就可以不再需要通过添加--config参数来执行kubeadm相关命令了,问题圆满解决,可能像我这种手残的用户才会遇到:(

你可能感兴趣的:(系统运维,Kubernetes,linux,服务器,运维,kubernetes)