(二)K8s踩坑记录

文章目录

  • 1 kubeadm join报错
  • 2 Unable to connect to the server: x509
  • 3 coredns一直处于ContainerCreating状态

1 kubeadm join报错

在部署服务过程中,初始化之后重启了master节点,然后node节点在join进群的时候报错,提示证书是否过期等问题,报错信息如下:

[root@node01 ~]# kubeadm join 10.0.0.100:6443 --token qxl5b3.5b78nwu3gm1r4u6o --discovery-token-ca-cert-hash sha256:3e20fa8054cbc9000cf3d3586a05a01d8af5721b577856e93c7e243877393d21 --ignore-preflight-errors=Swap
[preflight] Running pre-flight checks
	[WARNING Swap]: running with swap on is not supported. Please disable swap
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.09.0. Latest validated version: 18.06
[discovery] Trying to connect to API Server "10.0.0.100:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.0.0.100:6443"
[discovery] Failed to request cluster info, will try again: [Get https://10.0.0.100:6443/api/v1/namespaces/kube-public/configmaps/cluster-info: dial tcp 10.0.0.100:6443: connect: connection refused]
[discovery] Failed to request cluster info, will try again: [Get https://10.0.0.100:6443/api/v1/namespaces/kube-public/configmaps/cluster-info: dial tcp 10.0.0.100:6443: connect: connection refused]

产生原因:

  1. 有可能是时间不同步造成的
  2. 在初始化后重启master,重启后会报错

解决办法

  • 找了好多资料,没有找到可行的,最后kubeadm reset完美解决
  • 参考博文: k8s踩坑记 - kubeadm join 之 token 失效

reset之后重新初始化

[root@master1 ~]# kubeadm reset	
[root@master1 ~]# kubeadm init --kubernetes-version=v1.13.0 --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12 --token-ttl=0 --ignore-preflight-errors=Swap

创建所需文件

[root@master1 ~]# mkdir -p $HOME/.kube
[root@master1 ~]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
cp: overwrite ‘/root/.kube/config’? y
[root@master1 ~]# chown $(id -u):$(id -g) $HOME/.kube/config

查看节点

[root@master1 ~]# kubectl get nodes
NAME              STATUS     ROLES    AGE     VERSION
master1.rsq.com   NotReady   master   2m49s   v1.13.0

2 Unable to connect to the server: x509

这类报错是由于初始化后没有创建所需要的文件

[root@master1 ~]# kubectl get nodes
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")

创建所需文件即可

[root@master1 ~]# mkdir -p $HOME/.kube
[root@master1 ~]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
cp: overwrite ‘/root/.kube/config’? y
[root@master1 ~]# chown $(id -u):$(id -g) $HOME/.kube/config

查看集群节点

[root@master1 ~]# kubectl get nodes
NAME              STATUS     ROLES    AGE     VERSION
master1.rsq.com   NotReady   master   2m49s   v1.13.0

3 coredns一直处于ContainerCreating状态

Node加入集群中后一直处于NotReady状态,查看kube-system的状态,发现coredns一直处于ContainerCreating状态

[root@master1 ~]# kubectl get nodes
NAME              STATUS     ROLES    AGE     VERSION
master1.rsq.com   NotReady   master   16h     v1.13.0
node01.rsq.com    NotReady      16h     v1.13.0
node02.rsq.com    NotReady      8m39s   v1.13.0

[root@master1 ~]# kubectl get pods --all-namespaces
NAMESPACE     NAME                                      READY   STATUS              RESTARTS   AGE
kube-system   coredns-86c58d9df4-fzs9l                  0/1     ContainerCreating   0          16h
kube-system   coredns-86c58d9df4-hrwvk                  0/1     ContainerCreating   0          16h
kube-system   etcd-master1.rsq.com                      1/1     Running             1          16h
kube-system   kube-apiserver-master1.rsq.com            1/1     Running             1          16h
kube-system   kube-controller-manager-master1.rsq.com   1/1     Running             1          16h
kube-system   kube-proxy-28cmg                          1/1     Running             1          16h
kube-system   kube-proxy-5xqvf                          1/1     Running             1          26m
kube-system   kube-proxy-s4tsz                          1/1     Running             1          16h
kube-system   kube-scheduler-master1.rsq.com            1/1     Running             1          16h

查看kubelet服务状态,看最后几行的报错

[root@master1 ~]# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Wed 2018-12-12 09:24:44 CST; 6min ago
     Docs: https://kubernetes.io/docs/
 Main PID: 123631 (kubelet)
   Memory: 35.2M
   CGroup: /system.slice/kubelet.service
           └─123631 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgrou...

Dec 12 09:30:57 master1.rsq.com kubelet[123631]: E1212 09:30:57.187292  123631 pod_workers.go:190] Error syncing pod cea84a11-fd24-11e8-a282-000c291e37c2 ("coredns-86c58d9df4-fzs9l_kube-...
Dec 12 09:30:57 master1.rsq.com kubelet[123631]: E1212 09:30:57.187480  123631 pod_workers.go:190] Error syncing pod cea7ebef-fd24-11e8-a282-000c291e37c2 ("coredns-86c58d9df4-hrwvk_kube-...
Dec 12 09:30:59 master1.rsq.com kubelet[123631]: E1212 09:30:59.187419  123631 pod_workers.go:190] Error syncing pod cea84a11-fd24-11e8-a282-000c291e37c2 ("coredns-86c58d9df4-fzs9l_kube-...
Dec 12 09:30:59 master1.rsq.com kubelet[123631]: E1212 09:30:59.187607  123631 pod_workers.go:190] Error syncing pod cea7ebef-fd24-11e8-a282-000c291e37c2 ("coredns-86c58d9df4-hrwvk_kube-...
Dec 12 09:31:00 master1.rsq.com kubelet[123631]: W1212 09:31:00.454147  123631 cni.go:203] Unable to update cni config: No networks found in /etc/cni/net.d
Dec 12 09:31:00 master1.rsq.com kubelet[123631]: E1212 09:31:00.454242  123631 kubelet.go:2192] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNot...initialized
Dec 12 09:31:01 master1.rsq.com kubelet[123631]: E1212 09:31:01.188877  123631 pod_workers.go:190] Error syncing pod cea7ebef-fd24-11e8-a282-000c291e37c2 ("coredns-86c58d9df4-hrwvk_kube-...
Dec 12 09:31:01 master1.rsq.com kubelet[123631]: E1212 09:31:01.189259  123631 pod_workers.go:190] Error syncing pod cea84a11-fd24-11e8-a282-000c291e37c2 ("coredns-86c58d9df4-fzs9l_kube-...
Dec 12 09:31:03 master1.rsq.com kubelet[123631]: E1212 09:31:03.187200  123631 pod_workers.go:190] Error syncing pod cea84a11-fd24-11e8-a282-000c291e37c2 ("coredns-86c58d9df4-fzs9l_kube-...
Dec 12 09:31:03 master1.rsq.com kubelet[123631]: E1212 09:31:03.187730  123631 pod_workers.go:190] Error syncing pod cea7ebef-fd24-11e8-a282-000c291e37c2 ("coredns-86c58d9df4-hrwvk_kube-...
Hint: Some lines were ellipsized, use -l to show in full.

产生问题原因

  • 一直报网络NotReady,我就感觉flannel组件出了问题, 最后网上搜了一些资料解决
  • 参考博客:coreDNS一直处于创建中解决

解决办法:

# 所有节点执行(我只在master节点先执行就解决问题了)
rm -rf /var/lib/cni/flannel/* && rm -rf /var/lib/cni/networks/cbr0/* && ip link delete cni0
rm -rf /var/lib/cni/networks/cni0/*

# 删除flannel组件,重新下载
[root@master1 ~]# docker rmi quay.io/coreos/flannel:v0.10.0-amd64
[root@master1 ~]# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

# 查看节点状态已经处于Ready状态
[root@master1 ~]# kubectl get nodes
NAME              STATUS   ROLES    AGE   VERSION
master1.rsq.com   Ready    master   16h   v1.13.0
node01.rsq.com    Ready       16h   v1.13.0
node02.rsq.com    Ready       39m   v1.13.0

你可能感兴趣的:(Linux,Docker,Kubernetes)