Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all Kubernetes containers running in cri-o/containerd using crictl:
- 'crictl --runtime-endpoint /run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'crictl --runtime-endpoint /run/containerd/containerd.sock logs CONTAINERID'
couldn't initialize a Kubernetes cluster
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init.runWaitControlPlanePhase
根据提示使用journalctl -xeu kubelet 查看日志,日志文件中的错误主要有四种:
Error getting node" err="node \"k8s-master\" not found
May 31 09:04:45 k8s-master kubelet[12906]: E0531 09:04:45.363423 12906 remote_runtime.go:198] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"k8s.gcr.io/pause:3.6\": failed to pull image \"k8s.gcr.io/pause:3.6\": failed to pull and unpack image \"k8s.gcr.io/pause:3.6\": failed to resolve reference \"k8s.gcr.io/pause:3.6\": failed to do request: Head \"https://k8s.gcr.io/v2/pause/manifests/3.6\": dial tcp 142.250.157.82:443: i/o timeout"
May 31 09:04:45 k8s-master kubelet[12906]: E0531 09:04:45.363556 12906 kuberuntime_sandbox.go:70] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to get sandbox image \"k8s.gcr.io/pause:3.6\": failed to pull image \"k8s.gcr.io/pause:3.6\": failed to pull and unpack image \"k8s.gcr.io/pause:3.6\": failed to resolve reference \"k8s.gcr.io/pause:3.6\": failed to do request: Head \"https://k8s.gcr.io/v2/pause/manifests/3.6\": dial tcp 142.250.157.82:443: i/o timeout" pod="kube-system/kube-controller-manager-k8s-master"
May 31 09:04:45 k8s-master kubelet[12906]: E0531 09:04:45.363628 12906 kuberuntime_manager.go:833] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"k8s.gcr.io/pause:3.6\": failed to pull image \"k8s.gcr.io/pause:3.6\": failed to pull and unpack image \"k8s.gcr.io/pause:3.6\": failed to resolve reference \"k8s.gcr.io/pause:3.6\": failed to do request: Head \"https://k8s.gcr.io/v2/pause/manifests/3.6\": dial tcp 142.250.157.82:443: i/o timeout" pod="kube-system/kube-controller-manager-k8s-master"
May 31 09:04:45 k8s-master kubelet[12906]: E0531 09:04:45.363775 12906 pod_workers.go:949] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"kube-controller-manager-k8s-master_kube-system(a7773f029975563a22f260af603bc174)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"kube-controller-manager-k8s-master_kube-system(a7773f029975563a22f260af603bc174)\\\": rpc error: code = Unknown desc = failed to get sandbox image \\\"k8s.gcr.io/pause:3.6\\\": failed to pull image \\\"k8s.gcr.io/pause:3.6\\\": failed to pull and unpack image \\\"k8s.gcr.io/pause:3.6\\\": failed to resolve reference \\\"k8s.gcr.io/pause:3.6\\\": failed to do request: Head \\\"https://k8s.gcr.io/v2/pause/manifests/3.6\\\": dial tcp 142.250.157.82:443: i/o timeout\"" pod="kube-system/kube-controller-manager-k8s-master" podUID=a7773f029975563a22f260af603bc174
May 31 09:04:54 k8s-master kubelet[12906]: E0531 09:04:54.426474 12906 kubelet.go:2347] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"
tee /etc/sysconfig/kubelet <<-EOF
KUBELET_EXTRA_ARGS="--pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.6"
EOF
拉取的还是k8s.gcr.io/pause:3.6镜像,同时查看containerd的日志
May 31 09:11:10 k8s-master containerd[9807]: time="2022-05-31T09:11:10.355757461+08:00" level=error msg="RunPodSandbox for &PodSandboxMetadata{Name:kube-apiserver-k8s-master,Uid:4209d27a0268bc5305037fe9024040af,Namespace:kube-system,Attempt:0,} failed, error" error="failed to get sandbox image \"k8s.gcr.io/pause:3.6\": failed to pull image \"k8s.gcr.io/pause:3.6\": failed to pull and unpack image \"k8s.gcr.io/pause:3.6\": failed to resolve reference \"k8s.gcr.io/pause:3.6\": failed to do request: Head \"https://k8s.gcr.io/v2/pause/manifests/3.6\": dial tcp 64.233.189.82:443: i/o timeout"
May 31 09:11:12 k8s-master containerd[9807]: time="2022-05-31T09:11:12.355948935+08:00" level=info msg="trying next host" error="failed to do request: Head \"https://k8s.gcr.io/v2/pause/manifests/3.6\": dial tcp 64.233.189.82:443: i/o timeout" host=k8s.gcr.io
May 31 09:11:12 k8s-master containerd[9807]: time="2022-05-31T09:11:12.361334669+08:00" level=error msg="RunPodSandbox for &PodSandboxMetadata{Name:kube-controller-manager-k8s-master,Uid:a7773f029975563a22f260af603bc174,Namespace:kube-system,Attempt:0,} failed, error" error="failed to get sandbox image \"k8s.gcr.io/pause:3.6\": failed to pull image \"k8s.gcr.io/pause:3.6\": failed to pull and unpack image \"k8s.gcr.io/pause:3.6\": failed to resolve reference \"k8s.gcr.io/pause:3.6\": failed to do request: Head \"https://k8s.gcr.io/v2/pause/manifests/3.6\": dial tcp 64.233.189.82:443: i/o timeout"
May 31 09:11:23 k8s-master containerd[9807]: time="2022-05-31T09:11:23.353642109+08:00" level=info msg="trying next host" error="failed to do request: Head \"https://k8s.gcr.io/v2/pause/manifests/3.6\": dial tcp 64.233.189.82:443: i/o timeout" host=k8s.gcr.io
May 31 09:11:23 k8s-master containerd[9807]: time="2022-05-31T09:11:23.357821141+08:00" level=error msg="RunPodSandbox for &PodSandboxMetadata{Name:etcd-k8s-master,Uid:267aa25988340cd5f9ebe7bf0bc5b507,Namespace:kube-system,Attempt:0,} failed, error" error="failed to get sandbox image \"k8s.gcr.io/pause:3.6\": failed to pull image \"k8s.gcr.io/pause:3.6\": failed to pull and unpack image \"k8s.gcr.io/pause:3.6\": failed to resolve reference \"k8s.gcr.io/pause:3.6\": failed to do request: Head \"https://k8s.gcr.io/v2/pause/manifests/3.6\": dial tcp 64.233.189.82:443: i/o timeout"
May 31 09:11:25 k8s-master containerd[9807]: time="2022-05-31T09:11:25.345054250+08:00" level=info msg="RunPodSandbox for &PodSandboxMetadata{Name:kube-apiserver-k8s-master,Uid:4209d27a0268bc5305037fe9024040af,Namespace:kube-system,Attempt:0,}"
发现api-server、controller-manager、etcd因为pause镜像拉取失败而未能启动容器,导致问题1、问题2,于是 通过命令crictl img查看镜像,里面只有registry.aliyuncs.com/google_containers/pause:3.7,
手动拉取registry.aliyuncs.com/google_containers/pause:3.6,修改镜像tag,重新执行创建集群命令,创建成功。
ctr -n k8s.io i tag registry.aliyuncs.com/google_containers/pause:3.6 k8s.gcr.io/pause:3.6
导致此坑的根本问题是国内不能直接访问google的镜像仓库,虽然在初始化命令中添加了本地镜像仓库参数,image-repository,但是不完全有效,希望在新的版本中能解决这个问题。
--pod-network-cidr=10.244.0.0/16 对应如下:
networking:
podSubnet: 10.244.0.0/16
kubeadm config print init-defaults > kubeadm_config.yaml //通过该命令查看对应
参考:K8S: convert “kubeadm init” command-line arguments to “–config” YAML-file equivalent
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config