使用kind工具搭建虚拟k8s集群
目前工作中遇到了大规模创删pod时耗时长到不可接受的情况,创建三千个容器耗时两小时。尝试从kube-apiserver和kube-scheduler组件的运作原理入手,对这类场景下的效率进行提升,优先考虑组件参数调整,尽量不去改动代码。在输出对kube-scheduler的调优方案后,需要在集群中进行实际数据验证。利用kind工具构建虚拟k8s集群,以解决测试环境node节点匮乏的问题。目前也没有几个公司能做到测试集群有几千个节点吧,这么看来kind工具的实用性很强。
kind搭建集群原理
kind,全称为kubernetes in docker,可用于在单台机器上构建一个虚拟的k8s集群。k8s各组件以docker容器的形式实际运行,可以类比为pod即为node
安装kind
这里可以根据需求选择kind的版本,kind版本和k8s版本对应,具体参见release
wget -O /usr/local/bin/kind https://github.com/kubernetes-sigs/kind/releases/download/v0.8.1/kind-linux-amd64 && chmod +x /usr/local/bin/kind
另外,需要保证:
- 机器上运行着docker
- 有kubectl二进制作为命令行工具。
- go 1.14 or greater
搭建kind集群
默认集群
这里创建一个最简单的单节点kind集群,由于kind为v0.8.1版本,默认使用v1.18.2版本的k8s。可以看到,默认的集群名为kind,只有一个master节点。
后缀为control-plane的pod对应master上的各个组件,apiserver、scheduler和kcm;kindnet对应kubelet组件。
root@unknown:/home/zourui# kind create cluster
Creating cluster "kind" ...
✓ Ensuring node image (kindest/node:v1.18.2)
✓ Preparing nodes
✓ Writing configuration
✓ Starting control-plane ️
✓ Installing CNI
✓ Installing StorageClass
Set kubectl context to "kind-kind"
You can now use your cluster with:
kubectl cluster-info --context kind-kind
Have a question, bug, or feature request? Let us know! https://kind.sigs.k8s.io/#community
root@unknown:/home/zourui# kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-66bff467f8-rxxdc 1/1 Running 0 8m13s
kube-system coredns-66bff467f8-xx6wv 1/1 Running 0 8m13s
kube-system etcd-kind-control-plane 1/1 Running 0 8m21s
kube-system kindnet-z6wjx 1/1 Running 0 8m13s
kube-system kube-apiserver-kind-control-plane 1/1 Running 0 8m22s
kube-system kube-controller-manager-kind-control-plane 1/1 Running 0 8m22s
kube-system kube-proxy-v4z6x 1/1 Running 0 8m13s
kube-system kube-scheduler-kind-control-plane 1/1 Running 0 8m21s
local-path-storage local-path-provisioner-bd4bb6b75-rl5dm 1/1 Running 0 8m13s
root@cld-unknown23400:/home/zourui# kubectl get node
NAME STATUS ROLES AGE VERSION
kind-control-plane Ready master 34s v1.18.2
custom cluster构建
通常企业都不会使用社区版的k8s,如果想要使用内部版本,就需要从构建镜像开始。
需要保证kubernetes项目正确存放在$GOPATH/src/k8s.io路径下,否则构建镜像时是找不到项目的。
从本地kubernetes项目构建image
export SYM_K8S_VERSION=****
kind build node-image --image kindest/node:1.12.4-original
用本地image构建ha cluster
root@unknown:/home/zourui# cat kind-ten-config.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
- role: worker
- role: worker
- role: worker
- role: worker
- role: worker
- role: worker
- role: worker
kind create cluster --name ten --image kindest/node:1.12.4-original --config kind-ten-config.yaml
一共运行着11个kubelet(1 master+10 nodes)
root@cld-unknown23400:/home/zourui# ps axu|grep kubelet
root 85447 3.9 0.1 4194200 81196 ? Ssl 18:26 0:02 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --fail-swap-on=false --node-ip=172.19.0.4 --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --fail-swap-on=false
root 87746 51.6 0.5 419220 349704 ? Ssl 18:26 0:26 kube-apiserver --authorization-mode=Node,RBAC --advertise-address=172.19.0.4 --allow-privileged=true --client-ca-file=/etc/kubernetes/pki/ca.crt --enable-admission-plugins=NodeRestriction --enable-bootstrap-token-auth=true --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key --etcd-servers=https://127.0.0.1:2379 --insecure-port=0 --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-allowed-names=front-proxy-client --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6443 --service-account-key-file=/etc/kubernetes/pki/sa.pub --service-cluster-ip-range=10.96.0.0/12 --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
root 97363 10.1 0.1 3822916 79032 ? Ssl 18:27 0:02 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --fail-swap-on=false --node-ip=172.19.0.6 --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --fail-swap-on=false
root 97422 8.6 0.1 3158816 77976 ? Ssl 18:27 0:01 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --fail-swap-on=false --node-ip=172.19.0.12 --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --fail-swap-on=false
root 97428 10.0 0.1 3748416 77848 ? Ssl 18:27 0:02 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --fail-swap-on=false --node-ip=172.19.0.5 --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --fail-swap-on=false
root 97430 9.8 0.1 4192088 76116 ? Ssl 18:27 0:01 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --fail-swap-on=false --node-ip=172.19.0.8 --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --fail-swap-on=false
root 97450 8.6 0.1 3601720 78160 ? Ssl 18:27 0:01 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --fail-swap-on=false --node-ip=172.19.0.2 --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --fail-swap-on=false
root 97512 10.3 0.1 4043856 78560 ? Ssl 18:27 0:01 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --fail-swap-on=false --node-ip=172.19.0.11 --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --fail-swap-on=false
root 97551 8.4 0.1 3600404 78608 ? Ssl 18:27 0:01 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --fail-swap-on=false --node-ip=172.19.0.10 --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --fail-swap-on=false
root 97553 9.1 0.1 3822724 76980 ? Ssl 18:27 0:01 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --fail-swap-on=false --node-ip=172.19.0.3 --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --fail-swap-on=false
root 97554 8.2 0.1 3823460 79168 ? Ssl 18:27 0:01 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --fail-swap-on=false --node-ip=172.19.0.7 --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --fail-swap-on=false
root 97562 8.6 0.1 3823428 77156 ? Ssl 18:27 0:01 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --fail-swap-on=false --node-ip=172.19.0.9 --container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --fail-swap-on=false
root 100700 0.0 0.0 13516 888 pts/0 S+ 18:27 0:00 grep kubelet
image无法拉取问题
上面搭建的集群,默认情况下是无法启动容器的,原因在于无法成功pull image。我前期在这里折腾了很久,先是用docker pull将image拉取到本地,再在yaml配置imagePullPolicy: IfNotPresent,然而都没有效果。
实际上,当我们尝试在虚拟集群上启动容器时,image并不是从本地的images中寻找,而是从虚拟node也就是docker容器中寻找。默认情况下,docker容器是不通外网的。
解决方案有两种:
- kind提供load命令进行image导入
- 构建一个本地docker registry,并授权给kind集群使用
可参见:
方案一
方案二
方案一
root@unknown:/home/zourui# kind load docker-image nginx:1.7.9 --name ten
Image: "nginx:1.7.9" with ID "sha256:84581e99d807a703c9c03bd1a31cd9621815155ac72a7365fd02311264512656" not yet present on node "ten-worker7", loading...
Image: "nginx:1.7.9" with ID "sha256:84581e99d807a703c9c03bd1a31cd9621815155ac72a7365fd02311264512656" not yet present on node "ten-worker", loading...
Image: "nginx:1.7.9" with ID "sha256:84581e99d807a703c9c03bd1a31cd9621815155ac72a7365fd02311264512656" not yet present on node "ten-worker3", loading...
Image: "nginx:1.7.9" with ID "sha256:84581e99d807a703c9c03bd1a31cd9621815155ac72a7365fd02311264512656" not yet present on node "ten-control-plane", loading...
Image: "nginx:1.7.9" with ID "sha256:84581e99d807a703c9c03bd1a31cd9621815155ac72a7365fd02311264512656" not yet present on node "ten-worker5", loading...
Image: "nginx:1.7.9" with ID "sha256:84581e99d807a703c9c03bd1a31cd9621815155ac72a7365fd02311264512656" not yet present on node "ten-worker8", loading...
Image: "nginx:1.7.9" with ID "sha256:84581e99d807a703c9c03bd1a31cd9621815155ac72a7365fd02311264512656" not yet present on node "ten-worker2", loading...
Image: "nginx:1.7.9" with ID "sha256:84581e99d807a703c9c03bd1a31cd9621815155ac72a7365fd02311264512656" not yet present on node "ten-worker9", loading...
Image: "nginx:1.7.9" with ID "sha256:84581e99d807a703c9c03bd1a31cd9621815155ac72a7365fd02311264512656" not yet present on node "ten-worker6", loading...
Image: "nginx:1.7.9" with ID "sha256:84581e99d807a703c9c03bd1a31cd9621815155ac72a7365fd02311264512656" not yet present on node "ten-worker10", loading...
Image: "nginx:1.7.9" with ID "sha256:84581e99d807a703c9c03bd1a31cd9621815155ac72a7365fd02311264512656" not yet present on node "ten-worker4", loading...
方案二
root@cld-unknown23400:/home/zourui# cat kind-with-registry.sh
#!/bin/sh
set -o errexit
# create registry container unless it already exists
reg_name='kind-registry'
reg_port='5000'
running="$(docker inspect -f '{{.State.Running}}' "${reg_name}" 2>/dev/null || true)"
if [ "${running}" != 'true' ]; then
docker run \
-d --restart=always -p "127.0.0.1:${reg_port}:5000" --name "${reg_name}" \
registry:2
fi
# create a cluster with the local registry enabled in containerd
cat <
运行脚本后,无需在手动将本地image导入到kind集群,而是将image push到localhost:5000 registry
docker pull gcr.io/google-samples/hello-app:1.0
docker tag gcr.io/google-samples/hello-app:1.0 localhost:5000/hello-app:1.0
docker push localhost:5000/hello-app:1.0
kubectl create deployment hello-server --image=localhost:5000/hello-app:1.0