【K8S 一】使用kubeadm工具快速部署Kubernetes集群(单Master)

此为安装部署单Master集群,如需高可用Master集群,请一并参考《【K8S 五】使用kubeadm工具快速部署Kubernetes集群(Master高可用集群)》

目录

安装前配置

依赖包安装 

 kube-proxy开启ipvs的前置条件 (ALL NODE)

 关闭防火墙

安装 Docker(ALL NODE) 

安装 Kubeadm(ALL NODE)

初始化主节点(MASTER NODE)

安装网络-Flannel(MASTER NODE)

 Work Node加入集群(WORK NODE)

Token过期后加入集群 

通过kubernetes调度启动容器

卸载node

FAQ

安装前配置

角色 IP地址
Master节点 192.168.11.8
Node节点 192.168.11.10

依赖包安装 

yum -y install iproute-tc conntrack ipset ipvsadm

 kube-proxy开启ipvs的前置条件(ALL NODE)

modprobe br_netfilter ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh nf_conntrack_ipv4

注意:Kernel 4.18.0-240及以上版本,没有nf_conntrack_ipv4,使用modprobe nf_conntrack

modprobe nf_conntrack_ipv4
modprobe br_netfilter
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh

 关闭防火墙

不关闭防火墙,需要将比较多的端口添加到例外,否则会出现比较多的坑,需要你填。例如:

kubeadm init时的警告:

        [WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly

systemctl stop firewalld
systemctl disable firewalld

修改/etc/hosts 

将所有节点上hostname和IP地址对应关系填入到/etc/hosts,否则在kubeadm init的preflight阶段会有警告:

        [WARNING Hostname]: hostname "rhel-8-11.8" could not be reached
        [WARNING Hostname]: hostname "rhel-8-11.8": lookup rhel-8-11.8 on 223.5.5.5:53: no such host

此外还有:关闭swap、关闭selinux、设置时区和时间同步、给docker挂载单独分区、kubernetes内核优化 。

可根据实际情况,酌情进行配置。

安装 Docker(ALL NODE) 

yum -y install yum-utils
yum-config-manager     --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum clean all
yum makecache
yum list docker-ce --showduplicates | sort -r

yum list |grep docker-ce
yum -y install docker-ce

systemctl start docker.service  # Docker 在默认情况下使用的 Cgroup Driver 为 cgroupfs,而 Kubernetes 其实推荐使用 systemd 来代替 cgroupfs。此时不启动Docker,则需手动创建/etc/docker目录。
vi /etc/docker/daemon.json

{
  "registry-mirrors": ["https://cn7-k8s-min-51.com"],
  "insecure-registries": ["cn7-k8s-min-51.com"], 
  "ipv6": true,
  "fixed-cidr-v6": "fd00:db8:1::/48",
  "experimental": true,
  "ip6tables": true,
  "log-level":"warn",
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2",
  "storage-opts": [
    "overlay2.override_kernel_check=true"
  ],
  "default-address-pools" : [
    {
          "base" : "172.103.0.0/16",
          "size" : 24
    }
  ]
}
systemctl reload docker.service
systemctl enable docker.service

安装 Kubeadm(ALL NODE)

虽然镜像源是EL7的,但是EL8可用。
vi /etc/yum.repos.d/kubernetes.repo

[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
       http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
yum makecache
yum -y install kubelet kubeadm kubectl --disableexcludes=kubernetes 

## 因为kubeadm依赖kubectl,所以在work node,同样只能一起安装kubectl,即便如下命令同样会将kubectl安装到worknode。

yum -y install kubelet kubeadm --disableexcludes=kubernetes
systemctl enable kubelet

初始化主节点(MASTER NODE)

 kubeadm config print init-defaults > kubeadm-init.default.yaml

#  kubeadm config print init-defaults > kubeadm-init.default.yaml

#  以下配置限v1.19.0版本,更高版本只能作为参考;关于各组件的配置可通过以下命令查看,这里以KubeProxyConfiguration为例:

kubeadm config print init-defaults --component-configs KubeProxyConfiguration > kubeadm-config.yaml 查看。


#修改初始化配置文件

apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: liuyll.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.11.8
  bindPort: 6443
nodeRegistration:
  criSocket: /var/run/dockershim.sock
  name: master-01
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
  type: CoreDNS
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.20.0
networking:
  dnsDomain: cluster.local
  podSubnet: 172.254.0.0/16
  serviceSubnet: 10.254.0.0/16
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
iptables:
  masqueradeAll: false
ipvs:
  minSyncPeriod: 0s
  scheduler: "rr"
kind: KubeProxyConfiguration
mode: "ipvs"

kubeadm config images pull --config=kubeadm-init.default.yaml
kubeadm init --config=kubeadm-init.default.yaml |tee kubeadm-init.log

# kubeadm config images pull --config=kubeadm-init.default.yaml (可提前下载images,以节省初始化master节点时间,也可直接执行下面的命令)

[config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.24.0
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.24.0
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.24.0
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.24.0
[config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.7
[config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.5.3-0
[config/images] Pulled registry.aliyuncs.com/google_containers/coredns:v1.8.6

注:安装docker或者containerd之后,默认在/etc/containerd/config.toml禁用了CRI,需要注释掉disabled_plugins = ["cri"],否则执行kubeadm进行部署时会报错:

output: E0614 09:35:20.617482   19686 remote_image.go:218] "PullImage from image service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService" image="registry.aliyuncs.com/google_containers/kube-apiserver:v1.24.0"
time="2022-06-14T09:35:20+08:00" level=fatal msg="pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService"

# kubeadm init --config=kubeadm-init.default.yaml |tee kubeadm-init.log

W1110 16:12:18.114927   84873 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.19.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local rhel-8-11.8] and IPs [10.254.0.1 192.168.11.8]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost rhel-8-11.8] and IPs [192.168.11.8 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost rhel-8-11.8] and IPs [192.168.11.8 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 17.003026 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.19" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node rhel-8-11.8 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node rhel-8-11.8 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.11.8:6443 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:c6f619461d7aaa470249e11520888cbc22110bca8ed8b2bb0f2b626164d685d4 

此时:
执行kubectl get cs,scheduler和controller-manager为Unhealthy状态! --- --- 禁用了--port
执行kubectl get node,node是NotReady状态 !                                         --- --- 没有安装网络
执行kubectl get po,pod中coredns是Pending状态!                                 --- --- 没有安装网络

解决:scheduler和controller-manager默认禁用了--port(非安全端口),注释掉再次执行kubectl get cs命令scheduler和controller-manager状态正常,可以不处理,不影响使用
#    - --port=0

kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS    MESSAGE             ERROR
scheduler            Healthy   ok                  
controller-manager   Healthy   ok                  
etcd-0               Healthy   {"health":"true"}   

kubectl get node
NAME             STATUS   ROLES    AGE   VERSION
kubeadmin-node   Ready       51m   v1.19.3
master-01        Ready    master   60m   v1.19.3

安装网络-Flannel(MASTER NODE)

wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
修改:
  net-conf.json: |
    {
      "Network": "172.254.0.0/16",
      "Backend": {
        "Type": "vxlan",
        "Directrouting": true
      }
    }
kubectl apply -f kube-flannel.yml

kubectl get po -o wide -n kube-system
NAME                                  READY   STATUS    RESTARTS   AGE   IP              NODE             NOMINATED NODE   READINESS GATES
coredns-6d56c8448f-7lsmz              1/1     Running   0          60m   172.254.0.3     rhel-8-11.8                 
coredns-6d56c8448f-fk2pz              1/1     Running   0          60m   172.254.0.2     rhel-8-11.8                 
etcd-rhel-8-11.8                      1/1     Running   0          61m   192.168.11.8    rhel-8-11.8                 
kube-apiserver-rhel-8-11.8            1/1     Running   0          61m   192.168.11.8    rhel-8-11.8                 
kube-controller-manager-rhel-8-11.8   1/1     Running   1          56m   192.168.11.8    rhel-8-11.8                 
kube-flannel-ds-6zrvq                 1/1     Running   0          52m   192.168.11.10   kubeadmin-node              
kube-flannel-ds-jvskr                 1/1     Running   0          54m   192.168.11.8    rhel-8-11.8                 
kube-proxy-f994g                      1/1     Running   0          60m   192.168.11.8    rhel-8-11.8                 
kube-proxy-qwsgl                      1/1     Running   0          52m   192.168.11.10   kubeadmin-node              
kube-scheduler-rhel-8-11.8            1/1     Running   0          56m   192.168.11.8    rhel-8-11.8                 

 Work Node加入集群(WORK NODE)

在Work Node,执行到“安装 Kubeadm(ALL NODE)”,然后执行下面的步骤加入到kubernetes集群。

# kubeadm config print join-defaults  > kubeadm-join.default.yaml

将kube-apiserver 追加到/etc/hosts/中master IP地址后面,或者这里改成MasterIP地址或主机名。

# cat kubeadm-join.default.yaml 
apiVersion: kubeadm.k8s.io/v1beta2
caCertPath: /etc/kubernetes/pki/ca.crt
discovery:
  bootstrapToken:
    apiServerEndpoint: kube-apiserver:6443
    token: abcdef.0123456789abcdef
    unsafeSkipCAVerification: true
  timeout: 5m0s
  tlsBootstrapToken: abcdef.0123456789abcdef
kind: JoinConfiguration
nodeRegistration:
  criSocket: /var/run/dockershim.sock
  name: k8s-02
  taints: null 

# kubeadm join --config kubeadm-join.default.yaml |tee kubeadm-join.log 
或直接使用命令参数执行:
# kubeadm join 192.168.35.8:6443 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:2c5ea47a17346ee5265ccdf6a5c7d3df8916c0e2ca94df3836935ea17d8aef4a

[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

Token过期后加入集群 

Token过期执行kubeadm join将无法加入到Kubernetes集群。执行下面的命令验证Token是否过期:
# kubectl get configmap cluster-info --namespace=kube-public -o yaml
apiVersion: v1
data:
  jws-kubeconfig-gdtrlm: eyJhbGciOiJIUzI1NiIsImtpZCI6ImdkdHJsbSJ9..IK_YSH-y_oSd1jdW8kPOl1Pe-5AQeJ7aTptqDEjWNq0
  kubeconfig: |
      apiVersion: v1
如果Token过期,上面命令回显将没有红色字体行。执行# kubeadm token list查看token也是空的。

重新创建新Token:kubeadm token create --print-join-command --ttl 0
--print-join-command 直接打印kubeadm join命令,执行该命令即可成功加入集群。
--ttl=0              表示生成的token永不失效,如果不指定,则ttl默认24小时。

通过kubernetes调度启动容器

 kubectl apply -f mysql-rc.yaml 
 kubectl apply -f mysql-svc.yaml 
 kubectl apply -f tomcat-deploy.yaml 
 kubectl apply -f tomcat-svc.yaml 

kubectl get po 
NAME                    READY   STATUS    RESTARTS   AGE
mysql-zq2l5             1/1     Running   0          45m
myweb-594787496-jjp2l   1/1     Running   0          43m
myweb-594787496-xhr72   1/1     Running   0          43m

 【K8S 一】使用kubeadm工具快速部署Kubernetes集群(单Master)_第1张图片

YAML配置文件:

# mysql-rc.yaml
apiVersion: v1
kind: ReplicationController
metadata:
  name: mysql
spec:
  replicas: 1
  selector:
    app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:5.6.43
        ports:
        - containerPort: 3306
        env:
        - name: MYSQL_ROOT_PASSWORD
          value: "123456"

# mysql-svc.yaml
apiVersion: v1
kind: Service
metadata:
  name: mysql
spec:
  selector:
    app: mysql
  ports:
    - port: 3306

# tomcat-deploy.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myweb
spec:
  replicas: 2
  selector:
    matchLabels:
      app: myweb
  template:
    metadata:
      labels:
        app: myweb
    spec:
      containers:
      - name: myweb
        image: kubeguide/tomcat-app:v1
        ports:
        - containerPort: 8080
        env:
        - name: MYSQL_SERVICE_HOST
          value: 'mysql'
        volumeMounts:
        - name: shared
          mountPath: /mydata-data
      volumes:
      - name: shared
        hostPath:
          path: "/tmp/"

# tomcat-svc.yaml 
apiVersion: v1
kind: Service
metadata:
  name: myweb
spec:
  type: NodePort
  ports:
    - port: 8080
      nodePort: 30080
  selector:
    app: myweb

卸载node

kubeadm reset && rm -rf /etc/cni/net.d && ipvsadm --clear && rm -rf $HOME/.kube && rm -rf /etc/kubernetes/*

FAQ

问题:api-server启动--anonymous-auth=false,会导致频繁重启:

    Liveness:     http-get https://192.168.11.8:6443/livez delay=10s timeout=15s period=10s #success=1 #failure=8
    Readiness:    http-get https://192.168.11.8:6443/readyz delay=0s timeout=15s period=1s #success=1 #failure=3
    Startup:      http-get https://192.168.11.8:6443/livez delay=10s timeout=15s period=10s #success=1 #failure=24

解决:(未解决)

问题:work node上kubelet服务日志频繁打印错误:
 kubelet[623018]: E1231 18:05:30.083549  623018 file_linux.go:60] Unable to read config path "/etc/kubernetes/manifests": path does not exist, ignoring

解决:mkdir  -p /etc/kubernetes/manifests

问题:初始化节点或者提前下载镜像(kubeadm config images pull --config=kubeadm-config.new.yaml  ),命令执行后,报如下警告:error unmarshaling JSON: while decoding JSON: json: unknown field SupportIPVSProxyMode
解决:因为粘贴的KubeProxyConfiguration配置问题,虽然校验yaml格式都正确,但总是报错,在生成初始化kubeadm配置文件时,加上--component-configs KubeProxyConfiguration,然后在配置文件中进行定义修剪即可:
kubeadm config print init-defaults --component-configs KubeProxyConfiguration > kubeadm-kubeproxy.yaml

问题:执行kubeadm join加入kubernetes集群,卡在[preflight] Running pre-flight checks;增加-v5参数后,看见报错信息:[discovery] The cluster-info ConfigMap does not yet contain a JWS signature for token ID "liuyll", will try again。
解决:如上报错为token过期导致,默认token的有效期是24小时,过期之后如果还有节点加入kubernetes集群,需要创建新的token:# kubeadm token create --print-join-command

问题:kubectl get cs scheduler和controller-mananger组件Unhealthy
$ kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS      MESSAGE                                                                                       ERROR
scheduler            Unhealthy   Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused   
controller-manager   Unhealthy   Get "http://127.0.0.1:10252/healthz": dial tcp 127.0.0.1:10252: connect: connection refused   
etcd-2               Healthy     {"health":"true"}                                                                             
etcd-1               Healthy     {"health":"true"}                                                                             
etcd-0               Healthy     {"health":"true"} 
解决:1、不需要解决,不影响使用;2、如果非要解决,进入到/etc/kubernetes/manifests/,修改相关组件的yaml文件,将--port=0注释掉即可。        

问题:kubectl get node 发现Node状态为Unready。
到对应的Node上执行[failed to find plugin "flannel" in path [/opt/cni/bin] failed to find plugin "portmap" in path [/opt/cni/bin]]。查看/opt/下竟然空空如也。因/opt/cni下的所有文件来自于kubernetes-cni(可以通过rpm -ql kubernetes-cni查看),所以通过rpm -q kubernetes-cni查询软件包,诡异了:软件包还在,文件不在了。
怎么解决呢?可以重新安装kubernetes-cni,但是会报错:
Error: 
 Problem: problem with installed package kubelet-1.20.4-0.x86_64
  - cannot install both kubelet-1.18.4-0.x86_64 and kubelet-1.20.4-0.x86_64
  - cannot install both kubelet-1.20.4-0.x86_64 and kubelet-1.18.4-0.x86_64
除非是将Kubelet等全部卸载重装,代价太大了。
解决:可以从其他的Node上拷贝/opt/cni/*到有问题的Node即可。

如不提前安装Docker,则执行“kubeadm config print init-defaults”报警告
W0511 10:54:30.386986    5061 kubelet.go:210] cannot automatically set CgroupDriver when starting the Kubelet: cannot execute 'docker info -f {{.CgroupDriver}}': executable file not found in $PATH 

kubeadm config images pull --config=kubeadm-init-default.yaml执行该命令报警告,则是因为追加的KubeProxyConfiguration配置中空格问题,删除后重新输入空格即可。
W0511 11:16:06.091946    5715 strict.go:54] error unmarshaling configuration schema.GroupVersionKind{Group:"kubeproxy.config.k8s.io", Version:"v1alpha1", Kind:"KubeProxyConfiguration"}: error unmarshaling JSON: while decoding JSON: json: unknown field "\u00a0 masqueradeAll" 

问题:执行crictl images报错
WARN[0000] image connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead. 
ERRO[0000] unable to determine image API version: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory
E0705 15:23:06.260611   13731 remote_image.go:121] "ListImages with filter from image service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService" filter="&ImageFilter{Image:&ImageSpec{Image:,Annotations:map[string]string{},},}"
FATA[0000] listing images: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService 

解决:

编辑crictl命令的配置文件,保存后再次执行crictl命令,问题解决。

vi /etc/crictl.yaml 
runtime-endpoint: unix:///var/run/containerd/containerd.sock
image-endpoint: unix:///var/run/containerd/containerd.sock
timeout: 10
debug: false

问题:执行kubeadm config images pull --config=kubeadm-init.default.yaml报IPv6子网掩码是错误
podSubnet: Invalid value: "fa00::/112": the size of pod subnet with mask 112 is smaller than the size of node subnet with mask 64
To see the stack trace of this error execute with --v=5 or higher

解决:

kind: ClusterConfiguration中追加controllerManager,自定义IPv4和IPv6的子网掩码位数(默认分别为:24、64)

controllerManager:
  extraArgs:
    "node-cidr-mask-size-ipv4": "24"
    "node-cidr-mask-size-ipv6": "120"

问题:执行kubeadm init --config=kubeadm-init.default.yaml,卡死在如下阶段

[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s

[kubelet-check] Initial timeout of 40s passed.

定位:

看kubeadm init的日志看不到啥东西,通过journalctl -f -u containerd看容器引擎的日志,发现:

failed, error" error="failed to get sandbox image \"k8s.gcr.io/pause:3.6\": failed to pull image \"k8s.gcr.io/pause:3.6\": failed to pull and unpack image \"k8s.gcr.io/pause:3.6\": failed to resolve reference \"k8s.gcr.io/pause:3.6\": failed to do request: Head \"https://k8s.gcr.io/v2/pause/manifests/3.6\": dial tcp 108.177.125.82:443: i/o timeout"
Jul 05 19:08:30 k8s-testing01-190 containerd[13788]: time="2022-07-05T19:08:30.696324518+08:00" level=info msg="trying next host" error="failed to do request: Head \"https://k8s.gcr.io/v2/pause/manifests/3.6\": dial tcp 108.177.125.82:443: i/o timeout" host=k8s.gcr.io
(这里会无视kubelet的配置,如下指定了基础设置image并没用:--pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.7;kubeadm启动control plane还是会使用k8s.gcr.io/pause:3.6)

解决:

ctr -n k8s.io images tag registry.aliyuncs.com/google_containers/pause:3.7 k8s.gcr.io/pause:3.6  

问题:crictl pull insecure registry报https的443 refused(同docker pull)
E0707 10:29:23.733163    3389 remote_image.go:238] "PullImage from image service failed" err="rpc error: code = Unknown desc = failed to pull and unpack image \"192.168.11.101/library/cni:v3.22.2\": failed to resolve reference \"192.168.11.101/library/cni:v3.22.2\": failed to do request: Head \"https://192.168.11.101/v2/library/cni/manifests/v3.22.2\": dial tcp 192.168.11.101:443: connect: connection refused" image="192.168.11.101/library/cni:v3.22.2"
FATA[0000] pulling image: rpc error: code = Unknown desc = failed to pull and unpack image "192.168.11.101/library/cni:v3.22.2": failed to resolve reference "192.168.11.101/library/cni:v3.22.2": failed to do request: Head "https://192.168.11.101/v2/library/cni/manifests/v3.22.2": dial tcp 192.168.11.101:443: connect: connection refused 

定位:

docker的pull insecure 镜像看,我们都知道在daemon.json中配置就好了,但是crictl命令咋整?却需要配置containerd的配置文件:/etc/containerd/config.toml

解决:

(通过containerd config default > /etc/containerd/config.toml获取默认配置)

vi /etc/containerd/config.toml

root = "/var/lib/containerd"
state = "/run/containerd"
temp = ""
version = 2      
[plugins]
  [plugins."io.containerd.grpc.v1.cri"]
    [plugins."io.containerd.grpc.v1.cri".registry]
      config_path = ""

      [plugins."io.containerd.grpc.v1.cri".registry.auths]

      [plugins."io.containerd.grpc.v1.cri".registry.configs]

      [plugins."io.containerd.grpc.v1.cri".registry.headers]

      [plugins."io.containerd.grpc.v1.cri".registry.mirrors]
        [plugins."io.containerd.grpc.v1.cri".registry.mirrors."192.168.11.101"]
          endpoint = ["http://192.168.11.101"]

你可能感兴趣的:(云计算-容器云,kubernetes)