K8S 笔记 - 修改 kubelet 10250 端口运行的协议和地址

1. 问题描述

使用 kubeadm 部署 k8s 集群的时候不知道哪个步骤出了错,导致 kubelet 10250 端口运行的协议、地址出了问题,如下所示:

[[email protected] ~]# netstat -ntpl | grep 10250
     
tcp        0      0 127.0.0.1:10250         0.0.0.0:*               LISTEN      52577/kubelet

查看 kubelet 服务也能看到端口运行在 127.0.0.1 上:

[[email protected] ~]# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/kubelet.service.d
           └─10-kubeadm.conf, 20-etcd-service-manager.conf
   Active: active (running) since 二 2022-10-04 15:04:47 CST; 4 days ago
     Docs: https://kubernetes.io/docs/
 Main PID: 52577 (kubelet)
    Tasks: 16
   Memory: 51.4M
   CGroup: /system.slice/kubelet.service
           └─52577 /usr/bin/kubelet --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manifests --cgroup-driver=systemd --network-plugin=cni --pod-infra-cont...

10月 05 08:46:43 k8s-slave2 kubelet[52577]: I1005 08:46:43.352978   52577 topology_manager.go:200] "Topology Admit Handler"
10月 05 08:46:43 k8s-slave2 kubelet[52577]: I1005 08:46:43.514029   52577 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...
10月 05 08:46:44 k8s-slave2 kubelet[52577]: map[string]interface {}{"cniVersion":"0.3.1", "hairpinMode":true, "ipMasq":false, "ipam":map[string]interface {}{"rang...
10月 05 11:13:17 k8s-slave2 kubelet[52577]: {"cniVersion":"0.3.1","hairpinMode":true,"ipMasq":false,"ipam":{"ranges":[[{"subnet":"10.244.1.0/24"}]],"routes":[{"ds...
10月 05 11:13:17 k8s-slave2 kubelet[52577]: I1005 11:13:17.399294   52577 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...
10月 05 11:13:17 k8s-slave2 kubelet[52577]: I1005 11:13:17.979964   52577 pod_container_deletor.go:79] "Container not found in pod's containers" contain...d0cda5a59"
10月 05 11:13:18 k8s-slave2 kubelet[52577]: map[string]interface {}{"cniVersion":"0.3.1", "hairpinMode":true, "ipMasq":false, "ipam":map[string]interface {}{"rang...
10月 07 09:51:53 k8s-slave2 kubelet[52577]: {"cniVersion":"0.3.1","hairpinMode":true,"ipMasq":false,"ipam":{"ranges":[[{"subnet":"10.244.1.0/24"}]],"rou...go:187] fa
10月 07 09:51:56 k8s-slave2 kubelet[52577]: E1007 09:51:56.391455   52577 kubelet_node_status.go:460] "Error updating node status, will retry" err="error getting ...
10月 07 09:51:57 k8s-slave2 kubelet[52577]: E1007 09:51:57.185843   52577 controller.go:187] failed to update lease, error: Operation cannot be fulfille... try again
Hint: Some lines were ellipsized, use -l to show in full.

而部署正常的集群,kubelet 的 10250 端口运行情况应该是这样的:

  • 基于 tcp6 协议,而不是 tcp
  • 基于 :: 而不是 127.0.0.1

如下所示:

tcp6       0      0 :::10250                :::*                    LISTEN      3272/kubelet

2. kubelet 10250 端口介绍

顺便讲下 10250 端口的作用:
10250 端口监听的是 kubelet 的 API 接口,是 kubelet 与 apiserver 通信的端口。kubelet 通过 10250 端口请求 apiserver 获取自己所应当处理的任务,并通过该端口访问及获取 node 资源以及状态。kubectl 查看 pod 的日志和 cmd 命令,都是通过 kubelet 端口 10250 访问。

如果 kubelet 10250 端口运行有问题的话则会出现类似如下无法获取日志的情况:

[[email protected] ~]# kubectl logs kube-flannel-ds-9tfc8 -n kube-system
Error from server: Get "https://192.168.100.22:10250/containerLogs/kube-system/kube-flannel-ds-9tfc8/kube-flannel": dial tcp 192.168.100.22:10250: connect: connection refused

从上面的这个报错可以看出来,10250 端口运行在 127.0.0.1 上肯定是不行的

3. 修改 10250 端口的运行

怎样将 10250 的端口运行修改正常呢?
思路是:查找 kubelet 的各种配置,看看 127.0.0.1 这个 IP 配置在哪里。

kubelet 相关的配置文件及路径可能涉及如下其中一个或者多个:

  • /etc/kubernetes/kubelet.conf
  • /var/lib/kubelet/
  • /usr/lib/systemd/system/kubelet.service
  • /usr/lib/systemd/system/kubelet.service.d/

最简单的办法是,通过命令 systemctl status kubelet 查看 kubelet 引用的关键配置文件到底是哪个。最终确认 kubelet 的配置文件是:

/usr/lib/systemd/system/kubelet.service.d/20-etcd-service-manager.conf

其配置如下,可以看到这里定义了 kubelet 运行的 ip 地址是 127.0.0.1:

[[email protected] ~]# cat /usr/lib/systemd/system/kubelet.service.d/20-etcd-service-manager.conf
[Service]
ExecStart=
# 将下面的 "systemd" 替换为你的容器运行时所使用的 cgroup 驱动。
# kubelet 的默认值为 "cgroupfs"。
ExecStart=/usr/bin/kubelet --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manifests --cgroup-driver=systemd --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.2

尝试将 ip 地址改为主机的地址 192.168.100.22,然后重启 kubelet,再次查看 10250 端口运行情况:

[[email protected] ~]# netstat -ntpl | grep kubelet
tcp        0      0 127.0.0.1:38362         0.0.0.0:*               LISTEN      16424/kubelet       
tcp        0      0 127.0.0.1:10248         0.0.0.0:*               LISTEN      16424/kubelet       
tcp        0      0 192.168.100.22:10250    0.0.0.0:*               LISTEN      16424/kubelet

但是这样也不行,该端口仍然基于 tcp 运行而不是 tcp6,同时 127.0.0.1 也需要 10250 端口。
最后找到解决方法,即 将配置文件 /usr/lib/systemd/system/kubelet.service.d/20-etcd-service-manager.conf 文件直接注释掉。然后重启 kubelet 再次查看端口运行情况,已经正常:

[[email protected] kubelet.service.d]# mv 20-etcd-service-manager.conf 20-etcd-service-manager.conf.bak
[[email protected] kubelet.service.d]# 
[[email protected] kubelet.service.d]# systemctl daemon-reload
[[email protected] kubelet.service.d]# 
[[email protected] kubelet.service.d]# systemctl restart kubelet
[[email protected] kubelet.service.d]#
[[email protected] kubelet.service.d]# netstat -ntpl | grep kubelet
tcp        0      0 127.0.0.1:39386         0.0.0.0:*               LISTEN      18890/kubelet       
tcp        0      0 127.0.0.1:10248         0.0.0.0:*               LISTEN      18890/kubelet       
tcp6       0      0 :::10250                :::*                    LISTEN      18890/kubelet

查看 kubelet 的服务状态,之前的 127.0.0.1 也去掉了:

[[email protected] kubelet.service.d]# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since 日 2022-10-09 13:59:45 CST; 35min ago
     Docs: https://kubernetes.io/docs/
 Main PID: 18890 (kubelet)
    Tasks: 15
   Memory: 51.0M
   CGroup: /system.slice/kubelet.service
           └─18890 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubel...

10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143058   18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...
10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143072   18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...
10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143086   18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...
10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143100   18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...
10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143113   18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...
10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143129   18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...
10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143143   18890 reconciler.go:221] "operationExecutor.VerifyControllerAttachedVolume started for volume ...
10月 09 13:59:47 k8s-slave2 kubelet[18890]: I1009 13:59:47.143152   18890 reconciler.go:157] "Reconciler: start to sync state"
10月 09 13:59:48 k8s-slave2 kubelet[18890]: I1009 13:59:48.317492   18890 request.go:665] Waited for 1.071720304s due to client-side throttling, not pri...roxy/token
10月 09 14:27:07 k8s-slave2 kubelet[18890]: I1009 14:27:07.198435   18890 log.go:184] http: superfluous response.WriteHeader call from k8s.io/kubernetes...se.go:220)
Hint: Some lines were ellipsized, use -l to show in full.

再次查看该 node 节点上的 pod 日志,已经可以正常查看了:

[[email protected] ~]# kubectl logs kube-flannel-ds-8mwsd -n kube-system
I1003 11:37:08.049753       1 main.go:207] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[ens33] ifaceRegex:[] ipMasq:true ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true}
W1003 11:37:08.050009       1 client_config.go:614] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I1003 11:37:08.451790       1 kube.go:121] Waiting 10m0s for node controller to sync
I1003 11:37:08.451939       1 kube.go:402] Starting kube subnet manager
I1003 11:37:09.452166       1 kube.go:128] Node controller sync successful
I1003 11:37:09.452199       1 main.go:227] Created subnet manager: Kubernetes Subnet Manager - k8s-slave2
I1003 11:37:09.452206       1 main.go:230] Installing signal handlers
I1003 11:37:09.452354       1 main.go:463] Found network config - Backend type: vxlan
I1003 11:37:09.452652       1 match.go:248] Using interface with name ens33 and address 192.168.100.22
I1003 11:37:09.452676       1 match.go:270] Defaulting external address to interface address (192.168.100.22)
I1003 11:37:09.452733       1 vxlan.go:138] VXLAN config: VNI=1 Port=0 GBP=false Learning=false DirectRouting=false
I1003 11:37:09.472457       1 kube.go:351] Setting NodeNetworkUnavailable
I1003 11:37:09.481646       1 main.go:412] Current network or subnet (10.244.0.0/16, 10.244.2.0/24) is not equal to previous one (0.0.0.0/0, 0.0.0.0/0), trying to recycle old iptables rules
I1003 11:37:09.746603       1 iptables.go:255] Deleting iptables rule: -s 0.0.0.0/0 -d 0.0.0.0/0 -m comment --comment flanneld masq -j RETURN
I1003 11:37:09.747961       1 iptables.go:255] Deleting iptables rule: -s 0.0.0.0/0 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE --random-fully
I1003 11:37:09.748691       1 iptables.go:255] Deleting iptables rule: ! -s 0.0.0.0/0 -d 0.0.0.0/0 -m comment --comment flanneld masq -j RETURN
I1003 11:37:09.749351       1 iptables.go:255] Deleting iptables rule: ! -s 0.0.0.0/0 -d 0.0.0.0/0 -m comment --comment flanneld masq -j MASQUERADE --random-fully
I1003 11:37:09.750317       1 main.go:341] Setting up masking rules
I1003 11:37:09.750945       1 main.go:362] Changing default FORWARD chain policy to ACCEPT
I1003 11:37:09.750995       1 main.go:375] Wrote subnet file to /run/flannel/subnet.env
I1003 11:37:09.751000       1 main.go:379] Running backend.
I1003 11:37:09.845326       1 vxlan_network.go:61] watching for new subnet leases
I1003 11:37:09.846711       1 main.go:400] Waiting for all goroutines to exit
I1003 11:37:09.847083       1 iptables.go:231] Some iptables rules are missing; deleting and recreating rules
I1003 11:37:09.847088       1 iptables.go:255] Deleting iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -m comment --comment flanneld masq -j RETURN
I1003 11:37:09.943307       1 iptables.go:231] Some iptables rules are missing; deleting and recreating rules
I1003 11:37:09.943324       1 iptables.go:255] Deleting iptables rule: -s 10.244.0.0/16 -m comment --comment flanneld forward -j ACCEPT
I1003 11:37:09.943450       1 iptables.go:255] Deleting iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE --random-fully
I1003 11:37:09.944459       1 iptables.go:255] Deleting iptables rule: -d 10.244.0.0/16 -m comment --comment flanneld forward -j ACCEPT
I1003 11:37:09.945214       1 iptables.go:255] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.2.0/24 -m comment --comment flanneld masq -j RETURN
I1003 11:37:09.945335       1 iptables.go:243] Adding iptables rule: -s 10.244.0.0/16 -m comment --comment flanneld forward -j ACCEPT
I1003 11:37:09.946028       1 iptables.go:255] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -m comment --comment flanneld masq -j MASQUERADE --random-fully
I1003 11:37:09.947474       1 iptables.go:243] Adding iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -m comment --comment flanneld masq -j RETURN
I1003 11:37:09.948330       1 iptables.go:243] Adding iptables rule: -d 10.244.0.0/16 -m comment --comment flanneld forward -j ACCEPT
I1003 11:37:10.044998       1 iptables.go:243] Adding iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -m comment --comment flanneld masq -j MASQUERADE --random-fully
I1003 11:37:10.047373       1 iptables.go:243] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.2.0/24 -m comment --comment flanneld masq -j RETURN
I1003 11:37:10.049161       1 iptables.go:243] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -m comment --comment flanneld masq -j MASQUERADE --random-fully

你可能感兴趣的:(K8S 笔记 - 修改 kubelet 10250 端口运行的协议和地址)