在 Vagrant 中使用 flannel CNI 搭建K8S,导致的POD间不能ping通的问题解决

此问题解决花费了两天时间,上网找了不少资料,仍没有解决问题,最后是看flannel的官方资料时找到了问题答案。遇到问题查官方资料是多么重要

环境

vagrant + VM BOX 虚拟环境

1主 + 2 节点

CentOS 7.8.2003

k8s 1.26

flannel 0.22.0

问题现象:

节点能 ping 通本节点的 pod ,同一节点的 pod 能相互 ping 通

不同节点之间的 pod 无法 ping 通,包括 节点 ping pod

问题解决过程:

查看 路由 信息

[vagrant@k8s-master ~]$ route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         gateway         0.0.0.0         UG    100    0        0 eth0
10.0.2.0        0.0.0.0         255.255.255.0   U     100    0        0 eth0
10.244.0.0      0.0.0.0         255.255.255.0   U     0      0        0 cni0
10.244.1.0      10.244.1.0      255.255.255.0   UG    0      0        0 flannel.1
10.244.2.0      10.244.2.0      255.255.255.0   UG    0      0        0 flannel.1
192.168.111.0   0.0.0.0         255.255.255.0   U     101    0        0 eth1

[vagrant@k8s-work01 ~]$ route
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         gateway         0.0.0.0         UG    100    0        0 eth0
10.0.2.0        0.0.0.0         255.255.255.0   U     100    0        0 eth0
10.244.0.0      10.244.0.0      255.255.255.0   UG    0      0        0 flannel.1
10.244.1.0      0.0.0.0         255.255.255.0   U     0      0        0 cni0
10.244.2.0      10.244.2.0      255.255.255.0   UG    0      0        0 flannel.1
192.168.111.0   0.0.0.0         255.255.255.0   U     101    0        0 eth1

[vagrant@k8s-work02 ~]$ route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         gateway         0.0.0.0         UG    100    0        0 eth0
10.0.2.0        0.0.0.0         255.255.255.0   U     100    0        0 eth0
10.244.0.0      10.244.0.0      255.255.255.0   UG    0      0        0 flannel.1
10.244.1.0      10.244.1.0      255.255.255.0   UG    0      0        0 flannel.1
10.244.2.0      0.0.0.0         255.255.255.0   U     0      0        0 cni0
192.168.111.0   0.0.0.0         255.255.255.0   U     101    0        0 eth1

路由看起来是完整的。

查看 IP 信息

[vagrant@k8s-master ~]$ ip a
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0:  mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:4d:77:d3 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global noprefixroute dynamic eth0
       valid_lft 69366sec preferred_lft 69366sec
    inet6 fe80::5054:ff:fe4d:77d3/64 scope link
       valid_lft forever preferred_lft forever
3: eth1:  mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:e3:7f:f5 brd ff:ff:ff:ff:ff:ff
    inet 192.168.111.11/24 brd 192.168.111.255 scope global noprefixroute eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fee3:7ff5/64 scope link
       valid_lft forever preferred_lft forever
4: flannel.1:  mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether 12:5f:90:dd:6f:6b brd ff:ff:ff:ff:ff:ff
    inet 10.244.0.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::105f:90ff:fedd:6f6b/64 scope link
       valid_lft forever preferred_lft forever
5: cni0:  mtu 1450 qdisc noqueue state UP group default qlen 1000
    link/ether 92:e6:95:bc:58:de brd ff:ff:ff:ff:ff:ff
    inet 10.244.0.1/24 brd 10.244.0.255 scope global cni0
       valid_lft forever preferred_lft forever
    inet6 fe80::90e6:95ff:febc:58de/64 scope link
       valid_lft forever preferred_lft forever
​

[root@k8s-work01 ~]# ip a
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0:  mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:4d:77:d3 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global noprefixroute dynamic eth0
       valid_lft 68465sec preferred_lft 68465sec
    inet6 fe80::5054:ff:fe4d:77d3/64 scope link
       valid_lft forever preferred_lft forever
3: eth1:  mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:b7:04:43 brd ff:ff:ff:ff:ff:ff
    inet 192.168.111.12/24 brd 192.168.111.255 scope global noprefixroute eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:feb7:443/64 scope link
       valid_lft forever preferred_lft forever
4: flannel.1:  mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether 22:8c:46:c5:45:8b brd ff:ff:ff:ff:ff:ff
    inet 10.244.1.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::208c:46ff:fec5:458b/64 scope link
       valid_lft forever preferred_lft forever
5: cni0:  mtu 1450 qdisc noqueue state UP group default qlen 1000
    link/ether 2e:91:a8:b7:74:fc brd ff:ff:ff:ff:ff:ff
    inet 10.244.1.1/24 brd 10.244.1.255 scope global cni0
       valid_lft forever preferred_lft forever
    inet6 fe80::2c91:a8ff:feb7:74fc/64 scope link
       valid_lft forever preferred_lft forever

1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0:  mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:4d:77:d3 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global noprefixroute dynamic eth0
       valid_lft 81235sec preferred_lft 81235sec
    inet6 fe80::5054:ff:fe4d:77d3/64 scope link
       valid_lft forever preferred_lft forever
3: eth1:  mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:a8:13:55 brd ff:ff:ff:ff:ff:ff
    inet 192.168.111.13/24 brd 192.168.111.255 scope global noprefixroute eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fea8:1355/64 scope link
       valid_lft forever preferred_lft forever
4: flannel.1:  mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether fa:89:d5:3c:c6:50 brd ff:ff:ff:ff:ff:ff
    inet 10.244.2.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::f889:d5ff:fe3c:c650/64 scope link
       valid_lft forever preferred_lft forever
5: cni0:  mtu 1450 qdisc noqueue state UP group default qlen 1000
    link/ether 36:e8:c7:3d:31:28 brd ff:ff:ff:ff:ff:ff
    inet 10.244.2.1/24 brd 10.244.2.255 scope global cni0
       valid_lft forever preferred_lft forever
    inet6 fe80::34e8:c7ff:fe3d:3128/64 scope link
       valid_lft forever preferred_lft forever

flannel.1 和 cni0 网络也正常存在

查看pod

[vagrant@k8s-master ~]$ kgp -A
NAMESPACE      NAME                                 READY   STATUS    RESTARTS   AGE     IP               NODE         NOMINATED NODE   READINESS GATES
...
kube-flannel   kube-flannel-ds-cw6hh                1/1     Running   0          115m    192.168.111.11   k8s-master              
kube-flannel   kube-flannel-ds-g68xs                1/1     Running   0          115m    192.168.111.13   k8s-work02              
kube-flannel   kube-flannel-ds-pmmgd                1/1     Running   0          115m    192.168.111.12   k8s-work01              
kube-system    coredns-7645794859-8st6v             1/1     Running   0          8h      10.244.1.27      k8s-work01              
kube-system    coredns-7645794859-t7d8t             1/1     Running   0          8h      10.244.0.6       k8s-master              
kube-system    etcd-k8s-master                      1/1     Running   1          2d15h   192.168.111.11   k8s-master              
​
...

flannel 节点也正常启动

问题解决方法:

尝试了重新安装 flannel, 重置节点,都没有解决问题,最后在查看 flannel 官方文档时发现了如下信息

Vagrant

Vagrant typically assigns two interfaces to all VMs. The first, for which all hosts are assigned the IP address 10.0.2.15, is for external traffic that gets NATed.

This may lead to problems with flannel. By default, flannel selects the first interface on a host. This leads to all hosts thinking they have the same public IP address. To prevent this issue, pass the --iface=eth1 flag to flannel so that the second interface is chosen.

我原来是按网上的方法在 kube-flannel.yaml 中添加了 --iface=eth0 参数,但没有起效,是因为 eth0 IP 都为 10.0.2.15,在 IP 地址查看时没有注意到这个问题。

把 kube-flannel.yaml 中的如下参数修改为 --iface=eth1

containers:
   - args:
     - --ip-masq
     - --kube-subnet-mgr
     - --iface=eth1
       command:
     - /opt/bin/flanneld

重新安装 flannel ,问题解决

你可能感兴趣的:(kubernetes,docker,容器)