环境介绍
在一个物理server上安装三个VM,VM操作系统如下:
root@master:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 19.10
Release: 19.10
Codename: eoan
一个VM作为master,另外两个VM作为worker:
root@master:~# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready master 112d v1.17.3 192.168.122.20 Ubuntu 19.10 5.3.0-55-generic docker://19.3.2
node1 Ready 112d v1.17.3 192.168.122.21 Ubuntu 19.10 5.3.0-55-generic docker://19.3.2
node2 Ready 112d v1.17.3 192.168.122.22 Ubuntu 19.10 5.3.0-55-generic docker://19.3.2
calico安装
wget https://docs.projectcalico.org/manifests/calico.yaml
kubectl apply -f calico.yaml
root@master:~/calico# kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-5b644bc49c-94g6h 1/1 Running 0 82s 10.24.104.2 node2
calico-node-75kns 1/1 Running 0 82s 192.168.122.20 master
calico-node-fh969 1/1 Running 0 82s 192.168.122.22 node2
calico-node-lbbd9 1/1 Running 0 82s 192.168.122.21 node1
coredns-9d85f5447-5s8k9 0/1 Running 3 112d 10.24.219.65 master
coredns-9d85f5447-zbc8m 1/1 Running 2 112d 10.24.219.66 master
etcd-master 1/1 Running 2 112d 192.168.122.20 master
kube-apiserver-master 1/1 Running 2 112d 192.168.122.20 master
kube-controller-manager-master 1/1 Running 2 112d 192.168.122.20 master
kube-proxy-l4wn7 1/1 Running 2 112d 192.168.122.22 node2
kube-proxy-prhcm 1/1 Running 2 112d 192.168.122.21 node1
kube-proxy-psxqt 1/1 Running 2 112d 192.168.122.20 master
kube-scheduler-master 1/1 Running 2 112d 192.168.122.20 master
calico客户端命令工具-calicoctl,可用来查看,修改calico配置
wget https://github.com/projectcalico/calicoctl/releases/download/v3.5.4/calicoctl -O /usr/bin/calicoctl
chmod +x /usr/bin/calicoctl
网络模式
calico支持三种网络模式,可通过修过calico.yaml进行配置:
- overlay之ipip
- overlay之vxlan
- underlay之BGP
下面分别进行配置验证,并分析数据流向
overlay -- ipip
configure
安装完calico,默认就是ipip模式。
node之间是full mesh连接。
root@master:~/calico# calicoctl node status
Calico process is running.
IPv4 BGP status
+----------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+----------------+-------------------+-------+----------+-------------+
| 192.168.122.21 | node-to-node mesh | up | 17:37:27 | Established |
| 192.168.122.22 | node-to-node mesh | up | 17:37:28 | Established |
+----------------+-------------------+-------+----------+-------------+
IPv6 BGP status
No IPv6 peers found.
进入calico pod,查看运行的进程。
- felix为pod配置直接路由,管理接口
- bird感知pod直接路由,并通过bgp发布给其他node
- confd动态更新bird的配置文件
root@master:~/calico# kubectl exec -it calico-node-lbbd9 -n kube-system bash
[root@node1 /]# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 17:37 ? 00:00:00 /usr/local/bin/runsvdir -P /etc/service/enabled
root 44 1 0 17:37 ? 00:00:00 runsv felix
root 45 1 0 17:37 ? 00:00:00 runsv bird6
root 46 1 0 17:37 ? 00:00:00 runsv bird
root 47 1 0 17:37 ? 00:00:00 runsv confd
root 51 47 0 17:37 ? 00:00:00 calico-node -confd
root 148 45 0 17:37 ? 00:00:00 bird6 -R -s /var/run/calico/bird6.ctl -d -c /etc/calico/confd/config/bird6.cfg
root 149 46 0 17:37 ? 00:00:00 bird -R -s /var/run/calico/bird.ctl -d -c /etc/calico/confd/config/bird.cfg
root 163 44 2 17:37 ? 00:00:06 calico-node -felix
root 866 0 0 17:40 pts/0 00:00:00 bash
root 1263 866 0 17:42 pts/0 00:00:00 ps -ef
而且在node上会多出一个网络接口tunl0,用于封装/解封装ipip报文
11: tunl0@NONE: mtu 1440 qdisc noqueue state UNKNOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
inet 10.24.166.129/32 brd 10.24.166.129 scope global tunl0
valid_lft forever preferred_lft forever
verify
通过下面yaml文件部署两个pod,验证网络连通性。
nginx.yaml
1nginx.yaml -- 复制nginx.yaml,修改name
root@master:~# cat nginx.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 1
selector:
matchLabels:
name: nginx
template:
metadata:
labels:
name: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
imagePullPolicy: Always
---
kind: Service
apiVersion: v1
metadata:
name: nginx
spec:
type: ClusterIP
ports:
- name: nginx
port: 3306
targetPort: 80
protocol: TCP
selector:
name: nginx
root@master:~# kubectl apply -f nginx.yaml
deployment.apps/nginx unchanged
service/nginx unchanged
root@master:~# kubectl apply -f 1nginx.yaml
deployment.apps/nginx1 unchanged
service/nginx1 unchanged
root@master:~# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-677dc4d96-vrbp5 1/1 Running 0 18s 10.24.104.3 node2
nginx1-677dc4d96-8bjvv 1/1 Running 0 21s 10.24.166.130 node1
可看到两个pod分别部署在不同的worker上。
进入一个pod,可以ping通另一个pod
root@master:~# kubectl exec -it nginx1-677dc4d96-8bjvv bash
root@nginx1-677dc4d96-8bjvv:/# ping 10.24.104.3 -c1
PING 10.24.104.3 (10.24.104.3): 48 data bytes
56 bytes from 10.24.104.3: icmp_seq=0 ttl=62 time=2.369 ms
--- 10.24.104.3 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 2.369/2.369/2.369/0.000 ms
traffic flow
以10.24.166.130 ping 10.24.104.3 为例:
- 查找pod内路由表可知,需要发送给默认路由 169.254.1.1。
发送arp请求169.254.1.1的mac。arp请求报文会到达caliadb5d6cab6f。此设备设置了arp proxy,所以会将它的mac回复给pod。(可在caliadb5d6cab6f抓到arp请求和回复报文)
root@node1:~# cat /proc/sys/net/ipv4/conf/caliadb5d6cab6f/proxy_arp
1
- 学习到mac地址后,发送icmp请求报文
- 在eth0设备的驱动发送函数veth_xmit函数中,将skb->dev指向eth0的peer设备caliadb5d6cab6f,接着调用netif_rx进入协议栈查找路由。
可在caliadb5d6cab6f抓到报文。
18:17:50.003013 0a:65:aa:2b:ef:d1 > ee:ee:ee:ee:ee:ee, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 64, id 47525, offset 0, flags [DF], proto ICMP (1), length 76)
10.24.166.130 > 10.24.104.3: ICMP echo request, id 7168, seq 0, length 56
- icmp请求到达caliadb5d6cab6f, 查找host路由表得知,下一跳为192.168.122.22(node2的ip),并且需要通过tunl0进行隧道封装。
root@node1:~# ip r
default via 192.168.122.1 dev ens3 proto static
10.24.104.0/26 via 192.168.122.22 dev tunl0 proto bird onlink
blackhole 10.24.166.128/26 proto bird
10.24.166.130 dev caliadb5d6cab6f scope link
10.24.219.64/26 via 192.168.122.20 dev tunl0 proto bird onlink
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.122.0/24 dev ens3 proto kernel scope link src 192.168.122.21
所以报文达到tunl0设备时,报文格式如下,源目的ip不变,因为ipip模式,所以mac已经没了。
root@node1:~# tcpdump -vne -i tunl0
tcpdump: listening on tunl0, link-type RAW (Raw IP), capture size 262144 bytes
18:20:45.293856 ip: (tos 0x0, ttl 63, id 52265, offset 0, flags [DF], proto ICMP (1), length 76)
10.24.166.130 > 10.24.104.3: ICMP echo request, id 7424, seq 0, length 56
18:20:45.294975 ip: (tos 0x0, ttl 63, id 57896, offset 0, flags [none], proto ICMP (1), length 76)
10.24.104.3 > 10.24.166.130: ICMP echo reply, id 7424, seq 0, length 56
封装完ipip,根据外层ip再次查找host路由表,从ens3网卡发送出去
192.168.122.0/24 dev ens3 proto kernel scope link src 192.168.122.21
- 封装后从ens3网卡发出
最终封装的icmp request报文,可在ens3抓到 ipip 报文
root@node1:~# tcpdump -vne -i ens3 host 192.168.122.22
tcpdump: listening on ens3, link-type EN10MB (Ethernet), capture size 262144 bytes
18:31:17.809729 52:54:00:74:ac:0d > 52:54:00:f3:3a:90, ethertype IPv4 (0x0800), length 110: (tos 0x0, ttl 63, id 2590, offset 0, flags [DF], proto IPIP (4), length 96)
192.168.122.21 > 192.168.122.22: (tos 0x0, ttl 63, id 61416, offset 0, flags [DF], proto ICMP (1), length 76)
10.24.166.130 > 10.24.104.3: ICMP echo request, id 7680, seq 0, length 56
- 封装数据包到达node2后,因为目的ip为local,所以接收此数据包,并向上层协议传递。
解封装后,将报文发送给tunl0网卡,可在此抓到icmp请求报文
root@node2:~# tcpdump -vne -i tunl0
tcpdump: listening on tunl0, link-type RAW (Raw IP), capture size 262144 bytes
18:38:56.717329 ip: (tos 0x0, ttl 63, id 19824, offset 0, flags [DF], proto ICMP (1), length 76)
10.24.166.130 > 10.24.104.3: ICMP echo request, id 7936, seq 0, length 56
- 再次查找host路由表,得知目的ip 10.24.104.3发给
calie935ef337bb
10.24.104.3 dev calie935ef337bb scope link
- 通过veth,发送到pod
- icmp reply数据包处理过程类似
overlay -- vxlan
configure
参考:https://docs.projectcalico.org/getting-started/kubernetes/installation/config-options : Switching from IP-in-IP to VXLAN
修过 calico.yaml:
- Replace environment variable name CALICO_IPV4POOL_IPIP withCALICO_IPV4POOL_VXLAN. Leave the value of the new variable as “Always”.
- Optionally, (to save some resources if you’re running a VXLAN-only cluster) completely disable Calico’s BGP-based networking:
Replace calico_backend: "bird" with calico_backend: "vxlan". This disables BIRD.
Comment out the line - -bird-ready and - -bird-live from the calico/node readiness/liveness check (otherwise disabling BIRD will cause the readiness/liveness check to fail on every node):
livenessProbe:
exec:
command:
- /bin/calico-node
- -felix-live
# - -bird-live
readinessProbe:
exec:
command:
- /bin/calico-node
# - -bird-ready
- -felix-ready
重新apply calico.yaml
kubectl apply -f ./calico.yaml
查看calico node上运行的进程,已经没了bird等和BGP相关的进程。
root@master:~/calico# kubectl exec -it calico-node-9lh84 -n kube-system bash
[root@node1 /]# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 10:37 ? 00:00:00 /usr/local/bin/runsvdir -P /etc/service/enabled
root 37 1 0 10:38 ? 00:00:00 runsv felix
root 38 37 1 10:38 ? 00:02:08 calico-node -felix
root 2128 0 1 12:45 pts/0 00:00:00 bash
root 2148 2128 0 12:45 pts/0 00:00:00 ps -ef
calicoctl查看node状态,也已经没有BGP相关内容
root@master:~# calicoctl node status
Calico process is running.
None of the BGP backend processes (BIRD or GoBGP) are running.
而且每个节点上多了一个网络接口:
7: vxlan.calico: mtu 1410 qdisc noqueue state UNKNOWN group default
link/ether 66:f9:37:c3:7e:94 brd ff:ff:ff:ff:ff:ff
inet 10.24.166.128/32 brd 10.24.166.128 scope global vxlan.calico
valid_lft forever preferred_lft forever
inet6 fe80::64f9:37ff:fec3:7e94/64 scope link
valid_lft forever preferred_lft forever
verify
和ipip模式verify一样,创建两个pod。
进入一个pod,可以ping通另一个pod
root@master:~# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-677dc4d96-xysui 1/1 Running 0 18s 10.24.104.2 node2
nginx-677dc4d96-wkkcn 1/1 Running 0 21s 10.24.166.130 node1
root@master:~# kubectl exec -it nginx-677dc4d96-wkkcn bash
root@nginx-677dc4d96-wkkcn:/# ping 10.24.104.2 -c1
PING 10.24.104.2 (10.24.104.2): 48 data bytes
56 bytes from 10.24.104.2: icmp_seq=0 ttl=62 time=2.519 ms
--- 10.24.104.2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 2.519/2.519/2.519/0.000 ms
traffic flow
以10.24.166.130 ping 10.24.104.2 为例:
- 查找pod内路由表可知,需要发送给默认路由 169.254.1.1。
pod内邻居表项有169.254.1.1对应的mac地址(可能是calico静态配置的)。
root@nginx-677dc4d96-wkkcn:/# ip neigh
169.254.1.1 dev eth0 lladdr ee:ee:ee:ee:ee:ee STALE
192.168.122.21 dev eth0 lladdr ee:ee:ee:ee:ee:ee STALE
所以pod发出icmp request报文,可在eth0抓到。
- 在eth0设备的驱动发送函数veth_xmit函数中,将skb->dev指向eth0的peer设备caliea5b03f12b8,接着调用netif_rx进入协议栈查找路由。
可在caliea5b03f12b8抓到报文。
16:39:36.406630 4e:78:56:5f:78:5d > ee:ee:ee:ee:ee:ee, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 64, id 59734, offset 0, flags [DF], proto ICMP (1), length 76)
10.24.166.130 > 10.24.104.2: ICMP echo request, id 21248, seq 0, length 56
- icmp请求到达caliea5b03f12b8, 查找host路由表得知,下一跳为10.24.104.0(node2的vxlan设备ip),并且需要通过vxlan.calico进行隧道封装。
root@node1:~# ip r
default via 192.168.122.1 dev ens3 proto static
10.24.104.0/26 via 10.24.104.0 dev vxlan.calico onlink
10.24.166.130 dev caliea5b03f12b8 scope link
10.24.219.64/26 via 10.24.219.64 dev vxlan.calico onlink
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.122.0/24 dev ens3 proto kernel scope link src 192.168.122.21
从neigh信息可知,10.24.104.0 对应的mac地址为66:2d:bf:44:a6:8b
root@node1:~# ip neigh
192.168.122.22 dev ens3 lladdr 52:54:00:f3:3a:90 STALE
10.24.219.64 dev vxlan.calico lladdr 66:4f:26:ae:af:db PERMANENT
10.24.166.130 dev caliea5b03f12b8 lladdr 4e:78:56:5f:78:5d STALE
192.168.122.1 dev ens3 lladdr 52:54:00:32:63:2e REACHABLE
10.24.104.0 dev vxlan.calico lladdr 66:2d:bf:44:a6:8b PERMANENT
192.168.122.20 dev ens3 lladdr 52:54:00:d9:d7:07 REACHABLE
所以报文达到vxlan.calico设备时,报文格式如下,源目的ip不变,但是目的mac已经变为10.24.104.0对应的mac,源mac变为vxlan.calico设备的mac
13:44:39.560217 66:f9:37:c3:7e:94 > 66:2d:bf:44:a6:8b, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 63, id 48899, offset 0, flags [DF], proto ICMP (1), length 76)
10.24.166.130 > 10.24.104.2: ICMP echo request, id 16128, seq 0, length 56
在 vxlan_xmit 中调用 vxlan_find_mac 根据目的mac查找fdb信息。
从fdb信息可知,mac 66:2d:bf:44:a6:8b 对应ip 192.168.122.22。
此ip即为vxlan外层目的ip。
root@node1:~# bridge fdb show dev vxlan.calico
66:2d:bf:44:a6:8b dst 192.168.122.22 self permanent
66:4f:26:ae:af:db dst 192.168.122.20 self permanent
封装完vxlan,根据外层ip再次查找host路由表,从ens3网卡发送出去
192.168.122.22 dev ens3 lladdr 52:54:00:f3:3a:90 STALE
- 封装后从ens3网卡发出
最终封装的icmp request报文,可在ens3抓到
192.168.122.21.44936 > 192.168.122.22.4789: VXLAN, flags [I] (0x08), vni 4096
66:f9:37:c3:7e:94 > 66:2d:bf:44:a6:8b, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 63, id 1065, offset 0, flags [DF], proto ICMP (1), length 76)
10.24.166.130 > 10.24.104.2: ICMP echo request, id 15616, seq 0, length 56
- 封装数据包到达node2后,因为目的ip为local,所以接收此数据包,并向上层协议传递。
node2上正在监听4789端口号(创建vxlan.calico时,添加的socket vxlan_sock_add),如果有报文来了调用vxlan_rcv处理vxlan报文,
root@node2:~# netstat -nap | grep 4789
udp 0 0 0.0.0.0:4789 0.0.0.0:* -
解封装后,将报文发送给vxlan.calico网卡,可在此抓到报文
13:44:25.320094 66:f9:37:c3:7e:94 > 66:2d:bf:44:a6:8b, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 63, id 47307, offset 0, flags [DF], proto ICMP (1), length 76)
10.24.166.130 > 10.24.104.2: ICMP echo request, id 15872, seq 0, length 56
- 再次查找host路由表,得知目的ip 10.24.104.2发给
cali82cc91000b8
10.24.104.2 dev cali82cc91000b8 scope link
- 通过veth,发送到pod
- icmp reply数据包处理过程类似
underlay -- BGP
configure
修改 calico.yaml,将 CALICO_IPV4POOL_IPIP 的value改完 Never
# Enable IPIP
- name: CALICO_IPV4POOL_IPIP
value: "Never"
重新apply calico.yaml
kubectl apply -f calico.yaml
查看 calico node status和calico node上的进程,看和ipip模式没有区别。区别在于worker上的路由表,跨节点通信不再通过tunl0。
root@master:~/calico# calicoctl node status
Calico process is running.
IPv4 BGP status
+----------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+----------------+-------------------+-------+----------+-------------+
| 192.168.122.21 | node-to-node mesh | up | 19:56:08 | Established |
| 192.168.122.22 | node-to-node mesh | up | 19:56:09 | Established |
+----------------+-------------------+-------+----------+-------------+
IPv6 BGP status
No IPv6 peers found.
root@master:~/calico# kubectl exec -it -n kube-system calico-node-czhnn bash
[root@node1 /]# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 19:56 ? 00:00:00 /usr/local/bin/runsvdir -P /etc/service/enabled
root 42 1 0 19:56 ? 00:00:00 runsv felix
root 43 1 0 19:56 ? 00:00:00 runsv bird6
root 44 1 0 19:56 ? 00:00:00 runsv bird
root 45 1 0 19:56 ? 00:00:00 runsv confd
root 47 42 2 19:56 ? 00:00:02 calico-node -felix
root 48 45 0 19:56 ? 00:00:00 calico-node -confd
root 144 44 0 19:56 ? 00:00:00 bird -R -s /var/run/calico/bird.ctl -d -c /etc/calico/confd/config/bird.cfg
root 145 43 0 19:56 ? 00:00:00 bird6 -R -s /var/run/calico/bird6.ctl -d -c /etc/calico/confd/config/bird6.cfg
root 493 0 1 19:57 pts/0 00:00:00 bash
root 518 493 0 19:57 pts/0 00:00:00 ps -ef
或者通过如下方式动态更新,从IPIP到纯BGP模式
root@master:~# calicoctl get ipPool --export -o yaml > pool.yaml
修改ipipMode为Never
root@master:~# cat pool.yaml
apiVersion: projectcalico.org/v3
items:
- apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
creationTimestamp: 2020-05-30T18:27:41Z
name: default-ipv4-ippool
resourceVersion: "4950731"
uid: 79dac11f-309c-423a-ad5c-8235aafd08ea
spec:
cidr: 10.24.0.0/16
ipipMode: Never
natOutgoing: true
kind: IPPoolList
metadata:
resourceVersion: "4950758"
使配置生效
root@master:~# calicoctl replace -f pool.yaml
Successfully replaced 1 'IPPool' resource(s)
verify
和ipip模式verify一样,创建两个pod。
进入一个pod,可以ping通另一个pod
root@master:~# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-677dc4d96-c6mxz 1/1 Running 0 14s 10.24.104.1 node2
nginx1-677dc4d96-bjnw9 1/1 Running 0 17s 10.24.166.128 node1
root@master:~# kubectl exec -it nginx1-677dc4d96-bjnw9 bash
root@nginx1-677dc4d96-bjnw9:/# ping 10.24.104.1 -c1
PING 10.24.104.1 (10.24.104.1): 48 data bytes
56 bytes from 10.24.104.1: icmp_seq=0 ttl=63 time=4.949 ms
--- 10.24.104.1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 4.949/4.949/4.949/0.000 ms
traffic flow
以10.24.166.128 ping 10.24.104.1为例
- 查找pod内路由表可知,需要发送给默认路由 169.254.1.1。
发送arp请求169.254.1.1的mac。arp请求报文会到底caliadb5d6cab6f。此设备设置了arp proxy,所以会将它的mac回复给pod。(可在caliadb5d6cab6f抓到arp请求和回复报文) - 学习到mac地址后,发送icmp请求报文
- 在eth0设备的驱动发送函数veth_xmit函数中,将skb->dev指向eth0的peer设备cali5a1d2678510,接着调用netif_rx进入协议栈查找路由。
可在cali5a1d2678510抓到报文。
20:11:15.035450 7a:17:c4:cf:73:81 > ee:ee:ee:ee:ee:ee, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 64, id 57736, offset 0, flags [DF], proto ICMP (1), length 76)
10.24.166.128 > 10.24.104.1: ICMP echo request, id 6400, seq 0, length 56
- icmp请求到达cali5a1d2678510, 查找host路由表得知,下一跳为192.168.122.22(node2的ip),出接口为ens3,不用再经过任何封装。
root@node1:~# ip r
default via 192.168.122.1 dev ens3 proto static
10.24.104.0/26 via 192.168.122.22 dev ens3 proto bird
10.24.166.128 dev cali5a1d2678510 scope link
blackhole 10.24.166.128/26 proto bird
10.24.219.65 via 192.168.122.20 dev ens3 proto bird
10.24.219.66 via 192.168.122.20 dev ens3 proto bird
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.122.0/24 dev ens3 proto kernel scope link src 192.168.122.21
- icmp请求报文从ens3网卡发出,源目的ip就是pod的ip
20:13:48.448931 52:54:00:74:ac:0d > 52:54:00:f3:3a:90, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 63, id 2546, offset 0, flags [DF], proto ICMP (1), length 76)
10.24.166.128 > 10.24.104.1: ICMP echo request, id 6912, seq 0, length 56 - 请求报文达到node2后,查找路由表得知目的ip 10.24.104.1发给
cali06f028cd84e
root@node2:~# ip r
default via 192.168.122.1 dev ens3 proto static
10.24.104.0 dev cali1cd7c4c9ed9 scope link
blackhole 10.24.104.0/26 proto bird
10.24.104.1 dev cali06f028cd84e scope link
10.24.166.128/26 via 192.168.122.21 dev ens3 proto bird
10.24.219.65 via 192.168.122.20 dev ens3 proto bird
10.24.219.66 via 192.168.122.20 dev ens3 proto bird
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.122.0/24 dev ens3 proto kernel scope link src 192.168.122.22
- 通过veth,发送到pod
- icmp reply数据包处理过程类似
Q&A
引用自 https://docs.projectcalico.org/reference/faq
Why does my container have a route to 169.254.1.1?
In a Calico network, each host acts as a gateway router for the workloads that it hosts. In container deployments, Calico uses 169.254.1.1 as the address for the Calico router. By using a link-local address, Calico saves precious IP addresses and avoids burdening the user with configuring a suitable address.
While the routing table may look a little odd to someone who is used to configuring LAN networking, using explicit routes rather than subnet-local gateways is fairly common in WAN networking.Why can’t I see the 169.254.1.1 address mentioned above on my host?
Calico tries hard to avoid interfering with any other configuration on the host. Rather than adding the gateway address to the host side of each workload interface, Calico sets the proxy_arp flag on the interface. This makes the host behave like a gateway, responding to ARPs for 169.254.1.1 without having to actually allocate the IP address to the interface.Why do all cali* interfaces have the MAC address ee:ee:ee:ee:ee:ee?
In some setups the kernel is unable to generate a persistent MAC address and so Calico assigns a MAC address itself. Since Calico uses point-to-point routed interfaces, traffic does not reach the data link layer so the MAC Address is never used and can therefore be the same for all the cali* interfaces.