理论知识储备:
想了解vxlan网络的知识:https://www.cnblogs.com/shuiguizi/p/10923841.html
想了解docker网络的原理知识:https://www.cnblogs.com/shuiguizi/p/10922049.html
提前准备:
镜像使用centos和nginx,为了方便,将从官网上下载的镜像安装一些工具再重新commit得到新的镜像
yum install net-tools
yum install iputils
yum install iproute *
配置步骤:
0,安装启动etcd,步骤见网络.
在这里两个主机都部署etcd,它会自动选举出leader,访问etcd数据库的话,两个地址都可以。
1,重新配置docker daemon
vi /usr/lib/systemd/system/docker.service
--cluster-store=etcd://106.y.y.3:2379 \
--cluster-advertise=106.y.y.31:2375 \
--cluster-store=etcd://188.x.x.113:2379 \
--cluster-advertise=188.x.x.113:2375 \
systemctl daemon-reload
systemctl restart docker.service
2,创建docker network
#docker network create ov_net2 -d overlay
[root@master ~]# docker network ls
NETWORK ID NAME DRIVER SCOPE
731d1b63b387 ov_net2 overlay global
3,分别在两个节点上创建两个docker 容器
master:
docker run -ti -d --network=ov_net2 --name=centos21 centos:wxy /bin/sh
minion:
docker run -d --name nginx --network=ov_net2 nginx22
开始解析
1.这个overlay网络的信息情况
[root@master ~]# docker network inspect ov_net2 [ { "Name": "ov_net2", "Id": "731d1b63b38768022160534b619d09d2e0fb139a7504070bf370a7706ed8ee9e", "Created": "2019-05-14T20:08:29.045284861+08:00", "Scope": "global", "Driver": "overlay", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": {}, "Config": [ { "Subnet": "10.0.1.0/24", "Gateway": "10.0.1.1" } ] }, "Internal": false, "Attachable": false, "Containers": { "d7dc5bf71ccb2e5fb1f6b98dd47b4f79f44a8b73e9536f3f15b036ba0c94f55d": { "Name": "centos21", "EndpointID": "c7771dca216130e46b60cb921d4488eb82f9a5e1e168ec4d7a9d91f183e82ea6", "MacAddress": "02:42:0a:00:01:02", "IPv4Address": "10.0.1.2/24", "IPv6Address": "" }, "ep-611f45864389f90630fa70340dddd4c76b16ac070c49f60aa1679c753b41db7f": { "Name": "nginx22", "EndpointID": "611f45864389f90630fa70340dddd4c76b16ac070c49f60aa1679c753b41db7f", "MacAddress": "02:42:0a:00:01:03", "IPv4Address": "10.0.1.3/24", "IPv6Address": "" } }, "Options": {}, "Labels": {} } ]
2,不同host上的容器是怎么知道对方的地址情况呢?答:从etcd中读取到的。
[root@minion ~]# etcdctl get /docker/network/v1.0/endpoint/731d1b63b38768022160534b619d09d2e0fb139a7504070bf370a7706ed8ee9e/611f45864389f90630fa70340dddd4c76b16ac070c49f60aa1679c753b41db7f {"anonymous":false,"disableResolution":false,"ep_iface":{"addr":"10.0.1.3/24","dstPrefix":"eth","mac":"02:42:0a:00:01:03","routes":null,"srcName":"vethb5c341b","v4PoolID":"GlobalDefault/10.0.1.0/24","v6PoolID":""},"exposed_ports":[{"Proto":6,"Port":80}],"generic":{"com.docker.network.endpoint.exposedports":[{"Proto":6,"Port":80}],"com.docker.network.portmap":[]},"id":"611f45864389f90630fa70340dddd4c76b16ac070c49f60aa1679c753b41db7f","ingressPorts":null,"joinInfo":{"StaticRoutes":null,"disableGatewayService":false},"locator":"106.13.146.31","myAliases":["42c3eff8768d"],"name":"nginx22","sandbox":"f7a0ce169bd7690b45887a462efc169953150311dbb03e4bb2ccaf17ab75add8","svcAliases":null,"svcID":"","svcName":"","virtualIP":"\u003cnil\u003e"}
3,验证连接情况
[root@master ~]# ip netns exec a740da7c2043 ping 10.0.1.3 -c 2 PING 10.0.1.3 (10.0.1.3) 56(84) bytes of data. From 10.0.1.2 icmp_seq=1 Destination Host Unreachable From 10.0.1.2 icmp_seq=2 Destination Host Unreachable
4,看过Docker原理的都知道,一个容器其实就是创建了一个namespce;另外对于overlay网络为每个host还会再创建一个用于vxlan连接的namespace
[root@master ~]# ip netns exec 1-731d1b63b3 tcpdump -i vxlan1 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on vxlan1, link-type EN10MB (Ethernet), capture size 262144 bytes 20:36:47.283514 ARP, Request who-has 10.0.1.3 tell 10.0.1.2, length 28 20:36:48.304014 ARP, Request who-has 10.0.1.3 tell 10.0.1.2, length 28 20:36:49.328182 ARP, Request who-has 10.0.1.3 tell 10.0.1.2, length 28 [root@master ~]# ip netns exec 1-731d1b63b3 netstat -i Kernel Interface table Iface MTU RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg br0 1450 35 0 0 0 14 0 0 0 BMRU lo 65536 0 0 0 0 0 0 0 0 LRU veth2 1450 35 0 0 0 25 0 0 0 BMRU vxlan1 1450 0 0 0 0 0 0 39 0 BMRU
解析:
tcpdump的时候vxlan接口有发出数据,
netstat统计的时候,veth2中接收的数据有增加,而vxlan1中没有,
这说明什么?
简述bridge的发包原理:
将各种接口add到网桥上,当网桥的一个接口接收到数据,它会根据数据的目的地址决定是上送上层协议栈还是转发,在这里无论arp报文还是icmp报文都属于要转发。
首先,于是报文经过bridge内部的处理从网桥的其他接口出去,包括vxlan1接口,此处的代码属于内核的L2层,如果有tcpdump监听接口则会将报文报给tcpdump一份
然后,报文进入驱动,vxlan的驱动,可以认为驱动就是介于二层链路层和物理设备之间,netstat -i 统计的就是驱动上数据收发情况。
好了,回来,以上可以知道数据会在vxlan驱动那里被丢掉了,而在驱动都做了什么呢?vxlan的驱动简单说就是查fdb表指导转发,如下可以发现
[root@master ~]# ip netns exec 1-731d1b63b3 bridge fdb 12:54:57:62:92:74 dev vxlan1 vlan 1 master br0 permanent 12:54:57:62:92:74 dev vxlan1 master br0 permanent
解析一下,12:54:57:62:92:74为vxlan1也是网桥br0的接口的mac地址
第一条:表示目的mac关联的是vxlan1接口,也可以就认为这个mac地址就是这个接口的,后面的都是修饰vxlan1的:然后他属于vlan1的,从属于br0桥
对比我自己搭建vxlan网络时,namespace中的fdb表的情况
[root@minion ~]# ip netns exec ns200 bridge fdb f2:4d:be:62:09:50 dev vxlan20 vlan 1 master br-vx2 permanent f2:4d:be:62:09:50 dev vxlan20 master br-vx2 permanent 00:00:00:00:00:00 dev vxlan20 dst 188.131.210.113 via ifindex 2 link-netnsid 0 self permanent
rount1:手动再fdb表中添加缺省路由
和之前用namespace模拟的vxlan隧道比较,可以发现他少了缺省路由,即数据不知道从哪里出去,数据本来应该从真正的物理接口eth0出去,所以这里就尝试了手动配置,包括两个节点
ip netns exec 1-731d1b63b3 bridge fdb add 00:00:00:00:00:00 dev vxlan1 dst 106.13.146.3
不行,
结果:
还是不行,数据还是不知道出去哪里,原因是没有指定出口接口,fdb表增加条目的时候无法指定非本namespace下的接口我也是醉了....
round2:新创建一个namespace,替换Docker创建的那个用于vxlan连接的namespace,再把容器代表的接口添加到该ns的桥上
---------------------------未完成,待续--------------------------------------------
============================================
只有使能swarm才能启动gossip协议,进行节点的发现
vi /usr/lib/systemd/system/docker.service
systemctl daemon-reload
systemctl restart docker
]# docker swarm init --advertise-addr=188.131.210.113:2377
Swarm initialized: current node (m5jpgmwxow5ec256vw8bpgxi9) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-5vpumm8bssl8tsa7aq9bc9ytpgxdob5rp5dm0y4b8zed3ef5e9-eckhrx62u9vb0o95xxam98qjc \
188.131.210.113:2377
坑1:端口号之前设置的是2375,则node在加入master的时候会发生
[root@minion ~]# docker swarm join --token SWMTKN-1-4dt7opomolsz9q2kdykheknj2cbmj8sgydxcljl99h07ob9dtj-0acex2ahyx98wm8fuc6rt0k4i 188.131.210.113:2375
Error response from daemon: rpc error: code = 14 desc = grpc: the connection is unavailable
原因是:
For this test, you need two different Docker hosts that can communicate with each other. Each host must have Docker 17.06 or higher with the following ports open between the two Docker hosts:
TCP port 2377 for cluster management communications
TCP and UDP port 7946 for communication among nodes
UDP port 4789 for overlay network traffic
这个原理是这样的:
2377接口
用来集群管理方面的通信,比如某节点想要加入集群啊,那么他就向master节点的该接口发送加入消息
7946接口
用来在集群节点之间进行通信,它首先利用tcp:7946向其peer发消息,如果收到正确回应,说明线路畅通可以保持联系
然后使用udp:7946将自己节点中需要共享的网络信息同步出去(bulk sync)
于是,在master上创建的某个network也就能够被其他节点所发现了
--wxy:这个就是所谓的网络管理平面吧,是不是通过这个替代了etcd等k-v存储结构
4789接口
是不是很熟悉?没错,正是vxlan使用的缺省接口,即用来为overlay网络传输数据,就是vxlan网络中tunnel两端的vtep使用的端口号,不同主机上的container就是用他来通信的
解决办法:将端口号改成2377好了
[root@master ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
m5jpgmwxow5ec256vw8bpgxi9 * master.wxy Ready Active Leader
z50ektieet4esj1gwlfaisdc4 minion.wxy Ready Active
[root@master ~]# docker network ls
NETWORK ID NAME DRIVER SCOPE
8808070a27e1 bridge bridge local
e0407f6da9d8 docker_gwbridge bridge local
e866d30f43bf host host local
7j887qknji9s ingress overlay swarm
424fac469906 none null local
[root@master ~]#
[root@master ~]# docker network create -d overlay nginx-net
s8o2cknp3hc8l9cs5uqm4gskd
[root@master ~]# docker network ls
NETWORK ID NAME DRIVER SCOPE
8808070a27e1 bridge bridge local
e0407f6da9d8 docker_gwbridge bridge local
e866d30f43bf host host local
7j887qknji9s ingress overlay swarm
s8o2cknp3hc8 nginx-net overlay swarm
424fac469906 none null local
docker network create --driver=overlay --attachable ov-test
[root@master 5.0.10-1.el7.elrepo.x86_64]# ip netns exec 1-po3p8i2id3 bridge fdb show dev vxlan1
02:42:0a:00:00:04 master br0
9e:7e:18:eb:71:77 vlan 1 master br0 permanent
9e:7e:18:eb:71:77 master br0 permanent
02:42:0a:00:00:04 dst 172.16.0.4 link-netnsid 0 self permanent ---重要,
说明:
已经通过网络控制平面就知道对方的mac地址了,但是目的地址是不对的,不能是对方的私网地址(172.16.0.4正是minion的私网地址),得想办法改变这个地址,难道是我加入swarm使用的地址不对?改一下
[root@master ~]# docker network inspect ov-test
[
{
"Name": "ov-test",
"Id": "po3p8i2id3f9thvd7b8qiuu1o",
"Created": "2019-05-19T19:12:30.411636993+08:00",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.0.0.0/24",
"Gateway": "10.0.0.1"
}
]
},
"Internal": false,
"Attachable": true,
"Containers": {
"18d6f2f74b8d615f67a2c270f846102bc043d9ee63514a1edaa1087964e0486f": {
"Name": "centos",
"EndpointID": "65e8f927972704b27e19647040d5abb999d7142f327a26234ff01775fdf95991",
"MacAddress": "02:42:0a:00:00:02",
"IPv4Address": "10.0.0.2/24",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "4097"
},
"Labels": {},
"Peers": [
{
"Name": "master.wxy-61c0f37910dd",
"IP": "188.131.210.113"
},
{
"Name": "minion.wxy-362fc1004972",
"IP": "172.16.0.4"
}
]
}
]
[root@minion ~]# docker network inspect ov-test
[
{
"Name": "ov-test",
"Id": "po3p8i2id3f9thvd7b8qiuu1o",
"Created": "2019-05-19T19:13:41.541163575+08:00",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.0.0.0/24",
"Gateway": "10.0.0.1"
}
]
},
"Internal": false,
"Attachable": true,
"Containers": {
"12459f10b7f68e5eba856a446c46a24bce7d2a7f75b7db0cbd896bc72772455b": {
"Name": "nginx",
"EndpointID": "8ace966fe08b40b75f1e67d7c7a77e0e99abb3b5f9dc56f6df5c45b695698a85",
"MacAddress": "02:42:0a:00:00:04",
"IPv4Address": "10.0.0.4/24",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "4097"
},
"Labels": {},
"Peers": [
{
"Name": "master.wxy-61c0f37910dd",
"IP": "188.131.210.113"
},
{
"Name": "minion.wxy-362fc1004972",
"IP": "172.16.0.4"
}
]
}
]
修正后:
docker swarm init --advertise-addr=188.131.210.113:2377
docker swarm join \
--token SWMTKN-1-4m6s7rs3hkhh4bhdv78gymjc7bbq663cfw6qy4yh9ysja2nuvb-bknjkrcu52tvo50dl9ermvxnr \
--advertise-addr 106.13.146.3:2377 \
188.131.210.113:2377
docker swarm join \
--token SWMTKN-1-1qvdgad5v0q5jafdw2ym20lcffi2cd0d4fsuxlr9p5dg8ogqrp-48ti1h5gccibyqx0phzboanq5 \
--advertise-addr 106.13.146.3:2377 \
188.131.210.113:2377
docker swarm join \
--token SWMTKN-1-4m6s7rs3hkhh4bhdv78gymjc7bbq663cfw6qy4yh9ysja2nuvb-bknjkrcu52tvo50dl9ermvxnr \
--advertise-addr 106.13.146.3:2377 \
188.131.210.113:2377
docker network create --driver=overlay --attachable ov-test
[root@master ~]# docker run -d -ti --network=ov-test2 --name=centos2 centos:wxy /bin/sh
[root@minion ~]# docker run -d --network=ov-test2 --name=nginx2 nginx
journalctl -u docker.service
1,第一个错误,node节点的ip配置错
evel=error msg="periodic bulk sync failure for network 5s2fbdra2g5m9qt14wmiveir8: bulk sync failed on node minion.wxy-611e1a90a99f: failed to send a TCP message during bulk sync: dial tcp 106.13.146.31:7946: connect: connection refused"
2,第二个错误,node节点在加入集群的时候没有指定advertise ip,这样master在sync时访问的是对方的小网ip
vel=error msg="Error in responding to bulk sync from node 172.16.0.4: failed to send a TCP message during bulk sync: dial tcp 172.16.0.4:7946: i/o timeout"
3,问题2,网络不共享
定位过程:发现master有发tcp:7946且有回应,然后又发了udp:7946,node也收到了,但是发现携带的数据内容不大,所以猜测是没有将网络信息打包
Server Version: 1.13.1
iptables -A INPUT -p udp --dport 4789 -j ACCEPT
================
在minonin节点上使能集群
docker swarm init --advertise-addr=106.13.146.3:2377
iptables rule and container port has conflicts.
try :
sudo iptables -t nat -L -n --line-numbers | grep 7946
sudo iptables -t nat -D DOCKER 6926
===解析下命名空间=======================
[root@minion ~]# ip addr
3: docker_gwbridge:
link/ether 02:42:93:81:c9:8d brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 scope global docker_gwbridge
valid_lft forever preferred_lft forever
inet6 fe80::42:93ff:fe81:c98d/64 scope link
valid_lft forever preferred_lft forever
4: docker0:
link/ether 02:42:d5:cc:f2:db brd ff:ff:ff:ff:ff:ff
inet 10.0.78.1/24 scope global docker0
valid_lft forever preferred_lft forever
10: vethd59e050@if9:
link/ether 22:8d:34:2b:9a:c6 brd ff:ff:ff:ff:ff:ff link-netnsid 1
inet6 fe80::208d:34ff:fe2b:9ac6/64 scope link
valid_lft forever preferred_lft forever
[root@minion ~]# ip netns exec ingress_sbox ip addr
7: eth0@if8:
inet 10.255.0.4/16 scope global eth0 ---沙箱容器(netwok namespace)
9: eth1@if10:
inet 172.17.0.2/16 scope global eth1
[root@minion ~]# ip netns exec 1-ystd5xxiui ip addr
2: br0:
link/ether aa:cb:0d:a4:a6:b9 brd ff:ff:ff:ff:ff:ff
inet 10.255.0.1/16 scope global br0
6: vxlan1:
link/ether aa:cb:0d:a4:a6:b9 brd ff:ff:ff:ff:ff:ff link-netnsid 0
8: veth2@if7:
link/ether d2:04:e3:eb:fc:52 brd ff:ff:ff:ff:ff:ff link-netnsid 1
[root@minion ~]# ip netns
1-ystd5xxiui (id: 0)
ingress_sbox (id: 1)
其中master同
[root@minion ~]# docker network inspect ingress
[
{
"Name": "ingress",
"Id": "ystd5xxiuiuqjbpn76bbzcaws",
"Created": "2019-05-20T18:35:25.014017659+08:00",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.255.0.0/16",
"Gateway": "10.255.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Containers": {
"ingress-sbox": {
"Name": "ingress-endpoint",
"EndpointID": "081aaac8d188d9cb4c190bbb5863be933dcc2b98cde071dce9b035c4ea6df957",
"MacAddress": "02:42:0a:ff:00:04",
"IPv4Address": "10.255.0.4/16", -----注意这个ip
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "4096"
},
"Labels": {},
"Peers": [
{
"Name": "master.wxy-b33a341ba33b",
"IP": "188.131.210.113"
},
{
"Name": "minion.wxy-47481bf33513",
"IP": "106.13.146.3"
}
]
}
]
[root@master ~]# docker network inspect ingress
[
{
"Name": "ingress",
"Id": "ystd5xxiuiuqjbpn76bbzcaws",
"Created": "2019-05-20T18:34:27.512773865+08:00",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.255.0.0/16",
"Gateway": "10.255.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Containers": {
"ingress-sbox": {
"Name": "ingress-endpoint",
"EndpointID": "77cc852efe15111e141ae78ee9c29ccf88637599068044677d64f0107f5db78a",
"MacAddress": "02:42:0a:ff:00:03",
"IPv4Address": "10.255.0.3/16",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "4096"
},
"Labels": {},
"Peers": [
{
"Name": "master.wxy-b33a341ba33b",
"IP": "188.131.210.113"
},
{
"Name": "minion.wxy-47481bf33513",
"IP": "106.13.146.3"
}
]
}
]
说明:ingress-sbox其实是一个容器,就叫沙箱容器把,深层次来说就是一个namespace,这个ns的名字就叫ingress_sbox,这样之后加入该network的容器实际上就是共享了这个namespace(network类型),当然每一个沙箱容器都还会再配备一个namespace,用来承载vxlan相关的内容,即overlay网络通信功能的namespace。
验证下这个network下是否可以互通呢?
master的沙箱:10.255.0.3
minomon的沙箱:10.255.0.4
[root@minion ~]# ip netns exec ingress_sbox ping 10.255.0.3
PING 10.255.0.3 (10.255.0.3) 56(84) bytes of data.
64 bytes from 10.255.0.3: icmp_seq=1 ttl=64 time=9.64 ms
64 bytes from 10.255.0.3: icmp_seq=2 ttl=64 time=9.55 ms
64 bytes from 10.255.0.3: icmp_seq=3 ttl=64 time=9.75 ms
[root@minion ~]# ip netns exec 1-ystd5xxiui bridge fdb show dev vxlan1
02:42:0a:ff:00:03 master br0
aa:cb:0d:a4:a6:b9 vlan 1 master br0 permanent
aa:cb:0d:a4:a6:b9 master br0 permanent
02:42:0a:ff:00:03 dst 188.131.210.113 link-netnsid 0 self permanent --这个正是master沙箱的
第二步:新起容器加入到ingress网络中
/usr/bin/docker-current: Error response from daemon: Could not attach to network ingress: rpc error: code = 7 desc = network ingress not manually attachable.
第三步: