【爬坑系列】之docker的overlay网络配置(未完,待续)

理论知识储备:

想了解vxlan网络的知识:https://www.cnblogs.com/shuiguizi/p/10923841.html

想了解docker网络的原理知识:https://www.cnblogs.com/shuiguizi/p/10922049.html

提前准备:

镜像使用centos和nginx,为了方便,将从官网上下载的镜像安装一些工具再重新commit得到新的镜像
yum install net-tools
yum install iputils
yum install iproute *

配置步骤:

0,安装启动etcd,步骤见网络.
在这里两个主机都部署etcd,它会自动选举出leader,访问etcd数据库的话,两个地址都可以。


1,重新配置docker daemon
vi /usr/lib/systemd/system/docker.service


--cluster-store=etcd://106.y.y.3:2379 \
--cluster-advertise=106.y.y.31:2375 \

--cluster-store=etcd://188.x.x.113:2379  \
--cluster-advertise=188.x.x.113:2375 \

systemctl daemon-reload
systemctl restart docker.service

2,创建docker network
#docker network create ov_net2 -d overlay
[root@master ~]# docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
731d1b63b387        ov_net2             overlay             global


3,分别在两个节点上创建两个docker 容器
master:
docker run -ti -d --network=ov_net2 --name=centos21 centos:wxy /bin/sh

minion:
docker run -d --name nginx --network=ov_net2 nginx22

开始解析

1.这个overlay网络的信息情况

[root@master ~]# docker network inspect ov_net2
[
    {
        "Name": "ov_net2",
        "Id": "731d1b63b38768022160534b619d09d2e0fb139a7504070bf370a7706ed8ee9e",
        "Created": "2019-05-14T20:08:29.045284861+08:00",
        "Scope": "global",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "10.0.1.0/24",
                    "Gateway": "10.0.1.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Containers": {
            "d7dc5bf71ccb2e5fb1f6b98dd47b4f79f44a8b73e9536f3f15b036ba0c94f55d": {
                "Name": "centos21",
                "EndpointID": "c7771dca216130e46b60cb921d4488eb82f9a5e1e168ec4d7a9d91f183e82ea6",
                "MacAddress": "02:42:0a:00:01:02",
                "IPv4Address": "10.0.1.2/24",
                "IPv6Address": ""
            },
            "ep-611f45864389f90630fa70340dddd4c76b16ac070c49f60aa1679c753b41db7f": {
                "Name": "nginx22",
                "EndpointID": "611f45864389f90630fa70340dddd4c76b16ac070c49f60aa1679c753b41db7f",
                "MacAddress": "02:42:0a:00:01:03",
                "IPv4Address": "10.0.1.3/24",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {}
    }
]

 

2,不同host上的容器是怎么知道对方的地址情况呢?答:从etcd中读取到的。

[root@minion ~]# etcdctl get /docker/network/v1.0/endpoint/731d1b63b38768022160534b619d09d2e0fb139a7504070bf370a7706ed8ee9e/611f45864389f90630fa70340dddd4c76b16ac070c49f60aa1679c753b41db7f
{"anonymous":false,"disableResolution":false,"ep_iface":{"addr":"10.0.1.3/24","dstPrefix":"eth","mac":"02:42:0a:00:01:03","routes":null,"srcName":"vethb5c341b","v4PoolID":"GlobalDefault/10.0.1.0/24","v6PoolID":""},"exposed_ports":[{"Proto":6,"Port":80}],"generic":{"com.docker.network.endpoint.exposedports":[{"Proto":6,"Port":80}],"com.docker.network.portmap":[]},"id":"611f45864389f90630fa70340dddd4c76b16ac070c49f60aa1679c753b41db7f","ingressPorts":null,"joinInfo":{"StaticRoutes":null,"disableGatewayService":false},"locator":"106.13.146.31","myAliases":["42c3eff8768d"],"name":"nginx22","sandbox":"f7a0ce169bd7690b45887a462efc169953150311dbb03e4bb2ccaf17ab75add8","svcAliases":null,"svcID":"","svcName":"","virtualIP":"\u003cnil\u003e"}

 


3,验证连接情况

[root@master ~]# ip netns exec a740da7c2043 ping 10.0.1.3 -c 2
PING 10.0.1.3 (10.0.1.3) 56(84) bytes of data.
From 10.0.1.2 icmp_seq=1 Destination Host Unreachable
From 10.0.1.2 icmp_seq=2 Destination Host Unreachable

 

4,看过Docker原理的都知道,一个容器其实就是创建了一个namespce;另外对于overlay网络为每个host还会再创建一个用于vxlan连接的namespace

[root@master ~]# ip netns exec 1-731d1b63b3 tcpdump -i vxlan1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vxlan1, link-type EN10MB (Ethernet), capture size 262144 bytes
20:36:47.283514 ARP, Request who-has 10.0.1.3 tell 10.0.1.2, length 28
20:36:48.304014 ARP, Request who-has 10.0.1.3 tell 10.0.1.2, length 28
20:36:49.328182 ARP, Request who-has 10.0.1.3 tell 10.0.1.2, length 28

[root@master ~]# ip netns exec 1-731d1b63b3 netstat -i
Kernel Interface table
Iface             MTU    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
br0              1450       35      0      0 0            14      0      0      0 BMRU
lo              65536        0      0      0 0             0      0      0      0 LRU
veth2            1450       35      0      0 0            25      0      0      0 BMRU
vxlan1           1450        0      0      0 0             0      0     39      0 BMRU

解析:

tcpdump的时候vxlan接口有发出数据,
netstat统计的时候,veth2中接收的数据有增加,而vxlan1中没有,
这说明什么?

简述bridge的发包原理:
将各种接口add到网桥上,当网桥的一个接口接收到数据,它会根据数据的目的地址决定是上送上层协议栈还是转发,在这里无论arp报文还是icmp报文都属于要转发。
首先,于是报文经过bridge内部的处理从网桥的其他接口出去,包括vxlan1接口,此处的代码属于内核的L2层,如果有tcpdump监听接口则会将报文报给tcpdump一份
然后,报文进入驱动,vxlan的驱动,可以认为驱动就是介于二层链路层和物理设备之间,netstat -i 统计的就是驱动上数据收发情况。

好了,回来,以上可以知道数据会在vxlan驱动那里被丢掉了,而在驱动都做了什么呢?vxlan的驱动简单说就是查fdb表指导转发,如下可以发现

[root@master ~]# ip netns exec 1-731d1b63b3 bridge fdb
12:54:57:62:92:74 dev vxlan1 vlan 1 master br0 permanent
12:54:57:62:92:74 dev vxlan1 master br0 permanent

解析一下,12:54:57:62:92:74为vxlan1也是网桥br0的接口的mac地址
第一条:表示目的mac关联的是vxlan1接口,也可以就认为这个mac地址就是这个接口的,后面的都是修饰vxlan1的:然后他属于vlan1的,从属于br0桥

对比我自己搭建vxlan网络时,namespace中的fdb表的情况

[root@minion ~]# ip netns exec ns200 bridge fdb
f2:4d:be:62:09:50 dev vxlan20 vlan 1 master br-vx2 permanent
f2:4d:be:62:09:50 dev vxlan20 master br-vx2 permanent
00:00:00:00:00:00 dev vxlan20 dst 188.131.210.113 via ifindex 2 link-netnsid 0 self permanent

 

rount1:手动再fdb表中添加缺省路由

和之前用namespace模拟的vxlan隧道比较,可以发现他少了缺省路由,即数据不知道从哪里出去,数据本来应该从真正的物理接口eth0出去,所以这里就尝试了手动配置,包括两个节点
ip netns exec  1-731d1b63b3 bridge fdb add 00:00:00:00:00:00 dev vxlan1 dst 106.13.146.3
不行,

结果:
还是不行,数据还是不知道出去哪里,原因是没有指定出口接口,fdb表增加条目的时候无法指定非本namespace下的接口我也是醉了....

round2:新创建一个namespace,替换Docker创建的那个用于vxlan连接的namespace,再把容器代表的接口添加到该ns的桥上

---------------------------未完成,待续--------------------------------------------



============================================
只有使能swarm才能启动gossip协议,进行节点的发现

vi /usr/lib/systemd/system/docker.service
systemctl daemon-reload
systemctl restart docker

]# docker swarm init --advertise-addr=188.131.210.113:2377
Swarm initialized: current node (m5jpgmwxow5ec256vw8bpgxi9) is now a manager.

To add a worker to this swarm, run the following command:

    docker swarm join \
    --token SWMTKN-1-5vpumm8bssl8tsa7aq9bc9ytpgxdob5rp5dm0y4b8zed3ef5e9-eckhrx62u9vb0o95xxam98qjc \
    188.131.210.113:2377 


坑1:端口号之前设置的是2375,则node在加入master的时候会发生
[root@minion ~]# docker swarm join     --token SWMTKN-1-4dt7opomolsz9q2kdykheknj2cbmj8sgydxcljl99h07ob9dtj-0acex2ahyx98wm8fuc6rt0k4i     188.131.210.113:2375
Error response from daemon: rpc error: code = 14 desc = grpc: the connection is unavailable


原因是:
For this test, you need two different Docker hosts that can communicate with each other. Each host must have Docker 17.06 or higher with the following ports open between the two Docker hosts:

TCP port 2377 for cluster management communications
TCP and UDP port 7946 for communication among nodes
UDP port 4789 for overlay network traffic

这个原理是这样的:
2377接口
用来集群管理方面的通信,比如某节点想要加入集群啊,那么他就向master节点的该接口发送加入消息

7946接口
用来在集群节点之间进行通信,它首先利用tcp:7946向其peer发消息,如果收到正确回应,说明线路畅通可以保持联系
然后使用udp:7946将自己节点中需要共享的网络信息同步出去(bulk sync)
于是,在master上创建的某个network也就能够被其他节点所发现了
--wxy:这个就是所谓的网络管理平面吧,是不是通过这个替代了etcd等k-v存储结构

4789接口
是不是很熟悉?没错,正是vxlan使用的缺省接口,即用来为overlay网络传输数据,就是vxlan网络中tunnel两端的vtep使用的端口号,不同主机上的container就是用他来通信的

解决办法:将端口号改成2377好了
[root@master ~]# docker node ls
ID                           HOSTNAME    STATUS  AVAILABILITY  MANAGER STATUS
m5jpgmwxow5ec256vw8bpgxi9 *  master.wxy  Ready   Active        Leader
z50ektieet4esj1gwlfaisdc4    minion.wxy  Ready   Active

[root@master ~]# docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
8808070a27e1        bridge              bridge              local
e0407f6da9d8        docker_gwbridge     bridge              local
e866d30f43bf        host                host                local
7j887qknji9s        ingress             overlay             swarm
424fac469906        none                null                local
[root@master ~]# 
[root@master ~]# docker network create -d overlay nginx-net
s8o2cknp3hc8l9cs5uqm4gskd
[root@master ~]# docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
8808070a27e1        bridge              bridge              local
e0407f6da9d8        docker_gwbridge     bridge              local
e866d30f43bf        host                host                local
7j887qknji9s        ingress             overlay             swarm
s8o2cknp3hc8        nginx-net           overlay             swarm
424fac469906        none                null                local


docker network create --driver=overlay --attachable ov-test


[root@master 5.0.10-1.el7.elrepo.x86_64]# ip netns  exec 1-po3p8i2id3 bridge fdb show dev vxlan1
02:42:0a:00:00:04 master br0 
9e:7e:18:eb:71:77 vlan 1 master br0 permanent
9e:7e:18:eb:71:77 master br0 permanent
02:42:0a:00:00:04 dst 172.16.0.4 link-netnsid 0 self permanent  ---重要,
说明:
已经通过网络控制平面就知道对方的mac地址了,但是目的地址是不对的,不能是对方的私网地址(172.16.0.4正是minion的私网地址),得想办法改变这个地址,难道是我加入swarm使用的地址不对?改一下




[root@master ~]# docker network inspect ov-test
[
    {
        "Name": "ov-test",
        "Id": "po3p8i2id3f9thvd7b8qiuu1o",
        "Created": "2019-05-19T19:12:30.411636993+08:00",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.0.0/24",
                    "Gateway": "10.0.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": true,
        "Containers": {
            "18d6f2f74b8d615f67a2c270f846102bc043d9ee63514a1edaa1087964e0486f": {
                "Name": "centos",
                "EndpointID": "65e8f927972704b27e19647040d5abb999d7142f327a26234ff01775fdf95991",
                "MacAddress": "02:42:0a:00:00:02",
                "IPv4Address": "10.0.0.2/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4097"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "master.wxy-61c0f37910dd",
                "IP": "188.131.210.113"
            },
            {
                "Name": "minion.wxy-362fc1004972",
                "IP": "172.16.0.4"
            }
        ]
    }
]

[root@minion ~]# docker network inspect ov-test 
[
    {
        "Name": "ov-test",
        "Id": "po3p8i2id3f9thvd7b8qiuu1o",
        "Created": "2019-05-19T19:13:41.541163575+08:00",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.0.0/24",
                    "Gateway": "10.0.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": true,
        "Containers": {
            "12459f10b7f68e5eba856a446c46a24bce7d2a7f75b7db0cbd896bc72772455b": {
                "Name": "nginx",
                "EndpointID": "8ace966fe08b40b75f1e67d7c7a77e0e99abb3b5f9dc56f6df5c45b695698a85",
                "MacAddress": "02:42:0a:00:00:04",
                "IPv4Address": "10.0.0.4/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4097"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "master.wxy-61c0f37910dd",
                "IP": "188.131.210.113"
            },
            {
                "Name": "minion.wxy-362fc1004972",
                "IP": "172.16.0.4"
            }
        ]
    }
]

修正后:

docker swarm init --advertise-addr=188.131.210.113:2377



docker swarm join \
    --token SWMTKN-1-4m6s7rs3hkhh4bhdv78gymjc7bbq663cfw6qy4yh9ysja2nuvb-bknjkrcu52tvo50dl9ermvxnr \
    --advertise-addr 106.13.146.3:2377 \
    188.131.210.113:2377

docker swarm join \
    --token SWMTKN-1-1qvdgad5v0q5jafdw2ym20lcffi2cd0d4fsuxlr9p5dg8ogqrp-48ti1h5gccibyqx0phzboanq5 \
    --advertise-addr 106.13.146.3:2377 \
    188.131.210.113:2377




docker swarm join \
    --token SWMTKN-1-4m6s7rs3hkhh4bhdv78gymjc7bbq663cfw6qy4yh9ysja2nuvb-bknjkrcu52tvo50dl9ermvxnr \
    --advertise-addr 106.13.146.3:2377 \
    188.131.210.113:2377


docker network create --driver=overlay --attachable ov-test

[root@master ~]# docker run -d -ti --network=ov-test2 --name=centos2 centos:wxy /bin/sh
[root@minion ~]# docker run -d --network=ov-test2 --name=nginx2 nginx

journalctl -u docker.service

1,第一个错误,node节点的ip配置错
evel=error msg="periodic bulk sync failure for network 5s2fbdra2g5m9qt14wmiveir8: bulk sync failed on node minion.wxy-611e1a90a99f: failed to send a TCP message during bulk sync: dial tcp 106.13.146.31:7946: connect: connection refused"

2,第二个错误,node节点在加入集群的时候没有指定advertise ip,这样master在sync时访问的是对方的小网ip
vel=error msg="Error in responding to bulk sync from node 172.16.0.4: failed to send a TCP message during bulk sync: dial tcp 172.16.0.4:7946: i/o timeout"



3,问题2,网络不共享
定位过程:发现master有发tcp:7946且有回应,然后又发了udp:7946,node也收到了,但是发现携带的数据内容不大,所以猜测是没有将网络信息打包
Server Version: 1.13.1


iptables -A INPUT -p udp --dport 4789 -j ACCEPT


================
在minonin节点上使能集群
docker swarm init --advertise-addr=106.13.146.3:2377


iptables rule and container port has conflicts.

try :
sudo iptables -t nat -L -n --line-numbers | grep 7946
sudo iptables -t nat -D DOCKER 6926



===解析下命名空间=======================
[root@minion ~]# ip addr

3: docker_gwbridge: mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:93:81:c9:8d brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker_gwbridge
       valid_lft forever preferred_lft forever
    inet6 fe80::42:93ff:fe81:c98d/64 scope link 
       valid_lft forever preferred_lft forever
4: docker0: mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:d5:cc:f2:db brd ff:ff:ff:ff:ff:ff
    inet 10.0.78.1/24 scope global docker0
       valid_lft forever preferred_lft forever
10: vethd59e050@if9: mtu 1500 qdisc noqueue master docker_gwbridge state UP group default 
    link/ether 22:8d:34:2b:9a:c6 brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet6 fe80::208d:34ff:fe2b:9ac6/64 scope link 
       valid_lft forever preferred_lft forever

[root@minion ~]# ip netns exec ingress_sbox ip addr
7: eth0@if8: mtu 1450 qdisc noqueue state UP group default 
    inet 10.255.0.4/16 scope global eth0  ---沙箱容器(netwok namespace)
9: eth1@if10: mtu 1500 qdisc noqueue state UP group default 
    inet 172.17.0.2/16 scope global eth1

[root@minion ~]# ip netns exec 1-ystd5xxiui ip addr
2: br0: mtu 1450 qdisc noqueue state UP group default 
    link/ether aa:cb:0d:a4:a6:b9 brd ff:ff:ff:ff:ff:ff
    inet 10.255.0.1/16 scope global br0
6: vxlan1: mtu 1450 qdisc noqueue master br0 state UNKNOWN group default 
    link/ether aa:cb:0d:a4:a6:b9 brd ff:ff:ff:ff:ff:ff link-netnsid 0
8: veth2@if7: mtu 1450 qdisc noqueue master br0 state UP group default 
    link/ether d2:04:e3:eb:fc:52 brd ff:ff:ff:ff:ff:ff link-netnsid 1

[root@minion ~]# ip netns
1-ystd5xxiui (id: 0)
ingress_sbox (id: 1)

其中master同

[root@minion ~]# docker network inspect ingress
[
    {
        "Name": "ingress",
        "Id": "ystd5xxiuiuqjbpn76bbzcaws",
        "Created": "2019-05-20T18:35:25.014017659+08:00",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.255.0.0/16",
                    "Gateway": "10.255.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Containers": {
            "ingress-sbox": {
                "Name": "ingress-endpoint",
                "EndpointID": "081aaac8d188d9cb4c190bbb5863be933dcc2b98cde071dce9b035c4ea6df957",
                "MacAddress": "02:42:0a:ff:00:04",
                "IPv4Address": "10.255.0.4/16",  -----注意这个ip
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4096"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "master.wxy-b33a341ba33b",
                "IP": "188.131.210.113"
            },
            {
                "Name": "minion.wxy-47481bf33513",
                "IP": "106.13.146.3"
            }
        ]
    }
]


[root@master ~]# docker network inspect ingress
[
    {
        "Name": "ingress",
        "Id": "ystd5xxiuiuqjbpn76bbzcaws",
        "Created": "2019-05-20T18:34:27.512773865+08:00",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.255.0.0/16",
                    "Gateway": "10.255.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Containers": {
            "ingress-sbox": {
                "Name": "ingress-endpoint",
                "EndpointID": "77cc852efe15111e141ae78ee9c29ccf88637599068044677d64f0107f5db78a",
                "MacAddress": "02:42:0a:ff:00:03",
                "IPv4Address": "10.255.0.3/16",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4096"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "master.wxy-b33a341ba33b",
                "IP": "188.131.210.113"
            },
            {
                "Name": "minion.wxy-47481bf33513",
                "IP": "106.13.146.3"
            }
        ]
    }
]

说明:ingress-sbox其实是一个容器,就叫沙箱容器把,深层次来说就是一个namespace,这个ns的名字就叫ingress_sbox,这样之后加入该network的容器实际上就是共享了这个namespace(network类型),当然每一个沙箱容器都还会再配备一个namespace,用来承载vxlan相关的内容,即overlay网络通信功能的namespace。
验证下这个network下是否可以互通呢?
master的沙箱:10.255.0.3
minomon的沙箱:10.255.0.4
[root@minion ~]# ip netns exec ingress_sbox ping 10.255.0.3
PING 10.255.0.3 (10.255.0.3) 56(84) bytes of data.
64 bytes from 10.255.0.3: icmp_seq=1 ttl=64 time=9.64 ms
64 bytes from 10.255.0.3: icmp_seq=2 ttl=64 time=9.55 ms
64 bytes from 10.255.0.3: icmp_seq=3 ttl=64 time=9.75 ms
[root@minion ~]# ip netns exec 1-ystd5xxiui bridge fdb show dev vxlan1
02:42:0a:ff:00:03 master br0 
aa:cb:0d:a4:a6:b9 vlan 1 master br0 permanent
aa:cb:0d:a4:a6:b9 master br0 permanent
02:42:0a:ff:00:03 dst 188.131.210.113 link-netnsid 0 self permanent  --这个正是master沙箱的


第二步:新起容器加入到ingress网络中
/usr/bin/docker-current: Error response from daemon: Could not attach to network ingress: rpc error: code = 7 desc = network ingress not manually attachable.


第三步:

你可能感兴趣的:(【爬坑系列】之docker的overlay网络配置(未完,待续))