bridge是一个虚拟网络设备,具有网络设备的特性(可以配置IP、MAC地址等);而且bridge还是一个虚拟交换机,和物理交换机设备功能类似。网桥是一种在链路层实现中继,对帧进行转发的技术,根据MAC分区块,可隔离碰撞,将网络的多个网段在数据链路层连接起来的网络设备。
对于普通的物理设备来说,只有两端,从一段进来的数据会从另一端出去,比如物理网卡从外面网络中收到的数据会转发到内核协议栈中,而从协议栈过来的数据会转发到外面的物理网络中。而bridge不同,bridge有多个端口,数据可以从任何端口进来,进来之后从哪个口出去原理与物理交换机类似,需要看mac地址。
bridge是建立在从设备上(物理设备、虚拟设备、vlan设备等,即attach一个从设备,类似于现实世界中的交换机和一个用户终端之间连接了一根网线),并且可以为bridge配置一个IP(参考LinuxBridge MAC地址行为),这样该主机就可以通过这个bridge设备与网络中的其他主机进行通信了。另外它的从设备被虚拟化为端口port,它们的IP及MAC都不在可用,且它们被设置为接受任何包,最终由bridge设备来决定数据包的去向:接收到本机、转发、丢弃、广播。
bridge是用于连接两个不同网段的常见手段,不同网络段通过bridge连接后就如同在一个网段一样,工作原理很简单就是L2数据链路层进行数据包的转发。
基本原理图如下(来源于网络):
收到新数据包时,记录源MAC地址和输入端口
根据数据包中的目的MAC地址查找本地缓存,如果能找到对应的MAC地址记录
#安装bridge-utils软件包,并加载bridge模块和开启内核转发。
root@ubuntu:/home/sunld# apt-get install bridge-utils
root@ubuntu:/home/sunld# modprobe bridge
root@ubuntu:/home/sunld# echo "1">/proc/sys/net/ipv4/ip_forward
root@ubuntu:/home/sunld# brctl
Usage: brctl [commands]
commands:
addbr add bridge
delbr delete bridge
addif add interface to bridge
delif delete interface from bridge
hairpin {on|off} turn hairpin on/off
setageing <time> set ageing time
setbridgeprio set bridge priority
setfd <time> set bridge forward delay
sethello <time> set hello time
setmaxage <time> set max message age
setpathcost set path cost
setportprio set port priority
show [ ] show a list of bridges
showmacs show a list of mac addrs
showstp show bridge stp info
stp {on|off} turn stp on/off
#增加一个网桥
root@ubuntu:/home/sunld# brctl addbr br0
#将现有网卡连接到往前,由于该网卡开启了混杂模式,可以不需要配置IP,因为bridge工作在L2链路层
root@ubuntu:/home/sunld# ifconfig eth0 0.0.0.0 promisc
root@ubuntu:/home/sunld# ifconfig eth1 0.0.0.0 promisc
root@ubuntu:/home/sunld# brctl addif br0 eth0 eth1
#查看已有网桥
root@ubuntu:/home/sunld# brctl show
root@ubuntu:/home/sunld# bridge link
#给bridge配置IP
root@ubuntu:/home/sunld# ifconfig br0 10.10.1.1 netmask 255.255.0.0 up
root@ubuntu:/home/sunld# brctl delbr br0
#关闭生成树协议,减少数据包污染
root@ubuntu:/home/sunld# brctl stp br0 off
root@ubuntu:/home/sunld# echo "modprobe bridge">>/etc/rc.local
#配置eth0 eth1 br0开机启动,eth0,eth1未设置IP信息,在启动br0网卡时,开启了eth0,eth1的混杂模式,并桥接了它们。
vim /etc/network/interfaces
auto lo eth0 eth1 br0
iface lo inet loopback
iface br0 inet static
address 10.10.10.1
netmask 255.255.0.0
gateway 10.10.10.254
pre-up ip link set eth0 promisc on
pre-up ip link set eth1 promisc on
pre-up echo "1">/proc/sys/net/ipv4/ip_forward
bridge_ports eth0 eth1
root@ubuntu:~# ip link add br-sunld08-test type bridge
root@ubuntu:~# ip link set dev br-sunld08-test up
当创建一个bridge时,它是一个独立的网络设备,只有一个端口连接者协议栈,其他端口什么都没有连接,此时bridge没有任何实际功能,如下图所示:
root@ubuntu:~# ip link show br-sunld08-test
48: br-sunld08-test: mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/ether 1a:7b:cf:3b:08:53 brd ff:ff:ff:ff:ff:ff
root@ubuntu:~# brctl show br-sunld08-test
bridge name bridge id STP enabled interfaces
br-sunld08-test 8000.000000000000 no
部署图如下:
#创建veth
root@ubuntu:~# ip link add test-veth08 type veth peer name test-veth09
#配置IP
root@ubuntu:~# ifconfig test-veth08 192.168.209.135/24 up
root@ubuntu:~# ifconfig test-veth09 192.168.209.136/24 up
#启动设备
root@ubuntu:~# ip link set dev test-veth08 up
root@ubuntu:~# ip link set dev test-veth09 up
#把test-veth08连接到br-sunld08-test
root@ubuntu:~# ip link set dev test-veth08 master br-sunld08-test
#查看信息
root@ubuntu:~# bridge link | grep test-veth08
50: test-veth08 state UP : <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br-sunld08-test state forwarding priority 32 cost 2
当test-veth08连接到br-sunld08-test时发生的变化:
br-sunld08-test和test-veth08连接起来,并且是双向的通道
协议栈和test-veth08之间变成单向通道,协议栈能发数据给test-veth08,但是test-veth08从外面收到的数据不能转发给协议栈
br-sunld08-test的mac地址变成了test-veth08的mac地址(可以通过 ip a查看
)
相当于bridge在test-veth08和协议栈之间做了一次拦截,将test-veth08本来要转发给协议栈的数据拦截,全部转发给了bridge,同时bridge也可以向test-veth08发送数据。
注意:对于非debian系统,这里有可能ping不通,主要是因为内核中的一些ARP相关配置导致sunld-veth1不返回ARP应答包,如ubuntu上就会出现这种情况,解决办法如下:
root@ubuntu:/home/sunld# echo 1 > /proc/sys/net/ipv4/conf/test-veth08/accept_local
root@ubuntu:/home/sunld# echo 1 > /proc/sys/net/ipv4/conf/test-veth09/accept_local
root@ubuntu:/home/sunld# echo 0 > /proc/sys/net/ipv4/conf/all/rp_filter
root@ubuntu:/home/sunld# echo 0 > /proc/sys/net/ipv4/conf/test-veth08/rp_filter
root@ubuntu:/home/sunld# echo 0 > /proc/sys/net/ipv4/conf/test-veth09/rp_filter
root@ubuntu:~# ping -c 4 192.168.209.136 -I test-veth08
PING 192.168.209.136 (192.168.209.136) from 192.168.209.135 test-veth08: 56(84) bytes of data.
From 192.168.209.135 icmp_seq=1 Destination Host Unreachable
From 192.168.209.135 icmp_seq=2 Destination Host Unreachable
From 192.168.209.135 icmp_seq=3 Destination Host Unreachable
From 192.168.209.135 icmp_seq=4 Destination Host Unreachable
#由于test-veth08的arp缓存里没有test-veth09的mac地址,所以ping之前先发出arp请求
#从test-veth09上抓包来看,test-veth09收到了arp请求,并且返回了应答
root@ubuntu:/home/sunld# tcpdump -n -i test-veth09
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on test-veth09, link-type EN10MB (Ethernet), capture size 262144 bytes
00:22:38.369683 ARP, Request who-has 192.168.209.136 tell 192.168.209.135, length 28
00:22:38.369698 ARP, Reply 192.168.209.136 is-at 6a:11:bf:6b:21:8f, length 28
#从veth0上抓包来看,数据包也发出去了,并且也收到了返回
root@ubuntu:/home/sunld# tcpdump -n -i test-veth08
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on test-veth08, link-type EN10MB (Ethernet), capture size 262144 bytes
00:22:11.736906 ARP, Request who-has 192.168.209.136 tell 192.168.209.135, length 28
00:22:11.736918 ARP, Reply 192.168.209.136 is-at 6a:11:bf:6b:21:8f, length 28
#再看br0上的数据包,发现只有应答
root@ubuntu:/home/sunld# tcpdump -n -i br-sunld08-test
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br-sunld08-test, link-type EN10MB (Ethernet), capture size 262144 bytes
00:24:48.601526 ARP, Reply 192.168.209.136 is-at 6a:11:bf:6b:21:8f, length 28
从上面的抓包可以看出,去和回来的流程都没有问题,问题就出在test-veth08收到应答包后没有给协议栈,而是给了br-sunld08-test,于是协议栈得不到test-veth09的mac地址,从而通信失败。
通过上面的分析可以看出,给test-veth08配置IP没有意义,因为就算协议栈传数据给test-veth08,应当包也回不来。这里我们将test-veth08的ip配置给bridge。
root@ubuntu:~# ip addr del 192.168.209.135/24 dev test-veth08
root@ubuntu:~# ip addr add 192.168.209.135/24 dev br-sunld08-test
root@ubuntu:~# ping -c 4 192.168.209.136 -I br-sunld08-test
PING 192.168.209.136 (192.168.209.136) from 192.168.209.135 br-sunld08-test: 56(84) bytes of data.
64 bytes from 192.168.209.136: icmp_seq=1 ttl=64 time=0.238 ms
64 bytes from 192.168.209.136: icmp_seq=2 ttl=64 time=0.089 ms
64 bytes from 192.168.209.136: icmp_seq=3 ttl=64 time=0.243 ms
64 bytes from 192.168.209.136: icmp_seq=4 ttl=64 time=0.139 ms
但ping网关还是失败,因为这个bridge上只有两个网络设备,分别是192.168.209.135和192.168.209.136,br0不知道192.168.209.1在哪。
root@ubuntu:~# ping -c 4 192.168.209.2 -I br-sunld08-test
PING 192.168.209.2 (192.168.209.2) from 192.168.209.135 br-sunld08-test: 56(84) bytes of data.
^C
--- 192.168.209.2 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2000ms
root@ubuntu:~# ip link set dev eth0 master br-sunld08-test
root@ubuntu:~# bridge link
2: eth0 state UP : mtu 1500 master br-sunld08-test state forwarding priority 32 cost 4
50: test-veth08 state UP : mtu 1500 master br-sunld08-test state forwarding priority 32 cost 2
bridge根本不区分接入进来的是物理设备还是虚拟设备,对于bridge来说都是一样,都是网络设备,所以当eth0加入bridge之后,和test-veth08的功能一样,从外面网络收到的数据包将无条件的转发给br0,自己变成一根网线。
这时通过eth0 平网关则失败,由于bridge通过eth0这根网线连接上了外面的物理机,所以连接到bridge上的设备可以ping通网关。这里连接到bridge的设备是test-veth09和自己veth1是通过veth0这根网线连上去的,而br0可以理解为自己有一块自带的网卡。
由于eth0已经变成了和网线差不多的功能,所以在eth0上配置IP已经没有什么意义了,并且还会影响协议栈的路由选择,比如如果上面ping的时候不指定网卡的话,协议栈有可能优先选择eth0,导致ping不通,所以这里需要将eth0上的IP去掉。
如下图所示(图片来源于网络):
当一个设备attach到bridge上时,该设备上的IP则变为无效,Linux不在使用那个IP在三层接受数据。此时应该把该设备的IP赋值给bridge设备。
对于一个被attach到bridge上的设备来说,只有当它收到数据时,此包数据才会被转发到bridge上,进而完成查找表广播等后续操作。当请求是发送类型时,数据是不会被转发到bridge上的,它会寻址下一个发送出口。用户在配置网络时经常忽略这一点从而造成网络故障。
#创建bridge,查看默认mac
root@ubuntu:~# ip link add br-mac type bridge
20: br-mac: mtu 1500 qdisc noop state DOWN group default
link/ether f6:b0:c9:7c:04:1d brd ff:ff:ff:ff:ff:ff
#创建设备veth
root@ubuntu:~# ip link add mac-veth01 type veth peer name mac-veth02
20: br-mac: mtu 1500 qdisc noop state DOWN group default
link/ether f6:b0:c9:7c:04:1d brd ff:ff:ff:ff:ff:ff
21: mac-veth02: mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 92:a2:23:d5:88:56 brd ff:ff:ff:ff:ff:ff
22: mac-veth01: mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether de:ee:ff:8d:0c:50 brd ff:ff:ff:ff:ff:ff
#attach mac-veth01(大MAC)
20: br-mac: mtu 1500 qdisc noop state DOWN group default
link/ether de:ee:ff:8d:0c:50(自动变为mac-veth01的mac) brd ff:ff:ff:ff:ff:ff
21: mac-veth02: mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 92:a2:23:d5:88:56 brd ff:ff:ff:ff:ff:ff
22: mac-veth01: mtu 1500 qdisc noop master br-mac state DOWN group default qlen 1000
link/ether de:ee:ff:8d:0c:50 brd ff:ff:ff:ff:ff:ff
#attach mac-veth02(小MAC)
root@ubuntu:~# ip link set dev mac-veth02 master br-mac
20: br-mac: mtu 1500 qdisc noop state DOWN group default
link/ether 92:a2:23:d5:88:56(变化为小mac,mac-veth02的mac) brd ff:ff:ff:ff:ff:ff
21: mac-veth02: mtu 1500 qdisc noop master br-mac state DOWN group default qlen 1000
link/ether 92:a2:23:d5:88:56 brd ff:ff:ff:ff:ff:ff
22: mac-veth01: mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether de:ee:ff:8d:0c:50 brd ff:ff:ff:ff:ff:ff
#增加mac-veth02的mac
root@ubuntu:~# ifconfig mac-veth02 hw ether de:ee:ff:8d:0c:51
20: br-mac: mtu 1500 qdisc noop state DOWN group default
link/ether de:ee:ff:8d:0c:50(变化为小mac,mac-veth01的mac) brd ff:ff:ff:ff:ff:ff
21: mac-veth02: mtu 1500 qdisc noop master br-mac state DOWN group default qlen 1000
link/ether de:ee:ff:8d:0c:51 brd ff:ff:ff:ff:ff:ff
22: mac-veth01: mtu 1500 qdisc noop master br-mac state DOWN group default qlen 1000
link/ether de:ee:ff:8d:0c:50 brd ff:ff:ff:ff:ff:ff
#更改br-mac的mac(大mac)
root@ubuntu:~# ifconfig br-mac hw ether de:ee:ff:8d:0c:52
20: br-mac: mtu 1500 qdisc noop state DOWN group default
link/ether de:ee:ff:8d:0c:52(变化为指定的mac) brd ff:ff:ff:ff:ff:ff
21: mac-veth02: mtu 1500 qdisc noop master br-mac state DOWN group default qlen 1000
link/ether de:ee:ff:8d:0c:51 brd ff:ff:ff:ff:ff:ff
22: mac-veth01: mtu 1500 qdisc noop master br-mac state DOWN group default qlen 1000
link/ether de:ee:ff:8d:0c:50 brd ff:ff:ff:ff:ff:ff
#设置br-mac same as mac-veth01,mac-veth02 mac减小
root@ubuntu:~# ifconfig br-mac hw ether de:ee:ff:8d:0c:50
root@ubuntu:~# ifconfig mac-veth02 hw ether de:ee:ff:8d:0c:49
20: br-mac: mtu 1500 qdisc noop state DOWN group default
link/ether de:ee:ff:8d:0c:50(和设置的mac一样,不变) brd ff:ff:ff:ff:ff:ff
21: mac-veth02: mtu 1500 qdisc noop master br-mac state DOWN group default qlen 1000
link/ether de:ee:ff:8d:0c:49 brd ff:ff:ff:ff:ff:ff
22: mac-veth01: mtu 1500 qdisc noop master br-mac state DOWN group default qlen 1000
link/ether de:ee:ff:8d:0c:50 brd ff:ff:ff:ff:ff:ff
#增加mac-veth01的mac
root@ubuntu:~# ifconfig mac-veth01 hw ether de:ee:ff:8d:0c:51
20: br-mac: mtu 1500 qdisc noop state DOWN group default
link/ether de:ee:ff:8d:0c:50(mac不变) brd ff:ff:ff:ff:ff:ff
21: mac-veth02: mtu 1500 qdisc noop master br-mac state DOWN group default qlen 1000
link/ether de:ee:ff:8d:0c:49 brd ff:ff:ff:ff:ff:ff
22: mac-veth01: mtu 1500 qdisc noop master br-mac state DOWN group default qlen 1000
link/ether de:ee:ff:8d:0c:51 brd ff:ff:ff:ff:ff:ff
#增加新的设备
root@ubuntu:~# ip link add mac-veth03 type veth peer name mac-veth04
20: br-mac: mtu 1500 qdisc noop state DOWN group default
link/ether de:ee:ff:8d:0c:50 brd ff:ff:ff:ff:ff:ff
21: mac-veth02: mtu 1500 qdisc noop master br-mac state DOWN group default qlen 1000
link/ether de:ee:ff:8d:0c:49 brd ff:ff:ff:ff:ff:ff
22: mac-veth01: mtu 1500 qdisc noop master br-mac state DOWN group default qlen 1000
link/ether de:ee:ff:8d:0c:51 brd ff:ff:ff:ff:ff:ff
23: mac-veth04: mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 46:62:dd:cd:4f:41 brd ff:ff:ff:ff:ff:ff
24: mac-veth03: mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether c6:3f:65:95:e0:93 brd ff:ff:ff:ff:ff:ff
#attach mac-veth04(小mac)
root@ubuntu:~# brctl addif br-mac mac-veth04
20: br-mac: mtu 1500 qdisc noop state DOWN group default
link/ether de:ee:ff:8d:0c:50(不变) brd ff:ff:ff:ff:ff:ff
21: mac-veth02: mtu 1500 qdisc noop master br-mac state DOWN group default qlen 1000
link/ether de:ee:ff:8d:0c:49 brd ff:ff:ff:ff:ff:ff
22: mac-veth01: mtu 1500 qdisc noop master br-mac state DOWN group default qlen 1000
link/ether de:ee:ff:8d:0c:51 brd ff:ff:ff:ff:ff:ff
23: mac-veth04: mtu 1500 qdisc noop master br-mac state DOWN group default qlen 1000
link/ether 46:62:dd:cd:4f:41 brd ff:ff:ff:ff:ff:ff
24: mac-veth03: mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether c6:3f:65:95:e0:93 brd ff:ff:ff:ff:ff:ff
br0如果没有指定hw MAC, br0的MAC地址会根据bridge中port的变化,自动选择port中最小的一个MAC地址作为br0的MAC地址。br0只能指定port中有的interface的MAC作为br0的MAC地址。
基于veth的网络虚拟化
Linux虚拟网络设备之bridge(桥)
Linux 网桥配置命令:brctl
Linux-虚拟网络设备-veth pair
Linux下的虚拟Bridge实现
bridge
linux网桥设置MAC地址时的行为
br_device.c