本文分析了一下neutron provider network的环境下虚拟机数据流。
实验环境如下:
Openstack : Havana (Neutron ML2+openvswitch agent, Vlan模式)
Provider Network : Vlan 100, 网段 100.100.100.0/24, 网关100.100.100.1
虚机网络拓扑环境如下:
我们以在虚拟机中ping 8.8.8.8为例说明数据流。
在虚机中当我们敲下ping 8.8.8.8以后, 我们的kernel会查找路由表看看我们有没有8.8.8.8的路由。
在虚机中ifconfig和ip route的结果如下:
# ifconfig
eth0 Link encap:Ethernet HWaddr FA:16:3E:2E:FD:E1
inet addr:100.100.100.2 Bcast:100.100.100.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1906 errors:0 dropped:0 overruns:0 frame:0
TX packets:1709 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:147567 (144.1 KiB) TX bytes:233064 (227.6 KiB)
# ip route show
169.254.169.254 via 100.100.100.3 dev eth0 proto static
100.100.100.0/24 dev eth0 proto kernel scope link src 100.100.100.2
169.254.0.0/16 dev eth0 scope link metric 1002
default via 100.100.100.1 dev eth0
我们发现在虚机中并没有到8.8.8.8的直连路由,因此我们会把包发送给默认网关100.100.100.1。
此时我们首先会发送ARP广播请求默认网关的MAC。当默认网关会应我们ARP请求后,我们就得到了需要的默认网关的MAC。
我们会将ICMP Request包发出,这个包源IP是100.100.100.2,目的IP是8.8.8.8,源MAC是FA:16:3E:2E:FD:E1,目的MAC是默认网关的MAC。
之后这个包就被发送到了tapdfc176e4-5a。
下面是tapdfc176e4-5a在计算节点上的相关配置:
# ifconfig tapdfc176e4-5a
tapdfc176e4-5a Link encap:Ethernet HWaddr FE:16:3E:2E:FD:E1
inet6 addr: fe80::fc16:3eff:fe2e:fde1/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:217 errors:0 dropped:0 overruns:0 frame:0
TX packets:249 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:28180 (27.5 KiB) TX bytes:21472 (20.9 KiB)
# brctl show
bridge name bridge idSTP enabledinterfaces
qbr6750eac8-57 8000.d6c128fae672noqvb6750eac8-57
tap6750eac8-57
qbrdfc176e4-5a 8000.7e07e8dd1cf6noqvbdfc176e4-5a
tapdfc176e4-5a
virbr0 8000.525400e75eaayesvirbr0-nic
此处可以看到tapdfc176e4-5a被连接到了linux bridge qbrdfc176e4-5a上,而不是连接到OVS的br-int上。原因是如果将tapdfc176e4-5a直接连到br-int上,Security Group就没有机会执行过滤规则。
目前Security Group是用iptables实现的。在iptables中有一个feature叫做bridge-nf-call-iptables,可以过滤桥上的流。我们可以通过以下命令查看是否开启:
# cat /proc/sys/net/bridge/bridge-nf-call-iptables
1
我们可以使用iptables -nvL 查看这个VM在filter表中的Security Group:
Chain neutron-openvswi-idfc176e4-5 (1 references)
pkts bytes target prot opt in out source destination
0 0 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 state INVALID
318 20107 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
1 60 RETURN tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:22
0 0 RETURN udp -- * * 100.100.100.3 0.0.0.0/0 udp spt:67 dpt:68
0 0 RETURN udp -- * * 100.100.100.4 0.0.0.0/0 udp spt:67 dpt:68
0 0 neutron-openvswi-sg-fallback all -- * * 0.0.0.0/0 0.0.0.0/0
Chain neutron-openvswi-odfc176e4-5 (2 references)
pkts bytes target prot opt in out source destination
0 0 RETURN udp -- * * 0.0.0.0/0 0.0.0.0/0 udp spt:68 dpt:67
299 31689 neutron-openvswi-sdfc176e4-5 all -- * * 0.0.0.0/0 0.0.0.0/0
0 0 DROP udp -- * * 0.0.0.0/0 0.0.0.0/0 udp spt:67 dpt:68
0 0 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 state INVALID
295 31365 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
4 324 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0
0 0 neutron-openvswi-sg-fallback all -- * * 0.0.0.0/0 0.0.0.0/0
在Neutron中会对进入/流出虚机的流量进行过滤,neutron-openvswi-idfc176e4-5链是进入虚机流的访问控制规则。neutron-openvswi-odfc176e4-5链是流出虚机流的访问控制规则。
这两个连是在neutron-openvswi-sg-chain中被调用的:
Chain neutron-openvswi-sg-chain (4 references)
pkts bytes target prot opt in out source destination
319 20167 neutron-openvswi-idfc176e4-5 all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-out tapdfc176e4-5a --physdev-is-bridged
299 31689 neutron-openvswi-odfc176e4-5 all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-in tapdfc176e4-5a --physdev-is-bridged
在packet从桥和Security Group流出后,他会来到br-int的qvodfc176e4-5a,这个接口可以看到是"tag: 2",这表示这个接口是"Access口",vlan id是2。
OVS输出如下:
# ovs-vsctl show
47115847-b828-47f3-bbdb-e18d4b0fd11e
Bridge br-int
Port "tap39b2b891-3b"
tag: 2
Interface "tap39b2b891-3b"
type: internal
Port br-int
Interface br-int
type: internal
Port "qvo6750eac8-57"
tag: 1
Interface "qvo6750eac8-57"
Port "qvodfc176e4-5a"
tag: 2
Interface "qvodfc176e4-5a"
Port "int-br-eth2"
Interface "int-br-eth2"
Port "qr-441abe6b-8b"
tag: 1
Interface "qr-441abe6b-8b"
type: internal
Bridge br-ex
Port br-ex
Interface br-ex
type: internal
Port "qg-019b0743-e4"
Interface "qg-019b0743-e4"
type: internal
Port "eth3"
Interface "eth3"
Bridge "br-eth2"
Port "eth2"
Interface "eth2"
Port "br-eth2"
Interface "br-eth2"
type: internal
Port "phy-br-eth2"
Interface "phy-br-eth2"
ovs_version: "1.11.0"
虚拟交换机br-int会将packet从int-br-eth2发送出去,而"int-br-eth2"和"phy-br-eth2"是veth pair,因此包会从phy-br-eth2流入br-eth2,以下是OVS的openflow输出:
# ovs-ofctl show br-eth2
OFPT_FEATURES_REPLY (xid=0x2): dpid:00000800270731f9
n_tables:254, n_buffers:256
capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP
actions: OUTPUT SET_VLAN_VID SET_VLAN_PCP STRIP_VLAN SET_DL_SRC SET_DL_DST SET_NW_SRC SET_NW_DST SET_NW_TOS SET_TP_SRC SET_TP_DST ENQUEUE
1(eth2): addr:08:00:27:07:31:f9
config: 0
state: 0
speed: 0 Mbps now, 0 Mbps max
2(phy-br-eth2): addr:a2:e1:41:5c:cc:bf
config: 0
state: 0
current: 10GB-FD COPPER
speed: 10000 Mbps now, 0 Mbps max
LOCAL(br-eth2): addr:08:00:27:07:31:f9
config: 0
state: 0
speed: 0 Mbps now, 0 Mbps max
OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0
# ovs-ofctl dump-flows br-eth2
NXST_FLOW reply (xid=0x4):
cookie=0x0, duration=3968.876s, table=0, n_packets=20, n_bytes=2052, idle_age=1947, priority=4,in_port=2,dl_vlan=1 actions=mod_vlan_vid:1001,NORMAL
cookie=0x0, duration=3967.173s, table=0, n_packets=218, n_bytes=28424, idle_age=502, priority=4,in_port=2,dl_vlan=2 actions=mod_vlan_vid:100,NORMAL
cookie=0x0, duration=3972.688s, table=0, n_packets=10, n_bytes=764, idle_age=1986, priority=2,in_port=2 actions=drop
cookie=0x0, duration=3976.268s, table=0, n_packets=411, n_bytes=77162, idle_age=14, priority=1 actions=NORMAL
我们需要特别关注以下openflow条目:
cookie=0x0, duration=3967.173s, table=0, n_packets=218, n_bytes=28424, idle_age=502, priority=4,in_port=2,dl_vlan=2 actions=mod_vlan_vid:100,NORMAL
"dl_vlan=2" 是说当packet的vlan tag为2。我们的packet是从Tag为2的"Access口"进来的,因此会带有tag为2的vlan头。
"action = mod_vlan_vid:100" 执行修改vlan头的action,将vlan id改为100。
"NORMAL" 执行标准交换机动作。
也就是说在br-eth2上的openflow规则会将我们的packet转为vlan id为100的packet。我们回过头来看br-int的openflow条目,会发现也有类似规则,不过是将vlan id从100改为2。
下面是br-int上的openflow条目:
# ovs-ofctl show br-int
OFPT_FEATURES_REPLY (xid=0x2): dpid:000032774807d443
n_tables:254, n_buffers:256
capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP
actions: OUTPUT SET_VLAN_VID SET_VLAN_PCP STRIP_VLAN SET_DL_SRC SET_DL_DST SET_NW_SRC SET_NW_DST SET_NW_TOS SET_TP_SRC SET_TP_DST ENQUEUE
1(int-br-eth2): addr:4e:1d:f3:fe:23:12
config: 0
state: 0
current: 10GB-FD COPPER
speed: 10000 Mbps now, 0 Mbps max
2(qvodfc176e4-5a): addr:36:92:d2:25:b7:8d
config: 0
state: 0
current: 10GB-FD COPPER
speed: 10000 Mbps now, 0 Mbps max
3(qr-441abe6b-8b): addr:f6:01:00:00:00:00
config: PORT_DOWN
state: LINK_DOWN
speed: 0 Mbps now, 0 Mbps max
4(qvo6750eac8-57): addr:fe:63:44:8b:9d:28
config: 0
state: 0
current: 10GB-FD COPPER
speed: 10000 Mbps now, 0 Mbps max
7(tap39b2b891-3b): addr:f6:01:00:00:00:00
config: PORT_DOWN
state: LINK_DOWN
speed: 0 Mbps now, 0 Mbps max
LOCAL(br-int): addr:32:77:48:07:d4:43
config: 0
state: 0
speed: 0 Mbps now, 0 Mbps max
OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0
# ovs-ofctl dump-flows br-int
NXST_FLOW reply (xid=0x4):
cookie=0x0, duration=3960.902s, table=0, n_packets=2, n_bytes=748, idle_age=1949, priority=3,in_port=1,dl_vlan=1001 actions=mod_vlan_vid:1,NORMAL
cookie=0x0, duration=3959.222s, table=0, n_packets=242, n_bytes=21940, idle_age=494, priority=3,in_port=1,dl_vlan=100 actions=mod_vlan_vid:2,NORMAL
cookie=0x0, duration=3965.248s, table=0, n_packets=166, n_bytes=54124, idle_age=6, priority=2,in_port=1 actions=drop
cookie=0x0, duration=3969.286s, table=0, n_packets=608, n_bytes=69908, idle_age=494, priority=1 actions=NORMAL
当我们的packet在br-eth2上被转发到eth2,并带有vlan id 100从eth2上发送出去,就会发送到物理交换机上。物理交换机与compute节点是通过trunk连接的,只要配置了vlan100就能将包转发到网关上,最后由网关将包转发出去。