OK,实验的拓扑和配置已经提前CTRL+C/V,现在让我们看看他到底是怎么工作的(手动滑稽)。
拓扑:
CE1---PE1---P---PE2---CE2
分支CE1和CE2的设备上已经学习到了彼此的MAC地址,二层互通;也学习到了PE设备上的IRB网关地址,三层互通,可以看到所有PE设备的IRB mac地址是一致的:
root@CE1# run show arp
MAC Address Address Name Interface Flags
00:05:86:71:18:c0 192.168.10.20 192.168.10.20 ae0.2100 none
00:00:00:00:00:01 192.168.10.254 192.168.10.254 ae0.2100 none
00:05:86:71:18:c0 192.168.20.20 192.168.20.20 ae0.2200 none
00:00:00:00:00:02 192.168.20.254 192.168.20.254 ae0.2200 none
00:05:86:71:18:c0 192.168.30.20 192.168.30.20 ae0.2300 none
00:00:00:00:00:03 192.168.30.254 192.168.30.254 ae0.2300 none
root@CE2# run show arp
MAC Address Address Name Interface Flags
00:05:86:71:a0:c0 192.168.10.10 192.168.10.10 ae0.2100 none
00:00:00:00:00:01 192.168.10.254 192.168.10.254 ae0.2100 none
00:05:86:71:a0:c0 192.168.20.10 192.168.20.10 ae0.2200 none
00:00:00:00:00:02 192.168.20.254 192.168.20.254 ae0.2200 none
00:05:86:71:a0:c0 192.168.30.10 192.168.30.10 ae0.2300 none
00:00:00:00:00:03 192.168.30.254 192.168.30.254 ae0.2300 none
测试下ping:
root@CE1# run ping 192.168.10.20 routing-instance ce1_vlan2100
PING 192.168.10.20 (192.168.10.20): 56 data bytes
64 bytes from 192.168.10.20: icmp_seq=0 ttl=64 time=186.833 ms
64 bytes from 192.168.10.20: icmp_seq=1 ttl=64 time=14.576 ms
64 bytes from 192.168.10.20: icmp_seq=2 ttl=64 time=20.534 ms
64 bytes from 192.168.10.20: icmp_seq=3 ttl=64 time=22.830 ms
64 bytes from 192.168.10.20: icmp_seq=4 ttl=64 time=27.790 ms
^C
--- 192.168.10.20 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/stddev = 14.576/54.513/186.833/66.296 ms
[edit]
root@CE1# run ping 192.168.10.254 routing-instance ce1_vlan2100
PING 192.168.10.254 (192.168.10.254): 56 data bytes
64 bytes from 192.168.10.254: icmp_seq=0 ttl=64 time=31.226 ms
64 bytes from 192.168.10.254: icmp_seq=1 ttl=64 time=305.800 ms
64 bytes from 192.168.10.254: icmp_seq=2 ttl=64 time=10.599 ms
64 bytes from 192.168.10.254: icmp_seq=3 ttl=64 time=12.904 ms
^C
--- 192.168.10.254 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max/stddev = 10.599/90.132/305.800/124.772 ms
root@CE2# run ping 192.168.10.10
PING 192.168.10.10 (192.168.10.10): 56 data bytes
64 bytes from 192.168.10.10: icmp_seq=0 ttl=64 time=205.853 ms
64 bytes from 192.168.10.10: icmp_seq=1 ttl=64 time=16.703 ms
64 bytes from 192.168.10.10: icmp_seq=2 ttl=64 time=28.356 ms
64 bytes from 192.168.10.10: icmp_seq=3 ttl=64 time=75.543 ms
64 bytes from 192.168.10.10: icmp_seq=4 ttl=64 time=25.344 ms
^C
--- 192.168.10.10 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/stddev = 16.703/70.360/205.853/70.789 ms
[edit]
root@CE2# run ping 192.168.10.254
PING 192.168.10.254 (192.168.10.254): 56 data bytes
64 bytes from 192.168.10.254: icmp_seq=0 ttl=64 time=12.648 ms
64 bytes from 192.168.10.254: icmp_seq=1 ttl=64 time=5.729 ms
64 bytes from 192.168.10.254: icmp_seq=2 ttl=64 time=8.551 ms
64 bytes from 192.168.10.254: icmp_seq=3 ttl=64 time=136.645 ms
^C
--- 192.168.10.254 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max/stddev = 5.729/40.893/136.645/55.337 ms
所以我们的CE1已经可以ping通的CE2同VLAN,也实现了CE1上不同VLAN之间的三层互访。
E×××到底是如何实现的呢?
CE1-PE1
CE-PE之间还是通过数据层学习MAC地址,PE1收到CE1的数据包 比如DHCP或者ARP,读取源MAC,记录在了MAC表里面,哪个MAC表呢?
root@PE1# run show bridge mac-table
MAC flags (S -static MAC, D -dynamic MAC, L -locally learned, C -Control MAC
O -OVSDB MAC, SE -Statistics enabled, NM -Non configured MAC, R -Remote PE MAC, P -Pinned MAC)
Routing instance : E×××-A
Bridging domain : BR-2100, VLAN : 2100
MAC MAC Logical NH MAC active
address flags interface Index property source
00:05:86:71:18:c0 DC 1048584 2.2.2.2
00:05:86:71:a0:c0 D ae0.2100
MAC flags (S -static MAC, D -dynamic MAC, L -locally learned, C -Control MAC
O -OVSDB MAC, SE -Statistics enabled, NM -Non configured MAC, R -Remote PE MAC, P -Pinned MAC)
Routing instance : E×××-A
Bridging domain : BR-2200, VLAN : 2200
MAC MAC Logical NH MAC active
address flags interface Index property source
00:05:86:71:18:c0 DC 1048584 2.2.2.2
00:05:86:71:a0:c0 D ae0.2200
MAC flags (S -static MAC, D -dynamic MAC, L -locally learned, C -Control MAC
O -OVSDB MAC, SE -Statistics enabled, NM -Non configured MAC, R -Remote PE MAC, P -Pinned MAC)
Routing instance : E×××-A
Bridging domain : BR-2300, VLAN : 2300
MAC MAC Logical NH MAC active
address flags interface Index property source
00:05:86:71:18:c0 DC 1048584 2.2.2.2
00:05:86:71:a0:c0 D ae0.2300
记录在MAC-VRF中的MAC转发表.什么是MAC-VRF呢?什么是VRF呢?
VRF virtual routing和forwarding,作用就是隔离网络,每个VRF都有自己的独立的转发信息,在一台设备上
实现多租户使用。
在E×××里面,VRF有两种,MAC-VRF和IP-VRF。MAC-VRF看做L2交换机,IP-VRF看做L3路由器。
MAC-VRF也对应了一个RD和一组RT。
RD,路由识别符,主要用去区别VRF,在去其他的PE路由器交换×××路由时,RD通过MP-BGP与路由一起携带,会和IP前缀一起包含在内。例如:65000:20:192.0.2.0/24。
RT是用来过滤MAC路由,可以创建导入和导出的策略来接受和标记具有指定特殊community值得路由。先不管他。
到这一步,我们在PE1上的MAC-VRF上看到了CE1的MAC。
PE1-PE2
PE1有了CE1的MAC转发信息,如何封装成BGP数据传递给PE2?
随便show下,选取两个路由看看。
2:2.2.2.2:2000::2100::00:00:00:00:00:01/304 MAC/IP (1 entry, 0 announced)
*BGP Preference: 170/-101
Route Distinguisher: 2.2.2.2:2000
Next hop type: Indirect, Next hop index: 0
Address: 0xc633070
Next-hop reference count: 26
Source: 2.2.2.2
Protocol next hop: 2.2.2.2
Indirect next hop: 0x2 no-forward INH Session ID: 0x0
State:
Local AS: 65000 Peer AS: 65000
Age: 1d 4:32:45 Metric2: 1
Validation State: unverified
Task: BGP_65000.2.2.2.2+179
AS path: I
Communities: target:65000:2000 e***-default-gateway
Import Accepted
Route Label: 16
ESI: 00:00:00:00:00:00:00:00:00:00
Localpref: 100
Router ID: 2.2.2.2
Secondary Tables: E×××-A.e***.0
3:2.2.2.2:2000::2100::2.2.2.2/248
开头的2和3是什么东东。
E×××定义了多个新的BGP Extended Community。
Type 1 – Ethernet auto-discovery route
Type 2 – MAC/IP advertisement route
Type 3 – Inclusive multicast Ethernet tag route
Type 4 – Ethernet segment (ES) route
Type 5 – IP prefix route
Type 3 路由用于向所有具有相同VLAN的站点的PE发送BUM流量。
格式是:3:
Type 2,是MAC/IP Route,通过BGP/170是通过远程PE路由过来的,发往这个MAC地址的数据包具有,比如:
Route Label: 16
PE会根据收到的Erhernet Frame发送到相应的MAC-VRF中。
接下来就是CE2-PE2
CE2 ping CE1之前,会发送ARP request,查询CE1的MAC地址。(PE2上配置ARP proxy)PE2直接在ARP response中放入CE1的MAC地址。
CE2拿到MAC地址之后,组装Ethernet Header,将ping包发送到PE2。
PE2-PE1
在E×××下,定义了三种数据层。
MPLS PBB VXLAN。
PBB和VXLAN是之后的学习计划,先看看MPLS。之前看到Route Label: 16,MPLS Lable,它会被加到ping包中到达PE1。
PE1-CE1
PE1知道某个××× Lable对应哪个MAC-VRF,剥离××× Lable,ping包发送到MAC-VRF,读取MAC转发表,发到对应的端口。
大致来讲,E×××架构与BGP/MPLS L3 ×××是一样的。但是转发的是MAC/IP,而L3 ×××中的转发信息就是IP。E×××提供了一个L2层网络的控制层,在控制上也可以学习到L2的信息了,用BGP来宣告了MAC地址。
EVI是什么?
An E××× instance (EVI) is an E××× routing and forwarding instance spanning all the PE routers participating in that ×××. An EVI is configured on the PE routers on a per-customer basis. Each EVI has a unique route distinguisher and one or more route targets.
每个EVI链接了一个或者多个用户网络,EVI之间彼此独立。
ET是什么?
Ethernet tag—An Ethernet tag identifies a particular broadcast domain, such as a VLAN. An E××× instance consists of one or more broadcast domains. Ethernet tags are assigned to the broadcast domains of a given E××× instance by the provider of that E×××. Each PE router in that E××× instance performs a mapping between broadcast domain identifiers understood by each of its attached CE devices and the corresponding Ethernet tag.
如果一个EVI包含多个广播域,使用ET来区别不同的广播域。