This section discusses the functions, procedures, and associated BGP
routes used to support multihoming in EVPN. This covers both
multihomed device (MHD) and multihomed network (MHN) scenarios.
本节讨论功能,章程,用于在EVPN中支持多归的BGP route。覆盖多归设备(MHD)和多归网络(MHN)场景
PEs connected to the same Ethernet segment can automatically discover
each other with minimal to no configuration through the exchange of
the Ethernet Segment route.
链接同一ES的PE可以使用最小的配置通过交换ES路由互相发现
The Route Distinguisher (RD) MUST be a Type 1 RD [RFC4364]. The
value field comprises an IP address of the PE (typically, the
loopback address) followed by a number unique to the PE.
The Ethernet Segment Identifier (ESI) MUST be set to the 10-octet
value described in Section 5.
The BGP advertisement that advertises the Ethernet Segment route MUST
also carry an ES-Import Route Target, as defined in Section 7.6.
The Ethernet Segment route filtering MUST be done such that the
Ethernet Segment route is imported only by the PEs that are
multihomed to the same Ethernet segment. To that end, each PE that
is connected to a particular Ethernet segment constructs an import
filtering rule to import a route that carries the ES-Import Route
Target, constructed from the ESI.
构造ES路由
RD必须是类型1RD(RFC4364)。值字段由PE的IP地址(通常是环回地址)及紧随的对PE唯一的数字。ESI必须置为第5节描述的10字节值。通告ES路由的BGP通告必须携带ES-Import Route Target扩展团体属性。ES路由仅被多归到同一ES的PE Import这一ES路由过滤必须完成。最后,链接到某一ES的PE构造从其ESI一个过滤规则来import携带ES-Import Route Target的路由。
In EVPN, MAC address reachability is learned via the BGP control
plane over the MPLS network. As such, in the absence of any fast
protection mechanism, the network convergence time is a function of
the number of MAC/IP Advertisement routes that must be withdrawn by
the PE encountering a failure. For highly scaled environments, this
scheme yields slow convergence.
在EVPN,MAC地址可达性是通过BGP控制面在MPLS网络学习的。同样得,缺乏快速保护机制,网络收敛时间是指MAC/IP通告路由在PE故障时被撤掉的数目的功能。大数据环境下,这个模式收敛很慢。
To alleviate this, EVPN defines a mechanism to efficiently and
quickly signal, to remote PE nodes, the need to update their
forwarding tables upon the occurrence of a failure in connectivity to
an Ethernet segment. This is done by having each PE advertise a set
of one or more Ethernet A-D per ES routes for each locally attached
Ethernet segment (refer to Section 8.2.1 below for details on how
these routes are constructed). A PE may need to advertise more than
one Ethernet A-D per ES route for a given ES because the ES may be in
a multiplicity of EVIs and the RTs for all of these EVIs may not fit
into a single route. Advertising a set of Ethernet A-D per ES routes
for the ES allows each route to contain a subset of the complete set
of RTs. Each Ethernet A-D per ES route is differentiated from the
other routes in the set by a different Route Distinguisher (RD).
Upon a failure in connectivity to the attached segment, the PE
withdraws the corresponding set of Ethernet A-D per ES routes. This
triggers all PEs that receive the withdrawal to update their next-hop
adjacencies for all MAC addresses associated with the Ethernet
segment in question. If no other PE had advertised an Ethernet A-D
route for the same segment, then the PE that received the withdrawal
simply invalidates the MAC entries for that segment. Otherwise, the
PE updates its next-hop adjacencies accordingly.
为了减缓此影响,EVPN定义了对于端对PE在到一个ES的链接故障发生时更新他们的转发表的需求的有效,快速的信号的机制。这是通过每台PE通告一组一个或者多个ES A-D路由给每个链接此ES的PE(路由构造见8.2.1)。一台PE对于一个ES可能需要通告多条ES A-D路由,因为该ES可能在多个EVI里且所有EVI的RT可能不适合一条单一路由。为该PE通告一组ES ad路由允许每条路由包含所有RT的一个子组。ES A-D路由通过RD来区分彼此。一旦和接入ES的链路发生故障,PE发相应的ES A-D撤销路由。这会触发所有接收到该撤销路由的PE去更新他们所有和发生问题ES关联的MAC地址的下一跳。如果没有其它PE通告为同样的ES通告ES A-D路由,收到ES A-D撤销路由的PE仅无效该ES的MAC入口。否则,该PE照着更新下一跳信息。
》有ack机制保证PE知道是否有其它PE收到路由?
This section describes the procedures used to construct the Ethernet
A-D per ES route, which is used for fast convergence (as discussed
above) and for advertising the ESI label used for split-horizon
filtering (as discussed in Section 8.3). Support of this route is
REQUIRED.
The Route Distinguisher (RD) MUST be a Type 1 RD [RFC4364]. The
value field comprises an IP address of the PE (typically, the
loopback address) followed by a number unique to the PE.
The Ethernet Segment Identifier MUST be a 10-octet entity as
described in Section 5 ("Ethernet Segment"). The Ethernet A-D route
is not needed when the Segment Identifier is set to 0 (e.g., singlehomed
scenarios).
The Ethernet Tag ID MUST be set to MAX-ET.
The MPLS label in the NLRI MUST be set to 0.
The ESI Label extended community MUST be included in the route. If
All-Active redundancy mode is desired, then the "Single-Active" bit
in the flags of the ESI Label extended community MUST be set to 0 and
the MPLS label in that Extended Community MUST be set to a valid MPLS
label value. The MPLS label in this Extended Community is referred
to as the ESI label and MUST have the same value in each Ethernet A-D
per ES route advertised for the ES. This label MUST be a downstream
assigned MPLS label if the advertising PE is using ingress
replication for receiving multicast, broadcast, or unknown unicast
traffic from other PEs. If the advertising PE is using P2MP MPLS
LSPs for sending multicast, broadcast, or unknown unicast traffic,
then this label MUST be an upstream assigned MPLS label. The usage
of this label is described in Section 8.3.
If Single-Active redundancy mode is desired, then the "Single-Active"
bit in the flags of the ESI Label extended community MUST be set to 1
and the ESI label SHOULD be set to a valid MPLS label value.
构造ES A-D路由
这节描述如何构造ES A-D路由,该路由用于快速收敛(上节描述)还有用于水平分割(8.3)的ESI标签通告。支持这个路由是必须的。
RD必须是Type 1 RD(RFC4364)。值域由PE的IP地址(典型的,环回地址)及紧随的唯一属于PE的数字组成,ESI必须是和第5节描述的10字节,当ESI是0(单归场景)时,Ethernet A-D路由是不需要的。
EtagID必须设为最大值MAX-ET
NLRI的MPLS label必须置0
ESI label扩展团体属性必须被包含进该路由。如果多活冗余模式是被渴望的,ESI label扩展团体属性的flags的"Single-Active"位必须置0,且该扩展团体属性的MPLS label(ESI label)必须置为有效的MPLS label值。在这个扩展团体属性里的MPLS label被称为ESI label且该ES的所有ES A-D路由的这个值必须一样。
8.2.1.1. Ethernet A-D Route Targets
Each Ethernet A-D per ES route MUST carry one or more Route Target
(RT) attributes. The set of Ethernet A-D routes per ES MUST carry
the entire set of RTs for all the EVPN instances to which the
Ethernet segment belongs.
每个ES A-D路由必须携带一个或者多个RT属性。(怎么带???)ES A-D路由(组)必须携带该ES所属所有EVI的全部RT。
Consider a CE that is multihomed to two or more PEs on an Ethernet
segment ES1 operating in All-Active redundancy mode. If the CE sends
a broadcast, unknown unicast, or multicast (BUM) packet to one of the
non-Designated Forwarder (non-DF) PEs, say PE1, then PE1 will forward
that packet to all or a subset of the other PEs in that EVPN
instance, including the DF PE for that Ethernet segment. In this
case, the DF PE to which the CE is multihomed MUST drop the packet
and not forward back to the CE. This filtering is referred to as
"split-horizon filtering" in this document.
When a set of PEs are operating in Single-Active redundancy mode, the
use of this split-horizon filtering mechanism is highly recommended
because it prevents transient loops at the time of failure or
recovery that would impact the Ethernet segment -- e.g., when two PEs
think that both are DFs for that segment before the DF election
procedure settles down.
In order to achieve this split-horizon function, every BUM packet
originating from a non-DF PE is encapsulated with an MPLS label that
identifies the Ethernet segment of origin (i.e., the segment from
which the frame entered the EVPN network). This label is referred to
as the ESI label and MUST be distributed by all PEs when operating in
All-Active redundancy mode using a set of Ethernet A-D per ES routes,
per Section 8.2.1 above. The ESI label SHOULD be distributed by all
PEs when operating in Single-Active redundancy mode using a set of
Ethernet A-D per ES routes. These routes are imported by the PEs
connected to the Ethernet segment and also by the PEs that have at
least one EVPN instance in common with the Ethernet segment in the
route. As described in Section 8.1.1, the route MUST carry an ESI
Label extended community with a valid ESI label. The disposition PE
relies on the value of the ESI label to determine whether or not a
BUM frame is allowed to egress a specific Ethernet segment.
水平分割
考虑一个操作在多活模式下的ES ES1的CE多归到两台或者多台PE,如果该PE发送BUM报文到其中一台没有DF选举的PE,(PE1),然后PE1会转发该报文给包含该ES的DF PE在内的所有在该EVI下的其它PE或其子集。这种情况下,该CE多归的DF PE必须扔掉该报文且不转发该报文给该CE。这种过滤在本文叫做“水平分割过滤”。
当一组PE操作在单活模式下,很推荐使用这种水平分割机制。因为ta防止在影响该ES的故障或者恢复时成环,例如在DF选举前两个PE都认为自己是该ES的DF时。
为了完成水平分割功能,每个从non-DF PE发起的BUM报文封装标识该ES的原点(例如,该ES 从哪个框架进入该EVPN网络)的MPLS label,该label该称为ESI label且必须在多活模式下使用一组ES A-D路由的所有PE中区分彼此。ESI label应该在多活模式下使用一组ES A-D路由的所有PE中区分彼此。这些路由是被链接该ES的PE和有在至少一个EVI该路由共用该ES的PE import的,如8.1.1描述,该路由必须携带一个有合法ESI的ESI label扩展团体属性.部署的PE依赖ESI label的值来决定Bum框架(这翻译???)是否允许出一个ES域。
The following subsections describe the assignment procedures for the
ESI label, which differ depending on the type of tunnels being used
to deliver multi-destination packets in the EVPN network.
分配ESI label
本节接下来的部分描述ESI label的分配章程,和依赖在EVPN网络中被用于传输多目的地报文的隧道的类型不同。
8.3.1.1. Ingress Replication
Each PE that operates in All-Active or Single-Active redundancy mode
and that uses ingress replication to receive BUM traffic advertises a
downstream assigned ESI label in the set of Ethernet A-D per ES
routes for its attached ES. This label MUST be programmed in the
platform label space by the advertising PE, and the forwarding entry
for this label must result in NOT forwarding packets received with
this label onto the Ethernet segment for which the label was
distributed.
The rules for the inclusion of the ESI label in a BUM packet by the
ingress PE operating in All-Active redundancy mode are as follows:
- A non-DF ingress PE MUST include the ESI label distributed by the
DF egress PE in the copy of a BUM packet sent to it.
- An ingress PE (DF or non-DF) SHOULD include the ESI label
distributed by each non-DF egress PE in the copy of a BUM packet
sent to it.
The rule for the inclusion of the ESI label in a BUM packet by the
ingress PE operating in Single-Active redundancy mode is as follows:
- An ingress DF PE SHOULD include the ESI label distributed by the
egress PE in the copy of a BUM packet sent to it.
In both All-Active and Single-Active redundancy mode, an ingress PE
MUST NOT include an ESI label in the copy of a BUM packet sent to an
egress PE that is not attached to the ES through which the BUM packet
entered the EVI.
As an example, consider PE1 and PE2, which are multihomed to CE1 on
ES1 and operating in All-Active multihoming mode. Further, consider
that PE1 is using P2P or MP2P LSPs to send packets to PE2. Consider
that PE1 is the non-DF for VLAN1 and PE2 is the DF for VLAN1, and PE1
receives a BUM packet from CE1 on VLAN1 on ES1. In this scenario,
PE2 distributes an Inclusive Multicast Ethernet Tag route for VLAN1
corresponding to an EVPN instance. So, when PE1 sends a BUM packet
that it receives from CE1, it MUST first push onto the MPLS label
stack the ESI label that PE2 has distributed for ES1. It MUST then
push onto the MPLS label stack the MPLS label distributed by PE2 in
the Inclusive Multicast Ethernet Tag route for VLAN1. The resulting
packet is further encapsulated in the P2P or MP2P LSP label stack
required to transmit the packet to PE2. When PE2 receives this
packet, it determines, from the top MPLS label, the set of ESIs to
which it will replicate the packet after any P2P or MP2P LSP labels
have been removed. If the next label is the ESI label assigned by
PE2 for ES1, then PE2 MUST NOT forward the packet onto ES1. If the
next label is an ESI label that has not been assigned by PE2, then
PE2 MUST drop the packet. It should be noted that in this scenario,
if PE2 receives a BUM packet for VLAN1 from CE1, then it SHOULD
encapsulate the packet with an ESI label received from PE1 when
sending it to PE1 in order to avoid any transient loops during a
failure scenario that would impact ES1 (e.g., port or link failure).
头端复制
操作在单活或者多活模式下使用头端复制方式接收BUM流的PE广播一个带有为它接入的ES分配ES A-D路由(组)的ESI label的下游流量。该label必须在平台label空间被通告PE运行,该label的转发入口必须导致不转发和该label一起收到的去往该label标识的es的报文。
多活模式下ingress PE的bum报文包含ESI label的规则如下:
单活模式下ingress PE的bum报文包含ESI label的规则如下:
不管单活或者多活模式,一台ingress PE复制bum报文发给一台不接入该bum报文进入该EVI的ES的egress PE必须不包含ESI label。
首先ingress PE和egress PE时啥。。。??????
例如,ce1在es1上多归到多活模式的pe1,pe2。PE1使用P2P或者MP2P LSP发报文给PE2。在VLAN1,PE1是non-DF,PE2是DF,PE1收到从CE1通过ES1来的Bum报文。这个场景下,PE2在相应的EVI为VLAN1发IMET路由,所以,当PE1发ta从CE1收到的BUM报文时,ta必须先在MPLS label栈push PE2发给ta的ESI label,然后必须Push PE在VLAN1的IMET路由发给ta的MPLS label,最后报文封入要求发送该报文给PE2的p2p或者MP2p LSP label栈。当PE2收到该报文是,ta决定,在MPLS label栈的顶部,ESIs,在P2P或者MP2P LSP label被移除后复制该报文。如果下一个label是PE2为ES1分配的ESI label,PE2必须不转发该报文给ES1。如果下一个label不是PE2分配的ESI label,PE2必须扔掉该报文????ta应该注意该场景下,如果PE2收到从CE1来的VLAN1的bum报文,当发送给该报文给PE1时ta应该把从PE1收到的ESI label封入该报文,为了在影响ES1的故障场景(例如,端口或者链路故障)避免成环。
In the case where a CE is multihomed to multiple PE nodes, using a
Link Aggregation Group (LAG) with All-Active redundancy, it is
possible that only a single PE learns a set of the MAC addresses
associated with traffic transmitted by the CE. This leads to a
situation where remote PE nodes receive MAC/IP Advertisement routes
for these addresses from a single PE, even though multiple PEs are
connected to the multihomed segment. As a result, the remote PEs are
not able to effectively load balance traffic among the PE nodes
connected to the multihomed Ethernet segment. This could be the
case, for example, when the PEs perform data-plane learning on the
access, and the load-balancing function on the CE hashes traffic from
a given source MAC address to a single PE.
Another scenario where this occurs is when the PEs rely on controlplane
learning on the access (e.g., using ARP), since ARP traffic
will be hashed to a single link in the LAG.
To address this issue, EVPN introduces the concept of ’aliasing’,
which is the ability of a PE to signal that it has reachability to an
EVPN instance on a given ES even when it has learned no MAC addresses
from that EVI/ES. The Ethernet A-D per EVI route is used for this
purpose. A remote PE that receives a MAC/IP Advertisement route with
a non-reserved ESI SHOULD consider the advertised MAC address to be
reachable via all PEs that have advertised reachability to that MAC
address’s EVI/ES via the combination of an Ethernet A-D per EVI route
for that EVI/ES (and Ethernet tag, if applicable) AND Ethernet A-D
per ES routes for that ES with the "Single-Active" bit in the flags
of the ESI Label extended community set to 0.
Note that the Ethernet A-D per EVI route may be received by a remote
PE before it receives the set of Ethernet A-D per ES routes.
Therefore, in order to handle corner cases and race conditions, the
Ethernet A-D per EVI route MUST NOT be used for traffic forwarding by
a remote PE until it also receives the associated set of Ethernet A-D
per ES routes.
The backup path is a closely related function, but it is used in
Single-Active redundancy mode. In this case, a PE also advertises
that it has reachability to a given EVI/ES using the same combination
of Ethernet A-D per EVI route and Ethernet A-D per ES route as
discussed above, but with the "Single-Active" bit in the flags of the
ESI Label extended community set to 1. A remote PE that receives a
MAC/IP Advertisement route with a non-reserved ESI SHOULD consider
the advertised MAC address to be reachable via any PE that has
advertised this combination of Ethernet A-D routes, and it SHOULD
install a backup path for that MAC address.
This section describes the procedures used to construct the Ethernet
A-D per EVPN instance (EVI) route, which is used for aliasing (as
discussed above). Support of this route is OPTIONAL.
The Route Distinguisher (RD) MUST be set per Section 7.9.
The Ethernet Segment Identifier MUST be a 10-octet entity as
described in Section 5 ("Ethernet Segment"). The Ethernet A-D route
is not needed when the Segment Identifier is set to 0.
The Ethernet Tag ID is the identifier of an Ethernet tag on the
Ethernet segment. This value may be a 12-bit VLAN ID, in which case
the low-order 12 bits are set to the VLAN ID and the high-order
20 bits are set to 0. Or, it may be another Ethernet tag used by the
EVPN. It MAY be set to the default Ethernet tag on the Ethernet
segment or to the value 0.
Note that the above allows the Ethernet A-D route to be advertised
with one of the following granularities:
+ One Ethernet A-D route per
MAC-VRF. This is applicable when the PE uses MPLS-based
disposition with VID translation or may be applicable when the
PE uses MAC-based disposition with VID translation.
+ One Ethernet A-D route for each
Ethernet Tag ID is set to 0). This is applicable when the PE uses
MAC-based disposition or MPLS-based disposition without VID
translation.
The usage of the MPLS label is described in Section 14 ("Load
Balancing of Unicast Packets").
The Next Hop field of the MP_REACH_NLRI attribute of the route MUST
be set to the IPv4 or IPv6 address of the advertising PE.
The Ethernet A-D route MUST carry one or more Route Target (RT)
attributes, per Section 7.10.
当ce多归到pe,使用多活的Link Aggregation Group (LAG),仅一台pe通过ce流量传少学到一组mac地址是可能的。这就导致远端pe从单一pe收到这些mac地址的mac/ip通过路由,即使多归pe链接到多个segment。结果,远端pe在多归es的pe中无法形成有效的负载分担。可能的情况是,这些pe通过数据面学习到入口(mac地址),负载分担功能表现是ce流量从一个给定的源mac地址到单一的pe。
这种情况发生的另一个场景是pe依赖控制面学习入口(mac),如使用ARP,arp流量会hash到一个LAG的单一链路。
为了解决这个问题,e引入了“别名”的概念,即pe有能力到达一个EVI的给定ES即使ta没有从那个EVI/ES学习到mac地址。EVI A-D路由是为了这个目的使用的。一台远端pe收到有non-reserved ESI的mac/ip通告路由应该考虑该通告的MAC地址到所有pe应该是可达的,这个所有pe指通过该evi/es(和etag,如果应用的话)的EVI A-D路由和该ES的ES A-D路由的混合(这些es路由的ESI label扩展团体属性中flag的“单活”比特位置0)通告到该mac地址的EVI/es的可达性的pe。
注意EVI A-D路由可能在es A-D路由前被远端pe收到,为了处理极端情况和竞争调剂,EVI ad路由必须不被远端pe用于流量转发直到ta收到关联的es ad路由为止。
(这两个ad路由还有前后关系???)
备份路径是相关功能,但是ta用在单活模式。这种情况下,一台pe和前面一样使用同样得evi ad路由和es ad路由的组合也通告ta对一个给定的evi/es可达,但是在esi label扩展团体属性的flag的单活比特位置1。收到non-reserved esi的mac/ip通告路由远端pe应该考虑该通告mac地址通过任何有通告这个ad路由组合的pe是可达的,ta应该为该mac地址安装备份路径。(最后那句话什么意思???)
本节面试了构造用于别名的evi ad 路由的章程。该路由的支持是可选的。
RD必须设置。
ESI必须是10字节实体,esi置0时ad路由是不需要的
EtagID是在es上ethernet tag的标识。该值可能是12-bit VLAN ID,在低12比特置为VID且高20bit置0的情况。或者可能是另一个用于该e的etag。Ta可能置为该es的默认etag或0.
注意上面允许ad路由在下面间隔之一被通告:
+ 一个ad路由每
(具体场景还是不清楚???
MPLS lable的使用见14节。
该路由的MP_REACH_NLRI属性的下一跳字段必须置通告pe的ipv4或者ipv6地址。
Ad路由必须携带一个或者多个rt属性。
水平有限:请不吝赐教