1、Overview
This article presents a high-level overview of Open vSwitch* with the Data Plane Development Kit (OvS-DPDK)—the high performance, open source virtual switch—and links to further technical articles that dive deeper into individual OvS-DPDK features. This article was written for users of OvS who want to know more about DPDK integration.
Note: Users can download a zip file of the OVS master branch or the 2.6 branch, as well as installation steps for the master branch or the 2.6 branch.
本文介绍了使用DPDK的Open vSwitch的高级概述(OvS-DPDK)--高性能、开源的虚拟交换机--以及进一步深入研究OvS-DPDK个别特性的技术文章的链接。本文是为希望了解更多关于集成DPDK的OvS用户编写的。
注意:用户可以下载OvS主分支或2.6分支的zip文件,以及OvS主分支或2.6分支的安装步骤。
2、OvS-DPDK High-level Architecture
Open vSwitch is a production quality, multilayer virtual switch licensed under the open source Apache* 2.0 license. It supports SDN control semantics via the OpenFlow* protocol and its OVSDB management interface. It is available from openvswitch.org, GitHub, and is also consumable through Linux distributions.
Native Open vSwitch generally forwards packets via the kernel space data path (see Figure 1). In the kernel data path, the switching “fastpath” consists of a simple flow table indicating forwarding/action rules for packets that are received. Exception packets (first packet in a flow) do not match any existing entries in the kernel fastpath table and are sent to the user space daemon for processing (slowpath). After user space handles the first packet in the flow, the daemon will then update the flow table in kernel space so that subsequent packets in the flow can be processed in the fastpath and not sent to user space. Following this approach, native OvS can eliminate the costly context switch between kernel and user space for a large percentage of received packets. However, the achievable packet throughput is limited by the forwarding bandwidth of the Linux network stack, which is not suited for use cases requiring a high rate of packet processing; for example, Telco.
Open vSwitch是一款基于开源Apache* 2.0许可的生产级多层虚拟交换机。Open vSwitch通过OpenFlow协议及其OVSDB管理接口支持SDN控制语义。Open vSwitch可以从openvswitch.org和GitHub获得,也可以通过Linux发行版使用。
原生Open vSwitch通常通过内核空间数据路径(data path)转发数据包(参见图1)。在内核数据路径中,交换“fastpath”由一个简单的流表组成,该表指示接收到的数据包的转发/动作规则。异常包(流中的第一个包)与内核fastpath表中的任何现有条目都不匹配,该数据包会被发送到用户空间的守护进程进行处理(slowpath)。在用户空间处理流中的第一个包之后,守护进程将更新内核空间中的流表,以便流中的后续包可以在快速路径中处理,而不需要再被发送到用户空间。按照这种方法,原生OvS可以为接收的大部分包消除内核和用户空间之间代价高昂的上下文切换。但是,可实现的数据包吞吐量受Linux网络栈转发带宽的限制,不适合对数据包处理速率要求较高的用例;例如,电信。
Figure 2 below shows the high-level architecture of OvS-DPDK. OvS switching ports are represented by network devices (or netdevs). Netdev-dpdk is a DPDK-accelerated network device that uses DPDK to accelerate switch I/O, through three separate interfaces: one physical interface (handled by the librte_eth library within DPDK), and two virtual interfaces (librte_vhost and librte_ring). These interface with the physical and virtual devices connected to the virtual switch.
Other OvS architectural layers provide further functionality and interface with, for example, the SDN controller. Dpif-netdev provides user space forwarding and ofproto is the OvS library that implements an OpenFlow switch. It talks to OpenFlow controllers over the network and to switch hardware or software through an ofproto provider. The ovsdb server maintains the up-to-date switching table information for this OvS instance and communicates this to the SDN controller. The following section provides details of the switching/forwarding tables, with further information on the OvS architecture available through the openvswitch.org website.
下面的图2显示了OvS-DPDK的高级架构。OvS交换端口由网络设备(或netdevs)表示。Netdev-dpdk是一种DPDK网络设备,意思是使用DPDK技术加速交换机I/O,通过三个独立的接口:一个物理接口(由DPDK中的librte_eth库处理)和两个虚拟接口(librte_vhost和librte_ring)。这些物理设备接口和虚拟设备的接口,都连接到虚拟交换机。
其他OvS架构层提供了更进一步的功能和接口,比如SDN控制器。dpif-netdev为用户空间提供转发功能,ofproto是一个OvS库,相当于一个OpenFlow交换机。它通过网络与OpenFlow控制器进行对话,并通过ofproto提供商切换硬件或软件。ovsdb server维护此OvS实例的最新交换表信息,并将此信息传递给SDN控制器。下一节将描述交换/转发表的详细信息,关于OvS架构的更多信息,可以参考openvswitch.org网站内容。
3、OvS-DPDK Switching Table Hierarchy
A packet entering OvS-DPDK from a physical or virtual interface receives a unique identifier or hash, based on its header fields, which is then matched against an entry in one of three main switching tables: the exact match cache (EMC), the data path classifier (dpcls), or the ofproto classifier. A packet’s identifier will traverse each of these three tables in order, unless a match is found, in which case the appropriate actions indicated by the match rule in the table will be executed and the packet forwarded out of the switch upon completion of all actions. This scheme is illustrated in Figure 3.
从物理或虚拟接口进入OvS-DPDK的包,根据其报头字段计算出一个唯一标识符或hash值,然后将其与“精确匹配缓存(EMC)”、“数据路径分类器(dpcls)”、"ofproto分类器"这三个主要交换表中的条目进行匹配。包的标识符将依次遍历这三个表,直到找到一条匹配,接下来,将执行表中的匹配规则指示的适当操作,并在完成所有操作后将包转发出交换机。该方案如图3所示。
The three tables have different characteristics and associated throughput performance/latency. The EMC offers fastest processing for a limited number of table entries. The packet’s identifier must exactly match the entry in this table for all fields—the 5-tuple of source IP and port, destination IP and port, and protocol—for highest speed processing or it will “miss” on the EMC and pass through to the dpcls. The dpcls contains many more table entries (arranged in multiple subtables) and enables wildcard matching of the packet identifier (for example, destination IP and port are specified but any source is allowed). This gives approximately half the throughput performance of the EMC and caters to a much larger number of table entries. Packet flows matched in the dpcls are installed in the EMC so that subsequent packets with the same identifier can be processed at the highest speed.
A miss on the dpcls results in the packet identifier being sent to the ofproto classifier so that the OpenFlow controller can decide on the action. This path is the least performant, >10x slower than the EMC. Matches in the ofproto classifier result in new table entries being established in the faster switching tables so that subsequent packets in the same flow can be processed more quickly.
这三个表具有不同的特性、吞吐量、性能/延迟。EMC处理速度最快,但表项数目有限。数据包的标识符必须与EMC表中所有字段(源IP、源端口、目的IP、目的端口、协议,5元组)的条目完全匹配,才可以获得最高的处理速度,否则该数据包会被认为EMC缓存“未命中”,并将其传递给dpcls。dpcls包含更多的表项(排列在多个子表中),并允许数据包标识符进行通配符匹配(例如,指定了目的IP和端口,但允许任何源)。dpcls吞吐量性能大约为EMC一半,但提供了更多的表项。在dpcls中匹配的信息流被更新到EMC中,以便具有相同标识符的后续信息流能够以最高速度处理。
如果dpcls未命中,则会导致数据包被发送到ofproto分类器,以便OpenFlow控制器可以决定需要执行的动作。该路径是性能最差的,差不多比EMC慢10倍。在ofproto分类器中的匹配结果是在更快的交换表(dpcls)中建立新的表项,以便在同一流中的后续包可以更快地处理。
4、OvS-DPDK Features and Performance
At the time of this writing, the following high-level OvS-DPDK features are available on the OvS master code branch:
- DPDK support for v16.07 (supported version increments with each new DPDK release)
- vHost user support
- vHost reconnect
- vHost multiqueue
- Native tunneling support: VxLAN, GRE, Geneve
- VLAN support
- MPLS support
- Ingress/egress QoS policing
- Jumbo frame support
- Connection tracking
- Statistics: DPDK vHost and extended DPDK stats
- Debug: DPDK pdump support
- Link bonding
- Link status
- VFIO support
- ODL/OpenStack detection of DPDK ports
- vHost user NUMA awareness
A recent performance comparison between native OvS and OvS-DPDK is highlighted in Figure 4. This shows the throughput in packets-per-second for the Phy-OvS-Phy use case, indicating a ~10x performance enhancement for OvS-DPDK over native OvS, increasing to ~12x with Intel® Hyper-Threading Technology (Intel® HT Technology) enabled (labelled 1C2T, or one physical core with two logical threads, in the figure legend). Similarly, the Phy-OvS-VM-OvS-Phy use case demonstrates a ~9x performance enhancement for OvS-DPDK over native OvS.
The hardware and software configuration for this data, along with further use case results, can be found in the Intel® Open Network Platform (Intel® ONP) performance report.
图4突出显示了原生OvS和OvS- dpdk之间最近的性能比较。这显示了Phy-OvS-Phy用例的每秒包吞吐量,表明OvS-dpdk的性能比原生OvS提高了约10倍,在启用英特尔®超线程技术(英特尔®HT技术)(标记为1C2T,一个物理核心和两个逻辑线程,在图中图例)后提高到约12倍。类似地,Phy-OvS-VM-OvS-Phy用例演示了OvS- dpdk相对于原生OvS的约有9倍的性能增强。
该数据的硬件和软件配置,以及进一步的用例结果,可以在英特尔®开放网络平台(英特尔®ONP)性能报告中找到。
5、OvS-DPDK Availability
OvS-DPDK is available in the upstream openvswitch.org repository and is also available through Linux distributions as below. The latest milestone release is OvS 2.6 (September 2016), and releases are made with a six-month cadence.
Code is available for download as follows: OvS master branch; OvS 2.6 release branch. Installation steps for the master branch are available as well as installation steps for the 2.6 release branch.
Packaged versions of OvS with DPDK are available from:
- Red Hat* OpenStack Platform
- Ubuntu*
- Mirantis* OpenStack
- Open Platform for NFV*
OvS-DPDK可在openvswitch.org远程仓库中获得,也可通过如下Linux发行版获得。最新的里程碑版本是OvS 2.6(2016年9月),每六个月发布一次。
代码下载如下:OvS主分支;OvS 2.6发布分支。主分支的安装步骤和2.6版本分支的安装步骤都是可用的。
带有DPDK的OvS打包版本可从以下网站获得:
- Red Hat* OpenStack Platform
- Ubuntu*
- Mirantis* OpenStack
- Open Platform for NFV*
6、 Additional Information
To learn more about OvS-DPDK, check out the following videos and articles on Intel® Developer Zone, 01.org, Intel® Network Builders and Intel® Network Builders University.
User guides:
- Using OvS with DPDK for inter-VM NFV applications
- Using OvS with DPDK on Ubuntu
Developer guides:
- Open vSwitch with DPDK - how to build and install Open vSwitch using a DPDK datapath
- Using Open vSwitch with DPDK - includes advanced performance tuning information
Articles and Videos:
- Rate limiting configuration and usage for OvS with DPDK
- QoS configuration and usage for OvS with DPDK
- Configure vHost User multiqueue for OvS with DPDK
- vHost User NUMA awareness in OvS with DPDK
- DPDK Pdump in OvS with DPDK
- Enabling OvS with DPDK in OpenStack
- Jumbo Frames in Open vSwitch* with DPDK
- vHost User Client Mode in Open vSwitch* with DPDK
- OVS-DPDK Datapath Classifier - Part 1
- OvS-DPDK Datapath Classifier – Part 2
- Link Aggregation Configuration and Usage in Open vSwitch* with DPDK
- Analyzing OvS with DPDK bottlenecks using Intel® VTune Amplifier
- Using OvS and DPDK with Neutron in DevStack
- Build and Test a Simple NFV Inter-VM Use Case with OVS-DPDK (YouTube video series)
OvS with DPDK milestone release webinars:
- OvS with DPDK in OvS 2.5.0
- OvS with DPDK in OvS 2.4.0
- OvS with DPDK in OvS 2.6.0
INB university:
- Open vSwitch with DPDK Architectural Deep Dive
- DPDK Open vSwitch: Accelerating the Path to the Guest
White paper:
- OvS with DPDK enables SDN and NFV transformation
Have a question? Feel free to follow up with the query on the Open vSwitch discussion mailing thread.