iRDMA流量控制总结 - 2

4.0 Priority Flow Control – Fundamentals带优先级的流量控制 - 基础知识

PFC is defined by IEEE Standard 802.1Qbb and is part of the DCB suite of enhancements designed to make Ethernet a more viable, competitive transport in compute and storage environments. PFC 由 IEEE 标准 802.1Qbb 定义,是 DCB 系列增强功能的一部分,旨在使以太网成为计算和存储环境中更具可行性和竞争力的传输方式。

The following sections provide a brief overview of the DCB standards and the role of PFC. 下文将简要介绍 DCB 标准和 PFC 的作用。

4.1  DCB Standards DCB 标准

The goal of DCB is to create a completely loss-less Ethernet network that supports bandwidth allocation across links. The features of DCB are applicable to any highperformance Ethernet environment and have significant benefits for both LAN and RDMA traffic. DCB 的目标是创建一个支持跨链路带宽分配的完全无损耗以太网网络。DCB 的功能适用于任何高性能以太网环境,对局域网和 RDMA 流量都有显著优势。

Several different parts work together to make this happen: 有几个不同的部分共同发挥作用,实现了这一目标:

  • PFC: IEEE 802.1Qbb — Defines eight different traffic priorities that can be paused independently. 定义八种不同的流量优先级,可独立暂停。
  • Enhanced Transmission Selection (ETS): IEEE 802.1Qaz — Assigns bandwidth percentages to each priority. 为每个优先级分配带宽百分比。
  • Congestion Notification: IEEE 802.1Qau — End-to-end congestion management, further avoiding frame loss. 端到端拥塞管理,进一步避免帧丢失。
  • Data Center Bridging Capabilities Exchange Protocol (DCBX): IEEE 802.1az (same standard as ETS) — Discover and exchange DCB capabilities between link neighbors. Based on functionality provided by Link Layer Discovery Protocol (LLDP) (IEEE 802.1AB). 在链路邻居之间发现和交换 DCB 功能。基于链路层发现协议(LLDP)(IEEE 802.1AB)提供的功能。
  • Differentiated Services Code Point (DSCP): RFC 2474 — Defines the IP header field called Differentiated Services (DS) that selects packets based on the value in this field for buffer management and packet scheduling. 定义称为差异化服务 (DS) 的 IP 标头字段,根据该字段的值选择数据包,用于缓冲区管理和数据包调度。

4.1.1  DCB Willing vs. Non-willing Modes DCB 主动模式与被动模式

DCB standards have a concept of willing vs. non-willing DCB configuration. This refers to whether the device is willing to receive its DCB settings from its link neighbor. DCB 标准有一个主动与被动 DCB 配置的概念。这是指设备是否愿意从其链路邻居接收 DCB 设置。

  • In willing mode, a DCB-enabled device can query its neighbor's DCB settings, then apply the same settings to itself. 在主动模式下,启用 DCB 的设备可以查询其邻居的 DCB 设置,然后将相同的设置应用到自己身上。
  • In non-willing mode, DCB settings on the device must be explicitly configured. 在被动模式下,必须明确配置设备上的 DCB 设置。

A common strategy for using willing and non-willing modes in a cluster: 在服务器集群中使用主动模式和被动模式的通用策略:

  1. Set switches as non-willing. 将交换机设置为被动模式
  2. Configure DCB (priority settings, traffic classes, bandwidth allocations, etc.) on the switch ports. 在交换机端口上配置 DCB(优先级设置、数据流类别、带宽分配等)。
  3. Set adapters as willing.设置网络适配器为主动模式
  4. Adapters are automatically configured.网络适配器将自动配置

This helps simplify DCB cluster configuration by centralizing DCB settings on a switch and pushing the configuration to the adapters (rather than configuring each host individually). 通过在交换机上集中 配置DCB 并将配置推送到网络适配器(而不是单独配置每台主机),这有助于简化 服务器集群DCB配置。

Priority flow control (PFC) is supported on 800 Series in both willing and non-willing modes. 800 Series also has two DCB modes: software and firmware. For more background on software and firmware modes, refer to the Intel® Ethernet 800 Series ice driver README. 800 系列在主动和被动模式下都支持带优先级的流量控制 (PFC)。800 系列还具有两种 DCB 模式:软件和Firmware。有关软件和Firmware模式的更多背景信息,请参阅英特尔®以太网 800 系列ICE驱动程序 README。

  • For PFC willing mode, software DCB is recommended but firmware DCB is also supported. 对于 PFC 主动模式,建议使用软件 DCB,但也支持Firmware DCB。
  • For PFC non-willing mode, software DCB must be used. 对于 PFC 被动模式,必须使用软件 DCB

4.2  Determining PFC Priority Mode: PCP vs. DSCP确定 PFC 优先级模式: PCP 与 DSCP

An Ethernet frame's priority can be determined by one of two distinct values: PCP (VLAN) or DSCP. 以太网帧的优先级可由两个不同值之一决定: PCP(VLAN)或 DSCP。

Priority Code Point (PCP) is used to classify and manage network traffic, and providing QoS in Layer 2 Ethernet networks. It uses the 3-bit PCP field in the VLAN header for packet classification. 优先权代码点(PCP)用于对网络流量进行分类和管理,并在第 2 层以太网网络中提供 QoS。它使用 VLAN 标头中的 3 位 PCP 字段进行数据包分类。

Differentiated Services or DiffServ uses a 6-bit DSCP in the 8-bit DS field in the IP header for packet classification. The DS field replaces the outdated IPv4 TOS field. Of the 6 DSCP bits, 3 most significant bits represent priority value and the next 3 bits represent the drop precedence within each traffic class. 差异化服务或 DiffServ 在 IP 头的 8 位 DS 字段中使用 6 位 DSCP 对数据包进行分类。DS 字段取代了过时的 IPv4 TOS 字段。在 6 位 DSCP 位中,最重要的 3 位代表优先级值,接下来的 3 位代表每个数据流类别中的丢弃优先级。

Intel's ice driver supports two PFC modes: Layer 3 DSCP-based Quality of Service (L3 QoS) and L2 VLAN based QoS in the PF driver. For RoCEv2 traffi

你可能感兴趣的:(kernel,linux)