原文地址 https://datatracker.ietf.org/doc/html/draft-ietf-rmcat-gcc
This document describes two methods of congestion control when using real-time communications on the World Wide Web (RTCWEB); one delay- based and one loss-based. |
本文档描述了通过万维网(RTCWEB)进行实时通信时的两种拥塞控制方法;一个是基于延时的拥塞控制,另一个是基于丢包的拥塞控制。 |
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. |
本文件中的关键词“必须”、“不得”、“要求”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照RFC 2119中所述进行解释。 |
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. |
本互联网草案完全符合BCP 78和BCP 79的规定。 |
Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved. |
版权所有(c)2016 IETF Trust 和确定为文档作者的人员。保留所有权利。 |
1. Introduction·····························································3 |
Congestion control is a requirement for all applications sharing the Internet resources [RFC2914]. |
拥塞控制是针对所有共享Internet网络资源的应用程序的一种要求[RFC2914]。 |
o The media is usually encoded in forms that cannot be quickly changed to accommodate varying bandwidth, and bandwidth requirements can often be changed only in discrete, rather large steps |
o 流媒体通常形式的编码无法针对不同带宽做出快速改变,流媒体需要的带宽通常是离散的、甚至是大跨度的变化 |
This memo describes two congestion control algorithms that together are able to provide good performance and reasonable bandwidth sharing with other video flows using the same congestion control and with TCP flows that share the same links. |
本文描述了两种拥塞控制算法,它们在一起能够提供良好的性能,并能够和使用相同拥塞控制算法的其它视频流甚至包括相同物理链路上的TCP流一起,合理的共享带宽。 |
The mathematics of this document have been transcribed from a more formula-friendly format. |
本文的数学内容是从一种更为公式友好的格式转录而来的。 |
The following elements are in the system: 本系统中包含下列元素: |
o RTP packet - an RTP packet containing media data. |
o RTP数据包 - 包含流媒体数据的RTP数据包。 |
o RTCP sender at RTP receiver - sends receiver reports, REMB messages and transport-wide RTCP feedback messages. |
o RTP接收方同时作为RTCP的发送方 - 负责发送RR报文、REMB报文和传输层RTCP反馈报文(transport wide feedback packet)。 |
Together, loss-based controller and delay-based controller implement the congestion control algorithm. |
基于丢包的控制器和基于延迟的控制器共同实现了拥塞控制算法。 |
There are two ways to implement the proposed algorithm. One where both the controllers are running at the send-side, and one where the delay-based controller runs on the receive-side and the loss-based controller runs on the send-side. |
本算法有两种实现方式。一种是两个控制器都在发送端运行,另一种是基于延迟的控制器在接收端运行,基于丢包的控制器在发送端运行。 |
The first version can be realized by using a per-packet feedback protocol as described in [I-D.holmer-rmcat-transport-wide-cc-extensions]. Here, the RTP receiver will record the arrival time and the transport-wide sequence number of each received packet, which will be sent back to the sender periodically using the transport-wide feedback message. The RECOMMENDED feedback interval is once per received video frame or at least once every 30 ms if audio-only or multi-stream. If the feedback overhead needs to be limited this interval can be increased to 100 ms. |
第一种方法可以通过使用[I-D.holmer-rmcat-transport-wide-cc-extensions]中描述的每包反馈协议来实现。这里,RTP接收端将记录每个接收到的数据包的到达时间和传输层序列号(transport-wide sequence),该序列号将使用传输层反馈消息(transport-wide feedback)周期性地发送回发送方。建议的反馈间隔为每接收一个视频帧一次,或者如果仅音频或多流,则至少每30ms一次。如果需要限制反馈开销,该间隔可增加至100ms。 |
The second version can be realized by having a delay-based controller at the receive-side, monitoring and processing the arrival time and size of incoming packets. The sender SHOULD use the abs-send-time RTP header extension [abs-send-time] to enable the receiver to compute the inter-group delay variation. The output from the delay- based controller will be a bitrate, which will be sent back to the sender using the REMB feedback message [I-D.alvestrand-rmcat-remb]. The packet loss ratio is sent back via RTCP receiver reports. At the sender the bitrate in the REMB message and the fraction of packets lost are fed into the loss-based controller, which outputs a final target bitrate. It is RECOMMENDED to send the REMB message as soon as congestion is detected, and otherwise at least once every second. |
第二种方法可以通过在接收端部署基于延迟的控制器来实现,该控制器监视和处理传入数据包的到达时间和大小。发送方应该使用abs-send-time RTP报头扩展[abs-send-time],使接收方能够计算数据包组之间的延迟变化。基于延迟的控制器的输出将是比特率,该比特率将使用REMB消息[I-D.alvestrand-rmcat-REMB]反馈回发送方。丢包率通过RTCP-RR报文反馈。在发送方,REMB消息中的比特率和(RR报文中的)丢包率信息将被送入基于丢失的控制器,该控制器输出最终的目标比特率。建议在检测到拥塞时立即发送REMB消息,否则至少每秒发送一次。 |
Pacing is used to actuate the target bitrate computed by the controllers. |
控制器计算的目标比特率通过Pacing(调速器)生效 |
The delay-based control algorithm can be further decomposed into four parts: a pre-filtering, an arrival-time filter, an over-use detector, and a rate controller. |
基于延迟的控制算法可以进一步分解为四个部分:预滤波、到达时间滤波器、过载检测器和速率控制器。 |
This section describes an adaptive filter that continuously updates estimates of network parameters based on the timing of the received groups of packets. |
本节描述一种自适应滤波器,该自适应滤波器基于接收数据包的时间,连续更新网络参数的估计值。 |
d ( i ) = t ( i ) − t ( i − 1 ) − ( T ( i ) − T ( i − 1 ) ) d(i) = t(i) - t(i-1) - (T(i) - T(i-1)) d(i)=t(i)−t(i−1)−(T(i)−T(i−1))
An inter-departure time is computed between consecutive groups as T(i) - T(i-1), where T(i) is the departure timestamp of the last packet in the current packet group being processed. Any packets received out of order are ignored by the arrival-time model. |
使用T(i) -T(i-1)计算连续包组之间的发送时间间隔,其中T(i)是正在处理的当前包组中的最后一个报文的发送时间戳。另外,到达时间模型会忽略任何接收到的乱序报文。 |
Breaking out the mean, m(i), from w(i) to make the process zero mean, we get |
从w(i)中取平均值m(i),使过程的平均值为零,我们得到 |
The pre-filtering aims at handling delay transients caused by channel outages. During an outage, packets being queued in network buffers, for reasons unrelated to congestion, are delivered in a burst when the outage ends. |
预滤波旨在处理由通道中断引起的延迟瞬变。在中断期间,由于与拥塞无关的原因而在网络缓冲区中排队的数据包在中断结束时以突发方式发送。 |
The parameter d(i) is readily available for each group of packets, i > 1. We want to estimate m(i) and use this estimate to detect whether or not the bottleneck link is over-used. The parameter can be estimated by any adaptive filter - we are using the Kalman filter. |
i>1时,参数d(i)可方便地用于每个包组。我们想估计m(i),并使用这个估计值来检测存在带宽瓶颈的链路上是否出现过载。可以使用任何自适应滤波器来估计这个参数 - 这里我们使用的是卡尔曼滤波器。 |
q(i) = E{u(i)^2} |
q(i) = E{u(i)^2} |
where f_max = max {1/(T(j) - T(j-1))} for j in i-K+1,...,i is the highest rate at which the last K packet groups have been received and chi is a filter coefficient typically chosen as a number in the interval [0.1, 0.001]. Since our assumption that v(i) should be zero mean WGN is less accurate in some cases, we have introduced an additional outlier filter around the updates of var_v_hat. If z(i) > 3*sqrt(var_v_hat) the filter is updated with 3*sqrt(var_v_hat) rather than z(i). For instance v(i) will not be white in situations where packets are sent at a higher rate than the channel capacity, in which case they will be queued behind each other. |
其中f_max=max{ 1/(T(j) -T(j-1)) },j 为从 i-K+1 到 i,我们取最近收到的K个组中的最大值。chi是一个滤波系数,可以在0.001到0.1之间选择。由于我们假设v(i)应为零,因此在某些情况下,WGN的平均值不太准确,因此我们在var_v_hat的更新中引入了额外的异常值过滤器。如果z(i) > 3*sqrt(var_v_hat),则使用3*sqrt(var_v_hat)而不是z(i)更新过滤器。例如,在以高于信道容量的速率发送数据包的情况下,在这种情况下,这些包将会排队,v(i)将不会是白色的(即测量值要大于预测值) |
The inter-group delay variation estimate m(i), obtained as the output of the arrival-time filter, is compared with a threshold del_var_th(i). An estimate above the threshold is considered as an indication of over-use. Such an indication is not enough for the detector to signal over-use to the rate control subsystem. A definitive over-use will be signaled only if over-use has been detected for at least overuse_time_th milliseconds. However, if m(i) < m(i-1), over-use will not be signaled even if all the above conditions are met. Similarly, the opposite state, under-use, is detected when m(i) < -del_var_th(i). If neither over-use nor under- use is detected, the detector will be in the normal state. |
包组间延迟变化的估计值m(i)作为到达时间滤波器的输出,与阈值del_var_th(i)进行比较。超过此阈值的估计值被视为过载指示。但仅有这种指示不足以使检测器向速率控制子系统发出过载(over-use)信号。只有在检测到过载时间至少超过overuse_time_th毫秒的情况下,才会发出确定的过载信号。但是,如果m(i) < m(i-1),即使满足上述所有条件,也不会触发过载信号。 |
The reason is that, by using a larger value of del_var_th, a larger queuing delay can be tolerated, whereas with a small del_var_th, the over-use detector quickly reacts to a small increase in the offset estimate m(i) by generating an over-use signal that reduces the delay-based estimate of the available bandwidth A_hat (see Section 4.4). Thus, it is necessary to dynamically tune the threshold del_var_th to get good performance in the most common scenarios, such as when competing with loss-based flows. |
具体原因是,使用更大的del_var_th值,可以容忍更大的队列延迟,而较小的del_var_th可以让过载检测器对偏移估计m(i)的小幅增加作出快速反应,生成过载信号减少基于延迟的可用带宽估计值A_hat (参见第4.4节)。因此,有必要动态调整阈值del_var_th,以便在最常见的场景中获得良好的性能,例如在与基于丢包的流发生竞争时。 |
with K(i)=K_d if |m(i)| < del_var_th(i-1) or K(i)=K_u otherwise. The rationale is to increase del_var_th(i) when m(i) is outside of the range [-del_var_th(i-1),del_var_th(i-1)], whereas, when the offset estimate m(i) falls back into the range, del_var_th is decreased. In this way when m(i) increases, for instance due to a TCP flow entering the same bottleneck, del_var_th(i) increases and avoids the uncontrolled generation of over-use signals which may lead to starvation of the flow controlled by the proposed algorithm [Pv13]. Moreover, del_var_th(i) SHOULD NOT be updated if this condition holds: |
如果 |m(i)| < del_var_th(i-1) 时,则 K(i)=K_d,否则 K(i)=K_u。这里的基本原理是,当 |m(i)| 超出 [-del_var_th(i-1),del_var_th(i-1)] 的范围时,我们应该增加del_var_th(i),而当偏移估计m(i)回落到范围内时,应该减小del_var_th。 |
On the other hand, when m(i) falls back into the range [-del_var_th(i-1),del_var_th(i-1)] the threshold del_var_th(i) is decreased so that a lower queuing delay can be achieved. |
另外,当m(i)回落到 [-del_var_th(i-1),del_var_th(i-1)] 范围内时,阈值del_var_th(i)应该被减小,以便可以实现较低的排队延迟(使算法对延迟变化更敏感)。 |
The rate control is split in two parts, one controlling the bandwidth estimate based on delay, and one controlling the bandwidth estimate based on loss. Both are designed to increase the estimate of the available bandwidth A_hat as long as there is no detected congestion and to ensure that we will eventually match the available bandwidth of the channel and detect an over-use. |
速率控制分为两部分,一部分是通过基于延迟的带宽估计进行控制,另一部分是通过基于丢包带宽的估计进行控制。设计这两种方法的目的都是为了在没有检测到拥塞的情况下,增加可用带宽的估计值(A_hat),最终确保我们既能匹配信道的可用带宽也能对信道的过载状态进行检测。 |
The state transitions (with blank fields meaning "remain in state") are: |
Signal \ State | Hold | Increase | Decrease |
---|---|---|---|
Over-use | Decrease | Decrease | |
Normal | Increase | Hold | |
Under-use | Hold | Hold |
信号 \ 状态 | 保持 | 增加 | 减少 |
---|---|---|---|
过载 | 减少 | 减少 | |
正常 | 增加 | 保持 | |
轻载 | 保持 | 保持 |
The subsystem starts in the increase state, where it will stay until over-use or under-use has been detected by the detector subsystem. On every update the delay-based estimate of the available bandwidth is increased, either multiplicatively or additively, depending on its current state. |
子系统启动时的初始状态是增加状态,直到探测器子系统检测到过载或轻载为止。在每次更新时,基于延迟的可用带宽估计会根据当前状态采用乘法或加法的方式增加带宽估计值。 |
If R_hat(i) increases above three standard deviations of the average max bitrate, we assume that the current congestion level has changed, at which point we reset the average max bitrate and go back to the multiplicative increase state. |
如果R_hat(i)增加到平均最大比特率的三个标准偏差以上,我们认为当前拥塞等级已经改变,此时我们重置平均最大比特率并回到乘法增加状态。 |
During multiplicative increase, the estimate is increased by at most 8% per second. |
在乘法增加期间,估计值每秒最多被增加8%。 |
response_time_ms = 100 + rtt_ms |
response_time_ms = 100 + rtt_ms |
Since the system depends on over-using the channel to verify the current available bandwidth estimate, we must make sure that our estimate does not diverge from the rate at which the sender is actually sending. Thus, if the sender is unable to produce a bit stream with the bitrate the congestion controller is asking for, the available bandwidth estimate should stay within a given bound. Therefore we introduce a threshold |
由于本系统依赖于信道过载来验证当前可用带宽估计,因此我们必须确保我们的估计值不会远离发送端实际的发送码率。因此,如果发送端无法产生具有拥塞控制器要求的比特率的数据流,为了将可用带宽估计值限制到一个范围。我们引入了一个阈值 |
When an over-use is detected the system transitions to the decrease state, where the delay-based available bandwidth estimate is decreased to a factor times the currently incoming bitrate. |
当检测到过载时,系统迁移到减少状态,在此状态下,基于延迟的可用带宽估计值应该减少到当前输入比特率乘以一个系数(beta)。 |
When the detector signals under-use to the rate control subsystem, we know that queues in the network path are being emptied, indicating that our available bandwidth estimate A_hat is lower than the actual available bandwidth. Upon that signal the rate control subsystem will enter the hold state, where the receive-side available bandwidth estimate will be held constant while waiting for the queues to stabilize at a lower level - a way of keeping the delay as low as possible. This decrease of delay is wanted, and expected, immediately after the estimate has been reduced due to over-use, but can also happen if the cross traffic over some links is reduced. |
在估计值由于过载而减少之后,当检测器向速率控制子系统发送轻载信号时,我们知道网络路径上的队列正在清空,这表明我们的带宽估计值低于实际可用带宽。收到该信号后,速率控制子系统将进入保持状态,在该状态下,接收端可用带宽估计值将保持不变,同时等待队列稳定在较低水平,这是一种尽可能降低延迟的方法。这种延迟的减少是需要的,也是被期望的,但是如果某些链路上的交叉流量减少,也可能发生这种情况。 |
Parameter | Description | RECOMMENDED |
---|---|---|
burst_time | Time limit in milliseconds between packet bursts which identifies a group | 5 ms |
q | State noise covariance matrix | q = 10^-3 |
e(0) | Initial value of the system error covariance | e(0) = 0.1 |
chi | Coefficient used for the measured noise variance | [0.1, 0.001] |
del_var_th(0) | Initial value for the adaptive threshold | 12.5 ms |
overuse_time_th | Time required to trigger an overuse signal | 10 ms |
K_u | Coefficient for the adaptive threshold | 0.01 |
K_d | Coefficient for the adaptive threshold | 0.00018 |
T | Time window for measuring the received bitrate | [0.5, 1] s |
beta | Decrease rate factor | 0.85 |
Table 1: RECOMMENDED values for delay based controller
参数 | 描述 | 推荐值 |
---|---|---|
突发时间 | 标识组的数据包突发之间的时间限制(以毫秒为单位) | 5 ms |
q | 状态噪声协方差矩阵 | q = 10^-3 |
e(0) | 系统误差协方差的初始值 | e(0) = 0.1 |
chi | 用于测量噪声方差的系数 | [0.1, 0.001] |
del_var_th(0) | 自适应阈值的初始值 | 12.5 ms |
overuse_time_th | 触发过载信号所需的时间 | 10 ms |
K_u | 自适应阈值系数 | 0.01 |
K_d | 自适应阈值系数 | 0.00018 |
T | 用于测量接收比特率的时间窗口 | [0.5, 1] s |
beta | 递减率系数 | 0.85 |
表1: 基于延时的控制器参数的推荐者
A second part of the congestion controller bases its decisions on the round-trip time, packet loss and available bandwidth estimates A_hat received from the delay-based controller. The available bandwidth estimates computed by the loss-based controller are denoted with As_hat. |
拥塞控制器的第二部分基于从基于延迟的控制器接收到的往返时间、分组丢失和可用带宽估计来做出决策。由基于损耗的控制器计算的可用带宽估计值用As_hat表示。 |
o If 2-10% of the packets have been lost since the previous report from the receiver, the sender available bandwidth estimate As_hat(i) will be kept unchanged. |
o 如果自上一次来自接收方的报告以来有2-10%的数据包丢失,则发送方可用带宽估计值As_hat(i)将保持不变。 |
The loss-based estimate As_hat is compared with the delay-based estimate A_hat. The actual sending rate is set as the minimum between As_hat and A_hat. |
将基于丢包的估计值As_hat与基于延迟的估计值A_hat进行比较。实际发送速率设置为As_hat和A_hat之间的最小值。 |
In case a sender implementing these algorithms talks to a receiver which do not implement any of the proposed RTCP messages and RTP header extensions, it is suggested that the sender monitors RTCP receiver reports and uses the fraction of lost packets and the round- trip time as input to the loss-based controller. The delay-based controller should be left disabled. |
如果发送方支持本文描述的控制算法,而接收方未实现任何本文描述的两种控制协议所需要的RTCP消息和RTP报头扩展,则建议发送方监控RTCP-RR报文,并使用丢包率和RTT往返时间作为基于丢包的控制器(loss-based controller)的输入。基于延迟的控制器(delay-based controller)应保持禁用状态。 |
This algorithm has been implemented in the open-source WebRTC project, has been in use in Chrome since M23, and is being used by Google Hangouts. |
该算法已在开源WebRTC项目中实现,自M23开始在Chrome中使用,并被Google Hangouts使用。 |
This draft is offered as input to the congestion control discussion. |
提交本草案作为拥塞控制讨论的输入。 |
This document makes no request of IANA. |
本文件对于IANA无任何要求。 |
An attacker with the ability to insert or remove messages on the connection would have the ability to disrupt rate control. This could make the algorithm to produce either a sending rate under- utilizing the bottleneck link capacity, or a too high sending rate causing network congestion. |
如果攻击者能够在连接上插入或删除消息,就能够中断(本文描述的)速率控制(算法)。这可能导致算法要么是发送速率低于瓶颈链路的实际容量,要么是过高的发送速率导致网络拥塞。 |
Thanks to Randell Jesup, Magnus Westerlund, Varun Singh, Tim Panton, Soo-Hyun Choo, Jim Gettys, Ingemar Johansson, Michael Welzl and others for providing valuable feedback on earlier versions of this draft. |
感谢Randell Jesup、Magnus Westerlund、Varun Singh、Tim Panton、Soo Hyun Choo、Jim Gettys、Ingemar Johansson、Michael Welzl和其他人为本草案的早期版本提供了宝贵的意见。 |
[I-D.alvestrand-rmcat-remb] Alvestrand, H., "RTCP message for Receiver Estimated Maximum Bitrate", draft-alvestrand-rmcat-remb-03 (work in progress), October 2013. |
[Pv13] De Cicco, L., Carlucci, G., and S. Mascolo, "Understanding the Dynamic Behaviour of the Google Congestion Control", Packet Video Workshop , December 2013. |
o Added change log |
o 添加了更改日志 |
o Defined the term "frame", incorporating the transmission time offset into its definition, and removed references to "video frame". |
o 定义术语“帧”,将传输时间偏移纳入其定义,并删除对“视频帧”的引用。 |
o Added a section on how to process multiple streams in a single estimator using RTP timestamps to NTP time conversion. |
o 增加了一节,介绍如何使用RTP时间戳到NTP时间转换在单个估计器中处理多个流。 |
Renamed draft to link the draft name to the RMCAT WG. |
重命名草稿以将草稿名称链接到RMCAT WG。 |
Spellcheck. Otherwise no changes, this is a "keepalive" release. |
拼写检查。否则没有更改,这是一个“keepalive”版本。 |
o Added Luca De Cicco and Saverio Mascolo as authors. |
o 增加了卢卡·德·西科和萨维里奥·马斯科洛作为作者。 |
o Swapped receiver-side/sender-side controller with delay-based/ loss-based controller as there is no longer a requirement to run the delay-based controller on the receiver-side. |
o 由于不再需要在接收方运行基于延迟的控制器,因此将接收方/发送方控制器与基于延迟/丢失的控制器交换。 |
o Arrival-time filter converted from a two dimensional Kalman filter to a scalar Kalman filter. |
o 到达时间滤波器从二维卡尔曼滤波器转换为标量卡尔曼滤波器。 |
o Added a section which better describes the pre-filtering algorithm. |
o 增加了一节,更好地描述了预过滤算法。 |
Stefan Holmer |
RFC和IETF不为人知的秘密 https://zhuanlan.zhihu.com/p/426531687