Linux TCP 拥塞控制实现机制

几个重要的计数器:

packets_out    : snd.una后面的数据包

sacked_out     :由SACK确认的数据包(当没有SACK时,duplicate ack 也使该计数+1)

lost_out          :网络中丢失的数据包的估计

retrans_out    :重传数据包计数

 

其中lost_out是一个估计值,  取决于具体实现。

驻留于网络中的数据包in_flight = packets_out + retrans_out - left_out。

其中left_out表示离开网络的数据包个数 left_out = sacked_out + lost_out。

 

另外对发送队列中的数据包都设置了标记:(内核注释)

 We have three tag bits: SACKED(S), RETRANS(R) and LOST(L).
 Packets in queue with these bits set are counted in variables
 sacked_out, retrans_out and lost_out, correspondingly.
 
 Valid combinations are:
 Tag    InFlight             Description
 0        1                      - orig segment is in flight.
 S        0                      - nothing flies, orig reached receiver.
 L        0                      - nothing flies, orig lost by net.
 R        2                      - both orig and retransmit are in flight.
 L|R     1                      - orig is lost, retransmit is in flight.
 S|R     1                      - orig reached receiver, retrans is still in flight.
 (L|S|R is logically valid, it could occur when L|R is sacked,
 but it is equivalent to plain S and code short-curcuits it to S.
  L|S is logically invalid, it would mean -1 packet in flight 8))

 

在NewReno下,发送者进入recovery状态时第一个未确认的数据包被标记为lost;

在SACK下,SACK block间的hole被标记为lost(FACK)。

 

TCP拥塞控制状态机(由ACK触发)

A Open状态。正常状态,执行slow start算法或者是congestion avoid算法,取决于拥塞窗口和ssthresh的大小

 

B Disorder状态。当检测到duplicate ack或者是SACK时,进入此状态。在此状态下拥塞窗口不调整,没收到一个数据包都触发一个新的数据包的发送。

 

C CWR状态。检测到由ECN,ICMP,或者本地设置引起的拥塞提示时,进入此状态。在此状态下,每收到2个ACK就把拥塞窗口-1,直到减为原来的一半。

 

D Recovery状态。当检测到3个重复的ACK时进入此状态,一般由Disorder状态进入。立即重传第一个未确认的数据包,每收到2个ACK就把拥塞窗口-1,

   直到见到ssthresh(此值在进入Recovery状态时设置为拥塞窗口的一半)。TCP停留在此状态直到刚进入此状态时所有驻留网络的数据包都得到确认,然后

   返回到open状态

E Loss状态。当RTO定时器超时时,进入此状态。所有驻留于网络的数据包都标记为Lost,拥塞窗口设置为1,启用slow start算法。

   当进入此状态时所有驻留于网络的数据包得到确认后,返回到Open状态。

 

There are occasions where the number of outstanding

packets decreases suddenly by several segments. For
example, a retransmitted segment and the following forward
transmissions can be acknowledged with a single
cumulative ACK. These situations would cause bursts of
data to be transmitted into the network, unless they are
taken into account in the TCP sender implementation.
The Linux TCP sender avoids the bursts by limiting the
congestion window to allow at most three segments to be
transmitted for an incoming ACK. Since burst avoidance
may reduce the congestion window size below the slow
start threshold, it is possible for the sender to enter slow
start after several segments have been acknowledged by
a single ACK.

 

你可能感兴趣的:(linux,算法,网络,tcp,NetWork,combinations)