流媒体弱网优化之路(BBR算法应用)——QUIC-BBR算法代码分析

流媒体弱网优化之路(BBR算法应用)——QUIC-BBR算法代码分析

——
我正在的github给大家开发一个用于做实验的项目 —— github.com/qw225967/Bifrost

目标:可以让大家熟悉各类Qos能力、带宽估计能力,提供每个环节关键参数调节接口并实现一个json全配置,提供全面的可视化算法观察能力。

欢迎大家使用
——

文章目录

  • 流媒体弱网优化之路(BBR算法应用)——QUIC-BBR算法代码分析
  • 一、代码来源
  • 二、代码分析
    • 2.1 调用流程
    • 2.2 发送过程
    • 2.3 确认过程
      • 2.3.1 上层接收
      • 2.3.2 拥塞计算
      • 2.3.3 状态机切换
  • 三、总结


一、代码来源

  前面的文章给大简单介绍过BBR算法的基础内容,本文迁移了基于google的quiche代码仓库到上面提到的实验项目中,对BBR算法进行了集成,并适配了BBR算法进行调试。其实在quiche中支持的带宽估计算法比较多,下面展示了发送算法控制工厂的代码截图:

SendAlgorithmInterface* SendAlgorithmInterface::Create(
    const QuicClock* clock, const RttStats* rtt_stats,
    const QuicUnackedPacketMap* unacked_packets,
    CongestionControlType congestion_control_type, QuicRandom* random,
    QuicConnectionStats* stats, QuicPacketCount initial_congestion_window,
    SendAlgorithmInterface* old_send_algorithm) {
  QuicPacketCount max_congestion_window =
      GetQuicFlag(quic_max_congestion_window);
  switch (congestion_control_type) {
    case kGoogCC:  // GoogCC is not supported by quic/core, fall back to BBR.
    case kBBR:
      return new BbrSender(clock->ApproximateNow(), rtt_stats, unacked_packets,
                           initial_congestion_window, max_congestion_window,
                           random, stats);
    case kBBRv2:
      return new Bbr2Sender(
          clock->ApproximateNow(), rtt_stats, unacked_packets,
          initial_congestion_window, max_congestion_window, random, stats,
          old_send_algorithm &&
                  old_send_algorithm->GetCongestionControlType() == kBBR
              ? static_cast<BbrSender*>(old_send_algorithm)
              : nullptr);
    case kPCC:
      // PCC is currently not supported, fall back to CUBIC instead.
      ABSL_FALLTHROUGH_INTENDED;
    case kCubicBytes:
      return new TcpCubicSenderBytes(
          clock, rtt_stats, false /* don't use Reno */,
          initial_congestion_window, max_congestion_window, stats);
    case kRenoBytes:
      return new TcpCubicSenderBytes(clock, rtt_stats, true /* use Reno */,
                                     initial_congestion_window,
                                     max_congestion_window, stats);
  }
  return nullptr;
}

  它不仅提供了BBR的两个版本代码,还提供了TCP-Cubic的代码,看注释后续还会继续提供PCC的代码,为我们学习这些常用的带宽估计算法提供了很好的接入渠道。

  这次我们主要聚焦于BBR-v1算法来进行分析,并使用该算法和我们前面集成的GCC算法进行竞争实验,用于对比出两个算法之间的优缺点,同时为他们各自的使用场景进行具体分析。

二、代码分析

2.1 调用流程

  我们主要对quiche里代码的调用流程进行分析。主要使用发送控制算法的逻辑在 quic/core/quic_sent_packet_manager.cc 里边,在该类中创建了 SendAlgorithmInterface 类进行了调用。

// 创建算法时传入:未确认数据map、rtt_状态
 SetSendAlgorithm(SendAlgorithmInterface::Create(
      clock_, &rtt_stats_, &unacked_packets_, congestion_control_type, random_,
      stats_, initial_congestion_window_, send_algorithm_.get()));


// 数据发送过程
// quic_connection.cc 发送数据
bool QuicConnection::WritePacket(SerializedPacket* packet);

// quic_sent_packet_manager.cc quic发送模块记录数据
bool QuicSentPacketManager::OnPacketSent(
    SerializedPacket* mutable_packet, QuicTime sent_time,
    TransmissionType transmission_type,
    HasRetransmittableData has_retransmittable_data, bool measure_rtt,
    QuicEcnCodepoint ecn_codepoint);

//  send_algorithm_interface.h 发送算法接口类调用 ——> 调用到 bbr v1
// bbr_sender.cc
void BbrSender::OnPacketSent(QuicTime sent_time, QuicByteCount bytes_in_flight,
                             QuicPacketNumber packet_number,
                             QuicByteCount bytes,
                             HasRetransmittableData is_retransmittable);

// bandwidth_sample.cc 采样数据记录
void BandwidthSampler::OnPacketSent(
    QuicTime sent_time, QuicPacketNumber packet_number, QuicByteCount bytes,
    QuicByteCount bytes_in_flight,
    HasRetransmittableData has_retransmittable_data);



// ack信令接收过程
// quic_connection.cc 接收到ack数据
bool QuicConnection::OnAckFrameEnd(
    QuicPacketNumber start, const absl::optional<QuicEcnCounts>& ecn_counts);

// quic_sent_packet_manager.cc quic发送模块确认数据
AckResult QuicSentPacketManager::OnAckFrameEnd(
    QuicTime ack_receive_time, QuicPacketNumber ack_packet_number,
    EncryptionLevel ack_decrypted_level,
    const absl::optional<QuicEcnCounts>& ecn_counts);

// send_algorithm_interface.h 发送算法接口类调用 ——> 调用 bbr v1
// bbr_sender.cc
void BbrSender::OnCongestionEvent(bool /*rtt_updated*/,
                                  QuicByteCount prior_in_flight,
                                  QuicTime event_time,
                                  const AckedPacketVector& acked_packets,
                                  const LostPacketVector& lost_packets,
                                  QuicPacketCount /*num_ect*/,
                                  QuicPacketCount /*num_ce*/);

// bandwidth_sample.cc 采样计算确认数据
BandwidthSampler::OnCongestionEvent(QuicTime ack_time,
                                    const AckedPacketVector& acked_packets,
                                    const LostPacketVector& lost_packets,
                                    QuicBandwidth max_bandwidth,
                                    QuicBandwidth est_bandwidth_upper_bound,
                                    QuicRoundTripCount round_trip_count);

2.2 发送过程

  发送数据的过程中,会根据数据发送与接收的量进行具体的样本计算。本次介绍的内容不从上层展开,从发送算法处进行分析。

// quic_sent_packet_manager.cc
// 发送数据
bool QuicSentPacketManager::OnPacketSent(
    SerializedPacket* mutable_packet, QuicTime sent_time,
    TransmissionType transmission_type,
    HasRetransmittableData has_retransmittable_data, bool measure_rtt,
    QuicEcnCodepoint ecn_codepoint) {
  // 取出数据包信息
  const SerializedPacket& packet = *mutable_packet;
  QuicPacketNumber packet_number = packet.packet_number;
  QUICHE_DCHECK_LE(FirstSendingPacketNumber(), packet_number);
  QUICHE_DCHECK(!unacked_packets_.IsUnacked(packet_number));
  QUIC_BUG_IF(quic_bug_10750_2, packet.encrypted_length == 0)
      << "Cannot send empty packets.";
  if (pending_timer_transmission_count_ > 0) {
    --pending_timer_transmission_count_;
  }
  // 确认数据重传类型
  bool in_flight = has_retransmittable_data == HAS_RETRANSMITTABLE_DATA;
  if (ignore_pings_ && mutable_packet->retransmittable_frames.size() == 1 &&
      mutable_packet->retransmittable_frames[0].type == PING_FRAME) {
    // Dot not use PING only packet for RTT measure or congestion control.
    in_flight = false;
    measure_rtt = false;
  }
  // 使用pacing则调用,否则调用发送记录算法
  if (using_pacing_) {
    pacing_sender_.OnPacketSent(sent_time, unacked_packets_.bytes_in_flight(),
                                packet_number, packet.encrypted_length,
                                has_retransmittable_data);
  } else {
    send_algorithm_->OnPacketSent(sent_time, unacked_packets_.bytes_in_flight(),
                                  packet_number, packet.encrypted_length,
                                  has_retransmittable_data);
  }

  // Deallocate message data in QuicMessageFrame immediately after packet
  // sent.
  if (packet.has_message) {
    for (auto& frame : mutable_packet->retransmittable_frames) {
      if (frame.type == MESSAGE_FRAME) {
        frame.message_frame->message_data.clear();
        frame.message_frame->message_length = 0;
      }
    }
  }

  if (packet.has_ack_frequency) {
    for (const auto& frame : packet.retransmittable_frames) {
      if (frame.type == ACK_FREQUENCY_FRAME) {
        OnAckFrequencyFrameSent(*frame.ack_frequency_frame);
      }
    }
  }
  RecordEcnMarkingSent(ecn_codepoint, packet.encryption_level);
  // 就未确认数据
  unacked_packets_.AddSentPacket(mutable_packet, transmission_type, sent_time,
                                 in_flight, measure_rtt, ecn_codepoint);
  // Reset the retransmission timer anytime a pending packet is sent.
  return in_flight;
}

// quic_unacked_packet_map.cc
// 未确认数据发送函数
void QuicUnackedPacketMap::AddSentPacket(SerializedPacket* mutable_packet,
                                         TransmissionType transmission_type,
                                         QuicTime sent_time, bool set_in_flight,
                                         bool measure_rtt,
                                         QuicEcnCodepoint ecn_codepoint) {
  // 转存数据内容
  const SerializedPacket& packet = *mutable_packet;
  QuicPacketNumber packet_number = packet.packet_number;
  QuicPacketLength bytes_sent = packet.encrypted_length;
  QUIC_BUG_IF(quic_bug_12645_1, largest_sent_packet_.IsInitialized() &&
                                    largest_sent_packet_ >= packet_number)
      << "largest_sent_packet_: " << largest_sent_packet_
      << ", packet_number: " << packet_number;
  QUICHE_DCHECK_GE(packet_number, least_unacked_ + unacked_packets_.size());
  while (least_unacked_ + unacked_packets_.size() < packet_number) {
    unacked_packets_.push_back(QuicTransmissionInfo());
    unacked_packets_.back().state = NEVER_SENT;
  }

  // 记录发送信息
  const bool has_crypto_handshake = packet.has_crypto_handshake == IS_HANDSHAKE;
  QuicTransmissionInfo info(packet.encryption_level, transmission_type,
                            sent_time, bytes_sent, has_crypto_handshake,
                            packet.has_ack_frequency, ecn_codepoint);
  info.largest_acked = packet.largest_acked;
  largest_sent_largest_acked_.UpdateMax(packet.largest_acked);

  if (!measure_rtt) {
    QUIC_BUG_IF(quic_bug_12645_2, set_in_flight)
        << "Packet " << mutable_packet->packet_number << ", transmission type "
        << TransmissionTypeToString(mutable_packet->transmission_type)
        << ", retransmittable frames: "
        << QuicFramesToString(mutable_packet->retransmittable_frames)
        << ", nonretransmittable_frames: "
        << QuicFramesToString(mutable_packet->nonretransmittable_frames);
    info.state = NOT_CONTRIBUTING_RTT;
  }
  // 计算飞行数据
  largest_sent_packet_ = packet_number;
  if (set_in_flight) {
    const PacketNumberSpace packet_number_space =
        GetPacketNumberSpace(info.encryption_level);
    bytes_in_flight_ += bytes_sent;
    bytes_in_flight_per_packet_number_space_[packet_number_space] += bytes_sent;
    ++packets_in_flight_;
    info.in_flight = true;
    largest_sent_retransmittable_packets_[packet_number_space] = packet_number;
    last_inflight_packet_sent_time_ = sent_time;
    last_inflight_packets_sent_time_[packet_number_space] = sent_time;
  }
  unacked_packets_.push_back(std::move(info));
  // Swap the retransmittable frames to avoid allocations.
  // TODO(ianswett): Could use emplace_back when Chromium can.
  if (has_crypto_handshake) {
    last_crypto_packet_sent_time_ = sent_time;
  }

  mutable_packet->retransmittable_frames.swap(
      unacked_packets_.back().retransmittable_frames);
}

// bbr_sender.cc
// bbr发送
void BbrSender::OnPacketSent(QuicTime sent_time, QuicByteCount bytes_in_flight,
                             QuicPacketNumber packet_number,
                             QuicByteCount bytes,
                             HasRetransmittableData is_retransmittable) {
  // 记录状态数据
  if (stats_ && InSlowStart()) {
    ++stats_->slowstart_packets_sent;
    stats_->slowstart_bytes_sent += bytes;
  }

  last_sent_packet_ = packet_number;

  if (bytes_in_flight == 0 && sampler_.is_app_limited()) {
    exiting_quiescence_ = true;
  }

  // 放入采样模块
  sampler_.OnPacketSent(sent_time, packet_number, bytes, bytes_in_flight,
                        is_retransmittable);
}

// bandwidth_sampler.cc
// 采样记录模块记录发送数据
void BandwidthSampler::OnPacketSent(
    QuicTime sent_time, QuicPacketNumber packet_number, QuicByteCount bytes,
    QuicByteCount bytes_in_flight,
    HasRetransmittableData has_retransmittable_data) {
  last_sent_packet_ = packet_number;

  if (has_retransmittable_data != HAS_RETRANSMITTABLE_DATA) {
    return;
  }

  // 记录发送总量
  total_bytes_sent_ += bytes;

  // If there are no packets in flight, the time at which the new transmission
  // opens can be treated as the A_0 point for the purpose of bandwidth
  // sampling. This underestimates bandwidth to some extent, and produces some
  // artificially low samples for most packets in flight, but it provides with
  // samples at important points where we would not have them otherwise, most
  // importantly at the beginning of the connection.

  // 根据飞行数据进行更新
  if (bytes_in_flight == 0) {
    last_acked_packet_ack_time_ = sent_time;
    if (overestimate_avoidance_) {
      recent_ack_points_.Clear();
      recent_ack_points_.Update(sent_time, total_bytes_acked_);
      a0_candidates_.clear();
      a0_candidates_.push_back(recent_ack_points_.MostRecentPoint());
    }
    total_bytes_sent_at_last_acked_packet_ = total_bytes_sent_;

    // In this situation ack compression is not a concern, set send rate to
    // effectively infinite.
    last_acked_packet_sent_time_ = sent_time;
  }

  if (!connection_state_map_.IsEmpty() &&
      packet_number >
          connection_state_map_.last_packet() + max_tracked_packets_) {
    if (unacked_packet_map_ != nullptr && !unacked_packet_map_->empty()) {
      QuicPacketNumber maybe_least_unacked =
          unacked_packet_map_->GetLeastUnacked();
      QUIC_BUG(quic_bug_10437_1)
          << "BandwidthSampler in-flight packet map has exceeded maximum "
             "number of tracked packets("
          << max_tracked_packets_
          << ").  First tracked: " << connection_state_map_.first_packet()
          << "; last tracked: " << connection_state_map_.last_packet()
          << "; entry_slots_used: " << connection_state_map_.entry_slots_used()
          << "; number_of_present_entries: "
          << connection_state_map_.number_of_present_entries()
          << "; packet number: " << packet_number
          << "; unacked_map: " << unacked_packet_map_->DebugString()
          << "; total_bytes_sent: " << total_bytes_sent_
          << "; total_bytes_acked: " << total_bytes_acked_
          << "; total_bytes_lost: " << total_bytes_lost_
          << "; total_bytes_neutered: " << total_bytes_neutered_
          << "; last_acked_packet_sent_time: " << last_acked_packet_sent_time_
          << "; total_bytes_sent_at_last_acked_packet: "
          << total_bytes_sent_at_last_acked_packet_
          << "; least_unacked_packet_info: "
          << (unacked_packet_map_->IsUnacked(maybe_least_unacked)
                  ? unacked_packet_map_
                        ->GetTransmissionInfo(maybe_least_unacked)
                        .DebugString()
                  : "n/a");
    } else {
      QUIC_BUG(quic_bug_10437_2)
          << "BandwidthSampler in-flight packet map has exceeded maximum "
             "number of tracked packets.";
    }
  }

  // 更新网络发送数据map
  bool success = connection_state_map_.Emplace(packet_number, sent_time, bytes,
                                               bytes_in_flight + bytes, *this);
  QUIC_BUG_IF(quic_bug_10437_3, !success)
      << "BandwidthSampler failed to insert the packet "
         "into the map, most likely because it's already "
         "in it.";
}

  该过程主要是为了记录整个传输过程中发送的数据。
  当数据发送出去之后会存在几个情况:
  1.数据对端接收,并正确ack数据;
  2.数据对端未接收,无法ack数据;
  3.数据对端接收,ack信令丢失。

  根据上述的过程,我们需要统计已确认的数据+未确认的数据(其中包含在排队、正在传输的数据)。因此quic分出两个模块进行计算,一个是unackedpacketmap、一个是bbr发送算法中集成的带宽采样模块(其中会计算ack数据)。

2.3 确认过程

2.3.1 上层接收

  在数据发送后,对端会把确认的数据组合成一个确认数据返回发送端,此时我们需要对这些数据进行处理并计算出当前是否需要拥塞控制。

// quic_sent_packet_manager.cc
// ack内容记录
void QuicSentPacketManager::OnAckRange(QuicPacketNumber start,
                                       QuicPacketNumber end) {
  // 确定最新ack frame已更新
  if (!last_ack_frame_.largest_acked.IsInitialized() ||
      end > last_ack_frame_.largest_acked + 1) {
    // Largest acked increases.
    unacked_packets_.IncreaseLargestAcked(end - 1);
    last_ack_frame_.largest_acked = end - 1;
  }

  // 取出上一个最新的确认数据,确定该ack的内容是否合理
  // Drop ack ranges which ack packets below least_unacked.
  QuicPacketNumber least_unacked = unacked_packets_.GetLeastUnacked();
  if (least_unacked.IsInitialized() && end <= least_unacked) {
    return;
  }
  // 循环放如确认数据
  start = std::max(start, least_unacked);
  do {
    QuicPacketNumber newly_acked_start = start;
    if (acked_packets_iter_ != last_ack_frame_.packets.rend()) {
      newly_acked_start = std::max(start, acked_packets_iter_->max());
    }
    for (QuicPacketNumber acked = end - 1; acked >= newly_acked_start;
         --acked) {
      // Check if end is above the current range. If so add newly acked packets
      // in descending order.
      packets_acked_.push_back(AckedPacket(acked, 0, QuicTime::Zero()));
      if (acked == FirstSendingPacketNumber()) {
        break;
      }
    }
    if (acked_packets_iter_ == last_ack_frame_.packets.rend() ||
        start > acked_packets_iter_->min()) {
      // Finish adding all newly acked packets.
      return;
    }
    end = std::min(end, acked_packets_iter_->min());
    ++acked_packets_iter_;
  } while (start < end);
}


// 根据确认数据计算拥塞事件
void QuicSentPacketManager::MaybeInvokeCongestionEvent(
    bool rtt_updated, QuicByteCount prior_in_flight, QuicTime event_time,
    absl::optional<QuicEcnCounts> ecn_counts,
    const QuicEcnCounts& previous_counts) {
  if (!rtt_updated && packets_acked_.empty() && packets_lost_.empty()) {
    return;
  }
  const bool overshooting_detected =
      stats_->overshooting_detected_with_network_parameters_adjusted;
  // A connection should send at most one flavor of ECT, so only one variable
  // is necessary.
  QuicPacketCount newly_acked_ect = 0, newly_acked_ce = 0;
  if (ecn_counts.has_value()) {
    QUICHE_DCHECK(GetQuicReloadableFlag(quic_send_ect1));
    newly_acked_ect = ecn_counts->ect1 - previous_counts.ect1;
    if (newly_acked_ect == 0) {
      newly_acked_ect = ecn_counts->ect0 - previous_counts.ect0;
    } else {
      QUIC_BUG_IF(quic_bug_518619343_04,
                  ecn_counts->ect0 - previous_counts.ect0)
          << "Sent ECT(0) and ECT(1) newly acked in the same ACK.";
    }
    newly_acked_ce = ecn_counts->ce - previous_counts.ce;
  }
  if (using_pacing_) {
    pacing_sender_.OnCongestionEvent(rtt_updated, prior_in_flight, event_time,
                                     packets_acked_, packets_lost_,
                                     newly_acked_ect, newly_acked_ce);
  } else {
    // 放入算法模块计算拥塞数据
    send_algorithm_->OnCongestionEvent(rtt_updated, prior_in_flight, event_time,
                                       packets_acked_, packets_lost_,
                                       newly_acked_ect, newly_acked_ce);
  }
  if (debug_delegate_ != nullptr && !overshooting_detected &&
      stats_->overshooting_detected_with_network_parameters_adjusted) {
    debug_delegate_->OnOvershootingDetected();
  }
  packets_acked_.clear();
  packets_lost_.clear();
  if (network_change_visitor_ != nullptr) {
    network_change_visitor_->OnCongestionChange();
  }
}

2.3.2 拥塞计算

  以下部分讲述的是拥塞事件情况确认,当ack的数据输入到BBR后,需要先对ack进行采样统计。随后根据采样的结果确定网络状态,下面是代码:

void BbrSender::OnCongestionEvent(bool /*rtt_updated*/,
                                  QuicByteCount prior_in_flight,
                                  QuicTime event_time,
                                  const AckedPacketVector& acked_packets,
                                  const LostPacketVector& lost_packets,
                                  QuicPacketCount /*num_ect*/,
                                  QuicPacketCount /*num_ce*/) {
  // 获取上个采样的总丢包量和确认数据量
  const QuicByteCount total_bytes_acked_before = sampler_.total_bytes_acked();
  const QuicByteCount total_bytes_lost_before = sampler_.total_bytes_lost();

  bool is_round_start = false;
  bool min_rtt_expired = false;
  QuicByteCount excess_acked = 0;
  QuicByteCount bytes_lost = 0;

  // The send state of the largest packet in acked_packets, unless it is
  // empty. If acked_packets is empty, it's the send state of the largest
  // packet in lost_packets.
  SendTimeState last_packet_send_state;

  // 无ack数据时无法更新修复数据的状态
  if (!acked_packets.empty()) {
    QuicPacketNumber last_acked_packet = acked_packets.rbegin()->packet_number;
    // 需要累加增益周期
    is_round_start = UpdateRoundTripCounter(last_acked_packet);
    // 更新恢复状态
    UpdateRecoveryState(last_acked_packet, !lost_packets.empty(),
                        is_round_start);
  }

  // 采样统计
  BandwidthSamplerInterface::CongestionEventSample sample =
      sampler_.OnCongestionEvent(event_time, acked_packets, lost_packets,
                                 max_bandwidth_.GetBest(),
                                 QuicBandwidth::Infinite(), round_trip_count_);
  // 一旦ack的数据无法找到或者已经不再飞行会被当做无效数据
  if (sample.last_packet_send_state.is_valid) {
    last_sample_is_app_limited_ = sample.last_packet_send_state.is_app_limited;
    has_non_app_limited_sample_ |= !last_sample_is_app_limited_;
    if (stats_) {
      stats_->has_non_app_limited_sample = has_non_app_limited_sample_;
    }
  }
  // Avoid updating |max_bandwidth_| if a) this is a loss-only event, or b) all
  // packets in |acked_packets| did not generate valid samples. (e.g. ack of
  // ack-only packets). In both cases, sampler_.total_bytes_acked() will not
  // change.
  if (total_bytes_acked_before != sampler_.total_bytes_acked()) {
    QUIC_LOG_IF(WARNING, sample.sample_max_bandwidth.IsZero())
        << sampler_.total_bytes_acked() - total_bytes_acked_before
        << " bytes from " << acked_packets.size()
        << " packets have been acked, but sample_max_bandwidth is zero.";
    if (!sample.sample_is_app_limited ||
        sample.sample_max_bandwidth > max_bandwidth_.GetBest()) {
      max_bandwidth_.Update(sample.sample_max_bandwidth, round_trip_count_);
    }
  }

  // 采样结果中的rtt如果已经初始化则尝试更新最小RTT
  if (!sample.sample_rtt.IsInfinite()) {
    min_rtt_expired = MaybeUpdateMinRtt(event_time, sample.sample_rtt);
  }
  bytes_lost = sampler_.total_bytes_lost() - total_bytes_lost_before;
  if (mode_ == STARTUP) {
    if (stats_) {
      stats_->slowstart_packets_lost += lost_packets.size();
      stats_->slowstart_bytes_lost += bytes_lost;
    }
  }
  excess_acked = sample.extra_acked;
  last_packet_send_state = sample.last_packet_send_state;

  if (!lost_packets.empty()) {
    ++num_loss_events_in_round_;
    bytes_lost_in_round_ += bytes_lost;
  }

  // Handle logic specific to PROBE_BW mode.
  // 更新增益循环
  if (mode_ == PROBE_BW) {
    UpdateGainCyclePhase(event_time, prior_in_flight, !lost_packets.empty());
  }

  // Handle logic specific to STARTUP and DRAIN modes.
  // 确认当前是否进入或者处于监测到满带状态
  if (is_round_start && !is_at_full_bandwidth_) {
    CheckIfFullBandwidthReached(last_packet_send_state);
  }
  
  // 确认是否离开startup状态并进入drain状态
  MaybeExitStartupOrDrain(event_time);

  // Handle logic specific to PROBE_RTT.
  // 确认是否需要进入检测最小RTT状态
  MaybeEnterOrExitProbeRtt(event_time, is_round_start, min_rtt_expired);

  // Calculate number of packets acked and lost.
  QuicByteCount bytes_acked =
      sampler_.total_bytes_acked() - total_bytes_acked_before;

  // After the model is updated, recalculate the pacing rate and congestion
  // window.
  // 计算pacing_rate
  CalculatePacingRate(bytes_lost);
  // 计算拥塞窗口
  CalculateCongestionWindow(bytes_acked, excess_acked);
  // 计算恢复窗口
  CalculateRecoveryWindow(bytes_acked, bytes_lost);

  // Cleanup internal state.
  // 移除未确认的最新数据
  sampler_.RemoveObsoletePackets(unacked_packets_->GetLeastUnacked());
  if (is_round_start) {
    num_loss_events_in_round_ = 0;
    bytes_lost_in_round_ = 0;
  }
}

  恢复状态确定,并根据整体的丢包去确定网络是否存在需要重传的情况。恢复窗口状态分别为:NOT_IN_RECOVERY、CONSERVATION、GROWTH。

void BbrSender::UpdateRecoveryState(QuicPacketNumber last_acked_packet,
                                    bool has_losses, bool is_round_start) {
  // Disable recovery in startup, if loss-based exit is enabled.
  if (!is_at_full_bandwidth_) {
    return;
  }

  // Exit recovery when there are no losses for a round.
  // 存在丢包则更新最终的修复状态
  if (has_losses) {
    end_recovery_at_ = last_sent_packet_;
  }

  switch (recovery_state_) {
    // 无需恢复,一旦存在丢包则进入恢复窗口计算
    case NOT_IN_RECOVERY:
      // Enter conservation on the first loss.
      if (has_losses) {
        recovery_state_ = CONSERVATION;
        // This will cause the |recovery_window_| to be set to the correct
        // value in CalculateRecoveryWindow().
        recovery_window_ = 0;
        // Since the conservation phase is meant to be lasting for a whole
        // round, extend the current round as if it were started right now.
        current_round_trip_end_ = last_sent_packet_;
      }
      break;
	
	// 维持恢复窗口,当进入到下一轮的拥塞检测状态,则进入增长状态
    case CONSERVATION:
      if (is_round_start) {
        recovery_state_ = GROWTH;
      }
      ABSL_FALLTHROUGH_INTENDED;
	
	// 恢复窗口增长状态,存在丢包并且最新包都已经更新到大于待恢复的最大包,则不需要增长
    case GROWTH:
      // Exit recovery if appropriate.
      if (!has_losses && last_acked_packet > end_recovery_at_) {
        recovery_state_ = NOT_IN_RECOVERY;
      }

      break;
  }
}

  采样统计模块里包含了最大带宽的取值模块以及发送数据确认。

BandwidthSamplerInterface::CongestionEventSample
BandwidthSampler::OnCongestionEvent(QuicTime ack_time,
                                    const AckedPacketVector& acked_packets,
                                    const LostPacketVector& lost_packets,
                                    QuicBandwidth max_bandwidth,
                                    QuicBandwidth est_bandwidth_upper_bound,
                                    QuicRoundTripCount round_trip_count) {
  CongestionEventSample event_sample;

  SendTimeState last_lost_packet_send_state;

  // 先确认丢包数据将已发送的数据进行确认
  for (const LostPacket& packet : lost_packets) {
    SendTimeState send_state =
        OnPacketLost(packet.packet_number, packet.bytes_lost);
    if (send_state.is_valid) {
      last_lost_packet_send_state = send_state;
    }
  }
  
  // 无确认数据则直接返回旧状态
  if (acked_packets.empty()) {
    // Only populate send state for a loss-only event.
    event_sample.last_packet_send_state = last_lost_packet_send_state;
    return event_sample;
  }

  // 挨个对确认数据进行查询
  SendTimeState last_acked_packet_send_state;
  QuicBandwidth max_send_rate = QuicBandwidth::Zero();
  for (const auto& packet : acked_packets) {
    // 找不到ack数据则继续
    BandwidthSample sample =
        OnPacketAcknowledged(ack_time, packet.packet_number);
    if (!sample.state_at_send.is_valid) {
      continue;
    }

    last_acked_packet_send_state = sample.state_at_send;

    // 更新rtt
    if (!sample.rtt.IsZero()) {
      event_sample.sample_rtt = std::min(event_sample.sample_rtt, sample.rtt);
    }
	
	// 更新最大带宽
    if (sample.bandwidth > event_sample.sample_max_bandwidth) {
      event_sample.sample_max_bandwidth = sample.bandwidth;
      event_sample.sample_is_app_limited = sample.state_at_send.is_app_limited;
    }
    if (!sample.send_rate.IsInfinite()) {
      max_send_rate = std::max(max_send_rate, sample.send_rate);
    }
    const QuicByteCount inflight_sample =
        total_bytes_acked() - last_acked_packet_send_state.total_bytes_acked;
    if (inflight_sample > event_sample.sample_max_inflight) {
      event_sample.sample_max_inflight = inflight_sample;
    }
  }

  if (!last_lost_packet_send_state.is_valid) {
    event_sample.last_packet_send_state = last_acked_packet_send_state;
  } else if (!last_acked_packet_send_state.is_valid) {
    event_sample.last_packet_send_state = last_lost_packet_send_state;
  } else {
    // If two packets are inflight and an alarm is armed to lose a packet and it
    // wakes up late, then the first of two in flight packets could have been
    // acknowledged before the wakeup, which re-evaluates loss detection, and
    // could declare the later of the two lost.
    event_sample.last_packet_send_state =
        lost_packets.back().packet_number > acked_packets.back().packet_number
            ? last_lost_packet_send_state
            : last_acked_packet_send_state;
  }
  
  // 取出最大带宽将更新ack事件
  bool is_new_max_bandwidth = event_sample.sample_max_bandwidth > max_bandwidth;
  max_bandwidth = std::max(max_bandwidth, event_sample.sample_max_bandwidth);
  if (limit_max_ack_height_tracker_by_send_rate_) {
    max_bandwidth = std::max(max_bandwidth, max_send_rate);
  }
  // TODO(ianswett): Why is the min being passed in here?
  event_sample.extra_acked =
      OnAckEventEnd(std::min(est_bandwidth_upper_bound, max_bandwidth),
                    is_new_max_bandwidth, round_trip_count);

  return event_sample;
}

  采样模块ack确认逻辑:

BandwidthSample BandwidthSampler::OnPacketAcknowledgedInner(
    QuicTime ack_time, QuicPacketNumber packet_number,
    const ConnectionStateOnSentPacket& sent_packet) {
  // 增加ack的码率
  total_bytes_acked_ += sent_packet.size;
  total_bytes_sent_at_last_acked_packet_ =
      sent_packet.send_time_state.total_bytes_sent;
  last_acked_packet_sent_time_ = sent_packet.sent_time;
  last_acked_packet_ack_time_ = ack_time;
  if (overestimate_avoidance_) {
    recent_ack_points_.Update(ack_time, total_bytes_acked_);
  }

  if (is_app_limited_) {
    // Exit app-limited phase in two cases:
    // (1) end_of_app_limited_phase_ is not initialized, i.e., so far all
    // packets are sent while there are buffered packets or pending data.
    // (2) The current acked packet is after the sent packet marked as the end
    // of the app limit phase.
    if (!end_of_app_limited_phase_.IsInitialized() ||
        packet_number > end_of_app_limited_phase_) {
      is_app_limited_ = false;
    }
  }

  // There might have been no packets acknowledged at the moment when the
  // current packet was sent. In that case, there is no bandwidth sample to
  // make.
  if (sent_packet.last_acked_packet_sent_time == QuicTime::Zero()) {
//    QUIC_BUG(quic_bug_10437_4)
//        << "sent_packet.last_acked_packet_sent_time is zero";
    return BandwidthSample();
  }

  // Infinite rate indicates that the sampler is supposed to discard the
  // current send rate sample and use only the ack rate.
  // 统计发送码率
  QuicBandwidth send_rate = QuicBandwidth::Infinite();
  if (sent_packet.sent_time > sent_packet.last_acked_packet_sent_time) {
    send_rate = QuicBandwidth::FromBytesAndTimeDelta(
        sent_packet.send_time_state.total_bytes_sent -
            sent_packet.total_bytes_sent_at_last_acked_packet,
        sent_packet.sent_time - sent_packet.last_acked_packet_sent_time);
  }

  // ackpoint 是个抽象概念,是ack线上的一个点,更新ack点内容
  AckPoint a0;
  if (overestimate_avoidance_ &&
      ChooseA0Point(sent_packet.send_time_state.total_bytes_acked, &a0)) {
//    QUIC_DVLOG(2) << "Using a0 point: " << a0;
  } else {
    a0.ack_time = sent_packet.last_acked_packet_ack_time,
    a0.total_bytes_acked = sent_packet.send_time_state.total_bytes_acked;
  }

  // During the slope calculation, ensure that ack time of the current packet is
  // always larger than the time of the previous packet, otherwise division by
  // zero or integer underflow can occur.
  if (ack_time <= a0.ack_time) {
    // TODO(wub): Compare this code count before and after fixing clock jitter
    // issue.
    if (a0.ack_time == sent_packet.sent_time) {
      // This is the 1st packet after quiescense.
      QUIC_CODE_COUNT_N(quic_prev_ack_time_larger_than_current_ack_time, 1, 2);
    } else {
      QUIC_CODE_COUNT_N(quic_prev_ack_time_larger_than_current_ack_time, 2, 2);
    }
//    QUIC_LOG_EVERY_N_SEC(ERROR, 60)
//        << "Time of the previously acked packet:"
//        << a0.ack_time.ToDebuggingValue()
//        << " is larger than the ack time of the current packet:"
//        << ack_time.ToDebuggingValue()
//        << ". acked packet number:" << packet_number
//        << ", total_bytes_acked_:" << total_bytes_acked_
//        << ", overestimate_avoidance_:" << overestimate_avoidance_
//        << ", sent_packet:" << sent_packet;
    return BandwidthSample();
  }
  
  // 计算ack数据
  QuicBandwidth ack_rate = QuicBandwidth::FromBytesAndTimeDelta(
      total_bytes_acked_ - a0.total_bytes_acked, ack_time - a0.ack_time);

  // 在发送数据与接收数据中取最小的作为采样数据
  BandwidthSample sample;
  sample.bandwidth = std::min(send_rate, ack_rate);
  // Note: this sample does not account for delayed acknowledgement time.  This
  // means that the RTT measurements here can be artificially high, especially
  // on low bandwidth connections.
  sample.rtt = ack_time - sent_packet.sent_time;
  sample.send_rate = send_rate;
  SentPacketToSendTimeState(sent_packet, &sample.state_at_send);

  if (sample.bandwidth.IsZero()) {
//    QUIC_LOG_EVERY_N_SEC(ERROR, 60)
//        << "ack_rate: " << ack_rate << ", send_rate: " << send_rate
//        << ". acked packet number:" << packet_number
//        << ", overestimate_avoidance_:" << overestimate_avoidance_ << "a1:{"
//        << total_bytes_acked_ << "@" << ack_time << "}, a0:{"
//        << a0.total_bytes_acked << "@" << a0.ack_time
//        << "}, sent_packet:" << sent_packet;
  }
  return sample;
}

2.3.3 状态机切换

  BBR算法有四个比较重要的状态,这四个状态决定了整个发送的增益值。
  主要为:STARTUP、DRAIN、PROBE_BW、PROBE_RTT。

  STARTUP:启动状态,增益极大一直增长;
  DRAIN:增长到了码率大于了容量,需要一段时间的排空期,拥塞窗口变小;
  PROBE_BW:最大带宽检测阶段;
  PROBE_RTT:最小RTT检测阶段。
流媒体弱网优化之路(BBR算法应用)——QUIC-BBR算法代码分析_第1张图片

// STARTUP
// 创建Bbr_Sender类的时候直接进入startup状态 或 离开了probe_rtt状态且带宽不满
void BbrSender::EnterStartupMode(QuicTime now) {
  if (stats_) {
    ++stats_->slowstart_count;
    stats_->slowstart_duration.Start(now);
  }
  mode_ = STARTUP;
  pacing_gain_ = high_gain_;
  congestion_window_gain_ = high_cwnd_gain_;
}

// DRAIN
// 当检测到带宽已满则进入排空状态
void BbrSender::MaybeExitStartupOrDrain(QuicTime now) {
  if (mode_ == STARTUP && is_at_full_bandwidth_) {
    OnExitStartup(now);
    mode_ = DRAIN;
    pacing_gain_ = drain_gain_;
    congestion_window_gain_ = high_cwnd_gain_;
  }
  if (mode_ == DRAIN &&
      unacked_packets_->bytes_in_flight() <= GetTargetCongestionWindow(1)) {
    EnterProbeBandwidthMode(now);
  }
}

// PROBE_BW
// 当未确认的数据小于拥塞窗口的数据,则尝试进入最大带宽维持统计阶段,这个阶段不断的检测带宽
void BbrSender::EnterProbeBandwidthMode(QuicTime now) {
  mode_ = PROBE_BW;
  congestion_window_gain_ = congestion_window_gain_constant_;

  // Pick a random offset for the gain cycle out of {0, 2..7} range. 1 is
  // excluded because in that case increased gain and decreased gain would not
  // follow each other.
  cycle_current_offset_ = random_->RandUint64() % (kGainCycleLength - 1);
  if (cycle_current_offset_ >= 1) {
    cycle_current_offset_ += 1;
  }

  last_cycle_start_ = now;
  pacing_gain_ = kPacingGain[cycle_current_offset_];
}

// PROBE_RTT
// 当最小rtt到期,开始进入Probe_RTT
void BbrSender::MaybeEnterOrExitProbeRtt(QuicTime now, bool is_round_start,
                                         bool min_rtt_expired) {
  if (min_rtt_expired && !exiting_quiescence_ && mode_ != PROBE_RTT) {
    if (InSlowStart()) {
      OnExitStartup(now);
    }
    mode_ = PROBE_RTT;
    pacing_gain_ = 1;
    // Do not decide on the time to exit PROBE_RTT until the |bytes_in_flight|
    // is at the target small value.
    exit_probe_rtt_at_ = QuicTime::Zero();
  }

  if (mode_ == PROBE_RTT) {
    sampler_.OnAppLimited();

    if (exit_probe_rtt_at_ == QuicTime::Zero()) {
      // If the window has reached the appropriate size, schedule exiting
      // PROBE_RTT.  The CWND during PROBE_RTT is kMinimumCongestionWindow, but
      // we allow an extra packet since QUIC checks CWND before sending a
      // packet.
      if (unacked_packets_->bytes_in_flight() <
          ProbeRttCongestionWindow() + kMaxOutgoingPacketSize) {
        exit_probe_rtt_at_ = now + kProbeRttTime;
        probe_rtt_round_passed_ = false;
      }
    } else {
      if (is_round_start) {
        probe_rtt_round_passed_ = true;
      }
      if (now >= exit_probe_rtt_at_ && probe_rtt_round_passed_) {
        min_rtt_timestamp_ = now;
        if (!is_at_full_bandwidth_) {
          EnterStartupMode(now);
        } else {
          EnterProbeBandwidthMode(now);
        }
      }
    }
  }

  exiting_quiescence_ = false;
}

三、总结

  本文简单介绍了quiche中,BBR-V1的实现主流程,还有很多细节没有详细介绍。在quic的BBR算法中,思想主要集中在对整个传输链路承载能力的检测,重点是采样检测而非GCC逻辑中的预估。这两种思想的不同也造就了BBR算法相比GCC算法更主动,在带宽抢占中更有野性。同时,BBR中比较关键的点是保证采样的可靠性,因此使用什么类型的滤波、回归算法也是BBR可靠性很重要的一部分。后续我们将会对这些算法进行更深入的了解。

你可能感兴趣的:(网络,音视频)