WebRtc Video Receiver(二)-RTP包接收流程分析

1) 前言

  • 在WebRtc Video Receiver(一)-模块创建分析一文中主要介绍了Video Reviever Stream的创建流程,以及和其他各模块之间的关系。
  • 本文着重介绍Video Stream Receiver接收RTP数据流的业务流程以及处理流程。
  • 首先用一副图来描述网络层收到视频RTP包后数据是如何投递到Call模块的。
    WebRtc_Video_Stream_Receiver_02_01.png
  • 如上图所示在PeerConnection创建通道的时候会创建MediaChannel,将MediaChannel(属于webrtc_video_engine范畴)和PC层的BaseChannel进行关联
  • 网络层收到数据后通过信号的方式将数据传递给BaseChannel。
  • BaseChannel通过worker线程将RTP包传递给MediaChannel,由MediaChannel的派生关系,数据最终是到达VideoMediaChannel模块的。
  • 最后在VideoMediaChannel模块的OnPacketReceived()函数中通过调用Call模块的DeliverPacket()函数对RTP数据进行分发。
  • 在介绍Call模块将数据分发到RtpVideoStreamReceiver之前,先看看RtpVideoStreamReceiver类的构造函数,为什么会分发给它,在在WebRtc Video Stream Receiver原理(一)一文中有详细描述。

2) RtpVideoStreamReceiver核心成员分析

  • 在分析该构造函数之前,先看看RtpVideoStreamReceiver模块的派生关系以及依赖关系。
    WebRtc_Video_Stream_Receiver_02_02.png
  • 成员packet_buffer_负责管理VCMPacket数据包, 而当对RTP数据解析完后将其数据部分封装成VCMPacket
  • 同时根据上述的派生关系,当通过video_coding::PacketBuffer的插入函数在其内部将VCMPacket进行组包,若发现有完整的帧数据时,会触发OnAssembledFrame()函数,从而将一帧数据回传到RtpVideoStreamReceiver模块。
  • 成员reference_finder_成为rtp帧引用发现者,用于对一帧数据进行管理,当上述的OnAssembledFrame()函数被调用时,在其处理中会使用reference_finder_成员对当前帧进行管理和决策,具体作用后续分析。
    WebRtc_Video_Stream_Receiver_02_03.png
  • 在每个RtpVideoStreamReceiver模块中都有一个成员变量rtp_rtcp_由此可以看出webrtc每一路流都应该有自己的RTCP控制模块,用于接收对端发送过来的rtcp控制信息和发送rtcp控制请求。
  • 成员nack_module_Module派生而来,为一个定时线程,其作用主要是用于监听丢包,已经对丢包列表进行处理。

3) RtpVideoStreamReceiver RTP包处理

  • 根据上述前言中的流程图分析,数据最终经过Call模块进行分发,最终到达RtpVideoStreamReceiver模块,在RtpVideoStreamReceiver模块中的核心处理逻辑如下图:
    WebRtc_Video_Stream_Receiver_02_04.png
  • 从上图可以看出,RtpVideoStreamReceiver模块在处理RTP数据流的过程中,主要涉及三大步骤。
  • 首先是对RTP包进行解析,分离rtp头部等信息,获得RTPVideoHeader头,RTPHeader,以及payload_data等信息。
  • 其次,以RTPVideoHeaderRTPHeaderpayload_data等为参数封装VCMPacket,并进行容错判断,对于H264如若该包为一帧的首个包,且为IDR包的话判断是否有pps sps等信息的完整性。
  • 通过回调NackModule::OnReceivedPacket函数将当前包的seq传入到NackModule模块,NackModule模块会根据每次接收到seq进行是否连续判断,如果不连续则表示丢包,同时将丢包的seq插入到对应的丢包响应队列,NackModule模块利用其模块机制进行丢包重传发送。
  • 最后,如果在未丢包的情况下最终被封装的VCMPacket会插入到被RtpVideoStreamReceiver模块所管理的packet_buffer_成员当中,进行组包操作。
  • 接下来对以上三大流程进行分析,分析其原理。

4) RtpVideoStreamReceiver RTP包解析

void RtpVideoStreamReceiver::ReceivePacket(const RtpPacketReceived& packet) {
  if (packet.payload_size() == 0) {
    // Padding or keep-alive packet.
    // TODO(nisse): Could drop empty packets earlier, but need to figure out how
    // they should be counted in stats.
    NotifyReceiverOfEmptyPacket(packet.SequenceNumber());
    return;
  } 
  if (packet.PayloadType() == config_.rtp.red_payload_type) {
    ParseAndHandleEncapsulatingHeader(packet);
    return;
  }
  /*容器大小为1,也就是握手后确定的解码器对应的payloadtype,以H264为例,对应107
    插入流程在原理(一)中有说
  */
  const auto type_it = payload_type_map_.find(packet.PayloadType());
  if (type_it == payload_type_map_.end()) {
    return;
  }
  /*根据payload_type创建解包器*/
  auto depacketizer =
      absl::WrapUnique(RtpDepacketizer::Create(type_it->second));
  if (!depacketizer) {
    RTC_LOG(LS_ERROR) << "Failed to create depacketizer.";
    return;
  }
    
  RtpDepacketizer::ParsedPayload parsed_payload;
  if (!depacketizer->Parse(&parsed_payload, packet.payload().data(),
                           packet.payload().size())) {
    RTC_LOG(LS_WARNING) << "Failed parsing payload.";
    return;
  }

  RTPHeader rtp_header;
  packet.GetHeader(&rtp_header);
  /*信息封装在RtpDepacketizer当中*/  
  RTPVideoHeader video_header = parsed_payload.video_header();
  ......
  video_header.is_last_packet_in_frame = rtp_header.markerBit;
  video_header.frame_marking.temporal_id = kNoTemporalIdx;

  if (parsed_payload.video_header().codec == kVideoCodecVP9) {
    const RTPVideoHeaderVP9& codec_header = absl::get(
        parsed_payload.video_header().video_type_header);
    video_header.is_last_packet_in_frame |= codec_header.end_of_frame;
    video_header.is_first_packet_in_frame |= codec_header.beginning_of_frame;
  }
  /*解析扩展信息*/
  packet.GetExtension(&video_header.rotation);
  packet.GetExtension(&video_header.content_type);
  packet.GetExtension(&video_header.video_timing);
  /*解析播放延迟限制?*/  
  packet.GetExtension(&video_header.playout_delay);
  packet.GetExtension(&video_header.frame_marking);

  // Color space should only be transmitted in the last packet of a frame,
  // therefore, neglect it otherwise so that last_color_space_ is not reset by
  // mistake.
  /*颜色空间应该只在帧的最后一个数据包中传输,因此,需要忽略它,
    否则当发生错误的时候使last_color_space_不会被重置,为啥要这样? */
  if (video_header.is_last_packet_in_frame) {
    video_header.color_space = packet.GetExtension();
    if (video_header.color_space ||
        video_header.frame_type == VideoFrameType::kVideoFrameKey) {
      // Store color space since it's only transmitted when changed or for key
      // frames. Color space will be cleared if a key frame is transmitted
      // without color space information.
      last_color_space_ = video_header.color_space;
    } else if (last_color_space_) {
      video_header.color_space = last_color_space_;
    }
  }
  ......
  OnReceivedPayloadData(parsed_payload.payload, parsed_payload.payload_length,
                        rtp_header, video_header, generic_descriptor_wire,
                        packet.recovered());
}
  • 首先判断接收到的packet的payload_size是否为0,为0表示padding包,如果是则调用NotifyReceiverOfEmptyPacket()函数将该包信息通过rtp_rtcp_将包传递给NackModule模块,因为NackModule模块需要判断是否有丢包情况,所有判断的依据是seq的连续性。
  • 判断是否为red包,本文不做分析。
  • payload_type_map_查找是否有对应的payload type,payload_type_map_的插入在原理(一)中有详细分析,若不能找到匹配的解码payload type,则立即返回。
  • 根据payload type调用RtpDepacketizer::Create(type_it->second)创建对应的rtp分包器。最终的解包操作使用RtpDepacketizer调用其Parse()函数来完成,它的实现原理如下图:
    WebRtc_Video_Stream_Receiver_02_05.png
  • 由上图RtpDepacketizer的派生关系可看出,不同的解码类型,会有不同的派生类型,调用其Parse()函数后,最后的解析信息会被封装到其类步类RtpDepacketizer::ParsedPayload当中,其中记录了RTPVideoHeader、palyload、payload_length信息,通过parsed_payload.video_header()可以返回RTPVideoHeader结构实例。
  • 同时由上图看出,如果webrtc要支持h265解码,同理需要派生一个h265的解包类,在其内部对H265数据进行解析,最后封装成ParsedPayload结构。
  • 到此为止RTP数据包解析提取工作就已经完成,最后调用RtpVideoStreamReceiver::OnReceivedPayloadData()函数进入到下一个步骤。

5) RtpVideoStreamReceiver VCMPacket封装及关键帧请求

5.1) RtpVideoStreamReceiver VCMPacket封装及容错处理

int32_t RtpVideoStreamReceiver::OnReceivedPayloadData(
    const uint8_t* payload_data,
    size_t payload_size,
    const RTPHeader& rtp_header,
    const RTPVideoHeader& video_header,
    const absl::optional& generic_descriptor,
    bool is_recovered) {
  VCMPacket packet(payload_data, payload_size, rtp_header, video_header,
                   ntp_estimator_.Estimate(rtp_header.timestamp),
                   clock_->TimeInMilliseconds());
  packet.generic_descriptor = generic_descriptor;

  .......
  
  if (packet.codec() == kVideoCodecH264) {
    // Only when we start to receive packets will we know what payload type
    // that will be used. When we know the payload type insert the correct
    // sps/pps into the tracker.
    if (packet.payloadType != last_payload_type_) {
      last_payload_type_ = packet.payloadType;
      InsertSpsPpsIntoTracker(packet.payloadType);
    }

    switch (tracker_.CopyAndFixBitstream(&packet)) {
      case video_coding::H264SpsPpsTracker::kRequestKeyframe:
        rtcp_feedback_buffer_.RequestKeyFrame();
        rtcp_feedback_buffer_.SendBufferedRtcpFeedback();
        RTC_FALLTHROUGH();
      case video_coding::H264SpsPpsTracker::kDrop:
        return 0;
      case video_coding::H264SpsPpsTracker::kInsert:
        break;
    }

  } 
  ......  
  return 0;
}
  • 首先根据传入的RTPHeaderRTPVideoHeaderpayload_size打包VCMPacket结构。
  • 对H264解码的payload,调用tracker_.CopyAndFixBitstream(&packet)对VCMPacket进行相应的容错处理和数据赋值。
  • 如果正常情况下会调用 rtcp_feedback_buffer_.SendBufferedRtcpFeedback()想对端发送feedback。
  • 如果正常情况下最后会将VCMPacket插入到packet_buffer_
  • 这里重点分析CopyAndFixBitstream函数。
H264SpsPpsTracker::PacketAction H264SpsPpsTracker::CopyAndFixBitstream(
    VCMPacket* packet) {
  RTC_DCHECK(packet->codec() == kVideoCodecH264);

  const uint8_t* data = packet->dataPtr;
  const size_t data_size = packet->sizeBytes;
  const RTPVideoHeader& video_header = packet->video_header;
  auto& h264_header =
      absl::get(packet->video_header.video_type_header);

  bool append_sps_pps = false;
  auto sps = sps_data_.end();
  auto pps = pps_data_.end();

  for (size_t i = 0; i < h264_header.nalus_length; ++i) {
    const NaluInfo& nalu = h264_header.nalus[i];
    switch (nalu.type) {
      case H264::NaluType::kSps: {
        sps_data_[nalu.sps_id].width = packet->width();
        sps_data_[nalu.sps_id].height = packet->height();
        break;
      }
      case H264::NaluType::kPps: {
        pps_data_[nalu.pps_id].sps_id = nalu.sps_id;
        break;
      }
      case H264::NaluType::kIdr: {
        // If this is the first packet of an IDR, make sure we have the required
        // SPS/PPS and also calculate how much extra space we need in the buffer
        // to prepend the SPS/PPS to the bitstream with start codes.
        if (video_header.is_first_packet_in_frame) {
          if (nalu.pps_id == -1) {
            RTC_LOG(LS_WARNING) << "No PPS id in IDR nalu.";
            return kRequestKeyframe;
          }

          pps = pps_data_.find(nalu.pps_id);
          if (pps == pps_data_.end()) {
            RTC_LOG(LS_WARNING)
                << "No PPS with id << " << nalu.pps_id << " received";
            return kRequestKeyframe;
          }

          sps = sps_data_.find(pps->second.sps_id);
          if (sps == sps_data_.end()) {
            RTC_LOG(LS_WARNING)
                << "No SPS with id << " << pps->second.sps_id << " received";
            return kRequestKeyframe;
          }

          // Since the first packet of every keyframe should have its width and
          // height set we set it here in the case of it being supplied out of
          // band.
          packet->video_header.width = sps->second.width;
          packet->video_header.height = sps->second.height;

          // If the SPS/PPS was supplied out of band then we will have saved
          // the actual bitstream in |data|.
          if (sps->second.data && pps->second.data) {
            RTC_DCHECK_GT(sps->second.size, 0);
            RTC_DCHECK_GT(pps->second.size, 0);
            append_sps_pps = true;
          }
        }
        break;
      }
      default:
        break;
    }
  }

  RTC_CHECK(!append_sps_pps ||
            (sps != sps_data_.end() && pps != pps_data_.end()));

  // Calculate how much space we need for the rest of the bitstream.
  size_t required_size = 0;

  if (append_sps_pps) {
    required_size += sps->second.size + sizeof(start_code_h264);
    required_size += pps->second.size + sizeof(start_code_h264);
  }
    //RTC_LOG(INFO) << "h264_header.packetization_type:" << h264_header.packetization_type;
  if (h264_header.packetization_type == kH264StapA) {
    const uint8_t* nalu_ptr = data + 1;
    while (nalu_ptr < data + data_size) {
      RTC_DCHECK(video_header.is_first_packet_in_frame);
      required_size += sizeof(start_code_h264);

      // The first two bytes describe the length of a segment.
      uint16_t segment_length = nalu_ptr[0] << 8 | nalu_ptr[1];
      nalu_ptr += 2;

      required_size += segment_length;
      nalu_ptr += segment_length;
    }
  } else {//default kH264FuA
    if (h264_header.nalus_length > 0) {
      required_size += sizeof(start_code_h264);
    }
    required_size += data_size;
  }

  // Then we copy to the new buffer.
  uint8_t* buffer = new uint8_t[required_size];
  uint8_t* insert_at = buffer;

  if (append_sps_pps) {
    // Insert SPS.
    memcpy(insert_at, start_code_h264, sizeof(start_code_h264));
    insert_at += sizeof(start_code_h264);
    memcpy(insert_at, sps->second.data.get(), sps->second.size);
    insert_at += sps->second.size;

    // Insert PPS.
    memcpy(insert_at, start_code_h264, sizeof(start_code_h264));
    insert_at += sizeof(start_code_h264);
    memcpy(insert_at, pps->second.data.get(), pps->second.size);
    insert_at += pps->second.size;

    // Update codec header to reflect the newly added SPS and PPS.
    NaluInfo sps_info;
    sps_info.type = H264::NaluType::kSps;
    sps_info.sps_id = sps->first;
    sps_info.pps_id = -1;
    NaluInfo pps_info;
    pps_info.type = H264::NaluType::kPps;
    pps_info.sps_id = sps->first;
    pps_info.pps_id = pps->first;
    if (h264_header.nalus_length + 2 <= kMaxNalusPerPacket) {
      h264_header.nalus[h264_header.nalus_length++] = sps_info;
      h264_header.nalus[h264_header.nalus_length++] = pps_info;
    } else {
      RTC_LOG(LS_WARNING) << "Not enough space in H.264 codec header to insert "
                             "SPS/PPS provided out-of-band.";
    }
  }

  // Copy the rest of the bitstream and insert start codes.
  if (h264_header.packetization_type == kH264StapA) {
    const uint8_t* nalu_ptr = data + 1;
    while (nalu_ptr < data + data_size) {
      memcpy(insert_at, start_code_h264, sizeof(start_code_h264));
      insert_at += sizeof(start_code_h264);

      // The first two bytes describe the length of a segment.
      uint16_t segment_length = nalu_ptr[0] << 8 | nalu_ptr[1];
      nalu_ptr += 2;

      size_t copy_end = nalu_ptr - data + segment_length;
      if (copy_end > data_size) {
        delete[] buffer;
        return kDrop;
      }

      memcpy(insert_at, nalu_ptr, segment_length);
      insert_at += segment_length;
      nalu_ptr += segment_length;
    }
  } else {
    if (h264_header.nalus_length > 0) {
      memcpy(insert_at, start_code_h264, sizeof(start_code_h264));
      insert_at += sizeof(start_code_h264);
    }
    memcpy(insert_at, data, data_size);
  }

  packet->dataPtr = buffer;
  packet->sizeBytes = required_size;
  return kInsert;
}
  • 循环遍历该包,一个包中可能有多个NALU单元,如果该NALU为IDR片,并且该包为该帧的首个包,那么按照H264 bit stream的原理,它的前两个NALU一定是SPS和PPS,如下图:


    WebRtc_Video_Stream_Receiver_02_06.jpg

    WebRtc_Video_Stream_Receiver_02_07.png
  • 通过上述代码的逻辑也是如果NALU为SPS或PPS直接将其赋值到sps_data_pps_data_容器当中。
  • 如果video_header.is_first_packet_in_frame 并且nalu.type==kIdr,那么此时sps_data_pps_data_容器必须有值,如果没有值,则说缺失SPS和PPS信息,该IDR是无法进行解码的,所以直接返回kRequestKeyframe。
  • 进行数据拷贝,在append_sps_pps成立也就是video_header.is_first_packet_in_frame 并且nalu.type==kIdr的情况下按照上图的结构,配合代码不难进行分析。

5.2) RtpVideoStreamReceiver 关键帧请求

int32_t RtpVideoStreamReceiver::OnReceivedPayloadData(
    const uint8_t* payload_data,
    size_t payload_size,
    const RTPHeader& rtp_header,
    const RTPVideoHeader& video_header,
    const absl::optional& generic_descriptor,
    bool is_recovered) {
  VCMPacket packet(payload_data, payload_size, rtp_header, video_header,
                   ntp_estimator_.Estimate(rtp_header.timestamp),
                   clock_->TimeInMilliseconds());
    ....
    switch (tracker_.CopyAndFixBitstream(&packet)) {
      case video_coding::H264SpsPpsTracker::kRequestKeyframe:
        rtcp_feedback_buffer_.RequestKeyFrame();
        rtcp_feedback_buffer_.SendBufferedRtcpFeedback();
        RTC_FALLTHROUGH();
      case video_coding::H264SpsPpsTracker::kDrop:
        return 0;
      case video_coding::H264SpsPpsTracker::kInsert:
        break;
    }
    .....
}
  • 如tracker_.CopyAndFixBitstream返回kRequestKeyframe,表示该包的I帧参数有问题,需要重新发起关键帧请求。
  • 调用模块RtpVideoStreamReceiver::RtcpFeedbackBuffer的RequestKeyFrame()方法将其request_key_frame_变量设成true。
  • 最后调用RtpVideoStreamReceiver::RtcpFeedbackBuffer的SendBufferedRtcpFeedback()发送请求。
  • 关键帧请求的核心逻辑如下图


    WebRtc_Video_Stream_Receiver_02_08.png
  • RtpVideoStreamReceiver::RtcpFeedbackBuffer的RequestKeyFrame()和SendBufferedRtcpFeedback方法实现如下:
void RtpVideoStreamReceiver::RtcpFeedbackBuffer::RequestKeyFrame() {
  rtc::CritScope lock(&cs_);
  request_key_frame_ = true;
}
  • 设置request_key_frame_为true
void RtpVideoStreamReceiver::RtcpFeedbackBuffer::SendBufferedRtcpFeedback() {
  bool request_key_frame = false;
  std::vector nack_sequence_numbers;
  absl::optional lntf_state;
  ....
  {
    rtc::CritScope lock(&cs_);
    std::swap(request_key_frame, request_key_frame_);
  }
  .....
  if (request_key_frame) {
    key_frame_request_sender_->RequestKeyFrame();
  } else if (!nack_sequence_numbers.empty()) {
    nack_sender_->SendNack(nack_sequence_numbers, true);
  }
}
  • 由于此时request_key_frame为true。
  • key_frame_request_sender_为模块RtpVideoStreamReceiver指针,在其构造函数中实例化rtcp_feedback_buffer_实例化的时候以参数的形式传入。
void RtpVideoStreamReceiver::RequestKeyFrame() {
  if (keyframe_request_sender_) {//默认为nullptr
    keyframe_request_sender_->RequestKeyFrame();
  } else {
    rtp_rtcp_->SendPictureLossIndication();
  }
}
  • keyframe_request_sender_默认为nullptr,在VideoReceiveStream构造函数初始化其成员变量rtp_video_stream_receiver_的时候传入了nullptr。
  • 最终调用rtp_rtcp模块的SendPictureLossIndication函数发送PLI。
  • 本文分析到此结束,剩下的NACK module 以及组包分析放在下文。

你可能感兴趣的:(WebRtc Video Receiver(二)-RTP包接收流程分析)