NetEQ 算法中集成了自适应抖动控制算法以及语音包丢失隐藏算法。这项技术使其能够快速且高解析度地适应不断变化的网络环境,确保音质优美且缓冲延迟最小。
研究的重点是 NetEQ 模块,其中所涉及的处理过程包括抖动消除、丢包补偿和压缩解码。
抖动消除原理
抖动通常采用抖动缓冲技术来消除,即在接收方建立一个缓冲区,语音包到达接收端时首先进人缓冲区暂存,随后系统再以稳定平滑的速率将语音包从缓冲
区提取出来,经解压后从声卡播出。
4 个语音数据包(A、B、C、D)以 30ms 为间隔进行发送,即发送时间分别为 30,60,90,120ms,网络延迟分别为10,30,10,10ms。到达时间为40,90,100,130ms,所以需要在抖动缓冲中分别缓冲60,90,120,150ms。
1.静态抖动缓冲控制算法
2.自适应抖动缓冲控制算法
丢包隐藏原理
iLBC 的丢包隐藏只是在解码端进行处理,即在解码端根据收到的比特流逐帧进行解码的过程中,iLBC 解码器首先拿到每帧的比特流时判断当前帧是否完整,如果没有问题则按照正常的 iLBC 解码流程重建语音信号,如果发现语音数据包丢失,那么就进入 PLC 单元进行处理。见算法步骤13
MCU(Micro Control Unit)模块是抖动缓冲区的微控制单元,由于抖动缓冲区作用是暂存接收到的数据包,因此 MCU 的主要作用是安排数据包的插入并控制数据包的输出。数据包的插入主要是确定来自网络的新到达的数据包在缓冲区中的插入位置,而控制数据包的输出则要考虑什么时候需要输出数据,以及输出哪一个插槽的数据包。抖动消除的算法思路在 MCU 控制模块中得以体现。
DSP 模块主要负责对从 MCU 中提取出来的 PCM 源数据包进行数字信号处理,
包括解码、信号处理、数据输出等几个部分。丢包隐藏操作即在 DSP 模块中完成。
NetEQ 模块框图
网络数据包进入抖动缓冲区的过程,其基本步骤如下,
AudioCodingModuleImpl::IncomingPacket(const uint8_t*incoming_payload,const int32_t payload_length,const WebRtcRTPHeader& rtp_info)
{
...
memcpy(payload, incoming_payload, payload_length);
codecs_[current_receive_codec_idx_]->SplitStereoPacket(payload, &length);
rtp_header.type.Audio.channel = 2;
per_neteq_payload_length = length / 2;
// Insert packet into NetEQ.
if (neteq_.RecIn(payload, length, rtp_header,
last_receive_timestamp_) < 0)
...
}
ACMNetEQ::RecIn(const uint8_t*incoming_payload,const int32_t length_payload,const WebRtcRTPHeader& rtp_info,uint32_t receive_timestamp)
{
...
WebRtcNetEQ_RecInRTPStruct(inst_[0], &neteq_rtpinfo,incoming_payload, payload_length,receive_timestamp);
}
int WebRtcNetEQ_RecInRTPStruct(void *inst, WebRtcNetEQ_RTPInfo *rtpInfo,
const uint8_t *payloadPtr, int16_t payloadLenBytes,
uint32_t uw32_timeRec)
{
...
RTPPacket_t RTPpacket;
...
/* Load NetEQ's RTP struct from Module RTP struct */
RTPpacket.payloadType = rtpInfo->payloadType;
RTPpacket.seqNumber = rtpInfo->sequenceNumber;
RTPpacket.timeStamp = rtpInfo->timeStamp;
RTPpacket.ssrc = rtpInfo->SSRC;
RTPpacket.payload = (const int16_t*) payloadPtr;
RTPpacket.payloadLen = payloadLenBytes;
RTPpacket.starts_byte1 = 0;
WebRtcNetEQ_RecInInternal(&NetEqMainInst->MCUinst, &RTPpacket, uw32_timeRec);
}
int WebRtcNetEQ_RecInInternal(MCUInst_t *MCU_inst, RTPPacket_t *RTPpacketInput,uint32_t uw32_timeRec)
{
...
WebRtcNetEQ_PacketBufferInsert(&MCU_inst->PacketBuffer_inst,&RTPpacket[i_k], &flushed, MCU_inst->av_sync);
...
WebRtcNetEQ_SplitAndInsertPayload(&RTPpacket[i_k],&MCU_inst->PacketBuffer_inst, &MCU_inst->PayloadSplit_inst,&flushed, MCU_inst->av_sync);
...
}
int WebRtcNetEQ_PacketBufferInsert(PacketBuf_t *bufferInst, const RTPPacket_t *RTPpacket,
int16_t *flushed, int av_sync)
{//This function inserts an RTP packet into the packet buffer.
...
/* Copy the packet information */
bufferInst->payloadLocation[bufferInst->insertPosition] = bufferInst->currentMemoryPos;
bufferInst->payloadLengthBytes[bufferInst->insertPosition] = RTPpacket->payloadLen;
bufferInst->payloadType[bufferInst->insertPosition] = RTPpacket->payloadType;
bufferInst->seqNumber[bufferInst->insertPosition] = RTPpacket->seqNumber;
bufferInst->timeStamp[bufferInst->insertPosition] = RTPpacket->timeStamp;
bufferInst->rcuPlCntr[bufferInst->insertPosition] = RTPpacket->rcuPlCntr;
bufferInst->waitingTime[bufferInst->insertPosition] = 0;
...
}
int WebRtcNetEQ_SplitAndInsertPayload(RTPPacket_t* packet,PacketBuf_t* Buffer_inst,SplitInfo_t* split_inst,int16_t* flushed,int av_sync)
{
}
1:解析数据包并将其插入到抖动缓冲区中( Parse the payload and insert it into the buffer)。WebRtcNetEQ_SplitAndInsertPayload
WebRtcNetEQ_PacketBufferInsert(Buffer_inst, &temp_packet, &localFlushed);循环将输入数据packets split到PacketBuf_t 的实例中
while (len >= split_inst->deltaBytes)
{
....
i_ok = WebRtcNetEQ_PacketBufferInsert(Buffer_inst, &temp_packet, &localFlushed);
...
}
/* Insert packet in the found position */
if (RTPpacket->starts_byte1 == 0)
{
/* Payload is 16-bit aligned => just copy it */
WEBRTC_SPL_MEMCPY_W16(bufferInst->currentMemoryPos,
RTPpacket->payload, (RTPpacket->payloadLen + 1) >> 1);
}
else
{
/* Payload is not 16-bit aligned => align it during copy operation */
for (i = 0; i < RTPpacket->payloadLen; i++)
{
/* copy the (i+1)-th byte to the i-th byte */
WEBRTC_SPL_SET_BYTE(bufferInst->currentMemoryPos,
(WEBRTC_SPL_GET_BYTE(RTPpacket->payload, (i + 1))), i);
}
}
2:更新 automode 中的参数,重点计算网络延迟的统计值 BLo(optBufferLevel)。WebRtcNetEQ_UpdateIatStatistics
2.1
/* calculate inter-arrival time in integer packets (rounding down) */
timeIat = WebRtcSpl_DivW32W16(inst->packetIatCountSamp, packetLenSamp);
2.2 /* update iatProb = forgetting_factor * iatProb for all elements */
2.3 /* Calculate optimal buffer level based on updated statistics */
tempvar = (int32_t) WebRtcNetEQ_CalcOptimalBufLvl(inst, fsHz, mdCodec, timeIat,
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
提取 10ms 数据到声卡的算法过程
3:将 DSP 的 endTimeStamp 赋值给 playedOutTS(用于提取数据的参考条件)并记录语音缓冲区中等待播放的样本数 sampleLeft。Write status data to shared memory
4:从抖动缓冲区提取数据。WebRtcNetEQ_DSP2MCUinterrupt
5:遍历查找抖动缓冲区中数据包的时间戳。 WebRtcNetEQ_PacketBufferFindLowestTimestamp(&inst->PacketBuffer_inst,inst->timeStamp, &uw32_availableTS, &i_bufferpos, 1, &payloadType);
5.1/* Loop through all slots in buffer. */
5.2/* If old payloads should be discarded. */
6:统计进入接收端的 NetEQ 模块但仍未被播放的数据量,记为 bufsize。calculate total current buffer size (in ms*8), including sync buffer
w32_bufsize = WebRtcSpl_DivW32W16((w32_bufsize + dspInfo.samplesLeft), fs_mult)
7.根据 bufsize 计算 BLc(bufferLevelFilt).WebRtcNetEQ_BufferLevelFilter
* Current buffer level in packet lengths
* = (curSizeMs8 * fsMult) / packetSpeechLenSamp
curSizeFrames = Sb/Lp;
Sb = Np*Lp+sampleLeft;
/* Filter buffer level */
if (inst->levelFiltFact > 0) /* check that filter factor is set */
{
/* Filter:
* buffLevelFilt = levelFiltFact * buffLevelFilt
* + (1-levelFiltFact) * curSizeFrames
*
* levelFiltFact is in Q8
*/
inst->buffLevelFilt = ((inst->levelFiltFact * inst->buffLevelFilt) >> 8) +
(256 - inst->levelFiltFact) * curSizeFrames;
}
8.根据 BLo(optBufferLevel)、BLc(bufferLevelFilt)、bufsize、playedOutTS、availableTS 及 NetEQ 的上一播放模式进行 MCU 控制命令的判断
9.根据 MCU 的控制命令及当前语音缓冲区中解码后未被播放的数据量sampleLeft 进行判断考虑是否需要从抖动缓冲区取数据. Check sync buffer size (Step 11)
10.提取一个数据包送入共享内存暂存器.WebRtcNetEQ_PacketBufferExtract
11.根据从抖动缓冲区取数据之前的 MCU 的控制命令得到相应的 DSP的处理命令。 Step 13
12.解码取到的数据(Do decoding).inst->codec_ptr_inst.funcDecode()
13.丢包补偿.inst->codec_ptr_inst.funcDecodePLC
14.根据 DSP 操作命令进入相应的的播放模式对解码数据及语音缓冲区中数据进行相关操作。Step 15
15.从语音缓冲区的 curPosition 为起始位置取 10ms 数据传输到声卡。WEBRTC_SPL_MEMCPY_W16(pw16_outData, &inst->speechBuffer[inst->curPosition], w16_tmp1);