HEVC的码率控制(Rate Control)部分是我研究生期间的研究重点,近期在导师的敦促下开始论文写作,需要总结整个码率控制的框架以及自身对码率控制的算法改进部分,借此机会把码率控制的理论部分与实际代码部分进行一个整理和归纳,可作为各位博友的参考,如有错误,敬请指正!
(参考软件版本: https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-16.15/
)
上:介绍码率控制的初始化部分
中:介绍码率控制的具体实现
下:介绍码率控制的应用以及码率控制的一些研究方向
首先码率控制的作用是
在有限的带宽下尽可能地提高整体视频的性能
,并保证比特分配的准确性
,其中HEVC中的带宽用比特率(Bitrate)来表示。了解码率控制应从一份提案入手,即
JCTVC-K0103 http://phenix.it-sudparis.eu/jct/
,这篇文章奠定了码率控制的整体框架。之后的
JCTVC-M0036、
JCTVC-M0257在
JCTVC-K0103的基础上进行扩展和改进,但总体的框架不变。码率控制的整体流程大致如下。
- 对码率控制进行初始化。(主要分为序列级别,GOP级别,frame级别,LCU级别)
- 对上述4个级别分别进行比特的分配。
- 将各个单元分配的比特数根据R-lambda和lambda-QP模型求出各个单元的最佳QP,并应用于编码中。
接下来就对这三个部分进行解读,并加入一些我自己对各个模块的理解。
1.初始化部分
1.1.在函数TEncTop::create () 中,若m_RCEnableRateControl开启,即cfg文件中的RateCtrl==1,对整个码率控制模块进行初始化。
//初始化函数m_cRateCtrl.init()主要根据序列参数进行初始化,包括Int totalFrames, Int targetBitrate, Int frameRate, Int GOPSize, Int picWidth, Int picHeight, Int LCUWidth, Int LCUHeight, Int keepHierBits, Bool useLCUSeparateModel, GOPEntry GOPList[MAX_GOP]这里主要解释以下形参中的keepHierBits,即是否采用分层编码,若采用,则各帧的比特分配是不一样的,否则,各帧的比特分配权重相同,在之后的代码中进行详解。
if ( m_RCEnableRateControl )
{
m_cRateCtrl.init( m_framesToBeEncoded, m_RCTargetBitrate, (Int)( (Double)m_iFrameRate/m_temporalSubsampleRatio + 0.5), m_iGOPSize, m_iSourceWidth, m_iSourceHeight,
m_maxCUWidth, m_maxCUHeight,m_RCKeepHierarchicalBit, m_RCUseLCUSeparateModel, m_GOPList );
}
转到
m_cRateCtrl.init()的定义之中,这里删除了部分else语句以提升代码的阅读性。
Void TEncRateCtrl::init( Int totalFrames, Int targetBitrate, Int frameRate, Int GOPSize, Int picWidth, Int picHeight, Int LCUWidth, Int LCUHeight, Int keepHierBits, Bool useLCUSeparateModel, GOPEntry GOPList[MAX_GOP] )
{
destroy();
Bool isLowdelay = true; //判断编码方式是否是Lowdelay
for ( Int i=0; i GOPList[i+1].m_POC )
{
isLowdelay = false;
break;
}
}
Int numberOfLevel = 1;
Int adaptiveBit = 0;
if ( keepHierBits > 0 )
{
numberOfLevel = Int( log((Double)GOPSize)/log(2.0) + 0.5 ) + 1;
}
if ( !isLowdelay && GOPSize == 8 )
{
numberOfLevel = Int( log((Double)GOPSize)/log(2.0) + 0.5 ) + 1;
}
numberOfLevel++; // intra picture
numberOfLevel++; // non-reference picture
Int* bitsRatio;
bitsRatio = new Int[ GOPSize ]; //初始化每一帧权重
for ( Int i=0; i 0 ) //如果采用分层编码,则每一帧的权重不同,这里的权重即为每一帧获得比特数的比例
{
Double bpp = (Double)( targetBitrate / (Double)( frameRate*picWidth*picHeight ) ); //判断当前带宽(Bitrate)分配到每一帧的每一个像素点上的比特数,即bit per pixel,根据bpp调整权重策略
if ( GOPSize == 4 && isLowdelay ) //Lowdelay下每一帧权重
{
if ( bpp > 0.2 )
{
bitsRatio[0] = 2;
bitsRatio[1] = 3;
bitsRatio[2] = 2;
bitsRatio[3] = 6;
}
else if( bpp > 0.1 )
{
bitsRatio[0] = 2;
bitsRatio[1] = 3;
bitsRatio[2] = 2;
bitsRatio[3] = 10;
}
if ( keepHierBits == 2 )
{
adaptiveBit = 1;
}
}
else if ( GOPSize == 8 && !isLowdelay ) //Random Access下每一帧权重
{
if ( bpp > 0.2 )
{
bitsRatio[0] = 15;
bitsRatio[1] = 5;
bitsRatio[2] = 4;
bitsRatio[3] = 1;
bitsRatio[4] = 1;
bitsRatio[5] = 4;
bitsRatio[6] = 1;
bitsRatio[7] = 1;
}
else if ( bpp > 0.1 )
{
bitsRatio[0] = 20;
bitsRatio[1] = 6;
bitsRatio[2] = 4;
bitsRatio[3] = 1;
bitsRatio[4] = 1;
bitsRatio[5] = 4;
bitsRatio[6] = 1;
bitsRatio[7] = 1;
}
if ( keepHierBits == 2 )
{
adaptiveBit = 2;
}
}
else
{
printf( "\n hierarchical bit allocation is not support for the specified coding structure currently.\n" );
}
}
Int* GOPID2Level = new Int[ GOPSize ];
for ( Int i=0; i 0 ) //这里要说明一下GOPID2Level,在这块代码块之后附上图文解释。
{
if ( GOPSize == 4 && isLowdelay )
{
GOPID2Level[0] = 3;
GOPID2Level[1] = 2;
GOPID2Level[2] = 3;
GOPID2Level[3] = 1;
}
}
if ( !isLowdelay && GOPSize == 8 )
{
GOPID2Level[0] = 1;
GOPID2Level[1] = 2;
GOPID2Level[2] = 3;
GOPID2Level[3] = 4;
GOPID2Level[4] = 4;
GOPID2Level[5] = 3;
GOPID2Level[6] = 4;
GOPID2Level[7] = 4;
}
m_encRCSeq = new TEncRCSeq; //将计算后得到的各项数据传入码率控制的初始化函数(TEncRateCtrl.cpp)之中
m_encRCSeq->create( totalFrames, targetBitrate, frameRate, GOPSize, picWidth, picHeight, LCUWidth, LCUHeight, numberOfLevel, useLCUSeparateModel, adaptiveBit );
m_encRCSeq->initBitsRatio( bitsRatio );
m_encRCSeq->initGOPID2Level( GOPID2Level );
m_encRCSeq->initPicPara();
if ( useLCUSeparateModel )
{
m_encRCSeq->initLCUPara();
}
m_CpbSaturationEnabled = false;
m_cpbSize = targetBitrate;
m_cpbState = (UInt)(m_cpbSize*0.5f);
m_bufferingRate = (Int)(targetBitrate / frameRate);
delete[] bitsRatio; //销毁
delete[] GOPID2Level;
}
PS:用下图解释以下GOPID2Level的概念,这是一个Random Access的示例图(图有点丑,见谅。。。),图中的1~8表示一个GOP中的8帧,在I帧(0)编码完成之后,首先进行编码的是第8帧,之后0和8共同作为第4帧的参考帧,再由0和4共同决定2、0和2共同决定1的编码。这样就仿佛是一个层次结构,一层一层进行编码,也就有了layer这个标识符以标志各帧所属的层。而GOPID2Level 也就是所谓的layer,其中GOPID2Level[0]代表第8帧,GOPID2Level[1]代表第4帧,GOPID2Level[2]代表第2帧,GOPID2Level[3]代表第1和第3帧,以此类推。
1.2.在TEncTop::encode() 中,对GOP级别内容进行初始化,主要内容是对一个GOP中的各帧进行计算。
if ( m_RCEnableRateControl )
{
m_cRateCtrl.initRCGOP( m_iNumPicRcvd );
}
initRCGOP( )函数只有两行代码,最重要的就是
m_encRCGOP->create( m_encRCSeq,
numberOfPictures
),如下
Void TEncRCGOP::create( TEncRCSeq* encRCSeq, Int numPic )
{
destroy();
Int targetBits = xEstGOPTargetBits( encRCSeq, numPic ); //计算每个GOP分配的比特数
if ( encRCSeq->getAdaptiveBits() > 0 && encRCSeq->getLastLambda() > 0.1 )//一般不进入此if判断,除非开启adaptiveBits
{
Double targetBpp = (Double)targetBits / encRCSeq->getNumPixel();
Double basicLambda = 0.0;
Double* lambdaRatio = new Double[encRCSeq->getGOPSize()];
Double* equaCoeffA = new Double[encRCSeq->getGOPSize()];
Double* equaCoeffB = new Double[encRCSeq->getGOPSize()];
if ( encRCSeq->getAdaptiveBits() == 1 ) // for GOP size =4, low delay case
{
if ( encRCSeq->getLastLambda() < 120.0 )
{
lambdaRatio[1] = 0.725 * log( encRCSeq->getLastLambda() ) + 0.5793;
lambdaRatio[0] = 1.3 * lambdaRatio[1];
lambdaRatio[2] = 1.3 * lambdaRatio[1];
lambdaRatio[3] = 1.0;
}
else
{
lambdaRatio[0] = 5.0;
lambdaRatio[1] = 4.0;
lambdaRatio[2] = 5.0;
lambdaRatio[3] = 1.0;
}
}
else if ( encRCSeq->getAdaptiveBits() == 2 ) // for GOP size = 8, random access case
{
if ( encRCSeq->getLastLambda() < 90.0 )
{
lambdaRatio[0] = 1.0;
lambdaRatio[1] = 0.725 * log( encRCSeq->getLastLambda() ) + 0.7963;
lambdaRatio[2] = 1.3 * lambdaRatio[1];
lambdaRatio[3] = 3.25 * lambdaRatio[1];
lambdaRatio[4] = 3.25 * lambdaRatio[1];
lambdaRatio[5] = 1.3 * lambdaRatio[1];
lambdaRatio[6] = 3.25 * lambdaRatio[1];
lambdaRatio[7] = 3.25 * lambdaRatio[1];
}
else
{
lambdaRatio[0] = 1.0;
lambdaRatio[1] = 4.0;
lambdaRatio[2] = 5.0;
lambdaRatio[3] = 12.3;
lambdaRatio[4] = 12.3;
lambdaRatio[5] = 5.0;
lambdaRatio[6] = 12.3;
lambdaRatio[7] = 12.3;
}
}
xCalEquaCoeff( encRCSeq, lambdaRatio, equaCoeffA, equaCoeffB, encRCSeq->getGOPSize() );
basicLambda = xSolveEqua( targetBpp, equaCoeffA, equaCoeffB, encRCSeq->getGOPSize() );
encRCSeq->setAllBitRatio( basicLambda, equaCoeffA, equaCoeffB );
delete []lambdaRatio;
delete []equaCoeffA;
delete []equaCoeffB;
}
m_picTargetBitInGOP = new Int[numPic];
Int i;
Int totalPicRatio = 0;
Int currPicRatio = 0;
for ( i=0; igetBitRatio( i );
}
for ( i=0; igetBitRatio( i );
m_picTargetBitInGOP[i] = (Int)( ((Double)targetBits) * currPicRatio / totalPicRatio ); //运用每一帧的权重分配对应比特数
}
m_encRCSeq = encRCSeq;
m_numPic = numPic;
m_targetBits = targetBits;
m_picLeft = m_numPic;
m_bitsLeft = m_targetBits;
}
1.3.在TEncGOP::compressGOP()中,对frame进行初始化
if ( m_pcCfg->getUseRateCtrl() ) // TODO: does this work with multiple slices and slice-segments?
{
Int frameLevel = m_pcRateCtrl->getRCSeq()->getGOPID2Level( iGOPid );
if ( pcPic->getSlice(0)->getSliceType() == I_SLICE )
{
frameLevel = 0;
}
m_pcRateCtrl->initRCPic( frameLevel ); //对frame层级进行初始化
进入
m_pcRateCtrl->initRCPic()
函数,在里面的
m_encRCPic->create(
)
函数中可以看到对frame中参数的定义,包括各项基本信息,如下。(这一块不难,参照对应变量即可了解其含义)
Void TEncRCPic::create( TEncRCSeq* encRCSeq, TEncRCGOP* encRCGOP, Int frameLevel, list& listPreviousPictures )
{
destroy();
m_encRCSeq = encRCSeq;
m_encRCGOP = encRCGOP;
Int targetBits = xEstPicTargetBits( encRCSeq, encRCGOP );
Int estHeaderBits = xEstPicHeaderBits( listPreviousPictures, frameLevel );
if ( targetBits < estHeaderBits + 100 )
{
targetBits = estHeaderBits + 100; // at least allocate 100 bits for picture data
}
m_frameLevel = frameLevel;//基础信息
m_numberOfPixel = encRCSeq->getNumPixel();
m_numberOfLCU = encRCSeq->getNumberOfLCU();
m_estPicLambda = 100.0;
m_targetBits = targetBits;
m_estHeaderBits = estHeaderBits;
m_bitsLeft = m_targetBits;
Int picWidth = encRCSeq->getPicWidth();
Int picHeight = encRCSeq->getPicHeight();
Int LCUWidth = encRCSeq->getLCUWidth();
Int LCUHeight = encRCSeq->getLCUHeight();
Int picWidthInLCU = ( picWidth % LCUWidth ) == 0 ? picWidth / LCUWidth : picWidth / LCUWidth + 1;
Int picHeightInLCU = ( picHeight % LCUHeight ) == 0 ? picHeight / LCUHeight : picHeight / LCUHeight + 1;
m_lowerBound = xEstPicLowerBound( encRCSeq, encRCGOP );
m_LCULeft = m_numberOfLCU;
m_bitsLeft -= m_estHeaderBits;
m_pixelsLeft = m_numberOfPixel;
m_LCUs = new TRCLCU[m_numberOfLCU];
Int i, j;
Int LCUIdx;
for ( i=0; i
1.4.在TEncSlice::compressSlice()中,对LCU进行初始化,主要是bpp,lambda,QP这几个参数
if ( m_pcCfg->getUseRateCtrl() )
Int estQP = pcSlice->getSliceQp();
Double estLambda = -1.0;
Double bpp = -1.0;
以上部分就是所有RC中初始化参数的设定。