以下的size特指luma块的大小,min_cu_size 默认情况下= 8。
CTU size:16x16,32x32,64x64
CU size:8x8,16x16,32x32,64x64
PU size
设CU size = MxM
Intra PU:M/2 * M/2 (only when CU size reaches min_cu_size),MxM
Inter PU:支持8种划分模式
Iinter PU的限制
TU size
4x4,8x8, 16x16, 32x32
当CB大于最大TB时,隐含表示该CB需要做进一步的分隔;当CB继续分隔会小于最小TB时,隐含表示该CB不会继续作分隔。在正常范围内且划分深度小于最大划分深度时,都可选择继续划分或不划分。最大划分深度由encoder写在sps中,允许范围为[0,CtbLog2SizeY − MinTbLog2SizeY]。
此外,根据inter/intra PU的划分方式,TU存在默认划分机制(见后文中的记录),TU必须等于或小于intra PU,但可以跨越inter PU边界。
在“附录A Profiles, tiers and levels”中,如
Main Profile
– CtbLog2SizeY derived according to active SPSs for the base layer shall be in the range of 4 to 6, inclusive.
7.4.3.2.1 General sequence parameter set RBSP semantics
log2_diff_max_min_luma_transform_block_size
The variable MaxTbLog2SizeY is set equal to log2_min_luma_transform_block_size_minus2 + 2 + log2_diff_max_min_luma_transform_block_size.
The CVS shall not contain data that result in MaxTbLog2SizeY greater than Min( CtbLog2SizeY, 5 ).
syntax的解析过程有:
When part_mode is not present, the variables PartMode and IntraSplitFlag are derived as follows:
– PartMode is set equal to PART_2Nx2N.
– IntraSplitFlag is set equal to 0.
所以当intra,而且不是最小CB时,part_mode不存在,则默认不进行划分。Inter时,spec对其取值有限制。
The value of part_mode is restricted as follows:
– If CuPredMode[ x0 ][ y0 ] is equal to MODE_INTRA, part_mode shall be equal to 0 or 1.
– Otherwise (CuPredMode[ x0 ][ y0 ] is equal to MODE_INTER), the following applies:
– If log2CbSize is greater than MinCbLog2SizeY and amp_enabled_flag is equal to 1, part_mode shall be in the range of 0 to 2, inclusive, or in the range of 4 to 7, inclusive.
– Otherwise, if log2CbSize is greater than MinCbLog2SizeY and amp_enabled_flag is equal to 0, or log2CbSize is equal to 3, part_mode shall be in the range of 0 to 2, inclusive.
– Otherwise (log2CbSize is greater than 3 and equal to MinCbLog2SizeY), the value of part_mode shall be in the range of 0 to 3, inclusive.
下表为part_mode和IntraSplitFlag的意义
rqt_root_cbf equal to 1 specifies that the transform_tree( ) syntax structure is present for the current coding unit.
rqt_root_cbf equal to 0 specifies that the transform_tree( ) syntax structure is not present for the current coding unit.
When rqt_root_cbf is not present, its value is inferred to be equal to 1.
只有当intra为最小CB时,IntraSplitFlag才有可能等于1,此时MaxTrafoDepth = max_transform_hierarchy_depth_intra+1,允许TU的划分更进一步。
split_transform_flag表示一个block是否要分成等分的4个小块,当split_transform_flag不存在时,按如下方式取值:
When split_transform_flag[ x0 ][ y0 ][ trafoDepth ] is not present, it is inferred as follows:
– If one or more of the following conditions are true, the value of split_transform_flag[ x0 ][ y0 ][ trafoDepth ] is
inferred to be equal to 1:
– log2TrafoSize is greater than MaxTbLog2SizeY.
– IntraSplitFlag is equal to 1 and trafoDepth is equal to 0.
– interSplitFlag is equal to 1.
– Otherwise, the value of split_transform_flag[ x0 ][ y0 ][ trafoDepth ] is inferred to be equal to 0.
The variable interSplitFlag is derived as follows:
– If max_transform_hierarchy_depth_inter is equal to 0 and CuPredMode[ x0 ][ y0 ] is equal to MODE_INTER and PartMode is not equal to PART_2Nx2N and trafoDepth is equal to 0, interSplitFlag is set equal to 1.
– Otherwise, interSplitFlag is set equal to 0.
所以不考虑MaxTbLog2SizeY和MinTbLog2SizeY的影响下
对于inter的CU来讲,如果max_transform_hierarchy_depth_inter=0,split_transform_flag不存在,它的值如下决定:
inter PART_NxN时,TU被自动划分一次,因为到下一层split_transform_flag仍然不存在,就只能取为0,在这种情况下TU和PU的划分完全一致。
inter PART_2Nx2N,TU不被划分,TU和PU的划分也完全一致。
其他inter PU划分方式下,TU被自动划分一次,但与PU大小不一致。
但是若max_transform_hierarchy_depth_inter!=0,则没有这个限制。
对于intra的情况,如果PU划分,IntraSplitFlag=1,在第一层时,split_transform_flag不存在直接被取为1,所以会自动划分。但是下一层时,好像split_transform_flag就可以存在了。
如果PU不划分,IntraSplitFlag=0,那好像split_transform_flag是可以存在的。
所以对于intra的情况,TU只能比PU相等或更小,也不一定完全一致。为什么TU不能跨越Intra PU呢?因为此时帧内预测需要相邻块的重建像素值作为参考,若相邻块和当前一起作transform,这个条件是无法满足的。
既然对于inter,不考虑max_transform_hierarchy_depth_inter的影响,为什么对于intra,需要给max_transform_hierarchy_depth_intra加上IntraSplitFlag呢?这两种情况下执行原因不同,inter时,TU可以跨越PU的边界,为了防止PU边界上的不连续性影响压缩性能,倾向于在CU继续下分PU的情况下把TU也做下分,但是这是split_transform_flag不存在时才会有的行为,此时TU最多下分一层,当然也可以encoder通过split_transform_flag明确指出不分或继续下分很多层。intra的TU要求比PU要小,不存在TU跨越多个intra PU的情况,可能是为了在某些特殊情况下达到细分的效果,允许intra TU比max_transform_hierarchy_depth_intra再多分一层。
IntraSplitFlag只在CU = min_cb_size时才为1,默认情况下min_cb_size=8,此时TU分一层就到4x4了,最多也只分一层。
如果MinTbLog2SizeY=MinCbLog2SizeY,如果IntraSplitFlag或interSplitFlag为1,那按上面的逻辑,split_transform_flag会被取为1的。
可是HEVC overview1那篇文章中说:Not splitting is implicit when splitting would result in a luma TB size smaller than the indicated minimum.
看起来TU不应该比MinTbLog2SizeY还要小的。
问题出在哪儿了?
看了好几个资料,也没有找到答案,分析一下HM的代码,要看decoder:
相关参数:
getQuadtreeTUMaxDepthInter
getQuadtreeTUMaxDepthIntra
getQuadtreeTULog2MinSize
getQuadtreeTULog2MaxSize
代码
getQuadtreeTULog2MinSizeInCU()
if (log2CbSize < (m_pcSlice->getSPS()->getQuadtreeTULog2MinSize() + quadtreeTUMaxDepth - 1 + interSplitFlag + intraSplitFlag) )
{
// when fully making use of signaled TUMaxDepth + inter/intraSplitFlag, resulting luma TB size is < QuadtreeTULog2MinSize
log2MinTUSizeInCU = m_pcSlice->getSPS()->getQuadtreeTULog2MinSize();
}
TDecEntropy::xDecodeTransform ()
if( pcCU->isIntra(uiAbsPartIdx) && pcCU->getPartitionSize(uiAbsPartIdx) == SIZE_NxN && uiDepth == pcCU->getDepth(uiAbsPartIdx) )
{
uiSubdiv = 1;
}
else if( (pcCU->getSlice()->getSPS()->getQuadtreeTUMaxDepthInter() == 1) && (pcCU->isInter(uiAbsPartIdx)) && ( pcCU->getPartitionSize(uiAbsPartIdx) != SIZE_2Nx2N ) && (uiDepth == pcCU->getDepth(uiAbsPartIdx)) )
{
uiSubdiv = (uiLog2TrafoSize >quadtreeTULog2MinSizeInCU);
}
else if( uiLog2TrafoSize > pcCU->getSlice()->getSPS()->getQuadtreeTULog2MaxSize() )
{
uiSubdiv = 1;
}
else if( uiLog2TrafoSize == pcCU->getSlice()->getSPS()->getQuadtreeTULog2MinSize() )
{
uiSubdiv = 0;
}
else if( uiLog2TrafoSize == quadtreeTULog2MinSizeInCU )
{
uiSubdiv = 0;
}
else
{
assert( uiLog2TrafoSize > quadtreeTULog2MinSizeInCU );
m_pcEntropyDecoderIf->parseTransformSubdivFlag( uiSubdiv, 5 - uiLog2TrafoSize );
}
这好像是intra情况下tu可以小于MinTbLog2SizeY,但是inter不可以。
Hm把spec中的逻辑改写了好多,包括有些参数的意义也有所改动,比如这里的pcCU->getSlice()->getSPS()->getQuadtreeTUMaxDepthInter(),其写入时加过1,所以要在等于1时判断inter是否需要默认划分,不太好理解。
READ_UVLC_CHK( uiCode, "max_transform_hierarchy_depth_inter", 0, ctbLog2SizeY - minTbLog2SizeY); pcSPS->setQuadtreeTUMaxDepthInter( uiCode+1 );
READ_UVLC_CHK( uiCode, "max_transform_hierarchy_depth_intra", 0, ctbLog2SizeY - minTbLog2SizeY); pcSPS->setQuadtreeTUMaxDepthIntra( uiCode+1 );
Ffmpeg就是跟spec里的逻辑完全保持一致
if (log2_trafo_size <= s->ps.sps->log2_max_trafo_size &&
log2_trafo_size > s->ps.sps->log2_min_tb_size &&
trafo_depth < lc->cu.max_trafo_depth &&
!(lc->cu.intra_split_flag && trafo_depth == 0)) {
split_transform_flag = ff_hevc_split_transform_flag_decode(s, log2_trafo_size);
} else {
int inter_split = s->ps.sps->max_transform_hierarchy_depth_inter == 0 &&
lc->cu.pred_mode == MODE_INTER &&
lc->cu.part_mode != PART_2Nx2N &&
trafo_depth == 0;
split_transform_flag = log2_trafo_size > s->ps.sps->log2_max_trafo_size ||
(lc->cu.intra_split_flag && trafo_depth == 0) ||
inter_split;
}
但是后面的操作中,用log2_min_tu_size做数组寻址,如果TU比min tb小就会有问题了:
int min_tu_size = 1 << s->ps.sps->log2_min_tb_size;
int log2_min_tu_size = s->ps.sps->log2_min_tb_size;
…
// TODO: store cbf_luma somewhere else
if (cbf_luma) {
int i, j;
for (i = 0; i < (1 << log2_trafo_size); i += min_tu_size)
for (j = 0; j < (1 << log2_trafo_size); j += min_tu_size) {
int x_tu = (x0 + j) >> log2_min_tu_size;
int y_tu = (y0 + i) >> log2_min_tu_size;
s->cbf_luma[y_tu * min_tu_width + x_tu] = 1;
}
}
android下的hevc decoder libhevc的处理逻辑也与spec基本一致。
为什么??
[1]. Sullivan, G.J., et al., Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on circuits and systems for video technology, 2012. 22(12): p. 1649-1668. ↩︎