Hierarchical B-Frames or B-Pyramid 检测

 

http://www.ramugedia.com/hierarchical-b-frames-or-b-pyramid

Hierarchical B-Frames or B-Pyramid

What’s Hierarchical B-Frame Mode or B-pyramid (notice that in my opinion B-pyramid is a bad term)?

If there is a run of B frames and some B-frames in the run are used for backward reference for some other B frames – then this mode is called Hierarchical B-Frames Coding or B-pyramid.

The following figure is taken from the paper “ANALYSIS OF HIERARCHICAL B PICTURES AND MCTF”, by Heiko Schwarz, Detlev Marpe, and Thomas Wiegand, illustrates the conception of B-pyramid:

 

Hierarchical B-Frames or B-Pyramid 检测_第1张图片

 

Let’s display the first GOP from the above figure slightly different:

 

Hierarchical B-Frames or B-Pyramid 检测_第2张图片

 

So, some geometric form is revealed but not a pyramid. Therefore, in my opinion the term B-pyramid is bad.

According to results of the above mentioned article “ANALYSIS OF HIERARCHICAL B PICTURES AND MCTF” using of Hierarchical B-Frames commonly improves coding efficiency (e.g. on Football CIF 30Hz, the improvement is about 0.5 Y-PSNR dB).

Pros and Cons of Hierarchical B-frames

Pros: better exploitation of temporal redundancy.

Cons: long coding latency (not suitable for low-latency applications)

 

How Detect Hierarchical B-Frames or B-Pyramid?

For each frame we check that all following four conditions:

  • Current frame is B

  • Previous frame (in decoding order) is also B (i.e. successive number of B frames is greater than one)

  • Previous RefIdc (nal_ref_idc) is non-zero (i.e. the previous B-frame is used for reference)

  • POC of current B frame is smaller than that of the previous one

If all above conditions are met then B-pyramid is detected.

If elementary stream is encapsulated in Mpeg-TS container then we can use PTS instead of POC (consequently we avoid the derivation of POC, in case of pic_order_cnt_type=1 the derivation of POC is a tricky process).  Indeed, to get POC it’s necessary to dive into SPS to get log2_max_pic_order_cnt_lsb if pic_order_cnt_type=0 or a dozen other parameters in case of pic_order_cnt_type=1.

 

 

How Detect B-Pyramid if Elementary Stream is Encapsulated in Mpeg-TS or MPEG4 Container?

MPEG TS Container

When Elementary Stream is encapsulated in MPEG-TS container we look for video frame boundaries to pick up PTS. We get PTS from the PES header and frame start is mandatory indicated by AUD (nal_type=9) in transport packet payload. Notice that if PTS is not present then PTS=DTS and no B-pyramid can exist in such case. Picture data (or slice data in case of multiple slices per picture) is contained in NALU with nal_type = 1 or 5 (IDR). There is a possibility that slice data  is absent in the current transport packet and it’s present in the next or next-next video packet (e.g. if SPS is too long). 

Once NAL with nal_type 1 or 5 is sensed we need extract nal_ref_idc from the NAL header and two first parameters from the slice header: first_mb_in_slice and slice_type.

NAL unit of each slice consists of:

Start-code (000001 or 00000001), nal header (1 byte), slice header and slice data.

nalType = nal_header & 0x1f

nal_ref_idc =  ( nal_header & 0x60 )>>5

To determine first_mb_in_slice and slice_type we need read the first byte from the slice header  - slh[0] and to execute the following operations:

  • Get first_mb_in_slice:first_mb_in_slice = slh[0]>>7

  • if first_mb_in_slice==1 then the current slice is the first slice in a picture and it actually is the start of picture data (in such case the next step is to determine whether the slice type is B or not)

  • If first_mb_in_slice=0 then the current slice is not the first one in a picture and the picture type has been already determined.

  • if first_mb_in_slice==1 then we have to determine whether the slice type is B or not. Slice type code corresponding to B has two values 1 or 6. Exp-golomb bit-representation of 1 is ‘010’ and 6 is ‘00111’.

Hence if the current slice is corresponding to the first slice in a picture (i.e. first_mb_in_slice=1 or MSbit is ‘1’) and the picture type is B then one of the following two bit-patterns are transmitted in the first byte slh[0] of the slice:

1010     or      100111

Basing on the above patterns we derive the following rules to determine whether the picture type is B or not:

 if (slh[0]>>4)=0xA then current slice is the first slice and the picture type is B

 if ( slh[0] & 0xFC ) = 0x9C then then current slice is the first slice and the picture type is B

 

For each frame we check that all following four conditions:

  • Current frame is B

  • Previous frame (in decoding order) is also B (i.e. successive number of B frames is greater than one)

  • Previous RefIdc (nal_ref_idc) is non-zero (i.e. the previous frame is used for reference)

  • PTS of current B frame is smaller than that of the previous one

If all above conditions are met then B-pyramid is detected.

 

MPEG4 Container (non-fragmented)

With ‘stco’ and ‘stsz’ tables in meta-data we can access all access units successively in decoding order.

For each access unit we skip over non-VCL units (e.g. SEI) until first slice data NAL sensed (nal_type=1 or 5). 

Then we read NAL header (to determine nal_ref_idc) and the following byte (which corresponds to the first byte of slice header) to determine slice type (B or not B). Slice type and nal_ref_idc are identically determined according to the previous section.  Although ref_idc can be derived from sdtp-box provided that this box is present in meta-data (notice it’s not mandatory to signal sdtp-box).

 

With ctts-table in meta data we derive PTS of each access unit (if ctts is not present then PTS = DTS and no B-pyramid can exist in such stream).

For each frame we check that all following four conditions:

  • Current frame is B

  • Previous frame (in decoding order) is also B (i.e. successive number of B frames is greater than one)

  • Previous RefIdc (nal_ref_idc) is non-zero (i.e. the previous frame is used for reference)

  • PTS of current B frame is smaller than that of the previous one

If all above conditions are met then B-pyramid is detected.

 

你可能感兴趣的:(技术)