HEVC学习之码流分析

一,从分层编解码框架到NAL单元

以H.264为例:
H.264适应不同网络之间的传输,主要原因是引入了分层结构,分为视频编码层(VCL)与网络抽象层(NAL),从而实现压缩编码与网络传输分离。
通过H.264算法压缩的后的数据通过NAL-VCL接口封装成NAL包
HEVC学习之码流分析_第1张图片
NAL的基本单元是NALU,而VCL层自上而下的结构如下所示:
HEVC学习之码流分析_第2张图片
其中划分条带(slice)的目的是为了适应不同传输网络的最大传输单元(MTU)
分组的目的是为了使数据独立于其他分组,从而实现特定的目的,比如防止误差扩散保证图像质量,区分前景背景以分别编码

每个NALU由包头信息和VCL层信息组合而成,一个NALU包含一个slice:
HEVC学习之码流分析_第3张图片
HEVC学习之码流分析_第4张图片
其中RBSP(Raw Byte Sequence payload,原始字节序列负载)
SoDB(String of Data BITS,原始数据比特流)
填充比特为rbsP_trailing是为了使码流按字节对齐

二、NALU的包头信息

H.264的包头信息占一字节,即8bit,NAL类型只有32种(0-31),二进制只有5位,浪费三位,如果这三位不用的话大量的nal单元会造成大量的浪费,因此这三位也要利用上:
第1位:1比特的禁止位,当网络识别此单元中存在比特错误时,可将其设置为1,以方便接收方丢弃该单元。
第2位~第3位,2比特的优先级位(NRI),按照11,10,01,00的顺序优先级递减,当解码器忙碌时从优先级高的开始解码。

NAL的32种类型如下:
HEVC学习之码流分析_第5张图片
而HEVC和AVC的NAL包头主要有三个区别:
01、AVC包头信息占一字节,在HEVC中包头信息占两字节,足以支持HEVC可分级编码,多视点编码和3D视频编码的扩展
02、AVC的视频参数封存于pps,sps的NAL包中,在HEVC中还新增了vps(视频参数集),用于存放prfile,Level等
02、HEVC的NAL包头加入了该NAL所在的时间层的标志,去掉了NRI,并将该信息放在nal_unit_type中

HEVC学习之码流分析_第6张图片
01,1比特禁止位F,与AVC的不同,它的作用就是在尚存MPEG-2系统环境中,防止产生可以解释为MPEG-2起始码的比特模式。
02,6比特的类型位NAL_TYPE,新增了32位用作non-VCL单元
03, 6比特的Layer_ID,为层识别信息,表示当前NAL为哪一层,比如在可分级扩展中,它将用于联合标注空间和质量分级层,在3D扩展中,layer_id将标注视点和深度
04, 3比特的TID,temporal_id,表示HEVC的接入单元属于哪个时域子层

HEVC的NAL单元类型:
HEVC学习之码流分析_第7张图片

三、使用软件进行码流分析

先用HM对篮球测试序列进行压缩

配置如下:

#======== File I/O ===============
InputFile                     : C:\Users\梁昊霖\Desktop\HM-16.20\BasketballDrill_832x480_50.yuv
InputBitDepth                 : 8           # Input bitdepth
InputChromaFormat             : 420         # Ratio of luminance to chrominance samples
FrameRate                     : 50          # Frame Rate per second
FrameSkip                     : 0           # Number of frames to be skipped in input
SourceWidth                   : 832         # Input  frame width
SourceHeight                  : 480         # Input  frame height
FramesToBeEncoded             : 5        # Number of frames to be coded

Level                         : 3.1


采用lowdelay模式,即除了第一帧全B帧

#======== File I/O =====================
BitstreamFile                 : 50LP.bin
ReconFile                     : 50LP.yuv

#======== Profile ================
Profile                       : main

#======== Unit definition ================
MaxCUWidth                    : 64          # Maximum coding unit width in pixel
MaxCUHeight                   : 64          # Maximum coding unit height in pixel
MaxPartitionDepth             : 4           # Maximum coding unit depth
QuadtreeTULog2MaxSize         : 5           # Log2 of maximum transform size for
                                            # quadtree-based TU coding (2...6)
QuadtreeTULog2MinSize         : 2           # Log2 of minimum transform size for
                                            # quadtree-based TU coding (2...6)
QuadtreeTUMaxDepthInter       : 3
QuadtreeTUMaxDepthIntra       : 3

#======== Coding Structure =============
IntraPeriod                   : -1          # Period of I-Frame ( -1 = only first)
DecodingRefreshType           : 0           # Random Accesss 0:none, 1:CRA, 2:IDR, 3:Recovery Point SEI
GOPSize                       : 4           # GOP Size (number of B slice = GOPSize-1)
ReWriteParamSetsFlag          : 1           # Write parameter sets with every IRAP

IntraQPOffset                 : -1 
LambdaFromQpEnable            : 1           # see JCTVC-X0038 for suitable parameters for IntraQPOffset, QPoffset, QPOffsetModelOff, QPOffsetModelScale when enabled
#        Type POC QPoffset QPOffsetModelOff QPOffsetModelScale CbQPoffset CrQPoffset QPfactor tcOffsetDiv2 betaOffsetDiv2 temporal_id #ref_pics_active #ref_pics reference pictures     predict deltaRPS #ref_idcs reference idcs
Frame1:  B    1   5       -6.5                      0.2590         0          0          1.0      0            0               0           4                4         -1 -5 -9 -13       0
Frame2:  B    2   4       -6.5                      0.2590         0          0          1.0      0            0               0           4                4         -1 -2 -6 -10       1      -1       5         1 1 1 0 1
Frame3:  B    3   5       -6.5                      0.2590         0          0          1.0      0            0               0           4                4         -1 -3 -7 -11       1      -1       5         0 1 1 1 1
Frame4:  B    4   1        0.0                      0.0            0          0          1.0      0            0               0           4                4         -1 -4 -8 -12       1      -1       5         0 1 1 1 1

#=========== Motion Search =============
FastSearch                    : 1           # 0:Full search  1:TZ search
SearchRange                   : 64          # (0: Search range is a Full frame)
BipredSearchRange             : 4           # Search range for bi-prediction refinement
HadamardME                    : 1           # Use of hadamard measure for fractional ME
FEN                           : 1           # Fast encoder decision
FDM                           : 1           # Fast Decision for Merge RD cost

#======== Quantization =============
QP                            : 32          # Quantization parameter(0-51)
MaxDeltaQP                    : 0           # CU-based multi-QP optimization
MaxCuDQPDepth                 : 0           # Max depth of a minimum CuDQP for sub-LCU-level delta QP
DeltaQpRD                     : 0           # Slice-based multi-QP optimization
RDOQ                          : 1           # RDOQ
RDOQTS                        : 1           # RDOQ for transform skip
SliceChromaQPOffsetPeriodicity: 0           # Used in conjunction with Slice Cb/Cr QpOffsetIntraOrPeriodic. Use 0 (default) to disable periodic nature.
SliceCbQpOffsetIntraOrPeriodic: 0           # Chroma Cb QP Offset at slice level for I slice or for periodic inter slices as defined by SliceChromaQPOffsetPeriodicity. Replaces offset in the GOP table.
SliceCrQpOffsetIntraOrPeriodic: 0           # Chroma Cr QP Offset at slice level for I slice or for periodic inter slices as defined by SliceChromaQPOffsetPeriodicity. Replaces offset in the GOP table.

#=========== Deblock Filter ============
LoopFilterOffsetInPPS         : 1           # Dbl params: 0=varying params in SliceHeader, param = base_param + GOP_offset_param; 1 (default) =constant params in PPS, param = base_param)
LoopFilterDisable             : 0           # Disable deblocking filter (0=Filter, 1=No Filter)
LoopFilterBetaOffset_div2     : 0           # base_param: -6 ~ 6
LoopFilterTcOffset_div2       : 0           # base_param: -6 ~ 6
DeblockingFilterMetric        : 0           # blockiness metric (automatically configures deblocking parameters in bitstream). Applies slice-level loop filter offsets (LoopFilterOffsetInPPS and LoopFilterDisable must be 0)

#=========== Misc. ============
InternalBitDepth              : 8           # codec operating bit-depth

#=========== Coding Tools =================
SAO                           : 1           # Sample adaptive offset  (0: OFF, 1: ON)
AMP                           : 1           # Asymmetric motion partitions (0: OFF, 1: ON)
TransformSkip                 : 1           # Transform skipping (0: OFF, 1: ON)
TransformSkipFast             : 1           # Fast Transform skipping (0: OFF, 1: ON)
SAOLcuBoundary                : 0           # SAOLcuBoundary using non-deblocked pixels (0: OFF, 1: ON)

#============ Slices ================
SliceMode                : 0                # 0: Disable all slice options.
                                            # 1: Enforce maximum number of LCU in an slice,
                                            # 2: Enforce maximum number of bytes in an 'slice'
                                            # 3: Enforce maximum number of tiles in a slice
SliceArgument            : 1500             # Argument for 'SliceMode'.
                                            # If SliceMode==1 it represents max. SliceGranularity-sized blocks per slice.
                                            # If SliceMode==2 it represents max. bytes per slice.
                                            # If SliceMode==3 it represents max. tiles per slice.

LFCrossSliceBoundaryFlag : 1                # In-loop filtering, including ALF and DB, is across or not across slice boundary.
                                            # 0:not across, 1: across

#============ PCM ================
PCMEnabledFlag                      : 0                # 0: No PCM mode
PCMLog2MaxSize                      : 5                # Log2 of maximum PCM block size.
PCMLog2MinSize                      : 3                # Log2 of minimum PCM block size.
PCMInputBitDepthFlag                : 1                # 0: PCM bit-depth is internal bit-depth. 1: PCM bit-depth is input bit-depth.
PCMFilterDisableFlag                : 0                # 0: Enable loop filtering on I_PCM samples. 1: Disable loop filtering on I_PCM samples.

#============ Tiles ================
TileUniformSpacing                  : 0                # 0: the column boundaries are indicated by TileColumnWidth array, the row boundaries are indicated by TileRowHeight array
                                                       # 1: the column and row boundaries are distributed uniformly
NumTileColumnsMinus1                : 0                # Number of tile columns in a picture minus 1
TileColumnWidthArray                : 2 3              # Array containing tile column width values in units of CTU (from left to right in picture)   
NumTileRowsMinus1                   : 0                # Number of tile rows in a picture minus 1
TileRowHeightArray                  : 2                # Array containing tile row height values in units of CTU (from top to bottom in picture)

LFCrossTileBoundaryFlag             : 1                # In-loop filtering is across or not across tile boundary.
                                                       # 0:not across, 1: across 

#============ WaveFront ================
WaveFrontSynchro                    : 0                # 0:  No WaveFront synchronisation (WaveFrontSubstreams must be 1 in this case).
                                                       # >0: WaveFront synchronises with the LCU above and to the right by this many LCUs.

#=========== Quantization Matrix =================
ScalingList                   : 0                      # ScalingList 0 : off, 1 : default, 2 : file read
ScalingListFile               : scaling_list.txt       # Scaling List file name. If file is not exist, use Default Matrix.

#============ Lossless ================
TransquantBypassEnable     : 0                         # Value of PPS flag.
CUTransquantBypassFlagForce: 0                         # Force transquant bypass mode, when transquant_bypass_enable_flag is enabled

#============ Rate Control ======================
RateControl                         : 0                # Rate control: enable rate control
TargetBitrate                       : 1000000          # Rate control: target bitrate, in bps
KeepHierarchicalBit                 : 2                # Rate control: 0: equal bit allocation; 1: fixed ratio bit allocation; 2: adaptive ratio bit allocation
LCULevelRateControl                 : 1                # Rate control: 1: LCU level RC; 0: picture level RC
RCLCUSeparateModel                  : 1                # Rate control: use LCU level separate R-lambda model
InitialQP                           : 0                # Rate control: initial QP
RCForceIntraQP                      : 0                # Rate control: force intra QP to be equal to initial QP

### DO NOT ADD ANYTHING BELOW THIS LINE ###
### DO NOT DELETE THE EMPTY LINE BELOW ###

再对编码得到的二进制文件进行码流分析:
HEVC学习之码流分析_第8张图片
可以看到除了第一帧为I帧,其质量最高,其余帧为B帧
在B帧中每隔4帧出现一个较高质量的B帧,因为在配置文件中设置为:

GOPSize                       : 4           # GOP Size (number of B slice = GOPSize-1)
(B条带的数量=GOP数量-1,因为第一个条带为I条带?)GOP不一定以I帧结尾?

I帧只包含I条带,P帧只包含P条带,B帧只包含B条带
I条带只包含I宏块,P条带可以包含P宏块也可以包含I宏块,同样B条带可以包含B宏块也可以包含I宏块

可以看到第4,5帧都含有intra,即I宏块,并且intra含量越高,B帧质量越高
HEVC学习之码流分析_第9张图片
HEVC学习之码流分析_第10张图片

通过16进制查看其码流:
HEVC学习之码流分析_第11张图片
框中的意义是起始地址,每个地址的最小单位中可以放两个16进制数 ,如EF:E的10进制为14,转为二进制为1110,F的10进制为15,转为二进制为1111,[1110 1111 ]就放在一个地址单元中。
HEVC学习之码流分析_第12张图片

当采用其他传输协议时,一个UDP包就是一个NAL单元,解码器可以很方便检测出NAL分界和解码。但在字节流格式中,NAL单元被编码成字节的码流,解码器无法确定每个NAL的起始位置和终止位置,因此定义了一个起同步作用的起始码前缀:0X 00 00 01,在上图中用红框框出。
每个NAL单元用0X 00 00 01分割开,紧跟着起始码前缀后面的是NAL头,如40 01,转为二进制为0100 0000 0000 0001对照NAL单元头结构:
HEVC学习之码流分析_第13张图片
可以看到其中的NAL_TYPE为为0[100 000]0 0000 0001,将[ ]中转为10进制为32,其对应NAL类型为VPS,同理可以看到后面两个NALU依次为SPS,PPS

你可能感兴趣的:(视频增强与编解码,学习,网络,视频编解码)