GOP
QP
Bit Rate(码率)
PSNR
Definitions:
Ø GOP(Group of Pictures)
策略影响编码质量:所谓GOP,意思是画面组,一个GOP就是一组连续的画面。MPEG编码将画面(即帧)分为I、P、B三种,I是内部编码帧,P是前向预测帧,B是双向内插帧。简单地讲,I帧是一个完整的画面,而P帧和B帧记录的是相对于I帧的变化。没有I帧,P帧和B帧就无法解码,这就是MPEG格式难以精确剪辑的原因,也是我们之所以要微调头和尾的原因。GOP 越长,B 帧所占比例更高,编码的率失真性能越高。
In Video coding, a group of pictures specifies the order in which intra- and inter-frames are arranged.
The GOP is a group of successive pictures within a coded video stream. Each coded video stream consists of successive GOPs. From the pictures contained in it, the visible frames are generated.
A GOP can contain the following picture types:
§ I-picture or I-frame (intra coded picture) - reference picture, which represents a fixed image and which is independent of other picture types. Each GOP begins with this type of picture.
§ P-picture or P-frame (predictive coded picture) - contains motion-compensated difference information from the preceding I- or P-frame.
§ B-picture or B-frame (bidirectionally predictive coded picture) - contains difference information from the preceding and following I- or P-frame within a GOP.
§ D-picture or D-frame (DC direct coded picture) - serves the fast advance.
A GOP always begins with an I-frame. Afterwards several P-frames follow, in each case with some frames distance. In the remaining gaps are B-frames. A few video codecs allow for more than one I-frame in a GOP.
The I-frames contain the full image and do not require any additional information to reconstruct it. Therefore any errors within the GOP structure are corrected by the next I-frame. B-frames within a GOP only propagate errors in H.264, where B-frames can be referenced by other pictures in order to increase compression efficiency.
The more I-frames the video stream has, the more editable it is. However, having more I-frames increases the stream size. In order to save bandwidth and disk space, videos prepared for internet broadcast often have only one I-frame per GOP.
The GOP structure is often referred by two numbers, for example M=3, N=12. The first one tells the distance between two anchor frames (I or P). The second one tells the distance between two full images (I-frames): it is the GOP length <就是说GOP长度是两个I帧的距离>. For the example M=3 N=12, the GOP structure is IBBPBBPBBPBBI.
QP <quantization parameter> 量化参数
Wikipedia上居然没有对这个做一个解释,至少现在还木有。只好查了别的资料,解释如下:
H.264编解码器中,量化参数QP和量化步长Qstep的关系:
量化步长Qstep共有52个值。(对于亮度编码而言)
量化参数QP是量化步长Qstep的序号,取值0~51。
QP取最小值0 时,表示量化最精细;相反,QP取最大值51时,表示量化是最粗糙的。
Qstep随着QP的增加而增加,QP每增加6,Qstep增加一倍。
对于色度编码,QP的最大值是39。
在深度视频实验里我用的QP分别是22,27,32,37;结果可见22的最清晰,37的最模糊。
Bit Rate 码率
In telecommunications and computing, bitrate (sometimes written bit rate, data rate or as a variable R[1]) is the number of bits that are conveyed or processed per unit of time.
码率就是数据传输时单位时间传送的数据位数,一般我们用的单位是kbps即千位每秒。 通俗一点的理解就是取样率,单位时间内取样率越大,精度就越高,处理出来的文件就越接近原始文件,也就是说画面的细节就越丰富,但压缩率也就越小。
码流 x 时间 = 总容量
Multimedia encoding
In digital multimedia, bit rate often refers to the number of bits used per unit of playback time to represent a continuous medium such as audio or video aftersource coding (data compression). The encoding bit rate of a multimedia file is the size of a multimedia file in bytes divided by the playback time of the recording (in seconds), multiplied by eight.
For realtime streaming multimedia, the encoding bit rate is the goodput that is required to avoid interrupt:
Encoding bit rate = Required goodput
The term average bitrate is used in case of variable bitrate multimedia source coding schemes. In this context, the peak bit rate is the maximum number of bits required for any short-term block of compressed data.[12]
A theoretical lower bound for the encoding bit rate for lossless data compression is the source information rate, also known as the entropy rate.(熵率)
Entropy rate ≤ Multimedia bit rate
PSNR Peak signal-to-noise ratio
The phrase peak signal-to-noise ratio, often abbreviated PSNR, is an engineering term for the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity(保真度) of its representation. Because many signals have a very wide dynamic range, PSNR is usually expressed in terms of the logarithmic decibel scale.
The PSNR is most commonly used as a measure of quality of reconstruction of lossy compression codecs (e.g., for image compression). The signal in this case is the original data, and the noise is the error introduced by compression. When comparing compression codecs it is used as an approximation to human perception of reconstruction quality, therefore in some cases one reconstruction may appear to be closer to the original than another, even though it has a lower PSNR (a higher PSNR would normally indicate that the reconstruction is of higher quality). One has to be extremely careful with the range of validity of this metric; it is only conclusively valid when it is used to compare results from the same codec (or codec type) and same content. PSNR值越大,就代表失真越少。