1. Video Stream Structure
l Stream: a coded video stream, consists of successive GOPs
l GOP: group of successive picture
l Picture: equal to the frame (I frame, B frame, P frame)
l Slice: a group of macro blocks
l Macro Block: consists of two of more blocks of pixels. In MPEG2 and other early codec the size is fixed at blocks of 8x8 pixels. In more modern codec such as h.263 and h.264 the overarching macro block size is fixed at 16x16 pixels
l Structure Diagram
2. I Frame, B Frame, P Frame
l I Frames: intra coded frames. That is, all the information to reconstruct an I Frame access unit is contained in that access unit. It is essentially a JPEG image.
l P Frame: forward Predictive coded frames (inter frames). This means that macro-blocks in a P-Frame refer to an access unit in the future.
l B Frame: bidirectional predictive coded frames (inter frames). This means that macro-blocks in a B-Frame refer to an access unit in the future and an access unit in the past
3. CBR, VBR, ABR
l CBR
² Constant bit rate encoding means that the rate at which a codec's output data should be consumed is constant.
² CBR would not be the optimal choice for storage as it would not allocate enough data for complex sections (resulting in degraded quality) while wasting data on simple sections.
l VBR
² VBR allows a higher bitrate (and therefore more storage space) to be allocated to the more complex segments of media files while less space is allocated to less complex segments.
² The advantages of VBR are that it produces a better quality-to-space ratio compared to a CBR file of the same data. The bits available are used more flexibly to encode the sound or video data more accurately, with fewer bits used in less demanding passages and more bits used in difficult-to-encode passages.
² The disadvantages are that it may take more time to encode, as the process is more complex, and that some hardware might not be compatible with VBR files.
l ABR
² It can have higher bitrate and lower bitrate parts, and the average bitrate for a certain timeframe is obtained by dividing the number of bits used during the timeframe by the number of seconds in the timeframe.
² As it is a form of variable bitrate, this allows more complex portions of the material to use more bits and less complex areas to use fewer bits.
² ABR encoding is desirable for users who want the general benefits of VBR encoding (an optimum bitrate from frame to frame) but with a relatively predictable file size.
l At a given bitrate, VBR is usually higher quality than ABR, which is higher quality than CBR (constant bitrate).
4. 1-Pass vs. 2-Pass
l One pass VBR has a moving window over which it analyzes the video. That is, the encoder looks ahead a second or two decides how complex the video is and allocated bits based on that information. Because it is only looking ahead a short time in the future it has no idea how complex the video is much later in the movie so the allocation is not as optimal as it could be
l Two pass VBR analyzes the entire movie for video complexity in the first pass. It then stores this information. It can take advantage of knowing the variation in complexity of the entire movie and allocate bits more effectively than the 1-pass approach.
l For example, perhaps the beginning of the movie is all very high action but the end is people sitting quietly talking. The high action is much more difficult to encode and should be allocated many more bits than the talking part of the movie. The two pass algorithm can take advantage of knowing the complexity of the entire movie while the one pass cannot.
l Single-pass encoding is used when the encoding speed is most important - e.g. for real-time encoding.
l Two pass encoding is used when the encoding quality is most important. Two pass encoding cannot be used in real-time encoding, live broadcast or live streaming.
5. 3:2 Pull Down
l Display the video with 24fps on NTSC 29.97fps
l Implement
6. Why is 29.97fps, rather than 30 fps
l In the days of black and white TV, the frame rate was 30 fps
l When they added color, they had to add in extra data but wanted to keep the signal backwardly compatible with black and white TVs
l The addition of the color subcarrier also required a slight reduction of the frame rate from 30 frames per second to 30/1.001 (very close to 29.97) frames per second
7. Drop Frame & Non Drop Frame
l Display the video with 30fps on NTSC 29.97fps
l (30fps – 29.97fps) x 60seconds=1.8frames, you cannot adjust by 1.8 frames per minute, but you can adjust by 18 full frames per 10 minutes.
l Because 10 minutes is not evenly divisible by 18 frames, we use drop-frame timecode and drop two frame numbers every minute; by the ninth minute, you have dropped all 18 frame numbers. No frames need to be dropped the tenth minute. That is how drop-frame timecode works.