With AVC/H.264 the MPEG-4 Standard defines one of the newest and technically best available, state-of-the-art Video Coding FormatsThe AVC/H.264 Video Coding Standard was together finalized and identically specified in 2003 by two Groups, the MPEG (Moving Pictures Experts Group) from ISO and the VCEG (Video Coding Experts Group) from ITU (International Telecommunication Union), a suborganisation of the United Nations (UNO), which also standardised the H.263 format (mainly used in video conference software now)The AVC/H.264 Standard itself was developed by the Joint Video Team (JVT), which included experts from both MPEG and VCEGLooking from the MPEG side the standard is called MPEG-4 Part 10 (ISO 14496-10), looking from the ITU side, it is called H.264 (the ITU document number), by which the format is widely known alreadyAs "official" title for the new standard Advanced Video Coding (AVC) was chosen by MPEG - as video counterpart to the Advanced Audio Coding (AAC) audio formatAVC/H.264 ProfilesThe AVC/H.264 standard defines four different Profiles: Baseline, Main, Extended and High Profile (which themselves are subdivided into Levels):- Baseline Profile offers I/P-Frames, supports progressive and CAVLC only - Extended Profile offers I/P/B/SP/SI-Frames, supports progressive and CAVLC only - Main Profile offers I/P/B-Frames, supports progressive and interlaced, and offers CAVLC or CABAC - High Profile (aka FRExt) adds to Main Profile: 8x8 intra prediction, custom quants, lossless video coding, more yuv formats (4:4:4...)It seems the most usable profile for DVD Backups is the High Profile with maybe the following tools (also check out the tool description of MPEG-4 ASP as all, except GMC, are available in AVC too):CAVLC/CABAC:AVC/H.264 defines two, more advanced tools for entropy coding of the bitstream syntax (macroblock-type, motionvectors + reference-index...) than MPEG-4 ASP: Context-Adaptive Variable Length Coding (CAVLC) and Context-Adaptive Binary Arithmetic Coding (CABAC)CABAC, compared to CAVLC (aka UVLC) which is the default method in AVC/H.264, is a more powerful compression method, being said to bring down the bitrate additonally by about 10-15% (especially on high bitrates). CABAC (as CAVLC) is a lossless method and therefore will never hurt the quality, but will slow down encoding and decoding.Loop/Deblocking Filter:in contrary to prefiltering (for example via avisynth, done on the input), or postprocessing/filtering (via the decoder, done on the final output), LoopFiltering is applied during the encoding process on every single frame, after it got encoded, but before it gets used as reference for the following frames. This helps avoiding blocking artifacts, especially on low bitrates, but will slow down en/decodingVariable Block Sizes/Macroblock Partitions:in contrary to MPEG-4 ASP (where, only with Inter4V/4MV, the Block Sizes can varry between 16x16 and 8x8 pixels), AVC/H.264 offers for Motion Search Precision the division of a macroblock down to 4x4 pixels (including steps like 8x4...). The Block Size is adaptive/variable, a good encoder will be smart enough to decide which one is best to use in every specific macroblock Multiple Reference Frames:in contrary to MPEG-4 ASP (which only allows using the frame before the actual frame as reference), AVC/H.264 offers choosing from multiple ones for inter motion search, which means the codec can decide whether he wants to simply refer to the previous frame (like in ASP) or even to a frame before that. Because of that (eg a P-Frame can refer to a frame before the latest I-Frame) a new frametype had to be introduced: IDR-Frames, which are I-Frames before which no following frame is allowed to refer to. Allowing multiple reference frames will slow down encoding and decoding and cutting will be only possible at IDR-FramesWeighted Prediction:With Weigthed Prediction there can be weights applied to a reference frame (eg you can scale (brightness-wise) a previous picture). This helps especially whenever there are fades, where the subsequent picture is very similar to the previous one except that it is darker. WP will not help with cross-fades (eg a fade from one scene to another)Rate Distortion Optimisation (RDO):RDO allows the encoder to make the most efficient coding decisions whenever it has to choose between different choices (for example when it comes to inter/intra decisions, motion search...)RDO is not a tool defined by the AVC/H.264 specs, but it's a new decision making approach which was first introduced by the H.264 reference software. Other codecs can also make use of RDO, like XviD's VHQ Mode enables RDO already for exampleAn overview of AVC/H.264 compared to other popular video coding formats: available AVC/H.264 CodecsAVC/H.264 implementations are available atm already from x264, Nero, Apple, Sorenson, Elecard, Moonlight, VSS, mpegable, Envivio, Hdot264 (binary), DSPR, JM (reference software) (binary), ffmpeg, Philips, FastVDO, Skal, Sony and many more Encoders- x264: the first publically available High Profile encoder, opensource (GPL) (Source), available for VFW: x264vfw, ffdshow (output .avi), as commandline: x264cli (outputs .mp4, raw), mencoder (outputs raw, .avi) (Doom9's MeGUI) or ffmpeg x264 supports 2pass, CABAC, Loop, multiple B-Frames, B-References, multiple Reference Frames, 4x4 P-Frame, 8x8 B-Frame Blocksizes, anamorphic signalling and High Profile: 8x8 dct and intra prediction, lossless and custom quant matrices- NeroDigital AVC: useable in Nero Recode2, outputs .mp4ND AVC supports 2pass, CABAC, (adaptive) Loop, multiple B-Frames, mulitple Reference Frames, weighted prediction, 8x8 P-Frame Blocksizes, 16x16 B-Frame Blocksizes, Adaptive Quant. (Psy High)- Sorenson: useable in Sorenson Squeeze 4, outputs .mp4, Sorenson supports 2pass, max 2 B-Frames, B-References, Loop and multiple Slices- Apple: useable in Quicktime 7, outputs .mp4, .3gp and .mov, totally slowuses 2pass, max 1 B-frame, Loop (0,0), P8x8,B8x8,I4x4, Adapt. Quant, 5 Slices, no CABAC, no Weighted Pred., no multi Ref.- JM: The AVC Reference Software offers in v9.3 Main and High Profile: B/SP-Frames, CABAC, Loop Filter, 4x4 Blocksizes, multiple Reference Frames, Adaptive Quant, Error Resilience, RDO, Lossless Coding, Custom Quants, Rate Control aso...- Hdot264: opensource (GPL) VFW version of the reference software by doom9 member charact3r, still based on a very old version of the reference (JM 4.0c)- VSS: free preview VFW Encoder (limited to 5 days), based on the reference encoder- Elecard: useable in Elecard Mobile Converter, outputs .mp4 and MainConcept's v2 encoder, outputs .264 and .mpg PS/TSnot publically available anymore:- Moonlight: useable in Moonlight's OneClick Compressor v1.1 and CyberLink's PowerEncoder, outputs .mpg Moonlight supports 1pass (VBR/CBR/Fixed Quants), CABAC, Loop, 2 B-Frames, 8x8 P-Frame Sizes, Adapt. Quant, PAR, Interlacing- MainConcept: was useable in the v1 encoder (adds a watermark), outputs .264 and .mpg PS/TS1pass (CBR/VBR/fixed Quants), P-Frame Reorder, CABAC, Loop, Multiple B-Vops, Multiple Ref, 4x4 P-Frame Sizes, PAR, RDO- mpegable: offered for some time a free VFW Encoder (not based on the reference), doesnt handle YV12mpegable supports 1pass (fixed quants) uses P-Frames only, 8x8 P-Frame Blocksizes, CAVLC only, Loop - Envivio: useable in 4Coder, outputs .mp4 Decoders (comparison) - ffmpeg: opensource (LGPL), used e.g. in ffdshow (VFW and DShow decoder), mplayer and VideoLANsupports B-Frames, B-References, CABAC, Loop, Weighted Prediction and High Profile (8x8 dct and intra prediction, lossless)- Apple: AVC decoding inside Quicktime 7, supports .mp4/.mov, very slowsupports only 1 B-Frame, CABAC, Loop but no mixed references, multiple B-frames and no interlacing- NeroDigital AVC: DShow Decoder and .mp4 Parser coming with Recode2supports Main and High Profile- VSS: preview VFW Decoder (limited to 5 days) and a DShow Decoder (limited to 30 days)VSS DShow supports .avi (with VSSH and H264 fourcc), CABAC, Loop, B-Frames - Elecard: available in Elecard's MPEG Player v4.0 and MainConcept's v2 encoder- Envivio: not freely available AVC DShow decoder called EnvivioTV, handling AVC in .mp4 (since 2.0, current version: 2-1-181)- Philips: DShow AVC decoder freely available in the AVC Alliance player (handles raw AVC only)- FastVDO: time limited (5 minutes per video) High Profile DShow Decoder- Pegasus: not really compliant DShow AVC decoder available here- Basic AVC Decoder in C, for an university project available herenot publically available anymore:- Moonlight: DShow decoder/Parser handling AVC in .mpg, .mp4 and .264 available in Moonlight's MPEG Player v3.0 supports Main and High Profile- MainConcept: the v1 preview offered a free DShow AVC decoder (adds watermark) and Parser handling AVC as .mpg PS/TS- mpegable: offered for some time a free VFW decoder (usable also in DShow), supports .avi (with DAVC fourcc)Sample contentNeroDigital: mp4, mp4 Sorenson: mp4AVC Alliance: raw Moonlight: raw/medium bitrates, raw/low bitrates, raw, mpgFastVDO: raw/high profile Apple: mov Videosoft: avi, avi/new, avi/oldLead: ogm current issues with AVC/H.264- interoperability: most implementations support different container formats atm:.mp4: which is the container of AVC defined in the MPEG-4 Standard (ISO 14496-15) and supported by Apple, Nero, Sorenson, Envivio, Elecard/Moonlight and x264 atm.mpg PS/TS: which are the containers of AVC defined in the MPEG-2 Standard (ISO 13818-1, AMD3) and supported by Mainconcept and Elecard/Moonlight atm.avi: using AVC-in-AVI is nowhere standardized and therefore already causes incompatibilies. The limitations of AVI and VFW (eg regarding b-frames or arbitrary frame coding orders), together with the necessary hacks caused by these two formats, hinder the full implementation of all possible features AVC offers and therefore harm the possible quality or at least the speed of the development, the interoperability and therefore also the competition. AVI is currently used by VSS and x264 (mencoder and vfw).264/.h264: the raw bitstream not stored in a container. output for example by the reference, x264cli, mencoder and mainconcept- speed: some current implementations are pretty slowstill x264 and NeroDigital's AVC encoder seems to offer already a nice speed and quality. But this doesnt change the fact that AVC is a very advanced video coding format and therefore encoding and decoding on old CPU's can be very time consumingMPEG-4 AVC/H.264 on Hardware - HD-DVD/Blu-raythe DVD Forum and the Blu-ray Disc Association are currently working on successors for the DVD format, supporting High Definition content (simply larger picture sizes than current DVD): HD-DVD and BD-ROMAs reported here MPEG-4 AVC/H.264 will be mandatory for HD-DVDBlu-ray has also included MPEG-4 AVC/H.264 as written here It is therefore very likely that AVC/H.264 will be THE upcoming video format, which will be widely used and supported, like it is the case with MPEG-2 (used in DVD) todayfurther documentationRead more about the MPEG-4 AVC/H.264 here for a detailed overview, summarized info here or here a list of available implementations The AVC Verification Test Results can be found hereThe whole specs of AVC/H.264 can be downloaded here (Draft from the 7-14 March 2003) Technical Info about Blu-ray is available here
Posted by rainn at 2006-03-05 01:17:59 Read More Edit Comments(5) Trackback(0)
扫盲7··流媒体的概念 -[勤奋的小白兔]
1 流媒体的概念 数字视频和声音传输所涉及到的一个重要概念是所谓的"流媒体"概念。所谓流媒体是指视频、声音和数据从源端同时向目的地传输,它可以作为连续实时流在目的地被接收。这里的源指的是服务器端的应用,而目的地或称接收端是指客户端应用。 流数据从服务器端应用传输后可由客户端应用接收并显示或回放,一般是客户端应用接收到足够的数据并将之存储在缓冲区后便立即将视频显示出来,或将音频回放出来。 流媒体的一个重要特征是对时间的敏感性,这正是实时性要求高的应用所必需的,所以这类应用与流媒体密不可分就十分自然的了。流媒体的实现主要取决于网络带宽和压缩算法的提高。今天,随着网络协议的改善、网络基础设施和压缩技术的发展,流媒体的实现已经变得越来越容易了。 2 流媒体传输方式 流媒体的传输技术主要有三种:点对点(unicast)、多址广播(Multicast)和广播(Broadcast)。多址广播又称为组播。点对点的特点是流媒体的源和目的地是一一对应的,即流媒体从一个源(服务器端的应用)发送出去后只能到达一个目的地(客户端应用)。组播是一种基于"组"的广播,其源和目的地是一对多的关系,但这种一对多的关系只能在同一个组内建立,也就是说,流媒体从一个源(服务器端的应用)发送出去后,任何一个已经加入了与源同一个组号的目的地(客户端应用)均可以接收到,但该组以外的其他目的地(客户端应用)均接收不到。广播的源和目的地也是一对多的关系,但这种一对多的关系并不局限于组,也就是说,流媒体从一个源(服务器端的应用)发送出去后,同一网段上的所有目的地(客户端应用)均可以接收到,广播可以看作组播的一个特例。 广播和组播对于流媒体传输来说是很有意义的,因为流媒体的数据量往往都很庞大,需要占用很大的网络带宽。如果采用点对点方式,那么有多少个目的地就得传输多少份流媒体,所以所需的网络带宽与目的地的数目成正比,如果采用广播或组播方式,那么流媒体在源端只需传输一份,组内或同一网段上的所有客户端应用均可以接收到,这就大大降低了网络带宽的占用。 3 数字视频和声音传输技术 数字视频和声音传输属于流媒体传输范畴。模拟视频和声音信号经过捕获设备转换成数字形式后,其数据量是非常惊人的,如果没有采用压缩技术,那么要实现数字视频和声音的网络传输是不可想象的。另一方面,数字视频和声音传输对时间的敏感性很强,实时性要求很高,如果不采用特别的网络传输协议是很难满足要求的。所以,实现数字视频和声音传输的一般做法是:在源端先将数字视频和声音信息进行压缩,然后经由诸如ATM这样的有服务质量(即QoS)保证的网络传输到目的地,再在目的地将之进行解压后显示或回放出来。如果需要在诸如IP网络这样的没有QoS保证的网络上传输,则至少也得采用实时传输协议(RTP)进行传输。 目前已发展和正在发展的数字视频和音频压缩技术有很多种,不同的压缩技术有不同的侧重点,适应不同的应用。这些压缩技术中有的已经标准化,但还有很多并没有标准化。常用的已经标准化的压缩技术有MPEG-1、MPEG-2、H.261/H.263等,正在发展的有MPEG-4等。MPEG-1、MPEG-2适用于高带宽的能够提供高质量低延迟的视频和音频应用,而H.261、H.263以及正在发展MPEG-4则适用于低带宽的对图象质量的延迟要求不高的应用。 图为数字视频和音频传输原理示意图,它包含了目前基于数字视频和音频流的几种典型的应用领域。由图可知,不同的应用领域基于不同的网络技术和不同的压缩技术。