Vorbis格式总结

 前段时间一直没搞明白OGG、Vorbis之间的关系, 查了些资料大概弄清楚了。OGG就相当于一个容器,Vorbis是一种音频压缩机制,相当于AAC、AC3,可以用OGG来封装。像一些后缀名为.ogg的文件,表明这里面只包含Vorbis 音频。

 3 What is Vorbis

    Vorbis是一种codec,压缩机制,其本身并不提供frame同步、检错等功能。将audio PCM data压缩成raw packet后,它还需要采用一些high level的同步机制来进行传输。目前可采用OggRTP

       Ogg Vorbis is a fully open, non-proprietary, patent-and-royalty-free, general-purpose compressed audio format for mid to high quality (8kHz-48.0kHz, 16+ bit, polyphonic) audio and music at fixed and variable bitrates from 16 to 128 kbps/channel. This places Vorbis in the same competitive class as audio representations such as MPEG-4 (AAC), and similar to, but higher performance than MPEG-1/2 audio layer 3, MPEG-4 audio (TwinVQ), WMA and PAC.

       Vorbispackets的形式输出,这些packets当中除了data packets外,还包含三个header packets,Identification HeaderComment HeaderSetup Header。这三个header的顺序是固定的,即第一个必须是Identification header,第二个是Comment header,第三个是Setup HeaderDecode必须找到正确的三个header之后,才能对audio packet进行解码。

 

       这三个header头部都包含一些同样的field

       Common Header Decode

       Each header packet begins with the same header fields.

        1) [packet_type] : 8 bit value 

            Decode continues according to packet type; the identi cation header is type 1, the comment
                header type 3 and the setup header type 5 (these types are all odd as a packet with a leading
                single bit of '0' is an audio packet). The packets must occur in the order of identi cation,
                comment, setup.

        2) 0x76, 0x6f, 0x72, 0x62, 0x69, 0x73: the characters ’v’,’o’,’r’,’b’,’i’,’s’ as six octets

       随后就是每个header特有的部分,下面将依次介绍:

 

3.1 Identification Header Packet

       The identification header is a short header of only a few fields used to declare the stream definitively as Vorbis, and provide a few externally relevant pieces of information about the audio stream.主要用来表明是否为vorbis流,一些audio sample rate等信息。The identification header is coded as follows:

       1) [vorbis_version] = read 32 bits as unsigned integer
       2) [audio_channels] = read 8 bit integer as unsigned
       3) [audio_sample_rate] = read 32 bits as unsigned integer
       4) [bitrate_maximum] = read 32 bits as signed integer
       5) [bitrate_nominal] = read 32 bits as signed integer
       6) [bitrate_minimum] = read 32 bits as signed integer
       7) [blocksize_0] = 2 exponent (read 4 bits as unsigned integer)
       8) [blocksize_1] = 2 exponent (read 4 bits as unsigned integer)
       9) [framing_flag] = read one bit

 

       [vorbis_version] is to read ’0’in order to be compatible with this document. Both [audio_channels] and [audio_sample_rate] must read greater than zero. Allowed final blocksize values are 64, 128, 256, 512, 1024, 2048, 4096 and8192 inVorbis I. [blocksize_0] must be less than or equal to [blocksize_1]. The framing bit must be nonzero. Failure to meet any of these conditions renders a stream undecodable.若不满足这些条件,将不能解码。

       The bitrate fields above are used only as hints. The nominal bitrate field especially may be considerably off in purely VBR streams. The fields are meaningful only when greater than zero.

  • All three fields set to the same value implies a fixed rate, or tightly bounded, nearly fixed-rate bitstream
  • Only nominal set implies a VBR or ABR stream that averages the nominal bitrate
  • Maximum and or minimum set implies a VBR bitstream that obeys the bitrate limits
  • None set indicates the encoder does not care to speculate.             

3.2 Comment Header Packet

       Comment Header Packet主要包含一些这个流的介绍信息,例如作者、时间、版本等等。本身长度不限,有可能为空,有可能要占用Ogg里面的一两个page

       The comment header comprises the entirety of the second bitstream header packet. Unlike the first bitstream header packet, it is not generally the only packet on the second page and may not be restricted to within the second bitstream page. The length of the comment header packet is (practically) unbounded. The comment header packet is not optional; it must be present in the bitstream even if it is effectively empty.

       The comment header is encoded as follows (as per Ogg’s standard bitstream mapping which renders least-significant-bit of the word to be coded into the least significant available bit of the current bitstream octet first):

       下面为该Packet中内容的组织形式,首先是一个单独的vendor,然后是一系列的vendor

1. Vendor string length (32 bit unsigned quantity specifying number of octets)

2. Vendor string ([vendor string length] octets coded from beginning of string to end of string, not null terminated)

3. Number of comment fields (32 bit unsigned quantity specifying number of fields)

4. Comment field 0 length (if [Number of comment fields]>0; 32 bit unsigned quantity specifying number of octets)

5. Comment field 0 ([Comment field 0 length] octets coded from beginning of string to end of string, not null terminated)

6. Comment field 1 length (if [Number of comment fields]>1...)...

       This is actually somewhat easier to describe in code; implementation of the above can be found in vorbis/lib/info.c, _vorbis_pack_comment() and _vorbis_unpack_comment().

      

3.3 Setup Header Packet

      Setup headers中包含了一系列decode必需的codec setup information,按顺序依次包含:

       1 lists of codebook configurations

       2 time-domain transform configurations(placeholder in Vorbis I)

       3 floor configurations

       4 residue configurations

       5 channel mapping configurations

       6 mapping configurations

       如上步骤都完成后,会置一个framing flag1setup header decode完成,此后就能接收audio data packets来进行解码。如果framing flag没有设置,表明framing errorstream不能解码。

 

3.4 Audio Packet

      紧接着三个header packets之后,就是audio packets。收到audio packets后,第一步需要确认packet type,即是否为audio,如果不是audio或者流出错,decoder会忽略掉该packet,并且不去解码。具体内部的解码细节不太熟悉。      

 

3.5 API介绍

       待补充。

4   参考资料:

  1 官网:http://www.xiph.org/vorbis/

   2 Vorbis_I_spec.pdf

你可能感兴趣的:(音频)