AAC音频格式:Advanced Audio Coding(⾼级⾳频解码),是⼀种由MPEG-4标准定义的有损⾳频压缩格式,由Fraunhofer发展,Dolby, SonyAT&T是主要的贡献者。
简单说,ADTS可以在任意帧解码,也就是说它每⼀帧都有头信息。ADIF只有⼀个统⼀的头,所以必须得到所有的数据后解码。
且这两种的header的格式也是不同的,⽬前⼀般编码后的和抽取出的都是ADTS格式的⾳频流。两者具体的组织结构如下所示:
AAC音频文件每一帧都是由ADTS Header 和AAV Audio Data组成.
详细了解一下ADTS Header的构成:
ADTS Header由adts_fixed_header()和adts_variable_header()构成,
也就是说ADTS Header由一个固定头信息和可变头信息.
/* adts_fixed_header */
put_bits(&pb, 12, 0xfff); /* syncword */
put_bits(&pb, 1, 0); /* ID */
put_bits(&pb, 2, 0); /* layer */
put_bits(&pb, 1, 1); /* protection_absent */
put_bits(&pb, 2, ctx->objecttype); /* profile_objecttype */
put_bits(&pb, 4, ctx->sample_rate_index);
put_bits(&pb, 1, 0); /* private_bit */
put_bits(&pb, 3, ctx->channel_conf); /* channel_configuration */
put_bits(&pb, 1, 0); /* original_copy */
put_bits(&pb, 1, 0); /* home */
我们仔细看一下各个变量的含义:
syncword:0xfff 这是一个同步头,也就是代表着一个ADTS帧的开始
ID:用来标识MEPG的,0标识MPEG-4,1标识MPEG-2
layer: 总是0
protection_absent:表示是否误码校验,1是没有,0是有
profile:表示使用哪个级别的AAC,一共有4中
static const int aacenc_profiles[] = {
FF_PROFILE_AAC_MAIN,
FF_PROFILE_AAC_LOW,
FF_PROFILE_AAC_LTP,
FF_PROFILE_MPEG2_AAC_LOW,
};
AudioObjectType=profile+1,这是因为在AudioObjectType中0代表的是AOT_NULL,而profile是从0开始的,因此他们之间相差1个位置
enum AudioObjectType {
AOT_NULL,
// Support? Name
AOT_AAC_MAIN, ///< Y Main
AOT_AAC_LC, ///< Y Low Complexity
AOT_AAC_SSR, ///< N (code in SoC repo) Scalable Sample Rate
AOT_AAC_LTP, ///< Y Long Term Prediction
AOT_SBR, ///< Y Spectral Band Replication
AOT_AAC_SCALABLE, ///< N Scalable
AOT_TWINVQ, ///< N Twin Vector Quantizer
AOT_CELP, ///< N Code Excited Linear Prediction
AOT_HVXC, ///< N Harmonic Vector eXcitation Coding
AOT_TTSI = 12, ///< N Text-To-Speech Interface
AOT_MAINSYNTH, ///< N Main Synthesis
AOT_WAVESYNTH, ///< N Wavetable Synthesis
AOT_MIDI, ///< N General MIDI
AOT_SAFX, ///< N Algorithmic Synthesis and Audio Effects
AOT_ER_AAC_LC, ///< N Error Resilient Low Complexity
AOT_ER_AAC_LTP = 19, ///< N Error Resilient Long Term Prediction
AOT_ER_AAC_SCALABLE, ///< N Error Resilient Scalable
AOT_ER_TWINVQ, ///< N Error Resilient Twin Vector Quantizer
AOT_ER_BSAC, ///< N Error Resilient Bit-Sliced Arithmetic Coding
AOT_ER_AAC_LD, ///< N Error Resilient Low Delay
AOT_ER_CELP, ///< N Error Resilient Code Excited Linear Prediction
AOT_ER_HVXC, ///< N Error Resilient Harmonic Vector eXcitation Coding
AOT_ER_HILN, ///< N Error Resilient Harmonic and Individual Lines plus Noise
AOT_ER_PARAM, ///< N Error Resilient Parametric
AOT_SSC, ///< N SinuSoidal Coding
AOT_PS, ///< N Parametric Stereo
AOT_SURROUND, ///< N MPEG Surround
AOT_ESCAPE, ///< Y Escape Value
AOT_L1, ///< Y Layer 1
AOT_L2, ///< Y Layer 2
AOT_L3, ///< Y Layer 3
AOT_DST, ///< N Direct Stream Transfer
AOT_ALS, ///< Y Audio LosslesS
AOT_SLS, ///< N Scalable LosslesS
AOT_SLS_NON_CORE, ///< N Scalable LosslesS (non core)
AOT_ER_AAC_ELD, ///< N Error Resilient Enhanced Low Delay
AOT_SMR_SIMPLE, ///< N Symbolic Music Representation Simple
AOT_SMR_MAIN, ///< N Symbolic Music Representation Main
AOT_USAC_NOSBR, ///< N Unified Speech and Audio Coding (no SBR)
AOT_SAOC, ///< N Spatial Audio Object Coding
AOT_LD_SURROUND, ///< N Low Delay MPEG Surround
AOT_USAC, ///< N Unified Speech and Audio Coding
};
sampling_frequency_index:表示使用的采样率下标
static const int mpeg4audio_sample_rates[16] = {
96000, 88200, 64000, 48000, 44100, 32000,
24000, 22050, 16000, 12000, 11025, 8000, 7350
};
channel_configuration:表示声道数
/* adts_variable_header */
put_bits(&pb, 1, 0); /* copyright_identification_bit */
put_bits(&pb, 1, 0); /* copyright_identification_start */
put_bits(&pb, 13, full_frame_size); /* aac_frame_length */
put_bits(&pb, 11, 0x7ff); /* adts_buffer_fullness */
put_bits(&pb, 2, 0); /* number_of_raw_data_blocks_in_frame */
aac_frame_length :代表ADTS帧大小,header+aac原始数据大小长度
protection_absent: 为0则headerlength=9bytes,为1则headerlength=7bytes
adts_buffer_fullness:0x7FF 说明是码率可变的码流。
number_of_raw_data_blocks_in_frame :表示ADTS帧中有number_of_raw_data_blocks_in_frame +1个AAC原始帧
我打开一个aac文件二进制文件:
图中红色框框就是ADTS的header
FF F1 4C 80 2B 9F FC
通过分析工具: