ffmpeg解码音频tutorial03(个人分析)

音频解码部分用到了SDL_AudioSpec结构体,

/**

 * When filling in the desired audio spec structure,

 * - 'desired->freq' should be the desired audio frequency in samples-per-second.

 * - 'desired->format' should be the desired audio format.

 * - 'desired->samples' is the desired size of the audio buffer, in samples.

 *     This number should be a power of two, and may be adjusted by the audio

 *     driver to a value more suitable for the hardware.  Good values seem to

 *     range between 512 and 8096 inclusive, depending on the application and

 *     CPU speed.  Smaller values yield faster response time, but can lead

 *     to underflow if the application is doing heavy processing and cannot

 *     fill the audio buffer in time.  A stereo sample consists of both right

 *    and left channels in LR ordering.

 *     Note that the number of samples is directly related to time by the

 *     following formula:  ms = (samples*1000)/freq

 * - 'desired->size' is the size in bytes of the audio buffer, and is

 *     calculated by SDL_OpenAudio().

 * - 'desired->silence' is the value used to set the buffer to silence,

 *     and is calculated by SDL_OpenAudio().

 * - 'desired->callback' should be set to a function that will be called

 *     when the audio device is ready for more data.  It is passed a pointer

 *     to the audio buffer, and the length in bytes of the audio buffer.

 *     This function usually runs in a separate thread, and so you should

 *     protect data structures that it accesses by calling SDL_LockAudio()

 *     and SDL_UnlockAudio() in your code.

 * - 'desired->userdata' is passed as the first parameter to your callback

 *     function.

 *

 * @note The calculated values in this structure are calculated by SDL_OpenAudio()

 *

 */

typedef struct SDL_AudioSpec {

       int freq;          /**< DSP frequency -- samples per second */

       Uint16 format;              /**< Audio data format */

       Uint8  channels;    /**< Number of channels: 1 mono, 2 stereo */

       Uint8  silence;              /**< Audio buffer silence value (calculated) */

       Uint16 samples;             /**< Audio buffer size in samples (power of 2) */

       Uint16 padding;             /**< Necessary for some compile environments */

       Uint32 size;            /**< Audio buffer size in bytes (calculated) */

       /**

        *  This function is called when the audio device needs more data.

        *

        *  @param[out] stream      A pointer to the audio data buffer

        *  @param[in]  len    The length of the audio buffer in bytes.

        *

        *  Once the callback returns, the buffer will no longer be valid.

        *  Stereo samples are stored in a LRLRLR ordering.

        */

       void (SDLCALL *callback)(void *userdata, Uint8 *stream, int len);

       void  *userdata;

} SDL_AudioSpec;

Callback是用户自定义的用于处理音频的函数,当有音频数据需要处理时就调用该函数。Userdatacallback的第一个参数,在SDL_AudioSpec变量初始化时赋值,一般传递的是AVCodecContext变量。

注解中指出,size是音频缓冲的大小,当SDL_OpenAudio().调用时计算得到。跟踪发现,size的值与audio_callback函数的第三个参数len的值相同。

 

在程序中添加printf输出关键变量:

void audio_callback(void *userdata, Uint8 *stream, int len) {

      

       AVCodecContext *aCodecCtx = (AVCodecContext *)userdata;

       int len1, audio_size;

      

       static uint8_t audio_buf[(AVCODEC_MAX_AUDIO_FRAME_SIZE * 3) / 2];

       static unsigned int audio_buf_size = 0;

       static unsigned int audio_buf_index = 0;

printf("audio callback 1 len=%d\n",len);

       while(len > 0) {

              if(audio_buf_index >= audio_buf_size) {

                     /* We have already sent all our data; get more */

                     audio_size = audio_decode_frame(aCodecCtx, audio_buf, sizeof(audio_buf));

                     if(audio_size < 0) {

                            /* If error, output silence */

                            audio_buf_size = 1024; // arbitrary?

                            memset(audio_buf, 0, audio_buf_size);

                     } else {

                            audio_buf_size = audio_size;

                     }

                     audio_buf_index = 0;

printf("audio callback 2 (audio_buf_size,audio_buf_index) = (%d,%d)\n",audio_buf_size,audio_buf_index);

              }

              len1 = audio_buf_size - audio_buf_index;

              if(len1 > len)

                     len1 = len;

              memcpy(stream, (uint8_t *)audio_buf + audio_buf_index, len1);

              len -= len1;

              stream += len1;

              audio_buf_index += len1;

printf("audio callback 3 (len1,len,audio_buf_index) = (%d,%d,%d)\n",len1,len,audio_buf_index);    }

}

Len即音频缓冲中数据的大小,也即需待解码的音频数据。

网上资料:通过SDL库对audio_callback的不断调用,不断解码数据,然后放到stream的末尾,SDL库认为stream中数据够播放一帧音频了,就播放它,第三个参数len是向stream中写数据的内存分配尺度,是分配给audio_callback函数写入缓存大小。

假设len=4096,解码后数据块audio_buf的大小为4608,那么一次audio_callback调用不能把audio_buf中全部数据写入stream末尾,就分两次,第一次先把audio_buf的前4096个字节写入stream末尾,第二次调用audio_callback函数时,由于写缓存用光了,又分配4096个字节的缓存,再写剩余的512个字节到stream末尾,写缓存还剩余3584个字节留给下次audio_callback调用使用。

跟踪以后发现的确是这样:

[NULL @ 010B3A60]Invalid and inefficient vfw-avi packed B frames detected

video stream

Compiler did not align stack variables. Libavcodec has been miscompiled

and may be very slow or crash. This is not a bug in libavcodec,

but in the compiler. You may try recompiling using gcc >= 4.2.

Do not report crashes to FFmpeg developers.

[mpeg4 @ 010B3A60]

Invalid and inefficient vfw-avi packed B frames detected

audio callback 1 len=4096

audio callback 2 (audio_buf_size, audio_buf_index) = (4608,0)

audio callback 3 (len1,      len, audio_buf_index) = (4096,0,4096)

audio callback 1 len=4096

audio callback 3 (len1,      len, audio_buf_index) = (512,3584,4608)

audio callback 2 (audio_buf_size, audio_buf_index) = (4608,0)

audio callback 3 (len1,      len, audio_buf_index) = (3584,0,3584)

audio callback 1 len=4096

audio callback 3 (len1,      len, audio_buf_index) = (1024,3072,4608)

audio callback 2 (audio_buf_size, audio_buf_index) = (4608,0)

audio callback 3 (len1,      len, audio_buf_index) = (3072,0,3072)

audio callback 1 len=4096

audio callback 3 (len1,      len, audio_buf_index) = (1536,2560,4608)

audio callback 2 (audio_buf_size, audio_buf_index) = (4608,0)

audio callback 3 (len1,      len, audio_buf_index) = (2560,0,2560)

这里的audio_buf_size是解码得到的原始音频数据量,每次都是4608,这是解码得到的音频大小。Stream是音频输出,其大小为sizelen4096字节,解码得到的4608个数据无法一次性写入stream中,只有分几次来写。

上面的audio callback 2是解码音频后的输出,每次都解码得到4608字节音频数据,第一次写入4096字节(len1)stream中,数据还剩余4608-4096512字节,第二次解码4608字节数据,写入4096-5123584字节数据,还剩余4608-35841024字节数据……

通过上面的分析,我们可以清楚audio_callback解码音频写入缓冲的过程了。


你可能感兴趣的:(Stream,application,buffer,callback,audio,structure)