音频解码部分用到了SDL_AudioSpec结构体,
/**
* When filling in the desired audio spec structure,
* - 'desired->freq' should be the desired audio frequency in samples-per-second.
* - 'desired->format' should be the desired audio format.
* - 'desired->samples' is the desired size of the audio buffer, in samples.
* This number should be a power of two, and may be adjusted by the audio
* driver to a value more suitable for the hardware. Good values seem to
* range between 512 and 8096 inclusive, depending on the application and
* CPU speed. Smaller values yield faster response time, but can lead
* to underflow if the application is doing heavy processing and cannot
* fill the audio buffer in time. A stereo sample consists of both right
* and left channels in LR ordering.
* Note that the number of samples is directly related to time by the
* following formula: ms = (samples*1000)/freq
* - 'desired->size' is the size in bytes of the audio buffer, and is
* calculated by SDL_OpenAudio().
* - 'desired->silence' is the value used to set the buffer to silence,
* and is calculated by SDL_OpenAudio().
* - 'desired->callback' should be set to a function that will be called
* when the audio device is ready for more data. It is passed a pointer
* to the audio buffer, and the length in bytes of the audio buffer.
* This function usually runs in a separate thread, and so you should
* protect data structures that it accesses by calling SDL_LockAudio()
* and SDL_UnlockAudio() in your code.
* - 'desired->userdata' is passed as the first parameter to your callback
* function.
*
* @note The calculated values in this structure are calculated by SDL_OpenAudio()
*
*/
typedef struct SDL_AudioSpec {
int freq; /**< DSP frequency -- samples per second */
Uint16 format; /**< Audio data format */
Uint8 channels; /**< Number of channels: 1 mono, 2 stereo */
Uint8 silence; /**< Audio buffer silence value (calculated) */
Uint16 samples; /**< Audio buffer size in samples (power of 2) */
Uint16 padding; /**< Necessary for some compile environments */
Uint32 size; /**< Audio buffer size in bytes (calculated) */
/**
* This function is called when the audio device needs more data.
*
* @param[out] stream A pointer to the audio data buffer
* @param[in] len The length of the audio buffer in bytes.
*
* Once the callback returns, the buffer will no longer be valid.
* Stereo samples are stored in a LRLRLR ordering.
*/
void (SDLCALL *callback)(void *userdata, Uint8 *stream, int len);
void *userdata;
} SDL_AudioSpec;
Callback是用户自定义的用于处理音频的函数,当有音频数据需要处理时就调用该函数。Userdata是callback的第一个参数,在SDL_AudioSpec变量初始化时赋值,一般传递的是AVCodecContext变量。
注解中指出,size是音频缓冲的大小,当SDL_OpenAudio().调用时计算得到。跟踪发现,size的值与audio_callback函数的第三个参数len的值相同。
在程序中添加printf输出关键变量:
void audio_callback(void *userdata, Uint8 *stream, int len) {
AVCodecContext *aCodecCtx = (AVCodecContext *)userdata;
int len1, audio_size;
static uint8_t audio_buf[(AVCODEC_MAX_AUDIO_FRAME_SIZE * 3) / 2];
static unsigned int audio_buf_size = 0;
static unsigned int audio_buf_index = 0;
printf("audio callback 1 len=%d\n",len);
while(len > 0) {
if(audio_buf_index >= audio_buf_size) {
/* We have already sent all our data; get more */
audio_size = audio_decode_frame(aCodecCtx, audio_buf, sizeof(audio_buf));
if(audio_size < 0) {
/* If error, output silence */
audio_buf_size = 1024; // arbitrary?
memset(audio_buf, 0, audio_buf_size);
} else {
audio_buf_size = audio_size;
}
audio_buf_index = 0;
printf("audio callback 2 (audio_buf_size,audio_buf_index) = (%d,%d)\n",audio_buf_size,audio_buf_index);
}
len1 = audio_buf_size - audio_buf_index;
if(len1 > len)
len1 = len;
memcpy(stream, (uint8_t *)audio_buf + audio_buf_index, len1);
len -= len1;
stream += len1;
audio_buf_index += len1;
printf("audio callback 3 (len1,len,audio_buf_index) = (%d,%d,%d)\n",len1,len,audio_buf_index); }
}
Len即音频缓冲中数据的大小,也即需待解码的音频数据。
网上资料:通过SDL库对audio_callback的不断调用,不断解码数据,然后放到stream的末尾,SDL库认为stream中数据够播放一帧音频了,就播放它,第三个参数len是向stream中写数据的内存分配尺度,是分配给audio_callback函数写入缓存大小。
假设len=4096,解码后数据块audio_buf的大小为4608,那么一次audio_callback调用不能把audio_buf中全部数据写入stream末尾,就分两次,第一次先把audio_buf的前4096个字节写入stream末尾,第二次调用audio_callback函数时,由于写缓存用光了,又分配4096个字节的缓存,再写剩余的512个字节到stream末尾,写缓存还剩余3584个字节留给下次audio_callback调用使用。
跟踪以后发现的确是这样:
[NULL @ 010B3A60]Invalid and inefficient vfw-avi packed B frames detected
video stream
Compiler did not align stack variables. Libavcodec has been miscompiled
and may be very slow or crash. This is not a bug in libavcodec,
but in the compiler. You may try recompiling using gcc >= 4.2.
Do not report crashes to FFmpeg developers.
[mpeg4 @ 010B3A60]
Invalid and inefficient vfw-avi packed B frames detected
audio callback 1 len=4096
audio callback 2 (audio_buf_size, audio_buf_index) = (4608,0)
audio callback 3 (len1, len, audio_buf_index) = (4096,0,4096)
audio callback 1 len=4096
audio callback 3 (len1, len, audio_buf_index) = (512,3584,4608)
audio callback 2 (audio_buf_size, audio_buf_index) = (4608,0)
audio callback 3 (len1, len, audio_buf_index) = (3584,0,3584)
audio callback 1 len=4096
audio callback 3 (len1, len, audio_buf_index) = (1024,3072,4608)
audio callback 2 (audio_buf_size, audio_buf_index) = (4608,0)
audio callback 3 (len1, len, audio_buf_index) = (3072,0,3072)
audio callback 1 len=4096
audio callback 3 (len1, len, audio_buf_index) = (1536,2560,4608)
audio callback 2 (audio_buf_size, audio_buf_index) = (4608,0)
audio callback 3 (len1, len, audio_buf_index) = (2560,0,2560)
这里的audio_buf_size是解码得到的原始音频数据量,每次都是4608,这是解码得到的音频大小。Stream是音频输出,其大小为size即len为4096字节,解码得到的4608个数据无法一次性写入stream中,只有分几次来写。
上面的audio callback 2是解码音频后的输出,每次都解码得到4608字节音频数据,第一次写入4096字节(len1)到stream中,数据还剩余4608-4096=512字节,第二次解码4608字节数据,写入4096-512=3584字节数据,还剩余4608-3584=1024字节数据……
通过上面的分析,我们可以清楚audio_callback解码音频写入缓冲的过程了。