本篇内容说多不多,但如果要说得明明白白的,可能就有点难度了。所以我决定把我的调试过程日志都呈现出来,方便大家理解。继上一篇文末,我们学习到了什么是DTS/PTS,还有音视同步的三种策略。那再FFmpeg框架中,又怎么体现这些变量?我们以音频的解码线程为例,如下代码所示。
void* audio_avframe_decoder(void* arg)
{
JNIEnv *env = NULL;
if ( (*gJavaVM)->AttachCurrentThread(gJavaVM, &env,NULL) != JNI_OK) {
LOGE("gJavaVM->Env Error!\n");
pthread_exit((void *) -1);
}
SyncPlayer* player = (SyncPlayer*)arg;
AVCodecContext* audioCodecCtx = player->input_codec_ctx[player->audio_stream_index];
AV_PACKET_BUFFER* audioAVPacketButter = player->audio_avpacket_buffer;
AVStream *audioStream = player->input_format_ctx->streams[player->audio_stream_index];
int64_t audioDuration = (int64_t) (audioStream->duration * av_q2d(audioStream->time_base));
LOGI("audio steam time_base : %d/%d \n",audioStream->time_base.num, audioStream->time_base.den);
AVFrame *frame = av_frame_alloc();
//16bit 44100 PCM 数据的实际内存空间。
uint8_t *out_buffer = (uint8_t *)av_malloc(MAX_AUDIO_FRAME_SIZE);
// AudioTrack.play
(*env)->CallVoidMethod(env, player->audio_track, player->audio_track_play_mid);
int ret;
int64_t pts;
while(player->stop_thread_audio_decoder == 0)
{
pthread_mutex_lock(&audioAVPacketButter->mutex);
AVPacket* packet = get_read_packet(audioAVPacketButter);
pthread_mutex_unlock(&audioAVPacketButter->mutex);
//AVPacket->AVFrame
ret = avcodec_send_packet(audioCodecCtx, packet);
if (ret == AVERROR_EOF){
av_packet_unref(packet);
LOGW("audio_decoder avcodec_send_packet:%d\n", ret);
break; // 跳出 while(player->stop_thread_audio_decoder==0)
}else if(ret < 0){
av_packet_unref(packet);
LOGE("audio_decoder avcodec_send_packet:%d\n", ret);
continue;
}
while(ret >= 0)
{
ret = avcodec_receive_frame(audioCodecCtx, frame);
if (ret == AVERROR(EAGAIN) ) {
//LOGD("audio_decoder avcodec_receive_frame:%d\n", ret);
break; // 跳出 while(ret>=0)
} else if (ret < 0 || ret == AVERROR_EOF) {
LOGW("audio_decoder avcodec_receive_frame:%d %s\n", AVERROR(ret), av_err2str(ret));
av_packet_unref(packet);
goto end; //end处进行资源释放等善后处理
}
// !test start
if ((pts = av_frame_get_best_effort_timestamp(frame)) == AV_NOPTS_VALUE)
pts = 0;
LOGI("audio current frame PTS : %lld\n",pts);
pts *= av_q2d(audioStream->time_base);
LOGI("audio current frame PTS : %lld\n",pts);
// !test end
if (ret >= 0)
{
swr_convert(player->swr_ctx, &out_buffer, MAX_AUDIO_FRAME_SIZE, (const uint8_t **) frame->data, frame->nb_samples);
//获取sample的size
int out_buffer_size = av_samples_get_buffer_size(NULL, player->out_channel_nb,
frame->nb_samples, player->out_sample_fmt, 1);
//AudioTrack.write(byte[] int int) 需要byte数组,对应jni的jbyteArray
//需要把out_buffer缓冲区数据转成byte数组
jbyteArray audio_data_byteArray = (*env)->NewByteArray(env, out_buffer_size);
jbyte* fp_AudioDataArray = (*env)->GetByteArrayElements(env, audio_data_byteArray, NULL);
memcpy(fp_AudioDataArray, out_buffer, (size_t) out_buffer_size);
(*env)->ReleaseByteArrayElements(env, audio_data_byteArray, fp_AudioDataArray,0);
// AudioTrack.write PCM数据
(*env)->CallIntMethod(env,player->audio_track,player->audio_track_write_mid,
audio_data_byteArray, 0, out_buffer_size);
//!!!释放局部引用,要不然会局部引用溢出
(*env)->DeleteLocalRef(env,audio_data_byteArray);
}
}
av_packet_unref(packet);
}
end:
av_free(out_buffer);
av_frame_free(&frame);
LOGI("thread_audio_avframe_decoder exit ...\n");
(*gJavaVM)->DetachCurrentThread(gJavaVM);
return 0;
}
首先我们来看看AVStream->time_base是个什么玩意,以及这个av_q2d函数。
/**
* Rational number (pair of numerator and denominator).
*/
typedef struct AVRational{
int num; ///< Numerator
int den; ///< Denominator
} AVRational;
AVRational time_base;
/////////////////////////////////////////////////////////
static inline double av_q2d(AVRational a){
return a.num / (double) a.den;
}
然后我们再来了解一下,什么是 PTS刻度。
PTS:Presentation Time Stamp。PTS主要用于度量解码后的视频帧什么时候被显示出来
DTS:Decode Time Stamp。DTS主要是标识读入内存中的bit流在什么时候开始送入解码器中进行解码
也就是pts反映帧什么时候开始显示,dts反映数据流什么时候开始解码;怎么理解这里的“什么时候”呢?如果有某一帧,假设它是第10秒开始显示。那么它的pts是多少呢。是10?还是10s?还是两者都不是。
为了回答这个问题,FFmpeg引入了时间基的概念,也就是time_base。它是用来度量时间的基本单元。
如果把1秒分为25等份,你可以理解就是一把尺,那么每一格表示的就是1/25秒。此时的time_base={1,25}
如果你是把1秒分成90000份,每一个刻度就是1/90000秒,此时的time_base={1,90000}。
这么说应该理解av_frame_get_best_effort_timestamp方法获取的pts刻度 及其 时间基 了吧?我们在视频解码线程的相同位置(avcodec_receive_frame之后,渲染之前)加上同样的test代码。启动demo,随后我自己是得到这样的一份调试日志。
[sync_player.c][video_avframe_decoder][229]: video steam time_base : 1/29970
[sync_player.c][audio_avframe_decoder][143]: audio steam time_base : 1/48000
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 1024
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 2048
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 3072
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 0
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 4096
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 5120
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 6144
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 7168
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 2000
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 8192
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 9216
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 10240
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 11264
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 12288
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 13312
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 3000
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 14336
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 15360
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 16384
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 17408
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 18432
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 4000
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 19456
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 20480
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 21504
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 22528
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 23552
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 24576
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 5000
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 25600
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 26624
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 27648
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 28672
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 29696
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 30720
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 31744
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 6000
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 32768
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 33792
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 34816
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 35840
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 36864
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 37888
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 7000
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 38912
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 39936
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 40960
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 41984
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 43008
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 44032
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 8000
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 45056
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 46080
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 47104
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 48128
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 1
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 49152
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 1
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 9000
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0
从第一二行日志,我们可以清楚的知道,audio / video的时间基是各不相同的。这个时间基其实我们是可以间接初步的推算得出的,捉住时间基的概念去理解(1秒的量化单位),所以音频的时间基就是采样频率(每秒音频采样)的倒数,而视频的时间基就是帧率(每秒显示多少帧)的倒数了。(测试MP4文件,帧率29.97≈30,音频采样44100Hz,双声道AAC)
然后从日志我们可以看到,经过一瞬间后,audio的PTS已经满了1个时间基的刻度,到1s了。而且video还是9000/29970,还不足0.5s,已经出现了音视频不同步的现象了。
而且有没同学觉得(反正我自己是这样觉得了)通过这个av_frame_get_best_effort_timestamp方法获取的pts刻度,转换的时间只到秒级,用于调试还可以,但是用于同步那未免太粗糙了。有没更精确的数值表示呢? 答案是有的。我们更新 !test 的代码替换如下:
// audio_avframe_decoder
audioClock = frame->pts * av_q2d(audioStream->time_base);
LOGD(" current audioClock : %f\n", audioClock);
// video_avframe_decoder
videoClock = frame->pts * av_q2d(videoStream->time_base);
LOGD(" current audioClock : %f\n", audioClock);
新建两个double类型的全局变量,audioClock / videoClock。我们这次直接使用解码后的frame自带的pts字段,日志输出如下
[sync_player.c][video_avframe_decoder][231]: video steam time_base : 1/29970
[sync_player.c][audio_avframe_decoder][142]: audio steam time_base : 1/48000
[sync_player.c][audio_avframe_decoder][191]: current audioClock : 0.000000
[sync_player.c][audio_avframe_decoder][191]: current audioClock : 0.021333
[sync_player.c][audio_avframe_decoder][191]: current audioClock : 0.042667
[sync_player.c][audio_avframe_decoder][191]: current audioClock : 0.064000
[sync_player.c][video_avframe_decoder][295]: current videoClock : 0.000000
[sync_player.c][audio_avframe_decoder][191]: current audioClock : 0.085333
[sync_player.c][audio_avframe_decoder][191]: current audioClock : 0.106667
[sync_player.c][audio_avframe_decoder][191]: current audioClock : 0.128000
[sync_player.c][audio_avframe_decoder][191]: current audioClock : 0.149333
[sync_player.c][video_avframe_decoder][295]: current videoClock : 0.033367
已经精确到微妙级别了(1s = 1_000ms = 1_000_000us),可以使用于同步逻辑了。
那么要怎样去同步呢?以demo现在这个情况,audioClock比videoClock要快,在恰当的时间需要放慢脚步等待,videoClock太慢要加快,秉持这个原则,我们在音频解码线程的解码循环中,增加代码:
while(player->stop_thread_audio_decoder == 0)
{
pthread_mutex_lock(&audioAVPacketButter->mutex);
AVPacket* packet = get_read_packet(audioAVPacketButter);
pthread_mutex_unlock(&audioAVPacketButter->mutex);
//AVPacket->AVFrame
ret = avcodec_send_packet(audioCodecCtx, packet);
if (ret == AVERROR_EOF){
av_packet_unref(packet);
LOGW("audio_decoder avcodec_send_packet:%d\n", ret);
break; // 跳出 while(player->stop_thread_audio_decoder==0)
}else if(ret < 0){
av_packet_unref(packet);
LOGE("audio_decoder avcodec_send_packet:%d\n", ret);
continue;
}
while(ret >= 0)
{
ret = avcodec_receive_frame(audioCodecCtx, frame);
if (ret == AVERROR(EAGAIN) ) {
//LOGD("audio_decoder avcodec_receive_frame:%d\n", ret);
break; // 跳出 while(ret>=0)
} else if (ret < 0 || ret == AVERROR_EOF) {
LOGW("audio_decoder avcodec_receive_frame:%d %s\n", AVERROR(ret), av_err2str(ret));
av_packet_unref(packet);
goto end; //end处进行资源释放等善后处理
}
// !test start
//if ((pts = av_frame_get_best_effort_timestamp(frame)) == AV_NOPTS_VALUE)
// pts = 0;
//LOGI("audio current frame PTS : %lld\n",pts);
//pts *= av_q2d(audioStream->time_base);
//LOGI("audio current frame PTS : %lld\n",pts);
// !test end
audioClock = frame->pts * av_q2d(audioStream->time_base);
//LOGD(" current audioClock : %f\n", audioClock);
if (ret >= 0)
{
swr_convert(player->swr_ctx, &out_buffer, MAX_AUDIO_FRAME_SIZE, (const uint8_t **) frame->data, frame->nb_samples);
//获取sample的size
int out_buffer_size = av_samples_get_buffer_size(NULL, player->out_channel_nb,
frame->nb_samples, player->out_sample_fmt, 1);
//AudioTrack.write(byte[] int int) 需要byte数组,对应jni的jbyteArray
//需要把out_buffer缓冲区数据转成byte数组
jbyteArray audio_data_byteArray = (*env)->NewByteArray(env, out_buffer_size);
jbyte* fp_AudioDataArray = (*env)->GetByteArrayElements(env, audio_data_byteArray, NULL);
memcpy(fp_AudioDataArray, out_buffer, (size_t) out_buffer_size);
(*env)->ReleaseByteArrayElements(env, audio_data_byteArray, fp_AudioDataArray,0);
// AudioTrack.write PCM数据
(*env)->CallIntMethod(env,player->audio_track,player->audio_track_write_mid,
audio_data_byteArray, 0, out_buffer_size);
if(fabs(audioClock-videoClock) > AV_SYNC_THRESHOLD)
{ // 再次填充PCM + usleep延迟等待
(*env)->CallIntMethod(env,player->audio_track,player->audio_track_write_mid,
audio_data_byteArray, 0, out_buffer_size);
usleep(15000); // 等待时间其实还需要更好的通过pts计算出来。
}
//!!!释放局部引用,要不然会局部引用溢出
(*env)->DeleteLocalRef(env,audio_data_byteArray);
}
}
av_packet_unref(packet);
}
在正常的AudioTrack.write写入PCM数据之后,我们判断当前 audioClock-videoClock的差值,如果大于定义的同步差值,我们就usleep睡眠延迟等待,并重新写入该帧的PCM数据(因为人类的听觉灵敏度 远大于 视觉灵敏度,变音好过失去声音吧?)以降低audio的速度。
while(player->stop_thread_video_decoder == 0)
{
pthread_mutex_lock(&videoAVPacketButter->mutex);
AVPacket* packet = get_read_packet(videoAVPacketButter);
pthread_mutex_unlock(&videoAVPacketButter->mutex);
//AVPacket->AVFrame
ret = avcodec_send_packet(videoCodecCtx, packet);
if (ret == AVERROR_EOF){
av_packet_unref(packet);
LOGW("video_decoder avcodec_send_packet:%d\n", ret);
break; //跳出 while(player->stop_thread_video_decoder == 0)
}else if(ret < 0){
av_packet_unref(packet);
LOGE("video_decoder avcodec_send_packet:%d\n", ret);
continue;
}
while(ret >= 0)
{
ret = avcodec_receive_frame(videoCodecCtx, yuv_frame);
if (ret == AVERROR(EAGAIN) ){
//LOGD("video_decoder avcodec_receive_frame:%d\n", ret);
break; //跳出 while(ret >= 0)
}else if (ret < 0 || ret == AVERROR_EOF) {
LOGW("video_decoder avcodec_receive_frame:%d %s\n", AVERROR(ret),av_err2str(ret));
av_packet_unref(packet);
goto end;
}
// !test start
//if ((pts = av_frame_get_best_effort_timestamp(yuv_frame)) == AV_NOPTS_VALUE)
// pts = 0;
//LOGD("video current frame PTS : %lld\n",pts);
//pts *= av_q2d(videoStream->time_base);
//LOGD("video current frame PTS : %lld\n",pts);
// !test end
videoClock = yuv_frame->pts * av_q2d(videoStream->time_base);
//LOGD(" current videoClock : %f\n", videoClock);
if(fabs(audioClock-videoClock) > AV_SYNC_THRESHOLD)
continue;
if (ret >= 0)
{
ANativeWindow_lock(nativeWindow, &nativeWinBuffer, NULL);
// 上锁并关联 ANativeWindow + ANativeWindow_Buffer
av_image_fill_arrays(rgb_frame->data, rgb_frame->linesize, nativeWinBuffer.bits,
AV_PIX_FMT_RGBA, videoCodecCtx->width, videoCodecCtx->height, 1 );
// rgb.AVFrame对象 关联 ANativeWindow_Buffer的真实内存空间actual bits.
I420ToARGB(yuv_frame->data[0], yuv_frame->linesize[0],
yuv_frame->data[2], yuv_frame->linesize[2],
yuv_frame->data[1], yuv_frame->linesize[1],
rgb_frame->data[0], rgb_frame->linesize[0],
videoCodecCtx->width, videoCodecCtx->height);
// yuv.AVFrame 转 rgb.AVFrame
ANativeWindow_unlockAndPost(nativeWindow);
// 释放锁并 swap交换显示内存到屏幕上。
}
}
av_packet_unref(packet);
}
视频解码线程的循环当中,在avframe正式被渲染到ANativeWindow之前,也进行一下audioClock-videoClock的判断,如果大于定义的同步差值,就直接跳过当前帧的渲染,丢帧的操作是以加快video的速度,追赶audio。
至于这个AV_SYNC_THRESHOLD的值是多少呢? 这个是要根据之前的videoClock 和 audioClock的跨度决定,具体多少没有定值,需要我们自己根据实际情况去进行有策略性的调整。
现在demo的音视同步效果其实是差强人意的,但是原理我们已经理解清楚了。接下来,我们就要一探究竟,为何video的速度如此之慢,我们根本达不到 30fps的效果,为何其他播放器就可以呢? 我们暂且先屏蔽掉同步的逻辑,在视频解码线程的循环内加入代码耗时时间的调试日志
int ret;
int64_t pts;
long start;
int count = 0;
while(player->stop_thread_video_decoder == 0)
{
count++;
start = clock();
LOGD("video decoder %d round start at %ld ...\n", count, start/1000);
pthread_mutex_lock(&videoAVPacketButter->mutex);
AVPacket* packet = get_read_packet(videoAVPacketButter);
pthread_mutex_unlock(&videoAVPacketButter->mutex);
LOGI("video get_read_packet at %ld \n", clock()-start);
//AVPacket->AVFrame
ret = avcodec_send_packet(videoCodecCtx, packet);
LOGI("video avcodec_send_packet at %ld \n", clock()-start);
if (ret == AVERROR_EOF){
av_packet_unref(packet);
LOGW("video_decoder avcodec_send_packet:%d\n", ret);
break; //跳出 while(player->stop_thread_video_decoder == 0)
}else if(ret < 0){
av_packet_unref(packet);
LOGE("video_decoder avcodec_send_packet:%d\n", ret);
continue;
}
while(ret >= 0)
{
ret = avcodec_receive_frame(videoCodecCtx, yuv_frame);
if (ret == AVERROR(EAGAIN) ){
//LOGD("video_decoder avcodec_receive_frame:%d\n", ret);
break; //跳出 while(ret >= 0)
}else if (ret < 0 || ret == AVERROR_EOF) {
LOGW("video_decoder avcodec_receive_frame:%d %s\n", AVERROR(ret),av_err2str(ret));
av_packet_unref(packet);
goto end;
}
LOGI("video avcodec_receive_frame at %ld \n", clock()-start);
// !test start
//if ((pts = av_frame_get_best_effort_timestamp(yuv_frame)) == AV_NOPTS_VALUE)
// pts = 0;
//LOGD("video current frame PTS : %lld\n",pts);
//pts *= av_q2d(videoStream->time_base);
//LOGD("video current frame PTS : %lld\n",pts);
// !test end
videoClock = yuv_frame->pts * av_q2d(videoStream->time_base);
//LOGD(" current videoClock : %f\n", videoClock);
//if(fabs(audioClock-videoClock) > AV_SYNC_THRESHOLD)
// continue;
if (ret >= 0)
{
LOGI("video prepare draw at %ld \n", clock()-start);
ANativeWindow_lock(nativeWindow, &nativeWinBuffer, NULL);
// 上锁并关联 ANativeWindow + ANativeWindow_Buffer
av_image_fill_arrays(rgb_frame->data, rgb_frame->linesize, nativeWinBuffer.bits,
AV_PIX_FMT_RGBA, videoCodecCtx->width, videoCodecCtx->height, 1 );
// rgb.AVFrame对象 关联 ANativeWindow_Buffer的真实内存空间actual bits.
I420ToARGB(yuv_frame->data[0], yuv_frame->linesize[0],
yuv_frame->data[2], yuv_frame->linesize[2],
yuv_frame->data[1], yuv_frame->linesize[1],
rgb_frame->data[0], rgb_frame->linesize[0],
videoCodecCtx->width, videoCodecCtx->height);
LOGI("video I420ToARGB at %ld \n", clock()-start);
// yuv.AVFrame 转 rgb.AVFrame
ANativeWindow_unlockAndPost(nativeWindow);
// 释放锁并 swap交换显示内存到屏幕上。
LOGI("video finish draw at %ld \n", clock()-start);
}
}
av_packet_unref(packet);
LOGD("video decoder %d round end at %ld ...\n", count, (clock()-start)/1000);
LOGD("-------------------------------------\n");
}
我们在各个关键位置,加入了耗时的打印。得出如下日志。
[sync_player.c][video_avframe_decoder][268]: video decoder 1 round start at 1223 ...
[sync_player.c][video_avframe_decoder][272]: video get_read_packet at 53
[sync_player.c][video_avframe_decoder][276]: video avcodec_send_packet at 39951
[sync_player.c][video_avframe_decoder][336]: video decoder 1 round end at 40036 ...
[sync_player.c][video_avframe_decoder][337]: -------------------------------------
[sync_player.c][video_avframe_decoder][268]: video decoder 2 round start at 1263 ...
[sync_player.c][video_avframe_decoder][272]: video get_read_packet at 25
[sync_player.c][video_avframe_decoder][276]: video avcodec_send_packet at 28174
[sync_player.c][video_avframe_decoder][298]: video avcodec_receive_frame at 28257
[sync_player.c][video_avframe_decoder][317]: video prepare draw at 28290
[sync_player.c][video_avframe_decoder][328]: video I420ToARGB at 115386
[sync_player.c][video_avframe_decoder][332]: video finish draw at 116751
[sync_player.c][video_avframe_decoder][336]: video decoder 2 round end at 116843 ...
[sync_player.c][video_avframe_decoder][337]: -------------------------------------
[sync_player.c][video_avframe_decoder][268]: video decoder 3 round start at 1380 ...
[sync_player.c][video_avframe_decoder][272]: video get_read_packet at 44
[sync_player.c][video_avframe_decoder][276]: video avcodec_send_packet at 16678
[sync_player.c][video_avframe_decoder][298]: video avcodec_receive_frame at 16788
[sync_player.c][video_avframe_decoder][317]: video prepare draw at 16831
[sync_player.c][video_avframe_decoder][328]: video I420ToARGB at 90281
[sync_player.c][video_avframe_decoder][332]: video finish draw at 91667
[sync_player.c][video_avframe_decoder][336]: video decoder 3 round end at 91758 ...
[sync_player.c][video_avframe_decoder][337]: -------------------------------------
分析前三round,我们已经可以明显的知道了。每一round end得出的耗时大头,就是I420ToARGB方法就是yuv转rgb的整个耗时时间,这平均耗时居然达100ms,视频的帧率变成10fps。(笑哭.jpg)
那么有同学可能会说,不要用libyuv了,直接用ffmpeg的SwsContext API转换格式。效率会不会好一点?这个随便百度一下,可以知道libyuv比ffmpeg自带的转换格式性能提高,是接近10倍哦。
流弊一点的同学可能就会低头推眼镜 说:麻烦加上neon指令加速。是的,NEON确实也能提高其中的效率,效果怎样暂且我还没试验,网上论调也不尽人意。(但是开启NEON加速总比没有开启的好啊!)
那怎么办?这效果boss不接受啊!用户更加不能忍啊!既然yuv转rgb耗时那么多,为啥不能把这部分砍掉?直接用yuv渲染?是的,已经有前人想到了这点,并从前期的Android系统的播放器找到了答案,但是代码过于复杂而且年代久远,Google也放弃了AwesomePlayer。用于学习还是可以的。
Android用surface直接显示yuv数据(一)
Android用surface直接显示yuv数据(二)
Android用surface直接显示yuv数据(三)
然后我想说的是!OpenGL也能轻易的做到这一点!所以我会在下个主题OpenGL+Shader的学习中重回这里,让我们的demo真正的做到30fps,甚至更高的60fps!
文章当天写完之后,我就开始着手升级ffmpeg支持neon和libyuv支持neon的工作。其实两个库支持neon都比较简单。
ffmpeg支持neon,只需要在之前手动编译Android使用的FFmpeg库(Linux)介绍的脚本,--enable-neon就可以了。
#!/bin/bash
make clean
export NDK=/usr/ndk/android-ndk-r14/
export SYSROOT=$NDK/platforms/android-9/arch-arm/
export TOOLCHAIN=$NDK/toolchains/arm-linux-androideabi-4.8/prebuilt/linux-x86_64
export CPU=arm
export PREFIX=$(pwd)/android/$CPU
export ADDI_CFLAGS="-marm"
./configure --target-os=linux \
--prefix=$PREFIX --arch=arm \
--disable-doc \
--enable-shared \
--disable-static \
--disable-yasm \
--disable-symver \
--enable-gpl \
--enable-neon \
--disable-ffmpeg \
--disable-ffplay \
--disable-ffprobe \
--disable-ffserver \
--disable-doc \
--disable-symver \
--cross-prefix=$TOOLCHAIN/bin/arm-linux-androideabi- \
--enable-cross-compile \
--sysroot=$SYSROOT \
--extra-cflags="-Os -fpic $ADDI_CFLAGS" \
--extra-ldflags="$ADDI_LDFLAGS" \
$ADDITIONAL_CONFIGURE_FLAG
make clean
make
make install
libyuv之前好像没介绍源码编译的过程,因为都比较简单。参考https://blog.csdn.net/abcdnml/article/details/70243783就可以了。
LOCAL_PATH:= $(call my-dir)
include $(CLEAR_VARS)
LOCAL_CPP_EXTENSION := .cc
LOCAL_SRC_FILES := \
source/compare.cc \
source/compare_common.cc \
source/compare_neon64.cc \
source/compare_gcc.cc \
source/convert.cc \
source/convert_argb.cc \
source/convert_from.cc \
source/convert_from_argb.cc \
source/convert_to_argb.cc \
source/convert_to_i420.cc \
source/cpu_id.cc \
source/planar_functions.cc \
source/rotate.cc \
source/rotate_argb.cc \
source/rotate_mips.cc \
source/rotate_neon64.cc \
source/row_any.cc \
source/row_common.cc \
source/row_mips.cc \
source/row_neon64.cc \
source/row_gcc.cc \
source/scale.cc \
source/scale_any.cc \
source/scale_argb.cc \
source/scale_common.cc \
source/scale_mips.cc \
source/scale_neon64.cc \
source/scale_gcc.cc \
source/video_common.cc
# TODO(fbarchard): Enable mjpeg encoder.
# source/mjpeg_decoder.cc
# source/convert_jpeg.cc
# source/mjpeg_validate.cc
ifeq ($(TARGET_ARCH_ABI),armeabi-v7a)
LOCAL_CFLAGS += -DLIBYUV_NEON
LOCAL_SRC_FILES += \
source/compare_neon.cc.neon \
source/rotate_neon.cc.neon \
source/row_neon.cc.neon \
source/scale_neon.cc.neon
endif
LOCAL_EXPORT_C_INCLUDES := $(LOCAL_PATH)/include
LOCAL_C_INCLUDES += $(LOCAL_PATH)/include
LOCAL_MODULE := libyuv
LOCAL_MODULE_TAGS := optional
include $(BUILD_SHARED_LIBRARY)
我们可以看到,libyuv的编译脚本,其实在TARGET_ARCH_ABI==armeabi-v7a的时候,已经自动的增加了neon的支持。所以我们必须要另外写Application.mk,指定编译平台APP_ABI := armeabi armeabi-v7a mips,然后使用lib/armeabi-v7a的so。
经过升级以后,我们在看看解码线程的调试日志。
[sync_player.c][audio_avframe_decoder][140]: audio steam time_base : 1/48000
[sync_player.c][video_avframe_decoder][235]: video steam time_base : 1/29970
[sync_player.c][video_avframe_decoder][268]: video decoder 1 round start at 1190 ...
[sync_player.c][video_avframe_decoder][272]: video get_read_packet at 78
[sync_player.c][video_avframe_decoder][276]: video avcodec_send_packet at 39669
[sync_player.c][video_avframe_decoder][335]: video decoder 1 round end at 39 ...
[sync_player.c][video_avframe_decoder][336]: -------------------------------------
[sync_player.c][video_avframe_decoder][268]: video decoder 2 round start at 1230 ...
[sync_player.c][video_avframe_decoder][272]: video get_read_packet at 23
[sync_player.c][video_avframe_decoder][276]: video avcodec_send_packet at 39946
[sync_player.c][video_avframe_decoder][298]: video avcodec_receive_frame at 40033
[sync_player.c][video_avframe_decoder][335]: video decoder 2 round end at 40 ...
[sync_player.c][video_avframe_decoder][336]: -------------------------------------
[sync_player.c][video_avframe_decoder][268]: video decoder 3 round start at 1270 ...
[sync_player.c][video_avframe_decoder][272]: video get_read_packet at 23
[sync_player.c][video_avframe_decoder][276]: video avcodec_send_packet at 15373
[sync_player.c][video_avframe_decoder][298]: video avcodec_receive_frame at 15459
[sync_player.c][video_avframe_decoder][317]: video prepare draw at 15497
[sync_player.c][video_avframe_decoder][331]: video finish draw at 25104
[sync_player.c][video_avframe_decoder][335]: video decoder 3 round end at 25 ...
[sync_player.c][video_avframe_decoder][336]: -------------------------------------
[sync_player.c][video_avframe_decoder][268]: video decoder 4 round start at 1295 ...
[sync_player.c][video_avframe_decoder][272]: video get_read_packet at 23
[sync_player.c][video_avframe_decoder][276]: video avcodec_send_packet at 38360
[sync_player.c][video_avframe_decoder][298]: video avcodec_receive_frame at 38448
[sync_player.c][video_avframe_decoder][335]: video decoder 4 round end at 38 ...
[sync_player.c][video_avframe_decoder][336]: -------------------------------------
[sync_player.c][video_avframe_decoder][268]: video decoder 5 round start at 1334 ...
[sync_player.c][video_avframe_decoder][272]: video get_read_packet at 23
[sync_player.c][video_avframe_decoder][276]: video avcodec_send_packet at 49854
[sync_player.c][video_avframe_decoder][298]: video avcodec_receive_frame at 49948
[sync_player.c][video_avframe_decoder][317]: video prepare draw at 49990
[sync_player.c][video_avframe_decoder][331]: video finish draw at 58722
[sync_player.c][video_avframe_decoder][335]: video decoder 5 round end at 58 ...
[sync_player.c][video_avframe_decoder][336]: -------------------------------------
比较video finish - prepare的差值,解码线程的性能提高了10倍。