NDK学习笔记:FFmpeg音视频同步3(你追我赶,升级ffmpeg/libyuv支持neon)

NDK学习笔记:FFmpeg音视频同步3

 

本篇内容说多不多,但如果要说得明明白白的,可能就有点难度了。所以我决定把我的调试过程日志都呈现出来,方便大家理解。继上一篇文末,我们学习到了什么是DTS/PTS,还有音视同步的三种策略。那再FFmpeg框架中,又怎么体现这些变量?我们以音频的解码线程为例,如下代码所示。

void* audio_avframe_decoder(void* arg)
{
    JNIEnv *env = NULL;
    if ( (*gJavaVM)->AttachCurrentThread(gJavaVM, &env,NULL) != JNI_OK) {
        LOGE("gJavaVM->Env Error!\n");
        pthread_exit((void *) -1);
    }

    SyncPlayer* player = (SyncPlayer*)arg;
    AVCodecContext* audioCodecCtx = player->input_codec_ctx[player->audio_stream_index];
    AV_PACKET_BUFFER* audioAVPacketButter = player->audio_avpacket_buffer;

    AVStream *audioStream = player->input_format_ctx->streams[player->audio_stream_index];
    int64_t audioDuration = (int64_t) (audioStream->duration * av_q2d(audioStream->time_base));
    LOGI("audio steam time_base : %d/%d \n",audioStream->time_base.num, audioStream->time_base.den);

    AVFrame *frame = av_frame_alloc();
    //16bit 44100 PCM 数据的实际内存空间。
    uint8_t *out_buffer = (uint8_t *)av_malloc(MAX_AUDIO_FRAME_SIZE);
    // AudioTrack.play
    (*env)->CallVoidMethod(env, player->audio_track, player->audio_track_play_mid);

    int ret;
    int64_t pts;

    while(player->stop_thread_audio_decoder == 0)
    {
        pthread_mutex_lock(&audioAVPacketButter->mutex);
        AVPacket* packet = get_read_packet(audioAVPacketButter);
        pthread_mutex_unlock(&audioAVPacketButter->mutex);
        //AVPacket->AVFrame
        ret = avcodec_send_packet(audioCodecCtx, packet);
        if (ret == AVERROR_EOF){
            av_packet_unref(packet);
            LOGW("audio_decoder avcodec_send_packet:%d\n", ret);
            break; // 跳出 while(player->stop_thread_audio_decoder==0)
        }else if(ret < 0){
            av_packet_unref(packet);
            LOGE("audio_decoder avcodec_send_packet:%d\n", ret);
            continue;
        }

        while(ret >= 0)
        {
            ret = avcodec_receive_frame(audioCodecCtx, frame);
            if (ret == AVERROR(EAGAIN) ) {
                //LOGD("audio_decoder avcodec_receive_frame:%d\n", ret);
                break; // 跳出 while(ret>=0)
            } else if (ret < 0 || ret == AVERROR_EOF) {
                LOGW("audio_decoder avcodec_receive_frame:%d %s\n", AVERROR(ret), av_err2str(ret));
                av_packet_unref(packet);
                goto end;  //end处进行资源释放等善后处理
            }

            // !test start
            if ((pts = av_frame_get_best_effort_timestamp(frame)) == AV_NOPTS_VALUE)
                pts = 0;
            LOGI("audio current frame PTS : %lld\n",pts);
            pts *= av_q2d(audioStream->time_base);
            LOGI("audio current frame PTS : %lld\n",pts);
            // !test end

            if (ret >= 0)
            {
                swr_convert(player->swr_ctx, &out_buffer, MAX_AUDIO_FRAME_SIZE, (const uint8_t **) frame->data, frame->nb_samples);
                //获取sample的size
                int out_buffer_size = av_samples_get_buffer_size(NULL, player->out_channel_nb,
                                                                 frame->nb_samples, player->out_sample_fmt, 1);
                //AudioTrack.write(byte[] int int) 需要byte数组,对应jni的jbyteArray
                //需要把out_buffer缓冲区数据转成byte数组
                jbyteArray audio_data_byteArray = (*env)->NewByteArray(env, out_buffer_size);
                jbyte* fp_AudioDataArray = (*env)->GetByteArrayElements(env, audio_data_byteArray, NULL);
                memcpy(fp_AudioDataArray, out_buffer, (size_t) out_buffer_size);
                (*env)->ReleaseByteArrayElements(env, audio_data_byteArray, fp_AudioDataArray,0);
                // AudioTrack.write PCM数据
                (*env)->CallIntMethod(env,player->audio_track,player->audio_track_write_mid,
                                      audio_data_byteArray, 0, out_buffer_size);
                //!!!释放局部引用,要不然会局部引用溢出
                (*env)->DeleteLocalRef(env,audio_data_byteArray);
            }
        }
        av_packet_unref(packet);
    }

end:
    av_free(out_buffer);
    av_frame_free(&frame);
    LOGI("thread_audio_avframe_decoder exit ...\n");
    (*gJavaVM)->DetachCurrentThread(gJavaVM);
    return 0;
}

首先,我们通过AVFormatContext->streams[stream_idx]获取得到当前解码的流对象AVStream,然后紧接着,留意中间部分的 // !test 部分代码,  我们通过av_frame_get_best_effort_timestamp方法,获取当前解码帧pts的刻度,把当前的pts刻度打印出来,然后乘以一个当前音频流的时间基 av_q2d(audioStream->time_base) 得到真正的 pts,再打印出来。em... 如何理解???

 

首先我们来看看AVStream->time_base是个什么玩意,以及这个av_q2d函数。

/**
 * Rational number (pair of numerator and denominator).
 */
typedef struct AVRational{
    int num; ///< Numerator
    int den; ///< Denominator
} AVRational;

AVRational time_base;
/////////////////////////////////////////////////////////
static inline double av_q2d(AVRational a){
    return a.num / (double) a.den;
}

然后我们再来了解一下,什么是 PTS刻度。

PTS:Presentation Time Stamp。PTS主要用于度量解码后的视频帧什么时候被显示出来
DTS:Decode Time Stamp。DTS主要是标识读入内存中的bit流在什么时候开始送入解码器中进行解码

也就是pts反映帧什么时候开始显示,dts反映数据流什么时候开始解码;怎么理解这里的“什么时候”呢?如果有某一帧,假设它是第10秒开始显示。那么它的pts是多少呢。是10?还是10s?还是两者都不是。

为了回答这个问题,FFmpeg引入了时间基的概念,也就是time_base。它是用来度量时间的基本单元。
如果把1秒分为25等份,你可以理解就是一把尺,那么每一格表示的就是1/25秒。此时的time_base={1,25}
如果你是把1秒分成90000份,每一个刻度就是1/90000秒,此时的time_base={1,90000}。

这么说应该理解av_frame_get_best_effort_timestamp方法获取的pts刻度 及其 时间基 了吧?我们在视频解码线程的相同位置(avcodec_receive_frame之后,渲染之前)加上同样的test代码。启动demo,随后我自己是得到这样的一份调试日志。

[sync_player.c][video_avframe_decoder][229]: video steam time_base : 1/29970    
[sync_player.c][audio_avframe_decoder][143]: audio steam time_base : 1/48000    
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 1024     
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 2048     
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 3072     
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 0        
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 4096     
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 5120     
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 6144     
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 7168     
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 2000     
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 8192     
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 9216     
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 10240    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 11264    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 12288    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 13312    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 3000     
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 14336    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 15360    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 16384    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 17408    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 18432    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 4000     
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 19456    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 20480    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 21504    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 22528    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 23552    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 24576    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 5000     
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 25600    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 26624    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 27648    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 28672    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 29696    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 30720    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 31744    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 6000     
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 32768    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 33792    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 34816    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 35840    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 36864    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 37888    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 7000     
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 38912    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 39936    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 40960    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 41984    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 43008    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 44032    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 8000     
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 45056    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 46080    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 47104    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 0        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 48128    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 1        
[sync_player.c][audio_avframe_decoder][186]: audio current frame PTS : 49152    
[sync_player.c][audio_avframe_decoder][188]: audio current frame PTS : 1        
[sync_player.c][video_avframe_decoder][288]: video current frame PTS : 9000     
[sync_player.c][video_avframe_decoder][290]: video current frame PTS : 0           

从第一二行日志,我们可以清楚的知道,audio / video的时间基是各不相同的。这个时间基其实我们是可以间接初步的推算得出的,捉住时间基的概念去理解(1秒的量化单位),所以音频的时间基就是采样频率(每秒音频采样)的倒数,而视频的时间基就是帧率(每秒显示多少帧)的倒数了。(测试MP4文件,帧率29.97≈30,音频采样44100Hz,双声道AAC)

然后从日志我们可以看到,经过一瞬间后,audio的PTS已经满了1个时间基的刻度,到1s了。而且video还是9000/29970,还不足0.5s,已经出现了音视频不同步的现象了。     

而且有没同学觉得(反正我自己是这样觉得了)通过这个av_frame_get_best_effort_timestamp方法获取的pts刻度,转换的时间只到秒级,用于调试还可以,但是用于同步那未免太粗糙了。有没更精确的数值表示呢? 答案是有的。我们更新 !test 的代码替换如下:

// audio_avframe_decoder
audioClock = frame->pts * av_q2d(audioStream->time_base);
LOGD(" current audioClock : %f\n", audioClock);

// video_avframe_decoder
videoClock = frame->pts * av_q2d(videoStream->time_base);
LOGD(" current audioClock : %f\n", audioClock);

新建两个double类型的全局变量,audioClock / videoClock。我们这次直接使用解码后的frame自带的pts字段,日志输出如下

[sync_player.c][video_avframe_decoder][231]: video steam time_base : 1/29970 
[sync_player.c][audio_avframe_decoder][142]: audio steam time_base : 1/48000 
[sync_player.c][audio_avframe_decoder][191]:  current audioClock : 0.000000  
[sync_player.c][audio_avframe_decoder][191]:  current audioClock : 0.021333  
[sync_player.c][audio_avframe_decoder][191]:  current audioClock : 0.042667  
[sync_player.c][audio_avframe_decoder][191]:  current audioClock : 0.064000  
[sync_player.c][video_avframe_decoder][295]:  current videoClock : 0.000000  
[sync_player.c][audio_avframe_decoder][191]:  current audioClock : 0.085333  
[sync_player.c][audio_avframe_decoder][191]:  current audioClock : 0.106667  
[sync_player.c][audio_avframe_decoder][191]:  current audioClock : 0.128000  
[sync_player.c][audio_avframe_decoder][191]:  current audioClock : 0.149333  
[sync_player.c][video_avframe_decoder][295]:  current videoClock : 0.033367  

已经精确到微妙级别了(1s = 1_000ms = 1_000_000us),可以使用于同步逻辑了。

那么要怎样去同步呢?以demo现在这个情况,audioClock比videoClock要快,在恰当的时间需要放慢脚步等待,videoClock太慢要加快,秉持这个原则,我们在音频解码线程的解码循环中,增加代码:

    while(player->stop_thread_audio_decoder == 0)
    {
        pthread_mutex_lock(&audioAVPacketButter->mutex);
        AVPacket* packet = get_read_packet(audioAVPacketButter);
        pthread_mutex_unlock(&audioAVPacketButter->mutex);
        //AVPacket->AVFrame
        ret = avcodec_send_packet(audioCodecCtx, packet);
        if (ret == AVERROR_EOF){
            av_packet_unref(packet);
            LOGW("audio_decoder avcodec_send_packet:%d\n", ret);
            break; // 跳出 while(player->stop_thread_audio_decoder==0)
        }else if(ret < 0){
            av_packet_unref(packet);
            LOGE("audio_decoder avcodec_send_packet:%d\n", ret);
            continue;
        }

        while(ret >= 0)
        {
            ret = avcodec_receive_frame(audioCodecCtx, frame);
            if (ret == AVERROR(EAGAIN) ) {
                //LOGD("audio_decoder avcodec_receive_frame:%d\n", ret);
                break; // 跳出 while(ret>=0)
            } else if (ret < 0 || ret == AVERROR_EOF) {
                LOGW("audio_decoder avcodec_receive_frame:%d %s\n", AVERROR(ret), av_err2str(ret));
                av_packet_unref(packet);
                goto end;  //end处进行资源释放等善后处理
            }

            // !test start
            //if ((pts = av_frame_get_best_effort_timestamp(frame)) == AV_NOPTS_VALUE)
            //    pts = 0;
            //LOGI("audio current frame PTS : %lld\n",pts);
            //pts *= av_q2d(audioStream->time_base);
            //LOGI("audio current frame PTS : %lld\n",pts);
            // !test end

            audioClock = frame->pts * av_q2d(audioStream->time_base);
            //LOGD(" current audioClock : %f\n", audioClock);

            if (ret >= 0)
            {
                swr_convert(player->swr_ctx, &out_buffer, MAX_AUDIO_FRAME_SIZE, (const uint8_t **) frame->data, frame->nb_samples);
                //获取sample的size
                int out_buffer_size = av_samples_get_buffer_size(NULL, player->out_channel_nb,
                                                                 frame->nb_samples, player->out_sample_fmt, 1);
                //AudioTrack.write(byte[] int int) 需要byte数组,对应jni的jbyteArray
                //需要把out_buffer缓冲区数据转成byte数组
                jbyteArray audio_data_byteArray = (*env)->NewByteArray(env, out_buffer_size);
                jbyte* fp_AudioDataArray = (*env)->GetByteArrayElements(env, audio_data_byteArray, NULL);
                memcpy(fp_AudioDataArray, out_buffer, (size_t) out_buffer_size);
                (*env)->ReleaseByteArrayElements(env, audio_data_byteArray, fp_AudioDataArray,0);
                // AudioTrack.write PCM数据
                (*env)->CallIntMethod(env,player->audio_track,player->audio_track_write_mid,
                                      audio_data_byteArray, 0, out_buffer_size);
                if(fabs(audioClock-videoClock) > AV_SYNC_THRESHOLD)
                {   // 再次填充PCM + usleep延迟等待
                    (*env)->CallIntMethod(env,player->audio_track,player->audio_track_write_mid,
                                          audio_data_byteArray, 0, out_buffer_size);
                    usleep(15000); // 等待时间其实还需要更好的通过pts计算出来。
                }
                //!!!释放局部引用,要不然会局部引用溢出
                (*env)->DeleteLocalRef(env,audio_data_byteArray);
            }
        }
        av_packet_unref(packet);
    }

在正常的AudioTrack.write写入PCM数据之后,我们判断当前 audioClock-videoClock的差值,如果大于定义的同步差值,我们就usleep睡眠延迟等待,并重新写入该帧的PCM数据(因为人类的听觉灵敏度 远大于 视觉灵敏度,变音好过失去声音吧?)以降低audio的速度。

    while(player->stop_thread_video_decoder == 0)
    {
        pthread_mutex_lock(&videoAVPacketButter->mutex);
        AVPacket* packet = get_read_packet(videoAVPacketButter);
        pthread_mutex_unlock(&videoAVPacketButter->mutex);

        //AVPacket->AVFrame
        ret = avcodec_send_packet(videoCodecCtx, packet);
        if (ret == AVERROR_EOF){
            av_packet_unref(packet);
            LOGW("video_decoder avcodec_send_packet:%d\n", ret);
            break; //跳出 while(player->stop_thread_video_decoder == 0)
        }else if(ret < 0){
            av_packet_unref(packet);
            LOGE("video_decoder avcodec_send_packet:%d\n", ret);
            continue;
        }

        while(ret >= 0)
        {
            ret = avcodec_receive_frame(videoCodecCtx, yuv_frame);
            if (ret == AVERROR(EAGAIN) ){
                //LOGD("video_decoder avcodec_receive_frame:%d\n", ret);
                break; //跳出 while(ret >= 0)
            }else if (ret < 0 || ret == AVERROR_EOF) {
                LOGW("video_decoder avcodec_receive_frame:%d %s\n", AVERROR(ret),av_err2str(ret));
                av_packet_unref(packet);
                goto end;
            }

            // !test start
            //if ((pts = av_frame_get_best_effort_timestamp(yuv_frame)) == AV_NOPTS_VALUE)
            //    pts = 0;
            //LOGD("video current frame PTS : %lld\n",pts);
            //pts *= av_q2d(videoStream->time_base);
            //LOGD("video current frame PTS : %lld\n",pts);
            // !test end
            videoClock = yuv_frame->pts * av_q2d(videoStream->time_base);
            //LOGD(" current videoClock : %f\n", videoClock);

            if(fabs(audioClock-videoClock) > AV_SYNC_THRESHOLD)
                continue;
            if (ret >= 0)
            {
                ANativeWindow_lock(nativeWindow, &nativeWinBuffer, NULL);
                // 上锁并关联 ANativeWindow + ANativeWindow_Buffer
                av_image_fill_arrays(rgb_frame->data, rgb_frame->linesize, nativeWinBuffer.bits,
                                     AV_PIX_FMT_RGBA, videoCodecCtx->width, videoCodecCtx->height, 1 );
                // rgb.AVFrame对象 关联 ANativeWindow_Buffer的真实内存空间actual bits.
                I420ToARGB(yuv_frame->data[0], yuv_frame->linesize[0],
                           yuv_frame->data[2], yuv_frame->linesize[2],
                           yuv_frame->data[1], yuv_frame->linesize[1],
                           rgb_frame->data[0], rgb_frame->linesize[0],
                           videoCodecCtx->width, videoCodecCtx->height);
                // yuv.AVFrame 转 rgb.AVFrame
                ANativeWindow_unlockAndPost(nativeWindow);
                // 释放锁并 swap交换显示内存到屏幕上。
            }
        }
        av_packet_unref(packet);
    }

视频解码线程的循环当中,在avframe正式被渲染到ANativeWindow之前,也进行一下audioClock-videoClock的判断,如果大于定义的同步差值,就直接跳过当前帧的渲染,丢帧的操作是以加快video的速度,追赶audio。

至于这个AV_SYNC_THRESHOLD的值是多少呢? 这个是要根据之前的videoClock 和 audioClock的跨度决定,具体多少没有定值,需要我们自己根据实际情况去进行有策略性的调整。

 

 

进一步探究

现在demo的音视同步效果其实是差强人意的,但是原理我们已经理解清楚了。接下来,我们就要一探究竟,为何video的速度如此之慢,我们根本达不到 30fps的效果,为何其他播放器就可以呢? 我们暂且先屏蔽掉同步的逻辑,在视频解码线程的循环内加入代码耗时时间的调试日志

    int ret;
    int64_t pts;
    long start;
    int count = 0;

    while(player->stop_thread_video_decoder == 0)
    {
        count++;
        start = clock();
        LOGD("video decoder %d round start at %ld ...\n", count, start/1000);
        pthread_mutex_lock(&videoAVPacketButter->mutex);
        AVPacket* packet = get_read_packet(videoAVPacketButter);
        pthread_mutex_unlock(&videoAVPacketButter->mutex);
        LOGI("video get_read_packet at %ld \n", clock()-start);

        //AVPacket->AVFrame
        ret = avcodec_send_packet(videoCodecCtx, packet);
        LOGI("video avcodec_send_packet at %ld \n", clock()-start);
        if (ret == AVERROR_EOF){
            av_packet_unref(packet);
            LOGW("video_decoder avcodec_send_packet:%d\n", ret);
            break; //跳出 while(player->stop_thread_video_decoder == 0)
        }else if(ret < 0){
            av_packet_unref(packet);
            LOGE("video_decoder avcodec_send_packet:%d\n", ret);
            continue;
        }

        while(ret >= 0)
        {
            ret = avcodec_receive_frame(videoCodecCtx, yuv_frame);
            if (ret == AVERROR(EAGAIN) ){
                //LOGD("video_decoder avcodec_receive_frame:%d\n", ret);
                break; //跳出 while(ret >= 0)
            }else if (ret < 0 || ret == AVERROR_EOF) {
                LOGW("video_decoder avcodec_receive_frame:%d %s\n", AVERROR(ret),av_err2str(ret));
                av_packet_unref(packet);
                goto end;
            }
            LOGI("video avcodec_receive_frame at %ld \n", clock()-start);

            // !test start
            //if ((pts = av_frame_get_best_effort_timestamp(yuv_frame)) == AV_NOPTS_VALUE)
            //    pts = 0;
            //LOGD("video current frame PTS : %lld\n",pts);
            //pts *= av_q2d(videoStream->time_base);
            //LOGD("video current frame PTS : %lld\n",pts);
            // !test end
            videoClock = yuv_frame->pts * av_q2d(videoStream->time_base);
            //LOGD(" current videoClock : %f\n", videoClock);

            //if(fabs(audioClock-videoClock) > AV_SYNC_THRESHOLD)
            //    continue;
            if (ret >= 0)
            {
                LOGI("video prepare draw at %ld \n", clock()-start);
                ANativeWindow_lock(nativeWindow, &nativeWinBuffer, NULL);
                // 上锁并关联 ANativeWindow + ANativeWindow_Buffer
                av_image_fill_arrays(rgb_frame->data, rgb_frame->linesize, nativeWinBuffer.bits,
                                     AV_PIX_FMT_RGBA, videoCodecCtx->width, videoCodecCtx->height, 1 );
                // rgb.AVFrame对象 关联 ANativeWindow_Buffer的真实内存空间actual bits.
                I420ToARGB(yuv_frame->data[0], yuv_frame->linesize[0],
                           yuv_frame->data[2], yuv_frame->linesize[2],
                           yuv_frame->data[1], yuv_frame->linesize[1],
                           rgb_frame->data[0], rgb_frame->linesize[0],
                           videoCodecCtx->width, videoCodecCtx->height);
                LOGI("video I420ToARGB at %ld \n", clock()-start);
                // yuv.AVFrame 转 rgb.AVFrame
                ANativeWindow_unlockAndPost(nativeWindow);
                // 释放锁并 swap交换显示内存到屏幕上。
                LOGI("video finish draw at %ld \n", clock()-start);
            }
        }
        av_packet_unref(packet);
        LOGD("video decoder %d round end at %ld ...\n", count, (clock()-start)/1000);
        LOGD("-------------------------------------\n");
    }

我们在各个关键位置,加入了耗时的打印。得出如下日志。

[sync_player.c][video_avframe_decoder][268]: video decoder 1 round start at 1223 ...
[sync_player.c][video_avframe_decoder][272]: video get_read_packet at 53 
[sync_player.c][video_avframe_decoder][276]: video avcodec_send_packet at 39951 
[sync_player.c][video_avframe_decoder][336]: video decoder 1 round end at 40036 ...
[sync_player.c][video_avframe_decoder][337]: -------------------------------------
[sync_player.c][video_avframe_decoder][268]: video decoder 2 round start at 1263 ...
[sync_player.c][video_avframe_decoder][272]: video get_read_packet at 25 
[sync_player.c][video_avframe_decoder][276]: video avcodec_send_packet at 28174 
[sync_player.c][video_avframe_decoder][298]: video avcodec_receive_frame at 28257 
[sync_player.c][video_avframe_decoder][317]: video prepare draw at 28290 
[sync_player.c][video_avframe_decoder][328]: video I420ToARGB at 115386 
[sync_player.c][video_avframe_decoder][332]: video finish draw at 116751 
[sync_player.c][video_avframe_decoder][336]: video decoder 2 round end at 116843 ...
[sync_player.c][video_avframe_decoder][337]: -------------------------------------
[sync_player.c][video_avframe_decoder][268]: video decoder 3 round start at 1380 ...
[sync_player.c][video_avframe_decoder][272]: video get_read_packet at 44 
[sync_player.c][video_avframe_decoder][276]: video avcodec_send_packet at 16678 
[sync_player.c][video_avframe_decoder][298]: video avcodec_receive_frame at 16788 
[sync_player.c][video_avframe_decoder][317]: video prepare draw at 16831 
[sync_player.c][video_avframe_decoder][328]: video I420ToARGB at 90281 
[sync_player.c][video_avframe_decoder][332]: video finish draw at 91667 
[sync_player.c][video_avframe_decoder][336]: video decoder 3 round end at 91758 ...
[sync_player.c][video_avframe_decoder][337]: -------------------------------------

分析前三round,我们已经可以明显的知道了。每一round end得出的耗时大头,就是I420ToARGB方法就是yuv转rgb的整个耗时时间,这平均耗时居然达100ms,视频的帧率变成10fps。(笑哭.jpg)
那么有同学可能会说,不要用libyuv了,直接用ffmpeg的SwsContext API转换格式。效率会不会好一点?这个随便百度一下,可以知道libyuv比ffmpeg自带的转换格式性能提高,是接近10倍哦。         
流弊一点的同学可能就会低头推眼镜 说:麻烦加上neon指令加速。是的,NEON确实也能提高其中的效率,效果怎样暂且我还没试验,网上论调也不尽人意。(但是开启NEON加速总比没有开启的好啊!)

那怎么办?这效果boss不接受啊!用户更加不能忍啊!既然yuv转rgb耗时那么多,为啥不能把这部分砍掉?直接用yuv渲染?是的,已经有前人想到了这点,并从前期的Android系统的播放器找到了答案,但是代码过于复杂而且年代久远,Google也放弃了AwesomePlayer。用于学习还是可以的。

Android用surface直接显示yuv数据(一)

Android用surface直接显示yuv数据(二)

Android用surface直接显示yuv数据(三)

然后我想说的是!OpenGL也能轻易的做到这一点!所以我会在下个主题OpenGL+Shader的学习中重回这里,让我们的demo真正的做到30fps,甚至更高的60fps!

 

 

 

 

2019.01.24:升级libyuv支持neon,性能得到提升。

文章当天写完之后,我就开始着手升级ffmpeg支持neon和libyuv支持neon的工作。其实两个库支持neon都比较简单。

ffmpeg支持neon,只需要在之前手动编译Android使用的FFmpeg库(Linux)介绍的脚本,--enable-neon就可以了。

#!/bin/bash
make clean
export NDK=/usr/ndk/android-ndk-r14/
export SYSROOT=$NDK/platforms/android-9/arch-arm/
export TOOLCHAIN=$NDK/toolchains/arm-linux-androideabi-4.8/prebuilt/linux-x86_64
export CPU=arm
export PREFIX=$(pwd)/android/$CPU
export ADDI_CFLAGS="-marm"
 
./configure --target-os=linux \
--prefix=$PREFIX --arch=arm \
--disable-doc \
--enable-shared \
--disable-static \
--disable-yasm \
--disable-symver \
--enable-gpl \
--enable-neon \
--disable-ffmpeg \
--disable-ffplay \
--disable-ffprobe \
--disable-ffserver \
--disable-doc \
--disable-symver \
--cross-prefix=$TOOLCHAIN/bin/arm-linux-androideabi- \
--enable-cross-compile \
--sysroot=$SYSROOT \
--extra-cflags="-Os -fpic $ADDI_CFLAGS" \
--extra-ldflags="$ADDI_LDFLAGS" \
$ADDITIONAL_CONFIGURE_FLAG
make clean
make
make install

 

libyuv之前好像没介绍源码编译的过程,因为都比较简单。参考https://blog.csdn.net/abcdnml/article/details/70243783就可以了。

LOCAL_PATH:= $(call my-dir)
include $(CLEAR_VARS)

LOCAL_CPP_EXTENSION := .cc
LOCAL_SRC_FILES := \
    source/compare.cc           \
    source/compare_common.cc    \
    source/compare_neon64.cc    \
    source/compare_gcc.cc       \
    source/convert.cc           \
    source/convert_argb.cc      \
    source/convert_from.cc      \
    source/convert_from_argb.cc \
    source/convert_to_argb.cc   \
    source/convert_to_i420.cc   \
    source/cpu_id.cc            \
    source/planar_functions.cc  \
    source/rotate.cc            \
    source/rotate_argb.cc       \
    source/rotate_mips.cc       \
    source/rotate_neon64.cc     \
    source/row_any.cc           \
    source/row_common.cc        \
    source/row_mips.cc          \
    source/row_neon64.cc        \
    source/row_gcc.cc           \
    source/scale.cc             \
    source/scale_any.cc         \
    source/scale_argb.cc        \
    source/scale_common.cc      \
    source/scale_mips.cc        \
    source/scale_neon64.cc      \
    source/scale_gcc.cc         \
    source/video_common.cc

# TODO(fbarchard): Enable mjpeg encoder.
#   source/mjpeg_decoder.cc
#   source/convert_jpeg.cc
#   source/mjpeg_validate.cc

ifeq ($(TARGET_ARCH_ABI),armeabi-v7a)
    LOCAL_CFLAGS += -DLIBYUV_NEON
    LOCAL_SRC_FILES += \
        source/compare_neon.cc.neon    \
        source/rotate_neon.cc.neon     \
        source/row_neon.cc.neon        \
        source/scale_neon.cc.neon
endif

LOCAL_EXPORT_C_INCLUDES := $(LOCAL_PATH)/include
LOCAL_C_INCLUDES += $(LOCAL_PATH)/include

LOCAL_MODULE := libyuv
LOCAL_MODULE_TAGS := optional

include $(BUILD_SHARED_LIBRARY)

我们可以看到,libyuv的编译脚本,其实在TARGET_ARCH_ABI==armeabi-v7a的时候,已经自动的增加了neon的支持。所以我们必须要另外写Application.mk,指定编译平台APP_ABI := armeabi armeabi-v7a mips,然后使用lib/armeabi-v7a的so。

 

经过升级以后,我们在看看解码线程的调试日志。

 [sync_player.c][audio_avframe_decoder][140]: audio steam time_base : 1/48000 
 [sync_player.c][video_avframe_decoder][235]: video steam time_base : 1/29970 
 [sync_player.c][video_avframe_decoder][268]: video decoder 1 round start at 1190 ...
 [sync_player.c][video_avframe_decoder][272]: video get_read_packet at 78 
 [sync_player.c][video_avframe_decoder][276]: video avcodec_send_packet at 39669 
 [sync_player.c][video_avframe_decoder][335]: video decoder 1 round end at 39 ...
 [sync_player.c][video_avframe_decoder][336]: -------------------------------------
 [sync_player.c][video_avframe_decoder][268]: video decoder 2 round start at 1230 ...
 [sync_player.c][video_avframe_decoder][272]: video get_read_packet at 23 
 [sync_player.c][video_avframe_decoder][276]: video avcodec_send_packet at 39946 
 [sync_player.c][video_avframe_decoder][298]: video avcodec_receive_frame at 40033 
 [sync_player.c][video_avframe_decoder][335]: video decoder 2 round end at 40 ...
 [sync_player.c][video_avframe_decoder][336]: -------------------------------------
 [sync_player.c][video_avframe_decoder][268]: video decoder 3 round start at 1270 ...
 [sync_player.c][video_avframe_decoder][272]: video get_read_packet at 23 
 [sync_player.c][video_avframe_decoder][276]: video avcodec_send_packet at 15373 
 [sync_player.c][video_avframe_decoder][298]: video avcodec_receive_frame at 15459 
 [sync_player.c][video_avframe_decoder][317]: video prepare draw at 15497 
 [sync_player.c][video_avframe_decoder][331]: video finish draw at 25104 
 [sync_player.c][video_avframe_decoder][335]: video decoder 3 round end at 25 ...
 [sync_player.c][video_avframe_decoder][336]: -------------------------------------
 [sync_player.c][video_avframe_decoder][268]: video decoder 4 round start at 1295 ...
 [sync_player.c][video_avframe_decoder][272]: video get_read_packet at 23 
 [sync_player.c][video_avframe_decoder][276]: video avcodec_send_packet at 38360 
 [sync_player.c][video_avframe_decoder][298]: video avcodec_receive_frame at 38448 
 [sync_player.c][video_avframe_decoder][335]: video decoder 4 round end at 38 ...
 [sync_player.c][video_avframe_decoder][336]: -------------------------------------
 [sync_player.c][video_avframe_decoder][268]: video decoder 5 round start at 1334 ...
 [sync_player.c][video_avframe_decoder][272]: video get_read_packet at 23 
 [sync_player.c][video_avframe_decoder][276]: video avcodec_send_packet at 49854 
 [sync_player.c][video_avframe_decoder][298]: video avcodec_receive_frame at 49948 
 [sync_player.c][video_avframe_decoder][317]: video prepare draw at 49990 
 [sync_player.c][video_avframe_decoder][331]: video finish draw at 58722 
 [sync_player.c][video_avframe_decoder][335]: video decoder 5 round end at 58 ...
 [sync_player.c][video_avframe_decoder][336]: -------------------------------------

比较video finish - prepare的差值,解码线程的性能提高了10倍。

你可能感兴趣的:(NDK学习笔记)