看了很多的例子。本章需要学习的是视频同步,有很多新知识需要学习。我就先把代码翻译一下。
typedef struct VideoState { double video_clock; // pts of last decoded frame / predicted pts of next decoded frameHere's the synchronize_video function, which is pretty self-explanatory:
double synchronize_video(VideoState *is, AVFrame *src_frame, double pts) { double frame_delay; if(pts != 0) { /* if we have pts, set video clock to it */ is->video_clock = pts; } else { /* if we aren't given a pts, set it to the clock */ pts = is->video_clock; } /* update the video clock */ frame_delay = av_q2d(is->video_st->codec->time_base); /* if we are repeating a frame, adjust clock accordingly */ frame_delay += src_frame->repeat_pict * (frame_delay * 0.5); is->video_clock += frame_delay; return pts; }You'll notice we account for repeated frames in this function, too.
Now let's get our proper PTS and queue up the frame using queue_picture, adding a new pts argument:
现在,让我们获取合适的PTS并且使用queue_picture将其加入到队列中,并添加一个pts字段。// Did we get a video frame? if(frameFinished) { pts = synchronize_video(is, pFrame, pts); if(queue_picture(is, pFrame, pts) < 0) { break; } }The only thing that changes about queue_picture is that we save that pts value to the VideoPicture structure that we queue up.
typedef struct VideoPicture { ... double pts; } int queue_picture(VideoState *is, AVFrame *pFrame, double pts) { ... stuff ... if(vp->bmp) { ... convert picture ... vp->pts = pts; ... alert queue ... }So now we've got pictures lining up onto our picture queue with proper PTS values, so let's take a look at our video refreshing function.
void video_refresh_timer(void *userdata) { VideoState *is = (VideoState *)userdata; VideoPicture *vp; double actual_delay, delay, sync_threshold, ref_clock, diff; if(is->video_st) { if(is->pictq_size == 0) { schedule_refresh(is, 1); } else { vp = &is->pictq[is->pictq_rindex]; delay = vp->pts - is->frame_last_pts; /* the pts from last time */ if(delay <= 0 || delay >= 1.0) { /* if incorrect delay, use previous one */ delay = is->frame_last_delay; } /* save for next time */ is->frame_last_delay = delay; is->frame_last_pts = vp->pts; /* update delay to sync to audio */ ref_clock = get_audio_clock(is); diff = vp->pts - ref_clock; /* Skip or repeat the frame. Take delay into account FFPlay still doesn't "know if this is the best guess." */ sync_threshold = (delay > AV_SYNC_THRESHOLD) ? delay : AV_SYNC_THRESHOLD; if(fabs(diff) < AV_NOSYNC_THRESHOLD) { if(diff <= -sync_threshold) { delay = 0; } else if(diff >= sync_threshold) { delay = 2 * delay; } } is->frame_timer += delay; /* computer the REAL delay */ actual_delay = is->frame_timer - (av_gettime() / 1000000.0); if(actual_delay < 0.010) { /* Really it should skip the picture instead */ actual_delay = 0.010; } schedule_refresh(is, (int)(actual_delay * 1000 + 0.5)); /* show the picture! */ video_display(is); /* update queue for next picture! */ if(++is->pictq_rindex == VIDEO_PICTURE_QUEUE_SIZE) { is->pictq_rindex = 0; } SDL_LockMutex(is->pictq_mutex); is->pictq_size--; SDL_CondSignal(is->pictq_cond); SDL_UnlockMutex(is->pictq_mutex); } } else { schedule_refresh(is, 100); } }There are a few checks we make:
is->frame_timer = (double)av_gettime() / 1000000.0; is->frame_last_delay = 40e-3;
/* if update, update the audio clock w/pts */ if(pkt->pts != AV_NOPTS_VALUE) { is->audio_clock = av_q2d(is->audio_st->time_base)*pkt->pts; }And once we are processing the packet:
/* Keep audio_clock up-to-date */ pts = is->audio_clock; *pts_ptr = pts; n = 2 * is->audio_st->codec->channels; is->audio_clock += (double)data_size / (double)(n * is->audio_st->codec->sample_rate);A few fine details: the template of the function has changed to include pts_ptr , so make sure you change that.
double get_audio_clock(VideoState *is) { double pts; int hw_buf_size, bytes_per_sec, n; pts = is->audio_clock; /* maintained in the audio thread */ hw_buf_size = is->audio_buf_size - is->audio_buf_index; bytes_per_sec = 0; n = is->audio_st->codec->channels * 2; if(is->audio_st) { bytes_per_sec = is->audio_st->codec->sample_rate * n; } if(bytes_per_sec) { pts -= (double)hw_buf_size / bytes_per_sec; } return pts; }
#include "stdafx.h" #ifdef TUTORIAL_05 // tutorial05.c // A pedagogical video player that really works! // // This tutorial was written by Stephen Dranger ([email protected]). // // Code based on FFplay, Copyright (c) 2003 Fabrice Bellard, // and a tutorial by Martin Bohme ([email protected]) // Tested on Gentoo, CVS version 5/01/07 compiled with GCC 4.1.1 // // Use the Makefile to build all the samples. // // Run using // tutorial05 myvideofile.mpg // // to play the video. extern "C" { #include "libavutil/avstring.h" #include "libavutil/mathematics.h" #include "libavutil/pixdesc.h" #include "libavutil/imgutils.h" #include "libavutil/dict.h" #include "libavutil/parseutils.h" #include "libavutil/samplefmt.h" #include "libavutil/avassert.h" #include "libavutil/time.h" #include "libavformat/avformat.h" #include "libavdevice/avdevice.h" #include "libswscale/swscale.h" #include "libavutil/opt.h" #include "libavcodec/avfft.h" #include "libswresample/swresample.h" #include "SDL1.2/SDL.h" #include "SDL1.2/SDL_thread.h" } #pragma comment(lib, "avcodec.lib") #pragma comment(lib, "avformat.lib") #pragma comment(lib, "avutil.lib") #pragma comment(lib, "avdevice.lib") #pragma comment(lib, "avfilter.lib") #pragma comment(lib, "postproc.lib") #pragma comment(lib, "swresample.lib") #pragma comment(lib, "swscale.lib") #pragma comment(lib, "SDL.lib") #ifdef __MINGW32__ #undef main /* Prevents SDL from overriding main() */ #endif #include <stdio.h> #include <math.h> #define SDL_AUDIO_BUFFER_SIZE 1024 #define MAX_AUDIO_FRAME_SIZE 192000 #define MAX_AUDIOQ_SIZE (5 * 16 * 1024) #define MAX_VIDEOQ_SIZE (5 * 256 * 1024) #define AV_SYNC_THRESHOLD 0.01 #define AV_NOSYNC_THRESHOLD 10.0 #define FF_ALLOC_EVENT (SDL_USEREVENT) #define FF_REFRESH_EVENT (SDL_USEREVENT + 1) #define FF_QUIT_EVENT (SDL_USEREVENT + 2) #define VIDEO_PICTURE_QUEUE_SIZE 1 // BD int g_iIndex_video_pkt = 0; // ED typedef struct PacketQueue { AVPacketList *first_pkt, *last_pkt; int nb_packets; int size; SDL_mutex *mutex; SDL_cond *cond; } PacketQueue; typedef struct VideoPicture { SDL_Overlay *bmp; int width, height; /* source height & width */ int allocated; double pts; // BD AVPictureType type; int iIndex; // ED } VideoPicture; typedef struct VideoState { AVFormatContext *pFormatCtx; int videoStream, audioStream; // audio double audio_clock; AVStream *audio_st; PacketQueue audioq; AVFrame audio_frame; uint8_t audio_buf[(MAX_AUDIO_FRAME_SIZE * 3) / 2]; unsigned int audio_buf_size; unsigned int audio_buf_index; AVPacket audio_pkt; uint8_t *audio_pkt_data; int audio_pkt_size; int audio_hw_buf_size; double frame_timer; double frame_last_pts; double frame_last_delay; // video double video_clock; ///<pts of last decoded frame / predicted pts of next decoded frame AVStream *video_st; PacketQueue videoq; VideoPicture pictq[VIDEO_PICTURE_QUEUE_SIZE]; int pictq_size, pictq_rindex, pictq_windex; SDL_mutex *pictq_mutex; SDL_cond *pictq_cond; SDL_Thread *parse_tid; SDL_Thread *video_tid; char filename[1024]; int quit; AVIOContext *io_context; struct SwsContext *sws_ctx; } VideoState; SDL_Surface *screen; /* Since we only have one decoding thread, the Big Struct can be global in case we need it. */ VideoState *global_video_state; struct SwrContext *swr_ctx; DECLARE_ALIGNED(16, uint8_t, audio_buf2)[MAX_AUDIO_FRAME_SIZE * 4]; static inline double rint(double x) { return x >= 0 ? floor(x + 0.5) : ceil(x - 0.5); } void packet_queue_init(PacketQueue *q) { memset(q, 0, sizeof(PacketQueue)); q->mutex = SDL_CreateMutex(); q->cond = SDL_CreateCond(); } int packet_queue_put(PacketQueue *q, AVPacket *pkt) { AVPacketList *pkt1; if( av_dup_packet(pkt) < 0 ) { return -1; } pkt1 = (AVPacketList *)av_malloc(sizeof(AVPacketList)); if( !pkt1 ) { return -1; } pkt1->pkt = *pkt; pkt1->next = NULL; SDL_LockMutex(q->mutex); if( !q->last_pkt ) { q->first_pkt = pkt1; } else { q->last_pkt->next = pkt1; } q->last_pkt = pkt1; q->nb_packets ++; q->size += pkt1->pkt.size; SDL_CondSignal(q->cond); SDL_UnlockMutex(q->mutex); return 0; } static int packet_queue_get(PacketQueue *q, AVPacket *pkt, int block) { AVPacketList *pkt1; int ret; SDL_LockMutex(q->mutex); for( ; ; ) { if( global_video_state->quit ) { ret = -1; break; } pkt1 = q->first_pkt; if( pkt1 ) { q->first_pkt = pkt1->next; if( !q->first_pkt ) { q->last_pkt = NULL; } q->nb_packets --; q->size -= pkt1->pkt.size; *pkt = pkt1->pkt; av_free(pkt1); ret = 1; break; } else if( !block ) { ret = 0; break; } else { SDL_CondWait(q->cond, q->mutex); } } SDL_UnlockMutex(q->mutex); return ret; } double get_audio_clock(VideoState *is) { double pts; int hw_buf_size, bytes_per_sec, n; // 当前音频buffer播放完的时间 pts = is->audio_clock; /* maintained in the audio thread */ // 当前音频buffer的剩余时间 hw_buf_size = is->audio_buf_size - is->audio_buf_index; bytes_per_sec = 0; // 计算音频1秒钟所需的数据量 n = is->audio_st->codec->channels * 2; if( is->audio_st ) { bytes_per_sec = is->audio_st->codec->sample_rate * n; } // (double)hw_buf_size / bytes_per_sec;为当前音频播放完还需要的时间 // pts减去上面的值得到当前的时间戳 if( bytes_per_sec ) { pts -= (double)hw_buf_size / bytes_per_sec; } return pts; } int audio_decode_frame(VideoState *is, double *pts_ptr) { int len1, data_size = 0, n; AVPacket *pkt = &is->audio_pkt; double pts; for( ; ; ) { while( is->audio_pkt_size > 0 ) { int got_frame; len1 = avcodec_decode_audio4(is->audio_st->codec, &is->audio_frame, &got_frame, pkt); if( len1 < 0 ) { /* if error, skip frame */ is->audio_pkt_size = 0; break; } if( got_frame ) { AVCodecContext* aCodecCtx = is->audio_st->codec; uint64_t dec_channel_layout = (aCodecCtx->channel_layout && aCodecCtx->channels == av_get_channel_layout_nb_channels(aCodecCtx->channel_layout)) ? aCodecCtx->channel_layout : av_get_default_channel_layout(aCodecCtx->channels); AVSampleFormat tgtFmt = AV_SAMPLE_FMT_S16; if( aCodecCtx->sample_fmt != tgtFmt ) { // 需要重采样 if( swr_ctx == NULL ) { swr_ctx = swr_alloc(); swr_ctx = swr_alloc_set_opts(swr_ctx, dec_channel_layout, tgtFmt, aCodecCtx->sample_rate, dec_channel_layout, aCodecCtx->sample_fmt, aCodecCtx->sample_rate, 0, NULL); if( !swr_ctx || swr_init(swr_ctx) < 0 ) { assert(false); } } if( swr_ctx ) { const uint8_t **in = (const uint8_t **)is->audio_frame.extended_data; uint8_t *out[] = {audio_buf2}; int out_count = sizeof(audio_buf2) / aCodecCtx->channels / av_get_bytes_per_sample(aCodecCtx->sample_fmt); int len2 = swr_convert(swr_ctx, out, out_count, in, is->audio_frame.nb_samples); if( len2 < 0 ) { LogPrintfA("swr_convert() failed\n"); break; } if( len2 == out_count ) { LogPrintfA("warning: audio buffer is probably too small\n"); swr_init(swr_ctx); } data_size = len2 * aCodecCtx->channels * av_get_bytes_per_sample(tgtFmt); memcpy(is->audio_buf, audio_buf2, data_size); } } else { // 不需要重采样 data_size = av_samples_get_buffer_size(NULL, aCodecCtx->channels, is->audio_frame.nb_samples, aCodecCtx->sample_fmt, 1); assert(data_size <= is->audio_buf_size); memcpy(is->audio_buf, is->audio_frame.data[0], data_size); } } is->audio_pkt_data += len1; is->audio_pkt_size -= len1; if( data_size <= 0 ) { /* No data yet, get more frames */ continue; } pts = is->audio_clock; *pts_ptr = pts; // 2为: 16位采样, 一次占用的字节数, 若非16位采样, 就要修改字节数了 // 这里是为了计算播放本次音频buffer所需的时间 n = 2 * is->audio_st->codec->channels; is->audio_clock += (double)data_size / (double)(n * is->audio_st->codec->sample_rate); //LogPrintf(_T("is->audio_clock: %f, plus: %f\n"), is->audio_clock, (double)data_size / (double)(n * is->audio_st->codec->sample_rate) ); /* We have data, return it and come back for more later */ return data_size; } if( pkt->data ) { av_free_packet(pkt); } if( is->quit ) { return -1; } /* next packet */ if( packet_queue_get(&is->audioq, pkt, 1) < 0 ) { return -1; } is->audio_pkt_data = pkt->data; is->audio_pkt_size = pkt->size; /* if update, update the audio clock w/pts */ if( pkt->pts != AV_NOPTS_VALUE ) { is->audio_clock = av_q2d(is->audio_st->time_base)*pkt->pts; } } } void audio_callback(void *userdata, Uint8 *stream, int len) { VideoState *is = (VideoState *)userdata; int len1, audio_size; double pts; while( len > 0 ) { if(is->audio_buf_index >= is->audio_buf_size) { /* We have already sent all our data; get more */ audio_size = audio_decode_frame(is, &pts); if( audio_size < 0 ) { /* If error, output silence */ is->audio_buf_size = 1024; memset(is->audio_buf, 0, is->audio_buf_size); } else { is->audio_buf_size = audio_size; } is->audio_buf_index = 0; } len1 = is->audio_buf_size - is->audio_buf_index; if( len1 > len ) { len1 = len; } memcpy(stream, (uint8_t *)is->audio_buf + is->audio_buf_index, len1); len -= len1; stream += len1; is->audio_buf_index += len1; } } static Uint32 sdl_refresh_timer_cb(Uint32 interval, void *opaque) { SDL_Event event; event.type = FF_REFRESH_EVENT; event.user.data1 = opaque; SDL_PushEvent(&event); return 0; /* 0 means stop timer */ } /* schedule a video refresh in 'delay' ms */ static void schedule_refresh(VideoState *is, int delay) { SDL_AddTimer(delay, sdl_refresh_timer_cb, is); } void video_display(VideoState *is) { SDL_Rect rect; VideoPicture *vp; //AVPicture pict; float aspect_ratio; int w, h, x, y; //int i; vp = &is->pictq[is->pictq_rindex]; if( vp->bmp ) { if(is->video_st->codec->sample_aspect_ratio.num == 0) { aspect_ratio = 0; } else { aspect_ratio = av_q2d(is->video_st->codec->sample_aspect_ratio) * is->video_st->codec->width / is->video_st->codec->height; } if( aspect_ratio <= 0.0 ) { aspect_ratio = (float)is->video_st->codec->width / (float)is->video_st->codec->height; } h = screen->h; w = ((int)rint(h * aspect_ratio)) & -3; if( w > screen->w ) { w = screen->w; h = ((int)rint(w / aspect_ratio)) & -3; } x = (screen->w - w) / 2; y = (screen->h - h) / 2; rect.x = x; rect.y = y; rect.w = w; rect.h = h; // BD //LogPrintfA("---------------------------------------------------------- [%05d] refresh bmp, Packet:%d, type: %s, pts: %f\n", // ::GetCurrentThreadId(), vp->iIndex, GetPictureTypeString(vp->type).c_str(), vp->pts); // ED SDL_DisplayYUVOverlay(vp->bmp, &rect); } } void video_refresh_timer(void *userdata) { VideoState *is = (VideoState *)userdata; VideoPicture *vp; double actual_delay, delay, sync_threshold, ref_clock, diff; if( is->video_st ) { if( is->pictq_size == 0 ) { schedule_refresh(is, 1); } else { // 目标: 计算下一帧图像的显示时间 vp = &is->pictq[is->pictq_rindex]; // frame_last_pts存着上一帧图像的pts, 用当前帧的pts减去上一帧的pts, 从而计算出一个估计的delay值 // 该delay值是上一帧图像已播放的时长 delay = vp->pts - is->frame_last_pts; /* the pts from last time */ // BD static int iIndex = 0; //LogPrintfA("上一帧播放时长为: %f\n", delay); // ED // 这个delay值有一个范围,如果超出范围的话,则用再上一次的delay值 if( delay <= 0 || delay >= 1.0 ) { /* if incorrect delay, use previous one */ delay = is->frame_last_delay; } /* save for next time */ is->frame_last_delay = delay; // 将当前帧的pts保存下来 is->frame_last_pts = vp->pts; /* update delay to sync to audio */ // ref_clock: audio播放的时间戳 ref_clock = get_audio_clock(is); diff = vp->pts - ref_clock; // BD //LogPrintfA("vp->pts: %f, ref_clock: %f, diff: %f; delay: %f\n", vp->pts, ref_clock, diff, delay); // ED /* Skip or repeat the frame. Take delay into account FFPlay still doesn't "know if this is the best guess." */ // delay和AV_SYNC_THRESHOLD之间取一个最大值 // new sync_threshold = FFMAX(delay, AV_SYNC_THRESHOLD); // 时间正负在(-0.01, 0.01)范围之外需要重新计算延迟 if( fabs(diff) < AV_NOSYNC_THRESHOLD ) { if( diff <= -sync_threshold ) { // 如果diff是个很小的负数,则说明当前视频帧已经落后于主时钟源了,下一帧图像应该快点显示,所以delay=0 delay = 0; } else if( diff >= sync_threshold ) { // 如果diff是一个比较大的正数,则说明当前视频帧已经超前于主时钟源了,下一帧图像应该延迟显示 delay = 2 * delay; } else { // diff是个可接受的数值, 可直接使用上一个delay // LogPrintfA("abcd\n"); } } else { assert(false); } // BD double frame_timer_old = is->frame_timer; // ED // frame_timer是一个delay累加的值, 加上delay后, frame_timer即为下一帧图像开始显示的时间 is->frame_timer += delay; /* computer the REAL delay */ // frame_timer减去当前系统时钟,得到一个actual_delay值 actual_delay = is->frame_timer - (av_gettime() / 1000000.0); if( actual_delay < 0.010 ) { /* Really it should skip the picture instead */ actual_delay = 0.010; } schedule_refresh(is, (int)(actual_delay * 1000 + 0.5)); /* show the picture! */ video_display(is); /* update queue for next picture! */ if( ++ is->pictq_rindex == VIDEO_PICTURE_QUEUE_SIZE ) { is->pictq_rindex = 0; } SDL_LockMutex(is->pictq_mutex); is->pictq_size--; SDL_CondSignal(is->pictq_cond); SDL_UnlockMutex(is->pictq_mutex); } } else { schedule_refresh(is, 100); } } void alloc_picture(void *userdata) { VideoState *is = (VideoState *)userdata; VideoPicture *vp; vp = &is->pictq[is->pictq_windex]; if( vp->bmp ) { // we already have one make another, bigger/smaller SDL_FreeYUVOverlay(vp->bmp); } // Allocate a place to put our YUV image on that screen vp->bmp = SDL_CreateYUVOverlay(is->video_st->codec->width, is->video_st->codec->height, SDL_YV12_OVERLAY, screen); vp->width = is->video_st->codec->width; vp->height = is->video_st->codec->height; SDL_LockMutex(is->pictq_mutex); vp->allocated = 1; SDL_CondSignal(is->pictq_cond); SDL_UnlockMutex(is->pictq_mutex); } int queue_picture(VideoState *is, AVFrame *pFrame, double pts, int iIndex) { VideoPicture *vp; AVPicture pict; /* wait until we have space for a new pic */ SDL_LockMutex(is->pictq_mutex); while( is->pictq_size >= VIDEO_PICTURE_QUEUE_SIZE && !is->quit ) { SDL_CondWait(is->pictq_cond, is->pictq_mutex); } SDL_UnlockMutex(is->pictq_mutex); if( is->quit ) { return -1; } // windex is set to 0 initially vp = &is->pictq[is->pictq_windex]; /* allocate or resize the buffer! */ if( !vp->bmp || vp->width != is->video_st->codec->width || vp->height != is->video_st->codec->height ) { SDL_Event event; vp->allocated = 0; /* we have to do it in the main thread */ event.type = FF_ALLOC_EVENT; event.user.data1 = is; SDL_PushEvent(&event); /* wait until we have a picture allocated */ SDL_LockMutex(is->pictq_mutex); while( !vp->allocated && !is->quit ) { SDL_CondWait(is->pictq_cond, is->pictq_mutex); } SDL_UnlockMutex(is->pictq_mutex); if( is->quit ) { return -1; } } /* We have a place to put our picture on the queue */ /* If we are skipping a frame, do we set this to null but still return vp->allocated = 1? */ if( vp->bmp ) { SDL_LockYUVOverlay(vp->bmp); /* point pict at the queue */ pict.data[0] = vp->bmp->pixels[0]; pict.data[1] = vp->bmp->pixels[2]; pict.data[2] = vp->bmp->pixels[1]; pict.linesize[0] = vp->bmp->pitches[0]; pict.linesize[1] = vp->bmp->pitches[2]; pict.linesize[2] = vp->bmp->pitches[1]; // Convert the image into YUV format that SDL uses sws_scale ( is->sws_ctx, (uint8_t const * const *)pFrame->data, pFrame->linesize, 0, is->video_st->codec->height, pict.data, pict.linesize ); SDL_UnlockYUVOverlay(vp->bmp); vp->pts = pts; // BD vp->type = pFrame->pict_type; vp->iIndex = iIndex; // ED /* now we inform our display thread that we have a pic ready */ if( ++ is->pictq_windex == VIDEO_PICTURE_QUEUE_SIZE ) { is->pictq_windex = 0; } SDL_LockMutex(is->pictq_mutex); is->pictq_size++; SDL_UnlockMutex(is->pictq_mutex); } return 0; } /* * 这里就是简单的计算video_clock的值 */ double synchronize_video(VideoState *is, AVFrame *src_frame, double pts) { double frame_delay; if( pts != 0 ) { /* if we have pts, set video clock to it */ is->video_clock = pts; } else { /* if we aren't given a pts, set it to the clock */ pts = is->video_clock; } /* update the video clock */ // 若视频帧率为25fps, 则1帧耗时0.04s, 而这里time_base的值为1/50, 即0.02秒 frame_delay = av_q2d(is->video_st->codec->time_base); /* if we are repeating a frame, adjust clock accordingly */ frame_delay += src_frame->repeat_pict * (frame_delay * 0.5); is->video_clock += frame_delay; return pts; } uint64_t global_video_pkt_pts = AV_NOPTS_VALUE; /* These are called whenever we allocate a frame * buffer. We use this to store the global_pts in * a frame at the time it is allocated. */ int our_get_buffer(struct AVCodecContext *c, AVFrame *pic, int flags) { int ret = avcodec_default_get_buffer(c, pic); uint64_t *pts = (uint64_t *)av_malloc(sizeof(uint64_t)); *pts = global_video_pkt_pts; pic->opaque = pts; return ret; } void our_release_buffer(struct AVCodecContext *c, AVFrame *pic) { if( pic ) { av_freep(&pic->opaque); } avcodec_default_release_buffer(c, pic); } int video_thread(void *arg) { VideoState *is = (VideoState *)arg; AVPacket pkt1, *packet = &pkt1; int frameFinished; AVFrame *pFrame; double pts; pFrame = av_frame_alloc(); for( ; ; ) { if( packet_queue_get(&is->videoq, packet, 1) < 0 ) { // means we quit getting packets break; } pts = 0; // Save global pts to be stored in pFrame in first call global_video_pkt_pts = packet->pts; // Decode video frame int iRet = avcodec_decode_video2(is->video_st->codec, pFrame, &frameFinished, packet); if( iRet < 0 ) { // error int a=2; int b=a; } else if( iRet == 0 ) { // no frame could be decompressed int a=2; int b=a; } else { // ok } // BD LogPrintfA("[%05d] Packet:%d, type: %s, dts: %I64d, pts: %I64d\n", ::GetCurrentThreadId(), ++ g_iIndex_video_pkt, GetPictureTypeString(pFrame->pict_type).c_str(), packet->dts, packet->pts); // ED if( packet->dts == AV_NOPTS_VALUE && pFrame->opaque && *(uint64_t*)pFrame->opaque != AV_NOPTS_VALUE ) { pts = *(uint64_t *)pFrame->opaque; } else if( packet->dts != AV_NOPTS_VALUE ) { pts = packet->dts; } else { pts = 0; } // 根据pts来计算一桢在整个视频中的时间位置 pts *= av_q2d(is->video_st->time_base); // BD AVRational a1 = is->video_st->r_frame_rate; int64_t ptsBst = av_frame_get_best_effort_timestamp(pFrame); double ptsOld = pts; if( AV_PICTURE_TYPE_I == pFrame->pict_type ) { int a=2; int b=a; } // ED // Did we get a video frame? if( frameFinished ) { pts = synchronize_video(is, pFrame, pts); // BD if( ptsOld != pts ) { int a=2; int b=a; } //LogPrintfA("[%05d] Packet:%d, truely pts: %f\n", ::GetCurrentThreadId(), g_iIndex_video_pkt, pts); // ED if( queue_picture(is, pFrame, pts, g_iIndex_video_pkt) < 0 ) { break; } } av_free_packet(packet); } av_free(pFrame); return 0; } int stream_component_open(VideoState *is, int stream_index) { AVFormatContext *pFormatCtx = is->pFormatCtx; AVCodecContext *codecCtx = NULL; AVCodec *codec = NULL; AVDictionary *optionsDict = NULL; SDL_AudioSpec wanted_spec, spec; if(stream_index < 0 || stream_index >= pFormatCtx->nb_streams) { return -1; } // Get a pointer to the codec context for the video stream codecCtx = pFormatCtx->streams[stream_index]->codec; if( codecCtx->codec_type == AVMEDIA_TYPE_AUDIO ) { // Set audio settings from codec info wanted_spec.freq = codecCtx->sample_rate; wanted_spec.format = AUDIO_S16SYS; wanted_spec.channels = codecCtx->channels; wanted_spec.silence = 0; wanted_spec.samples = SDL_AUDIO_BUFFER_SIZE; wanted_spec.callback = audio_callback; wanted_spec.userdata = is; if( SDL_OpenAudio(&wanted_spec, &spec) < 0 ) { fprintf(stderr, "SDL_OpenAudio: %s\n", SDL_GetError()); return -1; } is->audio_hw_buf_size = spec.size; } codec = avcodec_find_decoder(codecCtx->codec_id); if( !codec || (avcodec_open2(codecCtx, codec, &optionsDict) < 0) ) { fprintf(stderr, "Unsupported codec!\n"); return -1; } switch( codecCtx->codec_type ) { case AVMEDIA_TYPE_AUDIO: { is->audioStream = stream_index; is->audio_st = pFormatCtx->streams[stream_index]; is->audio_buf_size = 0; is->audio_buf_index = 0; memset(&is->audio_pkt, 0, sizeof(is->audio_pkt)); packet_queue_init(&is->audioq); SDL_PauseAudio(0); } break; case AVMEDIA_TYPE_VIDEO: { is->videoStream = stream_index; is->video_st = pFormatCtx->streams[stream_index]; is->frame_timer = (double)av_gettime() / 1000000.0; is->frame_last_delay = 40e-3; // BD LogPrintfA("初始化: frame_timer: %f, frame_last_delay: %f\n", is->frame_timer, is->frame_last_delay); // ED packet_queue_init(&is->videoq); is->video_tid = SDL_CreateThread(video_thread, is); is->sws_ctx = sws_getContext ( is->video_st->codec->width, is->video_st->codec->height, is->video_st->codec->pix_fmt, is->video_st->codec->width, is->video_st->codec->height, PIX_FMT_YUV420P, SWS_BILINEAR, NULL, NULL, NULL ); codecCtx->get_buffer2 = our_get_buffer; codecCtx->release_buffer = our_release_buffer; } break; default: break; } return 0; } int decode_interrupt_cb(void *opaque) { return (global_video_state && global_video_state->quit); } int decode_thread(void *arg) { VideoState *is = (VideoState *)arg; AVFormatContext *pFormatCtx = NULL; AVPacket pkt1, *packet = &pkt1; AVDictionary *io_dict = NULL; AVIOInterruptCB callback; int video_index = -1; int audio_index = -1; int i; is->videoStream = -1; is->audioStream = -1; global_video_state = is; // will interrupt blocking functions if we quit! callback.callback = decode_interrupt_cb; callback.opaque = is; if( avio_open2(&is->io_context, is->filename, 0, &callback, &io_dict) ) { fprintf(stderr, "Unable to open I/O for %s\n", is->filename); return -1; } // Open video file if( avformat_open_input(&pFormatCtx, is->filename, NULL, NULL) != 0 ) { return -1; // Couldn't open file } is->pFormatCtx = pFormatCtx; // Retrieve stream information if( avformat_find_stream_info(pFormatCtx, NULL) < 0 ) { return -1; // Couldn't find stream information } // Dump information about file onto standard error av_dump_format(pFormatCtx, 0, is->filename, 0); // Find the first video stream for( i = 0; i < pFormatCtx->nb_streams; i++ ) { if( pFormatCtx->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO && video_index < 0 ) { video_index = i; } if( pFormatCtx->streams[i]->codec->codec_type == AVMEDIA_TYPE_AUDIO && audio_index < 0 ) { audio_index = i; } } if( audio_index >= 0 ) { stream_component_open(is, audio_index); } if( video_index >= 0 ) { stream_component_open(is, video_index); } if( is->videoStream < 0 || is->audioStream < 0 ) { fprintf(stderr, "%s: could not open codecs\n", is->filename); goto fail; } // Begin -- set video size by oldmtn // Make a screen to put our video int width = pFormatCtx->streams[video_index]->codec->width; int height = pFormatCtx->streams[video_index]->codec->height; screen = SDL_SetVideoMode(width, height, 0, 0); if( !screen ) { fprintf(stderr, "SDL: could not set video mode - exiting\n"); exit(1); } // End -- set video size by oldmtn // main decode loop for( ; ; ) { if( is->quit ) { break; } // seek stuff goes here if( is->audioq.size > MAX_AUDIOQ_SIZE || is->videoq.size > MAX_VIDEOQ_SIZE ) { SDL_Delay(10); continue; } if( av_read_frame(is->pFormatCtx, packet) < 0 ) { if( is->pFormatCtx->pb->error == 0 ) { SDL_Delay(100); /* no error; wait for user input */ continue; } else { break; } } // Is this a packet from the video stream? if( packet->stream_index == is->videoStream ) { packet_queue_put(&is->videoq, packet); } else if( packet->stream_index == is->audioStream ) { packet_queue_put(&is->audioq, packet); } else { av_free_packet(packet); } } /* all done - wait for it */ while( !is->quit ) { SDL_Delay(100); } fail: { SDL_Event event; event.type = FF_QUIT_EVENT; event.user.data1 = is; SDL_PushEvent(&event); } return 0; } int _tmain() { SDL_Event event; VideoState *is; is = (VideoState *)av_mallocz(sizeof(VideoState)); //char szFile[] = "cuc_ieschool.flv"; char szFile[] = "edu.flv"; //char szFile[] = "song.flv"; //char szFile[] = "drj.mkv"; //char szFile[] = "city.mkv"; // Register all formats and codecs av_register_all(); if(SDL_Init(SDL_INIT_VIDEO | SDL_INIT_AUDIO | SDL_INIT_TIMER)) { fprintf(stderr, "Could not initialize SDL - %s\n", SDL_GetError()); exit(1); } av_strlcpy(is->filename, szFile, 1024); is->pictq_mutex = SDL_CreateMutex(); is->pictq_cond = SDL_CreateCond(); schedule_refresh(is, 40); is->parse_tid = SDL_CreateThread(decode_thread, is); if(!is->parse_tid) { av_free(is); return -1; } for( ; ; ) { SDL_WaitEvent(&event); switch(event.type) { case FF_QUIT_EVENT: case SDL_QUIT: is->quit = 1; /* * If the video has finished playing, then both the picture and * audio queues are waiting for more data. Make them stop * waiting and terminate normally. */ SDL_CondSignal(is->audioq.cond); SDL_CondSignal(is->videoq.cond); SDL_Quit(); exit(0); break; case FF_ALLOC_EVENT: alloc_picture(event.user.data1); break; case FF_REFRESH_EVENT: video_refresh_timer(event.user.data1); break; default: break; } } return 0; } #endif // TUTORIAL_05
PTS和DTS
接触FFMPEG应用程序时间不长,一共8个tutorial,现在看到了第5个,花的时间最长,理解也是最难的。里面首先把电影文件分为audio和video,其中每个packet都有相应的pts,audio是通过声卡时钟自动同步,audio的pts的作用是来同步视频的。
audio和video都有一个统计播放总时间的变量,即audio_clock和video_clock,ffmpeg-tutorial05就是通过比较这两个clock来调整当前视频帧的延迟时间,从而达到音视频同步的效果的。
幸运的是,音频和视频流都有一些关于以多快速度和什么时间来播放它们的信息在里面。音频流有采样,视频流有每秒的帧率。然而,如果我们只是简单的通过数帧和乘以帧率的方式来同步视频,那么就很有可能会失去同步。于是作为一种补充,在流中的包有种叫做DTS(解码时间戳)和PTS(显示时间戳)的机制。为了这两个参数,你需要了解电影存放的方式。像MPEG等格式,使用被叫做B帧(B表示双向bidrectional)的方式。另外两种帧被叫做I帧和P帧(I表示关键帧,P表示预测帧)。I帧包含了某个特定的完整图像。P帧依赖于前面的I帧和P帧并且使用比较或者差分的方式来编码。B帧与P帧有点类似,但是它是依赖于前面和后面的帧的信息的。这也就解释了为什么我们可能在调用avcodec_decode_video以后会得不到一帧图像。
所以对于一个电影,帧是这样来显示的:I B B P。现在我们需要在显示B帧之前知道P帧中的信息。因此,帧可能会按照这样的方式来存储:IPBB。这就是为什么我们会有一个解码时间戳和一个显示时间戳的原因。解码时间戳告诉我们什么时候需要解码,显示时间戳告诉我们什么时候需要显示。所以,在这种情况下,我们的流可以是这样的:
PTS: 1 4 2 3
DTS: 1 2 3 4
Stream: I P B B
通常PTS和DTS只有在流中有B帧的时候会不同。
我跟踪代码的结果是,并不是每个AVPacket都有确定的PTS。
当我们调用av_read_frame()得到一个包的时候,PTS和DTS的信息也会保存在包中。但是我们真正想要的PTS是我们刚刚解码出来的原始帧的PTS,这样我们才能知道什么时候来显示它。然而,我们从avcodec_decode_video()函数中得到的帧只是一个AVFrame,其中并没有包含有用的PTS值(注意:AVFrame并没有包含时间戳信息,但当我们等到帧的时候并不是我们想要的样子)。然而,ffmpeg重新排序包以便于被avcodec_decode_video()函数处理的包的DTS可以总是与其返回的PTS相同。但是,另外的一个警告是:我们也并不是总能得到这个信息。
不用担心,因为有另外一种办法可以找到帧的PTS,我们可以让程序自己来重新排序包。我们保存一帧的第一个包的PTS:这将作为整个这一帧的 PTS。我们 可以通过函数avcodec_decode_video()来计算出哪个包是一帧的第一个包。怎样实现呢?任何时候当一个包开始一帧的时候,avcodec_decode_video()将调用一个函数来为一帧申请一个缓冲。当然,ffmpeg允许我们重新定义那个分配内存的函数。所以我们制作了一个新的函数来保存一个包的时间戳。
当然,尽管那样,我们可能还是得不到一个正确的时间戳。我们将在后面处理这个问题。
同步
现在,知道了什么时候来显示一个视频帧真好,但是我们怎样来实际操作呢?这里有个主意:当我们显示了一帧以后,我们计算出下一帧显示的时间。然后我们简单的设置一个新的定时器来。你可能会想,我们检查下一帧的PTS值而不是系统时钟来看超时是否会到。这种方式可以工作,但是有两种情况要处理。
首先,要知道下一个PTS是什么。现在我们能添加视频速率到我们的PTS中--太对了!然而,有些电影需要帧重复。这意味着我们重复播放当前的帧。这将导致程序显示下一帧太快了。所以我们需要计算它们。
第二,正如程序现在这样,视频和音频播放很欢快,一点也不受同步的影响。如果一切都工作得很好的话,我们不必担心。但是,你的电脑并不是最好的,很多视频文件也不是完好的。所以,我们有三种选择:同步音频到视频,同步视频到音频,或者都同步到外部时钟(例如你的电脑时钟)。从现在开始,我们将同步视频到音频。
写代码:获得帧的时间戳
现在让我们到代码中来做这些事情。我们将需要为我们的大结构体添加一些成员,但是我们会根据需要来做。首先,让我们看一下视频线程。记住,在这里我们得到了解码线程输出到队列中的包。这里我们需要的是从avcodec_decode_video函数中得到帧的时间戳。我们讨论的第一种方式是从上次处理的包中得到DTS,这是很容易的:
double pts;
for(;;) {
if(packet_queue_get(&is->videoq, packet, 1) < 0) {
// means we quit getting packets
break;
}
pts = 0;
// Decode video frame
len1 = avcodec_decode_video(is->video_st->codec,
pFrame, &frameFinished,
packet->data, packet->size);
if(packet->dts != AV_NOPTS_VALUE) {
pts = packet->dts;
} else {
pts = 0;
}
pts *= av_q2d(is->video_st->time_base);//这里就是1/frame_rate这里是1/25
如果我们得不到PTS就把它设置为0。
好,那是很容易的。但是我们所说的如果包的DTS不能帮到我们,我们需要使用这一帧的第一个包的PTS。我们通过让ffmpeg使用我们自己的申请帧程序来实现。下面的是函数的格式:
int get_buffer(struct AVCodecContext *c, AVFrame *pic);
void release_buffer(struct AVCodecContext *c, AVFrame *pic);
申请函数没有告诉我们关于包的任何事情,所以我们要自己每次在得到一个包的时候把PTS保存到一个全局变量中去。我们自己以读到它。然后,我们把值保存到AVFrame结构体难理解的变量中去。所以一开始,这就是我们的函数:
uint64_t global_video_pkt_pts = AV_NOPTS_VALUE;
//这里的AV_NOPTS_VALUE相当于NULL,out_get_buffer和our_release_buffer是自己定义的,赋给AVCodecContext的get_buffer和release_buffer,这样,程序在执行ffmpeg的get_buffer和release_buffer时就会执行到用户自己定义的函数体中。
int our_get_buffer(struct AVCodecContext *c, AVFrame *pic) {
int ret = avcodec_default_get_buffer(c, pic);
uint64_t *pts = av_malloc(sizeof(uint64_t));
*pts = global_video_pkt_pts;
pic->opaque = pts;
return ret;
}
void our_release_buffer(struct AVCodecContext *c, AVFrame *pic) {
if(pic) av_freep(&pic->opaque);
avcodec_default_release_buffer(c, pic);
}
函数avcodec_default_get_buffer和avcodec_default_release_buffer是ffmpeg中默认的申请缓冲的函数。函数av_freep是一个内存管理函数,它不但把内存释放而且把指针设置为NULL。
现在到了我们流打开的函数(stream_component_open),我们添加这几行来告诉ffmpeg如何去做:
codecCtx->get_buffer = our_get_buffer;
codecCtx->release_buffer = our_release_buffer;
现在我们必需添加代码来保存PTS到全局变量中,然后在需要的时候来使用它。我们的代码现在看起来应该是这样子:
for(;;) {
if(packet_queue_get(&is->videoq, packet, 1) < 0) {
// means we quit getting packets
break;
}
pts = 0;
// Save global pts to be stored in pFrame in first call
global_video_pkt_pts = packet->pts;
// Decode video frame
len1 = avcodec_decode_video(is->video_st->codec, pFrame, &frameFinished,
packet->data, packet->size);
if(packet->dts == AV_NOPTS_VALUE
&& pFrame->opaque && *(uint64_t*)pFrame->opaque != AV_NOPTS_VALUE) {
pts = *(uint64_t *)pFrame->opaque;
} else if(packet->dts != AV_NOPTS_VALUE) {
pts = packet->dts;
} else {
pts = 0;
}
pts *= av_q2d(is->video_st->time_base);
技术提示:你可能已经注意到我们使用int64来表示PTS。这是因为PTS是以整型来保存的。这个值是一个时间戳相当于时间的度量,用来以流的 time_base为单位进行时间度量。例如,如果一个流是24帧每秒,值为42的PTS表示这一帧应该排在第42个帧的位置如果我们每秒有24帧(这里并不完全正确)。
我们可以通过除以帧率来把这个值转化为秒。流中的time_base值表示1/framerate(对于固定帧率来说),所以得到了以秒为单位的PTS,我们需要乘以time_base。
写代码:使用PTS来同步
现在我们得到了PTS。我们要注意前面讨论到的两个同步问题。我们将定义一个函数叫做synchronize_video,它可以更新同步的 PTS。这个函数也能最终处理我们得不到PTS的情况。同时我们要知道下一帧的时间以便于正确设置刷新速率。我们可以使用内部的反映当前视频已经播放时间的时钟 video_clock来完成这个功能。我们把这些值添加到大结构体中。
typedef struct VideoState {
double video_clock; ///
下面的是函数synchronize_video,它可以很好的自我注释:
double synchronize_video(VideoState *is, AVFrame *src_frame, double pts) {
double frame_delay;
if(pts != 0) {
is->video_clock = pts;
} else {
pts = is->video_clock;
}
frame_delay = av_q2d(is->video_st->codec->time_base);
frame_delay += src_frame->repeat_pict * (frame_delay * 0.5);
is->video_clock += frame_delay;
return pts;
}
你也会注意到我们也计算了重复的帧。
现在让我们得到正确的PTS并且使用queue_picture来队列化帧,添加一个新的时间戳参数pts:
// Did we get a video frame?
if(frameFinished) {
pts = synchronize_video(is, pFrame, pts);
if(queue_picture(is, pFrame, pts) < 0) {
break;
}
}
对于queue_picture来说唯一改变的事情就是我们把时间戳值pts保存到VideoPicture结构体中,我们必需添加一个时间戳变量到结构体中并且添加一行代码:
typedef struct VideoPicture {
...
double pts;
}
int queue_picture(VideoState *is, AVFrame *pFrame, double pts) {
... stuff ...
if(vp->bmp) {
... convert picture ...
vp->pts = pts;
... alert queue ...
}
现在我们的图像队列中的所有图像都有了正确的时间戳值,所以让我们看一下视频刷新函数。你会记得上次我们用80ms的刷新时间来欺骗它。那么,现在我们将会算出实际的值。
我们的策略是通过简单计算前一帧和现在这一帧的时间戳来预测出下一个时间戳的时间。同时,我们需要同步视频到音频。我们将设置一个音频时间 audio clock;一个内部值记录了我们正在播放的音频的位置。就像从任意的mp3播放器中读出来的数字一样。既然我们把视频同步到音频,视频线程使用这个值来算出是否太快还是太慢。
我们将在后面来实现这些代码;现在我们假设我们已经有一个可以给我们音频时间的函数get_audio_clock。一旦我们有了这个值,我们在音频和视频失去同步的时候应该做些什么呢?简单而有点笨的办法是试着用跳过正确帧或者其它的方式来解决。作为一种替代的手段,我们会调整下次刷新的值;如果时间戳太落后于音频时间,我们加倍计算延迟。如果时间戳太领先于音频时间,我们将尽可能快的刷新。既然我们有了调整过的时间和延迟,我们将把它和我们通过 frame_timer计算出来的时间进行比较。这个帧时间frame_timer将会统计出电影播放中所有的延时。换句话说,这个 frame_timer就是指我们什么时候来显示下一帧。我们简单的添加新的帧定时器延时,把它和电脑的系统时间进行比较,然后使用那个值来调度下一次刷新。这可能有点难以理解,所以请认真研究代码:
void video_refresh_timer(void *userdata) {
VideoState *is = (VideoState *)userdata;
VideoPicture *vp;
double actual_delay, delay, sync_threshold, ref_clock, diff;
if(is->video_st) {
if(is->pictq_size == 0) {
schedule_refresh(is, 1);
} else {
vp = &is->pictq[is->pictq_rindex];
delay = vp->pts - is->frame_last_pts;
if(delay <= 0 || delay >= 1.0) {
delay = is->frame_last_delay;
}
is->frame_last_delay = delay;
is->frame_last_pts = vp->pts;
ref_clock = get_audio_clock(is);
diff = vp->pts - ref_clock;
sync_threshold = (delay > AV_SYNC_THRESHOLD) ? delay : AV_SYNC_THRESHOLD;
if(fabs(diff) < AV_NOSYNC_THRESHOLD) {
if(diff <= -sync_threshold) {
delay = 0;
} else if(diff >= sync_threshold) {
delay = 2 * delay;
}
}
is->frame_timer += delay;
actual_delay = is->frame_timer - (av_gettime() / 1000000.0);
if(actual_delay < 0.010) {
actual_delay = 0.010;
}
schedule_refresh(is, (int)(actual_delay * 1000 + 0.5));
video_display(is);
if(++is->pictq_rindex == VIDEO_PICTURE_QUEUE_SIZE) {
is->pictq_rindex = 0;
}
SDL_LockMutex(is->pictq_mutex);
is->pictq_size--;
SDL_CondSignal(is->pictq_cond);
SDL_UnlockMutex(is->pictq_mutex);
}
} else {
schedule_refresh(is, 100);
}
}
is->frame_timer表示下一帧要刷新(播放)的时刻。
is->frame_timer = (double)av_gettime() / 1000000.0;记录播放的初始时刻。然后在每次播放之前首先计算要播放的那帧的时刻,算好了时间才好设定定时器进行刷新。
actual_delay = is->frame_timer - (av_gettime() / 1000000.0);表示具体设定需要延迟的时间(is->frame_timer是将要播放的时刻,av_gettime() / 1000000.0是当前的时刻,它们的差值就是实际要延迟的时间)。
/*********************************************************************
这里的is->frame_timer需要注意,它第一次赋值时是在stream_compoment_open中:
is->frame_timer = (double)av_gettime() / 1000000.0; 获得系统时间作为第一帧播放的初始时刻,之后每一帧延迟delay都被累加进来,因此is->frame_timer就是当前帧的播放时间。
is->frame_timer += delay;
首先程序将帧播放时间与音频时间比较:
diff = vp->pts - ref_clock;
再与系统时间比较:
is->frame_timer += delay;
actual_delay = is->frame_timer - (av_gettime() / 1000000.0);
我们在这里做了很多检查:首先,我们保证现在的时间戳和上一个时间戳之间的处以delay是有意义的。如果不是的话,我们就猜测着用上次的延迟。接着,我们有一个同步阈值,因为在同步的时候事情并不总是那么完美的。在ffplay中使用0.01作为它的值。我们也保证阈值不会比时间戳之间的间隔短。最后,我们把最小的刷新值设置为10毫秒。
(这句不知道应该放在哪里)事实上这里我们应该跳过这一帧,但是我们不想为此而烦恼。
我们给大结构体添加了很多的变量,所以不要忘记检查一下代码。同时也不要忘记在函数streame_component_open中初始化帧时间frame_timer和前面的帧延迟frame delay:
av_gettime()得到的时间是以徽秒为单位的,所以要除以1000000转换为S。
is->frame_timer = (double)av_gettime() / 1000000.0;
is->frame_last_delay = 40e-3;
同步:声音时钟
现在让我们看一下怎样来得到声音时钟。我们可以在声音解码函数audio_decode_frame中更新时钟时间。现在,请记住我们并不是每次调用这个函数的时候都在处理新的包,所以有我们要在两个地方更新时钟。第一个地方是我们得到新的包的时候:我们简单的设置声音时钟为这个包的时间戳。然后,如果一个包里有许多帧,我们通过样本数和采样率来计算,所以当我们得到包的时候:
if(pkt->pts != AV_NOPTS_VALUE) {
is->audio_clock = av_q2d(is->audio_st->time_base)*pkt->pts;
}
然后当我们处理这个包的时候:
pts = is->audio_clock;
*pts_ptr = pts;
n = 2 * is->audio_st->codec->channels;
is->audio_clock += (double)data_size /
(double)(n * is->audio_st->codec->sample_rate);
一点细节:临时函数被改成包含pts_ptr,所以要保证你已经改了那些。这时的pts_ptr是一个用来通知audio_callback函数当前声音包的时间戳的指针。这将在下次用来同步声音和视频。
现在我们可以最后来实现我们的get_audio_clock函数。它并不像得到is->audio_clock值那样简单。注意我们会在每次处理它的时候设置声音时间戳,但是如果你看了audio_callback函数,它花费了时间来把数据从声音包中移到我们的输出缓冲区中。这意味着我们声音时钟中记录的时间比实际的要早太多。所以我们必须要检查一下我们还有多少没有写入。下面是完整的代码:
double get_audio_clock(VideoState *is) {
double pts;
int hw_buf_size, bytes_per_sec, n;
pts = is->audio_clock;
hw_buf_size = is->audio_buf_size - is->audio_buf_index;
bytes_per_sec = 0;
n = is->audio_st->codec->channels * 2;
if(is->audio_st) {
bytes_per_sec = is->audio_st->codec->sample_rate * n;
}
if(bytes_per_sec) {
pts -= (double)hw_buf_size / bytes_per_sec;
}
return pts;
}