ffmpeg:视频解码以及解码后数据的编码

前言

    公司一款产品的研发需要使用FFmpeg,遇到的一个问题就是如何从视频文件中提取特定格式的图片作为视频列表的thumnail。这个过程很简单,首先就是读视频帧,解码,然后按照目标图片的设定编码。搜索到的结果却多是关于ffmpeg命令行使用的。本文参考了An ffmpeg and SDL Tutorial系列的第一篇,增加对视频提取图像的编码实现。如需要编译ffmpeg库,请参考这篇介绍。

初始化FFmpeg

ffmpeg库使用前都需要初始化,比如这次用到avformat库:
/**
 * Initialize libavformat and register all the muxers, demuxers and
 * protocols. If you do not call this function, then you can select
 * exactly which formats you want to support.
 *
 * @see av_register_input_format()
 * @see av_register_output_format()
 */
void av_register_all(void);//函数名一般形式为:avXXX_register_all()

打开视频文件及初始化解码器

成功执行 av_register_all函数后,调用avformat_open_input,第一个参数为我们视频文件的AVFormatContext指针的地址,第二个参数为视频文件路径(这里我们忽略后面两个参数)。然后通过调用avformat_find_stream_info获取视频流信息。

AVFormatContext   *pInputFormatCtx = 0;
if (avformat_open_input (&pInputFormatCtx, filePath, 0, 0) != 0)
{
	return -1;
}
if (avformat_find_stream_info (pInputFormatCtx, 0) < 0)
{
	return -1;
}
视频流AVStream结构中包含了编解码器上下文( AVCodecContext *) ,其中的字段( codec_id) 我们需要知道视频的编码,这样就能打开正确的解码器。查找视频流:
for (uint32_t i = 0; i < pInputFormatCtx->nb_streams; i++)
{
	if (pInputFormatCtx->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO)
	{
		videoStream = i;
		break;
	}
}
查找解码器:
AVCodecContext *pInputCodecCtx = pInputFormatCtx->streams[videoStream]->codec;
AVCodec *pDecoderCodec = avcodec_find_decoder (pInputCodecCtx->codec_id);
打开解码器需要调用avcodec_open2,但这个函数不能直接调用,先来看看它的介绍:
/**
 * Initialize the AVCodecContext to use the given AVCodec. Prior to using this
 * function the context has to be allocated with avcodec_alloc_context3().
 *
 * The functions avcodec_find_decoder_by_name(), avcodec_find_encoder_by_name(),
 * avcodec_find_decoder() and avcodec_find_encoder() provide an easy way for
 * retrieving a codec.
 *
 * @warning This function is not thread safe!
 *
 * @note Always call this function before using decoding routines (such as
 * @ref avcodec_decode_video2()).
 *
 * @code
 * avcodec_register_all();
 * av_dict_set(&opts, "b", "2.5M", 0);
 * codec = avcodec_find_decoder(AV_CODEC_ID_H264);
 * if (!codec)
 *     exit(1);
 *
 * context = avcodec_alloc_context3(codec);
 *
 * if (avcodec_open2(context, codec, opts) < 0)
 *     exit(1);
 * @endcode
 *
 * @param avctx The context to initialize.
 * @param codec The codec to open this context for. If a non-NULL codec has been
 *              previously passed to avcodec_alloc_context3() or
 *              avcodec_get_context_defaults3() for this context, then this
 *              parameter MUST be either NULL or equal to the previously passed
 *              codec.
 * @param options A dictionary filled with AVCodecContext and codec-private options.
 *                On return this object will be filled with options that were not found.
 *
 * @return zero on success, a negative value on error
 * @see avcodec_alloc_context3(), avcodec_find_decoder(), avcodec_find_encoder(),
 *      av_dict_set(), av_opt_find().
 */
int avcodec_open2(AVCodecContext *avctx, const AVCodec *codec, AVDictionary **options);
所以这里的代码应该如下:
AVCodecContext *pDecoderCodecCtx = avcodec_alloc_context3 (pDecoderCodec);
if (avcodec_copy_context (pDecoderCodecCtx, pInputCodecCtx) != 0)
{
	fprintf (stderr, "Couldn't copy codec context");
	return -1;
}
if (avcodec_open2 (pDecoderCodecCtx, pDecoderCodec, NULL) < 0)
{
	fprintf (stderr, "Couldn't init decode codeccontext to use avcodec");
	return -1;
}

读取视频帧及解码

FFMpeg中的结构,大多数不能使用malloc直接分配,当然,也不会让人直接用free释放。这里我们需要一个空间,用来存放读取的帧数据。帧,在FFmpeg中使用AVFrame结构来表示:
AVFrame *pRawFrame = av_frame_alloc ();
读取以及解码处理,使用两个函数,av_read_frame、avcodec_decode_video2:
while (av_read_frame (pInputFormatCtx, &packet) >= 0)
{
	int frameFinished = 0;		
	if (packet.stream_index == videoStream)// 是否为视频流中的包
	{			
		// 解码视频帧
		avcodec_decode_video2 (pDecoderCodecCtx, pRawFrame, &frameFinished, &packet);	
		if (frameFinished && pRawFrame->key_frame)
		{
			//这里处理解码后的数据帧
		}
	}		
	av_free_packet (&packet);// 释放掉由av_read_frame分配给AVPacket对象的内存
	if (frameFinished) break;
}

数据帧的编码

读取到视频帧后,需要对其进行编码,以便保存到对应格式的文件。编码的函数为:
int avcodec_encode_video2(AVCodecContext *avctx, AVPacket *avpkt,
                          const AVFrame *frame, int *got_packet_ptr);

在这之前,一个重点就是如何找到对应的编码器。FFmpeg提供了两个函数:
/**
 * Find a registered decoder with a matching codec ID.
 *
 * @param id AVCodecID of the requested decoder
 * @return A decoder if one was found, NULL otherwise.
 */
AVCodec *avcodec_find_decoder(enum AVCodecID id);

/**
 * Find a registered decoder with the specified name.
 *
 * @param name name of the requested decoder
 * @return A decoder if one was found, NULL otherwise.
 */
AVCodec *avcodec_find_decoder_by_name(const char *name);

第二个函数的参数“name”,并不是指的文件名,而是在allcodecs.c:void avcodec_register_all(void)函数中注册的解码器名称。可以查看源代码来查找,但这种方式并不直观。我们还是利用第一个函数来查找,这样,问题就是如何查找输出文件的AVCodecID。
对于图片文件,/libavformat/img2.c 中有个函数,可以方便地从文件名猜测到AVCodecID:
enum AVCodecID ff_guess_image2_codec( const char *filename);
这是个内部函数,不可以通过 #include <libavformat/internal.h> 的方式引用,要使用它得另寻途径:
/**
 * Return the output format in the list of registered output formats
 * which best matches the provided parameters, or return NULL if
 * there is no match.
 *
 * @param short_name if non-NULL checks if short_name matches with the
 * names of the registered formats
 * @param filename if non-NULL checks if filename terminates with the
 * extensions of the registered formats
 * @param mime_type if non-NULL checks if mime_type matches with the
 * MIME type of the registered formats
 */
AVOutputFormat *av_guess_format(const char *short_name,
                                const char *filename,
                                const char *mime_type);
查找图片类解码器时:
AVOutputFormat *avOutputFormat =  av_guess_format (0, outfilename, 0);//比如,outfileName="thumb.jpg"
AVCodec* pCodec = avcodec_find_encoder (avOutputFormat->video_codec);

如果我们不是只处理为某个特定格式的图片,或者我们不想了解特定格式图片的编解码器支持的像素格式。另一个关键点就是查找目标格式的编码器支持的像素格式。
先来查看下AVCodec结构的定义,注意其中的【 const enum AVPixelFormat *pix_fmts; 】字段,保存了编解码器支持的像素格式列表
/**
 * AVCodec.
 */
typedef struct AVCodec {
    /**
     * Name of the codec implementation.
     * The name is globally unique among encoders and among decoders (but an
     * encoder and a decoder can share the same name).
     * This is the primary way to find a codec from the user perspective.
     */
    const char *name;
    /**
     * Descriptive name for the codec, meant to be more human readable than name.
     * You should use the NULL_IF_CONFIG_SMALL() macro to define it.
     */
    const char *long_name;
    enum AVMediaType type;
    enum AVCodecID id;
    /**
     * Codec capabilities.
     * see AV_CODEC_CAP_*
     */
    int capabilities;
    const AVRational *supported_framerates; ///< array of supported framerates, or NULL if any, array is terminated by {0,0}
    const enum AVPixelFormat *pix_fmts;     ///< array of supported pixel formats, or NULL if unknown, array is terminated by -1
    const int *supported_samplerates;       ///< array of supported audio samplerates, or NULL if unknown, array is terminated by 0
    const enum AVSampleFormat *sample_fmts; ///< array of supported sample formats, or NULL if unknown, array is terminated by -1
    const uint64_t *channel_layouts;         ///< array of support channel layouts, or NULL if unknown. array is terminated by 0
    uint8_t max_lowres;                     ///< maximum value for lowres supported by the decoder, no direct access, use av_codec_get_max_lowres()
    const AVClass *priv_class;              ///< AVClass for the private context
    const AVProfile *profiles;              ///< array of recognized profiles, or NULL if unknown, array is terminated by {FF_PROFILE_UNKNOWN}

    /*****************************************************************
     * No fields below this line are part of the public API. They
     * may not be used outside of libavcodec and can be changed and
     * removed at will.
     * New public fields should be added right above.   
    /*往下,删除了私有数据*/
***************************************************************** 
} AVCodec;
这样,我们就能通过avcodec.h中提供的函数来查找与要转换的数据最佳匹配的像素格式
/**
 * Find the best pixel format to convert to given a certain source pixel
 * format.  When converting from one pixel format to another, information loss
 * may occur.  For example, when converting from RGB24 to GRAY, the color
 * information will be lost. Similarly, other losses occur when converting from
 * some formats to other formats. avcodec_find_best_pix_fmt_of_2() searches which of
 * the given pixel formats should be used to suffer the least amount of loss.
 * The pixel formats from which it chooses one, are determined by the
 * pix_fmt_list parameter.
 *
 *
 * @param[in] pix_fmt_list AV_PIX_FMT_NONE terminated array of pixel formats to choose from
 * @param[in] src_pix_fmt source pixel format
 * @param[in] has_alpha Whether the source pixel format alpha channel is used.
 * @param[out] loss_ptr Combination of flags informing you what kind of losses will occur.
 * @return The best pixel format to convert to or -1 if none was found.
 */
enum AVPixelFormat avcodec_find_best_pix_fmt_of_list(const enum AVPixelFormat *pix_fmt_list,
                                            enum AVPixelFormat src_pix_fmt,
                                            int has_alpha, int *loss_ptr);

图片的缩放

我们从视频中提取的图片是当作thumb来使用的,需要的图片尺寸也就是320*180的水平,而高清视频是1920*1080,所以需要对解码后的帧数据进行缩放,然后编码。 使用 av_image_alloc(), avpicture_fill(),可以为图片分配并初始化缓冲区。但也可以使用avpicture_alloc一步完成:
/**
 * Allocate memory for the pixels of a picture and setup the AVPicture
 * fields for it.
 *
 * Call avpicture_free() to free it.
 *
 * @param picture            the picture structure to be filled in
 * @param pix_fmt            the pixel format of the picture
 * @param width              the width of the picture
 * @param height             the height of the picture
 * @return zero if successful, a negative error code otherwise
 *
 * @see av_image_alloc(), avpicture_fill()
 */
int avpicture_alloc(AVPicture *picture, enum AVPixelFormat pix_fmt, int width, int height);
为图片生成内存空间后,使用sws_getContext初始化SWS上下文,并调用sws_scale进行缩放处理。

代码清单

#pragma once
#define inline __inline
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>
#include <libavdevice/avdevice.h>
#include <stdio.h>

static void OutputFrame (AVCodecContext *pOrigCodecCtx, AVFrame *pFrame, const char* out, int height, int width)
{
	AVOutputFormat *avOutputFormat =  av_guess_format (0, out, 0);
	AVCodec* pOutAVCodec = avcodec_find_encoder (avOutputFormat->video_codec);
	if (pOutAVCodec == 0)
	{
		fprintf (stderr, "cann't find [%s] encoder\n", out);
		return;
	}
	AVCodecContext *pOutCodecCtx = avcodec_alloc_context3 (pOutAVCodec);
	pOutCodecCtx->bit_rate = pOrigCodecCtx->bit_rate;
	pOutCodecCtx->width = height;// pOrigCodecCtx->width;
	pOutCodecCtx->height = width;// pOrigCodecCtx->height;
	pOutCodecCtx->pix_fmt = avcodec_find_best_pix_fmt_of_list(pOutAVCodec->pix_fmts,pOrigCodecCtx->pix_fmt, 1, 0); // AV_PIX_FMT_YUVJ420P;
	pOutCodecCtx->codec_id = avOutputFormat->video_codec; //AV_CODEC_ID_MJPEG;
	pOutCodecCtx->codec_type = pOrigCodecCtx->codec_type; //AVMEDIA_TYPE_VIDEO;
	pOutCodecCtx->time_base.num = pOrigCodecCtx->time_base.num;
	pOutCodecCtx->time_base.den = pOrigCodecCtx->time_base.den;

	if (avcodec_open2 (pOutCodecCtx, pOutAVCodec, 0) < 0)
	{
		fprintf (stderr, "open  codec err !\n");
		return;
	}

	pOutCodecCtx->mb_lmin = pOutCodecCtx->qmin * FF_QP2LAMBDA;
	pOutCodecCtx->mb_lmax = pOutCodecCtx->qmax * FF_QP2LAMBDA;
	pOutCodecCtx->flags = CODEC_FLAG_QSCALE;
	pOutCodecCtx->global_quality = pOutCodecCtx->qmin * FF_QP2LAMBDA;
	pFrame->pts = 1;
	pFrame->quality = pOutCodecCtx->global_quality;
	//缩放图片
	AVFrame *pFrameSWS = av_frame_alloc ();
	// 为视频图片分配缓冲区
	avpicture_alloc ((AVPicture *)pFrameSWS
		, pOutCodecCtx->pix_fmt
		, pOutCodecCtx->width
		, pOutCodecCtx->height
		);

	// 初始化 SWS上下文,以便进行软件缩放
	struct SwsContext *sws_ctx = sws_getContext (
		pOrigCodecCtx->width
		,pOrigCodecCtx->height
		,pOrigCodecCtx->pix_fmt
		, pOutCodecCtx->width
		, pOutCodecCtx->height
		, pOutCodecCtx->pix_fmt
		,SWS_BILINEAR
		,NULL
		,NULL
		,NULL
		);

	sws_scale (sws_ctx, (uint8_t const * const *)pFrame->data,
		pFrame->linesize, 0, pOrigCodecCtx->height,
		pFrameSWS->data, pFrameSWS->linesize);

	AVPacket pkt;
	av_init_packet (&pkt);
	pkt.data = NULL;    // packet data will be allocated by the encoder
	pkt.size = 0;
	int BufSiz;
	int BufSizActual = avcodec_encode_video2 (pOutCodecCtx, &pkt, pFrameSWS, &BufSiz);
	FILE *pFile = fopen (out, "wb");
	fwrite (pkt.data, 1, pkt.size, pFile);
	fclose (pFile);
	avcodec_close (pOutCodecCtx);
	av_free_packet (&pkt);
}

int main (int argc, char *argv[])
{
	if (argc < 5)
	{
		printf ("usage: <exec> <input video file> <output thumbnail filename> <height> <width>");
		return -1;
	}
	// 初始化libavformat库(注册所有的支持格式与编解码器)
	av_register_all ();
	av_log_set_level (AV_LOG_DEBUG);
	//打开视频文件,获取格式上下文信息,从中找到视频流
	AVFormatContext   *pInputFormatCtx = NULL;
	if (avformat_open_input (&pInputFormatCtx, argv[1], NULL, NULL) != 0)
	{
		return -1;
	}
	if (avformat_find_stream_info (pInputFormatCtx, NULL) < 0)
	{
		return -1;
	}
	av_dump_format (pInputFormatCtx, 0, argv[1], 0);// 打印文件信息
	int videoStream = -1;
	for (uint32_t i = 0; i < pInputFormatCtx->nb_streams; i++)
	{
		if (pInputFormatCtx->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO)
		{
			videoStream = i;
			break;
		}
	}
	if (videoStream == -1) 	return -1;
	// 获得视频流的编解码器上下文指针,根据其中的解码器ID寻找解码器
	
	AVCodecContext *pInputCodecCtx = pInputFormatCtx->streams[videoStream]->codec;
	AVCodec *pDecoderCodec = avcodec_find_decoder (pInputCodecCtx->codec_id);
	if (pDecoderCodec == NULL)
	{
		fprintf (stderr, "Unsupported codec!\n");
		return -1;
	}

	// 根据找到的解码器分配上下文,copy输入流的编解码器上下文,并据此初始化解码器
	AVCodecContext *pDecoderCodecCtx = avcodec_alloc_context3 (pDecoderCodec);
	if (avcodec_copy_context (pDecoderCodecCtx, pInputCodecCtx) != 0)
	{
		fprintf (stderr, "Couldn't copy codec context");
		return -1;
	}
	if (avcodec_open2 (pDecoderCodecCtx, pDecoderCodec, NULL) < 0)
	{
		fprintf (stderr, "Couldn't init decode codeccontext to use avcodec");
		return -1;
	}

	// 为视频帧分配内存
	AVFrame *pRawFrame = av_frame_alloc ();
	if (0 == pRawFrame) return -1;
	// 提取开头的关键帧,编码后存文件
	AVPacket packet;
	while (av_read_frame (pInputFormatCtx, &packet) >= 0)
	{
		int frameFinished = 0;		
		if (packet.stream_index == videoStream)// 是否为视频流中的包
		{			
			// 解码视频帧
			avcodec_decode_video2 (pDecoderCodecCtx, pRawFrame, &frameFinished, &packet);			
			if (frameFinished && pRawFrame->key_frame)//关键帧判断
			{
				OutputFrame (pDecoderCodecCtx, pRawFrame, argv[2], atoi(argv[3]), atoi(argv[4]) );
			}
		}		
		av_free_packet (&packet);// 释放掉由av_read_frame分配给AVPacket对象的内存
		if (frameFinished) break;
	}
	// 做清理工作
	av_frame_free (&pRawFrame);	
	avcodec_close (pDecoderCodecCtx);
	avcodec_close (pInputCodecCtx);
	avformat_close_input (&pInputFormatCtx);
	return 0;
}



你可能感兴趣的:(ffmpeg:视频解码以及解码后数据的编码)