Tutorial 01: Making Screencaps
制作屏幕录像 http://home.chinavideo.org/space.php?uid=1&do=thread&id=14§
Code: tutorial01.c§
Overview
Movie files 电影文件have a few basic components. First, the file itself is called a container容器, and the type of container容器类型 determines where the information in the file goes.信息存放在文件中的位置 Examples of containers are AVI and Quicktime. Next, you have a bunch串 of streams; for example, you usually have an audio stream and a video stream. (A "stream" is just a fancy想象出来的 word for "a succession连续 of data elements made available over time".
一个流只是一种想像出来的词语,用来表示一连串的通过时间来串连的数据元素
) The data elements in a stream are called frames
在流中的数据元素被称为帧. Each stream is encoded by a different kind of codec
每个流是由不同的编码器来编码生成的
. The codec defines how the actual data is COded编码 and DECoded解码 - hence the name CODEC. Examples of codecs are DivX and MP3.Packets are then read from the stream. Packets 包are pieces of data数据段 that can contain含有 bits of data数据位 that are decoded into raw frames解码为原始的帧 that we can finally manipulate操作 for our application. For our purposes, each packet contains complete frames, or multiple frames in the case of audio. 每个包包含完整的帧。或者是对于音频来说,是多个帧。
At its very basic level, dealing with video and audio streams is very easy:
10 OPEN video_stream FROM video.avi从视频编码格式文件里头,打开视频流。
20 READ packet FROM video_stream INTO frame 从视频流里头读取包,变成帧
30 IF frame NOT COMPLETE GOTO 20 如果帧不完整,回到20执行。
40 DO SOMETHING WITH frame 用帧做些什么。
50 GOTO 20 跳到20 执行。
Handling multimedia with ffmpeg is pretty much as simple as this program, although some programs might have a very complex "DO SOMETHING" step. So in this tutorial, we're going to open a file, read from the video stream inside it, and our DO SOMETHING is going to be writing the frame to a PPM file.
我们将打开一个文件,读取里面的视频流,而且我们对帧的操作将是把这个帧写到一个PPM文件
Opening the File打开文件
First, let's see how we open a file in the first place. With ffmpeg, you have to first initialize the library. 对于ffmpeg,首先要初始化库。
(Note that some systems might have to use
#include
#include
...
int main(int argc, charg *argv[]) {
av_register_all§();
This registers all available file formats and codecs with the library so they will be used automatically when a file with the corresponding format/codec is opened. Note that you only need to call av_register_all§() once, so we do it here in main(). If you like, it's possible to register only certain individual file formats and codecs, but there's usually no reason why you would have to do that.
Now we can actually open the file:
AVFormatContext§ *pFormatCtx;
// Open video file
if(av_open_input_file§(&pFormatCtx, argv[1], NULL, 0, NULL)!=0)
return -1; // Couldn't open file
We get our filename from the first argument. This function reads the file header and stores information about the file format in the AVFormatContext§ structure we have given it. The last three arguments are used to specify the file format, 文件格式 buffer size 缓冲区大小, and format options 格式选项, but by setting this to NULL or 0, libavformat will auto-detect these.
This function only looks at the header, so next we need to check out 检查、核对the stream information in the file.:
// Retrieve寻回、找回 stream information
if(av_find_stream_info§(pFormatCtx)<0)
return -1; // Couldn't find stream information
This function populates居住于、生活于 pFormatCtx->streams with the proper information. We introduce a handy debugging function to show us what's inside:
// Dump倾倒 information about file onto standard error
dump_format(pFormatCtx, 0, argv[1], 0);
Now pFormatCtx->streams is just an array of pointers 一个元素都是指针的数组, of size pFormatCtx->nb_streams, so let's walk through it until we find a video stream.
int i;
AVCodecContext§ *pCodecCtx;
// Find the first video stream
videoStream=-1;
for(i=0; i
if(pFormatCtx->streams[i]->codec->codec_type==CODEC_TYPE_VIDEO) {
videoStream=i;
break;
}
if(videoStream==-1)
return -1; // Didn't find a video stream
// Get a pointer to the codec context for the video stream
pCodecCtx=pFormatCtx->streams[videoStream]->codec;
The stream's information about the codec关于编解码器的流信息我们称之为 编解码器上下文 is in what we call the "codec context." This contains all the information about the codec that the stream is using, and now we have a pointer to it. But we still have to find the actual codec and open it:
AVCodec *pCodec;
// Find the decoder for the video stream 视频流的解码器
pCodec=avcodec_find_decoder§(pCodecCtx->codec_id);
if(pCodec==NULL) {
fprintf(stderr, "Unsupported codec!/n");
return -1; // Codec not found
}
// Open codec
if(avcodec_open§(pCodecCtx, pCodec)<0)
return -1; // Could not open codec
Some of you might remember from the old tutorial that there were two other parts to this code: addingCODEC_FLAG_TRUNCATED to pCodecCtx->flags and adding a hack劈砍 to correct grossly非常、极其 incorrect frame rates. These two fixes aren't in ffplay.c anymore, so I have to assume that they are not necessary anymore. There's another difference to point out since we removed that code: pCodecCtx->time_base now holds the frame rate information. time_base is a struct that has the numerator分子 and denominator分母 (AVRational§). We represent the frame rate as a fraction分数、小部分、一点儿 because many codecs have non-integer frame rates (like NTSC's 29.97fps).
Storing the Data 存储帧
Now we need a place to actually store the frame:
AVFrame§ *pFrame;
// Allocate video frame
pFrame=avcodec_alloc_frame§();
Since we're planning to output PPM files, which are stored in 24-bit RGB, we're going to have to convert our frame from its native format to RGB 把原始格式的帧转化为RGB. ffmpeg will do these conversions for us. For most projects (including ours) we're going to want to convert our initial frame to a specific format. Let's allocate a frame for the converted frame now.
// Allocate an AVFrame§ structure
pFrameRGB=avcodec_alloc_frame§();
if(pFrameRGB==NULL)
return -1;
Even though we've allocated the frame, we still need a place to put the raw data when we convert it. We useavpicture_get_size§ to get the size we need, and allocate the space manually:
uint8_t *buffer;
int numBytes;
// Determine required buffer size and allocate buffer
numBytes=avpicture_get_size§(PIX_FMT_RGB24, pCodecCtx->width,
pCodecCtx->height);
buffer=(uint8_t *)av_malloc§(numBytes*sizeof(uint8_t));
av_malloc§ is ffmpeg's malloc that is just a simple wrapper封皮 around malloc that makes sure the memory addresses are aligned对齐 and such. It will not protect you from memory leaks泄露, double freeing解除, or other malloc problems.
Now we use avpicture_fill§ to associate the frame with our newly allocated buffer. About the AVPicture§ cast: the AVPicture struct is a subset子集 of the AVFrame§ struct - the beginning of the AVFrame struct is identical同一的 to the AVPicture struct.
// Assign appropriate parts of buffer to image planes位面 in pFrameRGB
// Note that pFrameRGB is an AVFrame§, but AVFrame is a superset父集、超集
// of AVPicture§
avpicture_fill§((AVPicture§ *)pFrameRGB, buffer, PIX_FMT_RGB24,
pCodecCtx->width, pCodecCtx->height);
Finally! Now we're ready to read from the stream!
Reading the Data 读取数据
从视频流中读取到包里头,解码变成帧。帧好了之后,就转变格式,然后存储。
What we're going to do is read through the entire video stream by reading in the packet, decoding 编码it into our frame, and once our frame is complete, we will convert and save it.
int frameFinished;
AVPacket§ packet;
i=0;
while(av_read_frame§(pFormatCtx, &packet)>=0) {
// Is this a packet from the video stream? 这是一个来自视频流的包么?
if(packet.stream_index==videoStream) {
// Decode video frame 解码视频帧
avcodec_decode_video§(pCodecCtx, pFrame, &frameFinished,
packet.data, packet.size);
// Did we get a video frame? 我们有视频帧么?
if(frameFinished) {
// Convert the image from its native format to RGB
img_convert§((AVPicture§ *)pFrameRGB, PIX_FMT_RGB24,
(AVPicture§*)pFrame, pCodecCtx->pix_fmt,
pCodecCtx->width, pCodecCtx->height);
// Save the frame to disk 帧存储到磁盘上。
if(++i<=5)
SaveFrame(pFrameRGB, pCodecCtx->width,
pCodecCtx->height, i);
}
}
// Free the packet that was allocated by av_read_frame§
//释放读取帧函数所分配的数据包
av_free_packet§(&packet);
}
A note on packets
Technically a packet can contain partial部分的 frames or other bits of data, but ffmpeg's parser 剖析器ensures that the packets we get contain either complete or multiple frames.
The process, again, is simple: av_read_frame§() reads in a packet and stores it in the AVPacket§ struct. Note that we've only allocated the packet structure分配包结构 - ffmpeg allocates the internal data 内部数据 for us, which is pointed to by packet.data. This is freed by the av_free_packet§() later. avcodec_decode_video§() converts the packet to a frame for us. However, we might not have all the information we need for a frame after decoding a packet, so avcodec_decode_video() sets frameFinished for us when we have the next frame. Finally, we use img_convert§() to convert from the native format (pCodecCtx->pix_fmt) to RGB. Remember that you can cast an AVFrame pointer to an AVPicture pointer. Finally, we pass the frame and height and width information to our SaveFrame function.
Now all we need to do is make the SaveFrame function to write the RGB information to a file in PPM format. 写入 RGB信息到一个PPM格式的文件中We're going to be kind of sketchy粗略的 on the PPM format itself; trust us, it works.
void SaveFrame(AVFrame§ *pFrame, int width, int height, int iFrame) {
FILE *pFile;
char szFilename[32];
int y;
// Open file
sprintf(szFilename, "frame%d.ppm", iFrame);
pFile=fopen(szFilename, "wb");
if(pFile==NULL)
return;
// Write header
fprintf(pFile, "P6/n%d %d/n255/n", width, height);
// Write pixel data
for(y=0; y
// Close file
fclose(pFile);
}
We do a bit of standard file opening, etc., and then write the RGB data. We write the file one line at a time. 一次一行的写入文件A PPM file is simply a file that has RGB information laid out in a long string. RGB信息是以长字符串的形式展开的If you know HTML colors, it would be like laying out the color of each pixel end to end like #ff0000#ff0000.... would be a red screen. (It's stored in binary 二进制存储而且没有分隔符and without the separator, but you get the idea.) The header indicated how wide and tall the image is, and the max size of the RGB values.
Now, going back to our main() function. Once we're done reading from the video stream, we just have to clean everything up:
// Free the RGB image
av_free§(buffer);
av_free§(pFrameRGB);
// Free the YUV frame
av_free§(pFrame);
// Close the codec
avcodec_close(pCodecCtx);
// Close the video file
av_close_input_file§(pFormatCtx);
return 0;
You'll notice we use av_free§ for the memory we allocated with avcode_alloc_frame and av_malloc§.
That's it for the code! Now, if you're on Linux or a similar platform, you'll run:
gcc -o tutorial01 tutorial01.c -lavutil -lavformat -lavcodec -lz -lavutil -lm
If you have an older version of ffmpeg, you may need to drop -lavutil:
gcc -o tutorial01 tutorial01.c -lavformat -lavcodec -lz -lm
Most image programs should be able to open PPM files. Test it on some movie files.
>> Tutorial 2: Outputting to the Screen§
==========================================================================
如何基于FFMPEG和SDL写一个少于1000行代码的视频播放器(一)
电影文件有很多基本的组成部分。首先,文件本身被称为容器Container,容器的类型决定了信息被存放在文件中的位置。AVI和Quicktime就是容器的例子。接着,你有一组流,例如,你经常有的是一个音频流和一个视频流。(一个流只是一种想像出来的词语,用来表示一连串的通过时间来串连的数据元素)。在流中的数据元素被称为帧Frame。每个流是由不同的编码器来编码生成的。编解码器描述了实际的数据是如何被编码Coded和解码DECoded的,因此它的名字叫做CODEC。Divx和 MP3就是编解码器的例子。接着从流中被读出来的叫做包Packets。包是一段数据,它包含了一段可以被解码成方便我们最后在应用程序中操作的原始帧的数据。根据我们的目的,每个包包含了完整的帧或者对于音频来说是许多格式的完整帧。
基本上来说,处理视频和音频流是很容易的:
10 从video.avi文件中打开视频流video_stream
20 从视频流中读取包到帧中
30 如果这个帧还不完整,跳到20
40 对这个帧进行一些操作
50 跳回到20
在这个程序中使用ffmpeg来处理多种媒体是相当容易的,虽然很多程序可能在对帧进行操作的时候非常的复杂。因此在这篇指导中,我们将打开一个文件,读取里面的视频流,而且我们对帧的操作将是把这个帧写到一个PPM文件中。
打开文件
首先,来看一下我们如何打开一个文件。通过ffmpeg,你必需先初始化这个库。(注意在某些系统中必需用
#include
#include
...
int main(int argc, char* *argv[]) {
av_register_all();
这里注册了所有的文件格式和编解码器的库,所以它们将被自动的使用在被打开的合适格式的文件上。注意你只需要调用av_register_all()一次,因此我们在主函数main()中来调用它。如果你喜欢,也可以只注册特定的格式和编解码器,但是通常你没有必要这样做。
现在我们可以真正的打开文件:
AVFormatContext *pFormatCtx;
// Open video file
if(av_open_input_file(&pFormatCtx, argv[1],NULL, 0, NULL)!=0)
return -1; // Couldn't open file
我们通过第一个参数来获得文件名。这个函数读取文件的头部并且把信息保存到我们给的AVFormatContext结构体中。最后三个参数用来指定特殊的文件格式,缓冲大小和格式参数,但如果把它们设置为空NULL或者0,libavformat将自动检测这些参数。
这个函数只是检测了文件的头部,所以接着我们需要检查在文件中的流的信息:
// Retrieve stream information 寻回 找回 恢复
if(av_find_stream_info(pFormatCtx)<0)
return -1; // Couldn't find streaminformation
这个函数为pFormatCtx->streams填充上正确的信息。我们引进一个手工调试的函数来看一下里面有什么:
// Dump information about file onto standard error
dump_format(pFormatCtx, 0, argv[1], 0);
现在pFormatCtx->streams仅仅是一组大小为pFormatCtx->nb_streams的指针,所以让我们先跳过它直到我们找到一个视频流。
int i;
AVCodecContext *pCodecCtx;
// Find the first video stream
videoStream=-1;
for(i=0; i
if(pFormatCtx->streams->codec->codec_type==CODEC_TYPE_VIDEO){
videoStream=i;
break;
}
if(videoStream==-1)
return -1; // Didn't find a video stream
// Get a pointer to the codec context for the video stream
pCodecCtx=pFormatCtx->streams[videoStream]->codec;
流中关于编解码器的信息就是被我们叫做"codeccontext"(编解码器上下文)的东西。这里面包含了流中所使用的关于编解码器的所有信息,现在我们有了一个指向他的指针pCodecCtx。但是我们必需要找到真正的编解码器pCodec并且打开它:
AVCodec *pCodec;
// Find the decoder for the video stream
pCodec=avcodec_find_decoder(pCodecCtx->codec_id);
if(pCodec==NULL) {
fprintf(stderr, "Unsupported codec!/n");
return -1; // Codec not found
}
// Open codec
if(avcodec_open(pCodecCtx, pCodec)<0) //ctx是上下文的意思,用编解码器pCodec打开编解码器信息pCodecCtx
return -1; // Could not open codec
有些人可能会从旧的指导中记得有两个关于这些代码其它部分:添加 CODEC_FLAG_TRUNCATED到pCodecCtx->flags和添加一个hack来粗糙的修正帧率。这两个修正已经不在存在于 ffplay.c中。因此,我必需假设它们不再必要。我们移除了那些代码后还有一个需要指出的不同点:pCodecCtx->time_base现在已经保存了帧率的信息。time_base是一个结构体,它里面有一个分子和分母(AVRational)。我们使用分数的方式来表示帧率是因为很多编解码器使用非整数的帧率(例如NTSC使用29.97fps)。
保存数据
现在我们需要找到一个地方来保存帧:
AVFrame *pFrame;
// Allocate video frame
pFrame=avcodec_alloc_frame();
因为我们准备输出保存24位RGB色的PPM文件,我们必需把帧的格式从原来的转换为RGB。FFMPEG将为我们做这些转换。在大多数项目中(包括我们的这个)我们都想把原始的帧转换成一个特定的格式。让我们先为转换来申请一帧的内存。
// Allocate an AVFrame structure
pFrameRGB=avcodec_alloc_frame();
if(pFrameRGB==NULL)
return -1;
即使我们申请了一帧的内存,当转换的时候,我们仍然需要一个地方来放置原始的数据。我们使用avpicture_get_size来获得我们需要的大小,然后手工申请内存空间:
uint8_t *buffer;
int numBytes;
// Determine required buffer size and allocate buffer
numBytes=avpicture_get_size(PIX_FMT_RGB24,pCodecCtx->width,
pCodecCtx->height);
buffer=(uint8_t *)av_malloc(numBytes*sizeof(uint8_t));
av_malloc是ffmpeg的malloc,用来实现一个简单的malloc的包装,这样来保证内存地址是对齐的(4字节对齐或者2字节对齐)。它并不能保护你不被内存泄漏,重复释放或者其它malloc的问题所困扰。
现在我们使用avpicture_fill来把帧和我们新申请的内存来结合。关于AVPicture的结成:AVPicture结构体是AVFrame结构体的子集――AVFrame结构体的开始部分与AVPicture结构体是一样的。
// Assign appropriate parts of buffer to image planes inpFrameRGB
// Note that pFrameRGB is an AVFrame, but AVFrame is asuperset
// of AVPicture
avpicture_fill((AVPicture *)pFrameRGB, buffer,PIX_FMT_RGB24,
pCodecCtx->width,pCodecCtx->height);
最后,我们已经准备好来从流中读取数据了。
读取数据
我们将要做的是通过读取包来读取整个视频流,然后把它解码成帧,最好后转换格式并且保存。
int frameFinished;
AVPacket packet;
i=0;
while(av_read_frame(pFormatCtx,&packet)>=0) {
// Is this a packet from the videostream?
if(packet.stream_index==videoStream) {
// Decodevideo frame 解码packet成帧pFrame
avcodec_decode_video(pCodecCtx, pFrame,&frameFinished,
packet.data,packet.size);
// Did weget a video frame?
if(frameFinished) {
//Convert the image from its native format (pFrame) to RGB (pFrameRGB)
img_convert((AVPicture *)pFrameRGB, PIX_FMT_RGB24,
(AVPicture*)pFrame, pCodecCtx->pix_fmt,
pCodecCtx->width,pCodecCtx->height);
// Save the frame to disk
if(++i<=5)
SaveFrame(pFrameRGB, pCodecCtx->width,
pCodecCtx->height, i);
}
}
// Free the packet that was allocated byav_read_frame
av_free_packet(&packet);
}
这个循环过程是比较简单的:av_read_frame()读取一个包并且把它保存到 AVPacket结构体中。注意我们仅仅申请了一个包的结构体――ffmpeg为我们申请了内部的数据的内存并通过packet.data指针来指向它。这些数据可以在后面通过av_free_packet()来释放。函数avcodec_decode_video()把包转换为帧。然而当解码一个包的时候,我们可能没有得到我们需要的关于帧的信息。因此,当我们得到下一帧的时候,avcodec_decode_video()为我们设置了帧结束标志 frameFinished。最后,我们使用img_convert()函数来把帧从原始格式(pCodecCtx->pix_fmt)转换成为 RGB格式。要记住,你可以把一个AVFrame结构体的指针转换为AVPicture结构体的指针。最后,我们把帧和高度宽度信息传递给我们的 SaveFrame函数。
关于包Packets的注释
从技术上讲一个包可以包含部分或者其它的数据,但是ffmpeg的解释器保证了我们得到的包Packets包含的要么是完整的要么是多种完整的帧。
现在我们需要做的是让SaveFrame函数能把RGB信息定稿到一个PPM格式的文件中。我们将生成一个简单的PPM格式文件,请相信,它是可以工作的。
void SaveFrame(AVFrame *pFrame, int width, int height, intiFrame) {
FILE *pFile;
char szFilename[32];
int y;
// Open file
sprintf(szFilename, "frame%d.ppm",iFrame);
pFile=fopen(szFilename, "wb");
if(pFile==NULL)
return;
// Write header
fprintf(pFile, "P6/n%d %d/n255/n", width,height);
// Write pixel data
for(y=0; y
// Close file
fclose(pFile);
}
我们做了一些标准的文件打开动作,然后写入RGB数据。我们一次向文件写入一行数据。PPM格式文件的是一种包含一长串的RGB数据的文件。如果你了解HTML色彩表示的方式,那么它就类似于把每个像素的颜色头对头的展开,就像#ff0000#ff0000....就表示了了个红色的屏幕。(它被保存成二进制方式并且没有分隔符,但是你自己是知道如何分隔的)。文件的头部表示了图像的宽度和高度以及最大的RGB值的大小。
现在,回顾我们的main()函数。一旦我们开始读取完视频流,我们必需清理一切:
// Free the RGB image
av_free(buffer);
av_free(pFrameRGB);
// Free the YUV frame
av_free(pFrame);
// Close the codec
avcodec_close(pCodecCtx);
// Close the video file
av_close_input_file(pFormatCtx);
return 0;
你会注意到我们使用av_free来释放我们使用avcode_alloc_frame和av_malloc来分配的内存。
上面的就是代码!下面,我们将使用Linux或者其它类似的平台,你将运行:
gcc -o tutorial01 tutorial01.c -lavutil -lavformat -lavcodec -lz-lavutil -lm
如果你使用的是老版本的ffmpeg,你可以去掉-lavutil参数:
gcc -o tutorial01 tutorial01.c -lavutil -lavformat -lavcodec -lz-lm
大多数的图像处理函数可以打开PPM文件。可以使用一些电影文件来进行测试。
原文地址 http://www.dranger.com/ffmpeg/
// tutorial01.c // Code based on a tutorial by Martin Bohme ([email protected]) // Tested on Gentoo, CVS version 5/01/07 compiled with GCC 4.1.1 // A small sample program that shows how to use libavformat and libavcodec to // read video from a file. // // Use // // gcc -o tutorial01 tutorial01.c -lavformat -lavcodec -lz // // to build (assuming libavformat and libavcodec are correctly installed // your system).libavformat and libavcodec要正确安装。 // // Run using // // tutorial01 myvideofile.mpg // // to write the first five frames from "myvideofile.mpg" to disk in PPM // format. #include