FLV格式详解

FLV封装格式是由一个文件头和文件体组成的。其中,FLV body是由一对(previous tag size 字段+tag)组成。previous tag size在tag前面,占4个字节,表示前一个tag的大小。FLV Header后的第一个previous tag size 为0。
Tag一般分为三类:

  • 音频数据类型
  • 视频数据类型
  • 脚本(帧)数据类型

FLV封装格式中,FLV Header占9字节

FLV封装格式的结构

FLV格式详解_第1张图片

FLV Header

注:在下面的数据type中,UI表示无符号整形,后面跟的数字表示其长度是多少位。比如UI8,表示无符号整形,长度一个字节。UI24是三个字节,UI[8*n]表示多个字节。UB表示位域,UB5表示一个字节的5位。可以参考c中的位域结构体。

FLV头占9个字节,用来标识文件为FLV文件,以及后续存储的音视频流。一个FLV文件,每钟类型的tag都属于一个流,也就是说,一个FLV文件,最多只有一个音频流,一个视频流:

FIled Type Comment
签名 UI8 ‘F’(0x64)
签名 UI8 ‘L’(0x4C)
签名 UI8 ‘V’(0x56)
版本 UI8 FLV的斑斑。0x01表示版本1
保留字段 UB5 前5位都是0
音频流标识 UB1 是否存在音频流
保留字段 UB1 为0
视频流字段 UB1 是否存在视频流
文件头大小 UI32 FLV版本1时填写9,表示FLV头的大小,包括这个字段

FLV Body

FLV Body是由一对一对previous tag size+tag组成的,其中previous tag size占4个字节,表示前一个tag的大小。

FLV Tag

每一个tag也是由Tag Header和Tag Data组成,Tag Header里存放的是当前tag的类型,数据区的长度等信息。Tag Header一般占11个字节,Tag结构如下:

Field Type Comment
Tag类型 UI8 8:audio 9: video 18:Script data
数据区大小 UI24 当前tag的数据区大小,不包含头
时戳 UI24 当前帧时戳,单位是毫秒,相对值,第一个tag的时戳总为0
时戳扩展字段 UI8 如果时戳大于0xFFFFFF,将会使用这个字节。这个字节是时戳的高8位,上面的三个字节是低24位。
Stream ID UI24 总是0
数据区 UI[8*n] 数据区数据

Tag类型

FLV tag的类型有三种,音频,视频和script

Script Tag Data

又被称为MetaData Tag,存放好一些关于FLV视频和音频的元信息,比如:duration,width,height等。通常该类型Tag会作为FLV文件的第一个tag,并且只有一个,跟在FLV Header后面。
该类型的Tag Data的结构:

在这里插入图片描述
第一个AMF包:
第一个字节表示AMF包类型,一般总是0x02,表示字符串。第2-3个字节为UI16类型值,标识字符串的长度,一般总是0x000A(“onMetaData”长度)。后面字节为具体的字符串

第二个AMF包:
第一个字节表示AMF包类型,一般总是0x08,表示数组。第2-5个字节为UI32类型值,表示数据元素的个数。后面即为各数组元素的封装,数组元素为元素名称和值组成的对。常见的数组元素如下:

Comment
duration 时长
width 视频宽度
height 视频高度
video data rate 视频码率
frame rate 视频帧率
video codec id 视频编码方式
audio sample rate 音频采样率
audio sample size 音频采样精度
stereo 是否为立体声
audio codec id 音频编码方式
filesize 文件大小
Audio Tag Data

音频Tag Data又分为Audio TagHeader 和Data数据区,它的第一个字节包含了音频数据的参数信息,从第二个字节开始为音频流数据:
FLV格式详解_第2张图片
第一个字节为音频信息,格式如下:

Field Type Comment
音频格式 UB4 0 = Linear PCM, platform endian
1 =ADPCM
2 = MP3
3 = Linear PCM, little endian
4 = Nellymoser 16-kHz mono
5 = Nellymoser 8-kHz mono
6 = Nellymoser
7 = G.711 A-law logarithmic PCM
8 = G.711 mu-law logarithmic PCM
9 = reserved
10 = AAC
11 = Speex
14 = MP3 8-Khz
15 = Device-specific sound
flv是不支持g711a的,如果要用,可能要用线性音频。
采样率 UB2 0 = 5.5-kHz
1 = 11-kHz
2 = 22-kHz
3 = 44-kHz
对于AAC总是3。
由此可以看出FLV封装格式并不支持48KHz的采样率
采样精度 UB1 0 = snd8Bit
1 = snd16Bit
压缩过的音频都是16bit
音频声道 UB1 0 = sndMono 单声道
1 = sndStereo 立体声,双声道
对于AAC总是1

第二个字节开始为音频数据:

Field Type Comment
音频数据 UI[8*n] 如果是PCM线性数据,存储的时候每个16bit小端存储,有符号。
如果音频格式是AAC,则存储的数据是AAC AUDIO DATA,否则为线性数组
Video Tag Data

视频Tag Data开始的第一个字节包含视频数据的参数信息,从第二个字节开始为视频流数据。结构如下:
在这里插入图片描述
第一个字节包含视频信息,格式如下:

Field Type Comment
帧类型 UB4 1: keyframe (for AVC, a seekable frame)——h264的IDR,关键帧,可重入帧。
2: inter frame (for AVC, a non- seekable frame)——h264的普通帧
3: disposable inter frame (H.263 only)
4: generated keyframe (reserved for server use only)
5: video info/command frame
编码ID UB4 使用哪种编码类型:1: JPEG (currently unused)
2: Sorenson H.263
3: Screen video4: On2 VP6
5: On2 VP6 with alpha channel
6: Screen video version 2
7: AVC

第二个字节开始为视频数据:

Filed Type Comment
视频数据 UI[8*n] 如果是avc,可以参考AVC VIDEO PACKET

AVC VIDEO PACKET

Field Type Comment
AVC Packet 类型 UI8 0:AVC序列头
1:AVC NALU单元
2:AVC序列结束。低级别avc不需要
CTS UI24 如果AVC packet类型是1,测为cts偏移
如果AVC packet类型是0,则为0
数据 UI[8*n] 如果AVC packet 类型是0,则是解码器配置,sps,pps。
如果是1,测是nalu单元,可是是多个

ts:显示时间,也就是接收方在显示器显示这帧的时间。单位为1/90000 秒。
dts:解码时间,也就是rtp包中传输的时间戳,表明解码的顺序。单位单位为1/90000 秒。——根据后面的理解,pts就是标准中的CompositionTime
cts偏移:cts = (pts - dts) / 90 。cts的单位是毫秒。
pts和dts的时间不一样,应该只出现在含有B帧的情况下,也就是profile main以上。baseline是没有这个问题的,baseline的pts和dts一直相同,所以cts一直为0。

AVC VIDEO PACKET中Data的结构:

Field Type Comment
长度 UI32 nalu单元的长度,不包括长度字段。
nalu数据 UI[8*n] NALU数据,没有四个字节的nalu单元头,直接从h264头开始,比如:65 ** ** **,41 ** ** **
长度 UI32 nalu单元的长度,不包括长度字段。
nalu数据 UI[8*n] NALU数据,没有四个字节的nalu单元头,直接从h264头开始,比如:65 ** ** **,41 ** ** **

解析FLV 根据FFMEPG

在FFmpeg中,AVInputFormat是记录各个格式的结构体,其中FLV的结构体如下:

AVInputFormat ff_flv_demuxer = {
    .name           = "flv",
    .long_name      = NULL_IF_CONFIG_SMALL("FLV (Flash Video)"),
    .priv_data_size = sizeof(FLVContext),
    .read_probe     = flv_probe,
    .read_header    = flv_read_header,
    .read_packet    = flv_read_packet,
    .read_seek      = flv_read_seek,
    .read_close     = flv_read_close,
    .extensions     = "flv",
    .priv_class     = &flv_class,
};

我们看到 flv_read_header,读取FLV格式的头:

static int flv_read_header(AVFormatContext *s)
{
    int flags;
    FLVContext *flv = s->priv_data;
    int offset;
    int pre_tag_size = 0;
	//跳动4个字节,也就是跳过了FLV和版本号
    avio_skip(s->pb, 4);
    //读取一个字节,得到Flags
    flags = avio_r8(s->pb);
	//检查缺哪个流,音频流或视频流
    flv->missing_streams = flags & (FLV_HEADER_FLAG_HASVIDEO | FLV_HEADER_FLAG_HASAUDIO);

    s->ctx_flags |= AVFMTCTX_NOHEADER;
	//读取4个字节,也就是HeaderSize
    offset = avio_rb32(s->pb);
    //偏移4个字节
    avio_seek(s->pb, offset, SEEK_SET);

    /* Annex E. The FLV File Format
     * E.3 TheFLVFileBody
     *     Field               Type    Comment
     *     PreviousTagSize0    UI32    Always 0
     * */
    //读取4个字节,也就是我们的body的第一个 previous tag size
    pre_tag_size = avio_rb32(s->pb);
    if (pre_tag_size) {
        av_log(s, AV_LOG_WARNING, "Read FLV header error, input file is not a standard flv format, first PreviousTagSize0 always is 0\n");
    }

    s->start_time = 0;
    flv->sum_flv_tag_size = 0;
    flv->last_keyframe_stream_index = -1;

    return 0;
}

然后我们来看看FLV 文件体的读取:

static int flv_read_packet(AVFormatContext *s, AVPacket *pkt)
{
    FLVContext *flv = s->priv_data;
    int ret, i, size, flags;
    enum FlvTagType type;
    int stream_type=-1;
    int64_t next, pos, meta_pos;
    int64_t dts, pts = AV_NOPTS_VALUE;
    int av_uninit(channels);
    int av_uninit(sample_rate);
    AVStream *st    = NULL;
    int last = -1;
    int orig_size;

retry:
    /* pkt size is repeated at end. skip it */
    //得到目前所在的位置
        pos  = avio_tell(s->pb);
        //读取一个字节,也就是我们Tag Header中的Type
        //每个NALU的第一个byte & 0x1f就可以得出它的类型
        type = (avio_r8(s->pb) & 0x1F);
        //读取3个字节,得到Tag Data的大小
        orig_size =
        size = avio_rb24(s->pb);
        //得到总的Tag的大小
        flv->sum_flv_tag_size += size + 11;
        //读取3个字节,得到Timestamp
        dts  = avio_rb24(s->pb);
        //再读取一个字节,得到Timestamp_ex,然后拼接为dts
        dts |= (unsigned)avio_r8(s->pb) << 24;
        av_log(s, AV_LOG_TRACE, "type:%d, size:%d, last:%d, dts:%"PRId64" pos:%"PRId64"\n", type, size, last, dts, avio_tell(s->pb));
        if (avio_feof(s->pb))
            return AVERROR_EOF;
        //跳过三个字节,也就是StreamID    
        avio_skip(s->pb, 3); /* stream id, always 0 */
        flags = 0;
		
        if (flv->validate_next < flv->validate_count) {
            int64_t validate_pos = flv->validate_index[flv->validate_next].pos;
            if (pos == validate_pos) {
                if (FFABS(dts - flv->validate_index[flv->validate_next].dts) <=
                    VALIDATE_INDEX_TS_THRESH) {
                    flv->validate_next++;
                } else {
                    clear_index_entries(s, validate_pos);
                    flv->validate_count = 0;
                }
            } else if (pos > validate_pos) {
                clear_index_entries(s, validate_pos);
                flv->validate_count = 0;
            }
        }
		//如果tag data的size为0,就跳到leave
        if (size == 0) {
            ret = FFERROR_REDO;
            goto leave;
        }

        next = size + avio_tell(s->pb);
		//如果是audio
        if (type == FLV_TAG_TYPE_AUDIO) {
            stream_type = FLV_STREAM_TYPE_AUDIO;
            //读取一个字节,得到音频信息
            flags    = avio_r8(s->pb);
            size--;
        } else if (type == FLV_TAG_TYPE_VIDEO) {
            stream_type = FLV_STREAM_TYPE_VIDEO;
            //得到视频信息
            flags    = avio_r8(s->pb);
            size--;
            //如果是video info/command frame
            if ((flags & FLV_VIDEO_FRAMETYPE_MASK) == FLV_FRAME_VIDEO_INFO_CMD)
                goto skip;
        } else if (type == FLV_TAG_TYPE_META) { //如果是元数据
            stream_type=FLV_STREAM_TYPE_DATA;
            if (size > 13 + 1 + 4) { // Header-type metadata stuff
                int type;
                meta_pos = avio_tell(s->pb);
                //读取metabody  1
                type = flv_read_metabody(s, next);
                if (type == 0 && dts == 0 || type < 0 || type == TYPE_UNKNOWN) {
                    if (type < 0 && flv->validate_count &&
                        flv->validate_index[0].pos     > next &&
                        flv->validate_index[0].pos - 4 < next
                    ) {
                        av_log(s, AV_LOG_WARNING, "Adjusting next position due to index mismatch\n");
                        next = flv->validate_index[0].pos - 4;
                    }
                    goto skip;
                } else if (type == TYPE_ONTEXTDATA) {
                    avpriv_request_sample(s, "OnTextData packet");
                    //2
                    return flv_data_packet(s, pkt, dts, next);
                } else if (type == TYPE_ONCAPTION) {
                    return flv_data_packet(s, pkt, dts, next);
                }
                avio_seek(s->pb, meta_pos, SEEK_SET);
            }
        } else {
            av_log(s, AV_LOG_DEBUG,
                   "Skipping flv packet: type %d, size %d, flags %d.\n",
                   type, size, flags);
skip:
            if (avio_seek(s->pb, next, SEEK_SET) != next) {
                 // This can happen if flv_read_metabody above read past
                 // next, on a non-seekable input, and the preceding data has
                 // been flushed out from the IO buffer.
                 av_log(s, AV_LOG_ERROR, "Unable to seek to the next packet\n");
                 return AVERROR_INVALIDDATA;
            }
            ret = FFERROR_REDO;
            goto leave;
        }

        /* skip empty data packets */
        if (!size) {
            ret = FFERROR_REDO;
            goto leave;
        }

        /* now find stream */
        //找到相应的codec
        for (i = 0; i < s->nb_streams; i++) {
            st = s->streams[i];
            if (stream_type == FLV_STREAM_TYPE_AUDIO) {
                if (st->codecpar->codec_type == AVMEDIA_TYPE_AUDIO &&
                    (s->audio_codec_id || flv_same_audio_codec(st->codecpar, flags)))
                    break;
            } else if (stream_type == FLV_STREAM_TYPE_VIDEO) {
                if (st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO &&
                    (s->video_codec_id || flv_same_video_codec(st->codecpar, flags)))
                    break;
            } else if (stream_type == FLV_STREAM_TYPE_DATA) {
                if (st->codecpar->codec_type == AVMEDIA_TYPE_SUBTITLE)
                    break;
            }
        }
        if (i == s->nb_streams) {
            static const enum AVMediaType stream_types[] = {AVMEDIA_TYPE_VIDEO, AVMEDIA_TYPE_AUDIO, AVMEDIA_TYPE_SUBTITLE};
            //如果没有就创建
            st = create_stream(s, stream_types[stream_type]);
            if (!st)
                return AVERROR(ENOMEM);

        }
        av_log(s, AV_LOG_TRACE, "%d %X %d \n", stream_type, flags, st->discard);

        if ((s->pb->seekable & AVIO_SEEKABLE_NORMAL) &&
            ((flags & FLV_VIDEO_FRAMETYPE_MASK) == FLV_FRAME_KEY ||
              stream_type == FLV_STREAM_TYPE_AUDIO))
            av_add_index_entry(st, pos, dts, size, 0, AVINDEX_KEYFRAME);

        if (  (st->discard >= AVDISCARD_NONKEY && !((flags & FLV_VIDEO_FRAMETYPE_MASK) == FLV_FRAME_KEY || (stream_type == FLV_STREAM_TYPE_AUDIO)))
            ||(st->discard >= AVDISCARD_BIDIR  &&  ((flags & FLV_VIDEO_FRAMETYPE_MASK) == FLV_FRAME_DISP_INTER && (stream_type == FLV_STREAM_TYPE_VIDEO)))
            || st->discard >= AVDISCARD_ALL
        ) {
            avio_seek(s->pb, next, SEEK_SET);
            ret = FFERROR_REDO;
            goto leave;
        }

    // if not streamed and no duration from metadata then seek to end to find
    // the duration from the timestamps
    if ((s->pb->seekable & AVIO_SEEKABLE_NORMAL) &&
        (!s->duration || s->duration == AV_NOPTS_VALUE) &&
        !flv->searched_for_end) {
        int size;
        const int64_t pos   = avio_tell(s->pb);
        // Read the last 4 bytes of the file, this should be the size of the
        // previous FLV tag. Use the timestamp of its payload as duration.
        int64_t fsize       = avio_size(s->pb);
retry_duration:
        avio_seek(s->pb, fsize - 4, SEEK_SET);
        size = avio_rb32(s->pb);
        if (size > 0 && size < fsize) {
            // Seek to the start of the last FLV tag at position (fsize - 4 - size)
            // but skip the byte indicating the type.
            avio_seek(s->pb, fsize - 3 - size, SEEK_SET);
            if (size == avio_rb24(s->pb) + 11) {
                uint32_t ts = avio_rb24(s->pb);
                ts         |= avio_r8(s->pb) << 24;
                if (ts)
                    s->duration = ts * (int64_t)AV_TIME_BASE / 1000;
                else if (fsize >= 8 && fsize - 8 >= size) {
                    fsize -= size+4;
                    goto retry_duration;
                }
            }
        }

        avio_seek(s->pb, pos, SEEK_SET);
        flv->searched_for_end = 1;
    }
	//如果是音频流
    if (stream_type == FLV_STREAM_TYPE_AUDIO) {
        int bits_per_coded_sample;
        channels = (flags & FLV_AUDIO_CHANNEL_MASK) == FLV_STEREO ? 2 : 1;
        sample_rate = 44100 << ((flags & FLV_AUDIO_SAMPLERATE_MASK) >>
                                FLV_AUDIO_SAMPLERATE_OFFSET) >> 3;
        bits_per_coded_sample = (flags & FLV_AUDIO_SAMPLESIZE_MASK) ? 16 : 8;
        if (!st->codecpar->channels || !st->codecpar->sample_rate ||
            !st->codecpar->bits_per_coded_sample) {
            st->codecpar->channels              = channels;
            st->codecpar->channel_layout        = channels == 1
                                               ? AV_CH_LAYOUT_MONO
                                               : AV_CH_LAYOUT_STEREO;
            st->codecpar->sample_rate           = sample_rate;
            st->codecpar->bits_per_coded_sample = bits_per_coded_sample;
        }
        if (!st->codecpar->codec_id) {
        //设置音频编解码
            flv_set_audio_codec(s, st, st->codecpar,
                                flags & FLV_AUDIO_CODECID_MASK);
            flv->last_sample_rate =
            sample_rate           = st->codecpar->sample_rate;
            flv->last_channels    =
            channels              = st->codecpar->channels;
        } else {
            AVCodecParameters *par = avcodec_parameters_alloc();
            if (!par) {
                ret = AVERROR(ENOMEM);
                goto leave;
            }
            par->sample_rate = sample_rate;
            par->bits_per_coded_sample = bits_per_coded_sample;
            flv_set_audio_codec(s, st, par, flags & FLV_AUDIO_CODECID_MASK);
            sample_rate = par->sample_rate;
            avcodec_parameters_free(&par);
        }
    } else if (stream_type == FLV_STREAM_TYPE_VIDEO) { //如果是视频流
    //设置视频编解码
        int ret = flv_set_video_codec(s, st, flags & FLV_VIDEO_CODECID_MASK, 1);
        if (ret < 0)
            return ret;
        size -= ret;
    } else if (stream_type == FLV_STREAM_TYPE_DATA) { //如果是Script流
        st->codecpar->codec_id = AV_CODEC_ID_TEXT;
    }
	//如果是这几个格式
    if (st->codecpar->codec_id == AV_CODEC_ID_AAC ||
        st->codecpar->codec_id == AV_CODEC_ID_H264 ||
        st->codecpar->codec_id == AV_CODEC_ID_MPEG4) {
        //读取1个字节,得到包类型,可以参考h264的结构来看
        int type = avio_r8(s->pb);
        size--;

        if (size < 0) {
            ret = AVERROR_INVALIDDATA;
            goto leave;
        }

        if (st->codecpar->codec_id == AV_CODEC_ID_H264 || st->codecpar->codec_id == AV_CODEC_ID_MPEG4) {
            // sign extension
            //如果3个字节,得到cts
            int32_t cts = (avio_rb24(s->pb) + 0xff800000) ^ 0xff800000;
            //通过cts得到pts
            pts = dts + cts;
            if (cts < 0) { // dts might be wrong
                if (!flv->wrong_dts)
                    av_log(s, AV_LOG_WARNING,
                        "Negative cts, previous timestamps might be wrong.\n");
                flv->wrong_dts = 1;
            } else if (FFABS(dts - pts) > 1000*60*15) {
                av_log(s, AV_LOG_WARNING,
                       "invalid timestamps %"PRId64" %"PRId64"\n", dts, pts);
                dts = pts = AV_NOPTS_VALUE;
            }
        }
        if (type == 0 && (!st->codecpar->extradata || st->codecpar->codec_id == AV_CODEC_ID_AAC ||
            st->codecpar->codec_id == AV_CODEC_ID_H264)) {
            AVDictionaryEntry *t;

            if (st->codecpar->extradata) {
                if ((ret = flv_queue_extradata(flv, s->pb, stream_type, size)) < 0)
                    return ret;
                ret = FFERROR_REDO;
                goto leave;
            }
            if ((ret = flv_get_extradata(s, st, size)) < 0)
                return ret;

            /* Workaround for buggy Omnia A/XE encoder */
            t = av_dict_get(s->metadata, "Encoder", NULL, 0);
            if (st->codecpar->codec_id == AV_CODEC_ID_AAC && t && !strcmp(t->value, "Omnia A/XE"))
                st->codecpar->extradata_size = 2;

            if (st->codecpar->codec_id == AV_CODEC_ID_AAC && 0) {
                MPEG4AudioConfig cfg;

                if (avpriv_mpeg4audio_get_config(&cfg, st->codecpar->extradata,
                                                 st->codecpar->extradata_size * 8, 1) >= 0) {
                st->codecpar->channels       = cfg.channels;
                st->codecpar->channel_layout = 0;
                if (cfg.ext_sample_rate)
                    st->codecpar->sample_rate = cfg.ext_sample_rate;
                else
                    st->codecpar->sample_rate = cfg.sample_rate;
                av_log(s, AV_LOG_TRACE, "mp4a config channels %d sample rate %d\n",
                        st->codecpar->channels, st->codecpar->sample_rate);
                }
            }

            ret = FFERROR_REDO;
            goto leave;
        }
    }

    /* skip empty data packets */
    if (!size) {
        ret = FFERROR_REDO;
        goto leave;
    }
	// 获得AVPacket并初始化
    ret = av_get_packet(s->pb, pkt, size);
    if (ret < 0)
        return ret;
    pkt->dts          = dts;
    pkt->pts          = pts == AV_NOPTS_VALUE ? dts : pts;
    pkt->stream_index = st->index;
    pkt->pos          = pos;
    if (flv->new_extradata[stream_type]) {
        uint8_t *side = av_packet_new_side_data(pkt, AV_PKT_DATA_NEW_EXTRADATA,
                                                flv->new_extradata_size[stream_type]);
        if (side) {
            memcpy(side, flv->new_extradata[stream_type],
                   flv->new_extradata_size[stream_type]);
            av_freep(&flv->new_extradata[stream_type]);
            flv->new_extradata_size[stream_type] = 0;
        }
    }
    if (stream_type == FLV_STREAM_TYPE_AUDIO &&
                    (sample_rate != flv->last_sample_rate ||
                     channels    != flv->last_channels)) {
        flv->last_sample_rate = sample_rate;
        flv->last_channels    = channels;
        ff_add_param_change(pkt, channels, 0, sample_rate, 0, 0);
    }

    if (    stream_type == FLV_STREAM_TYPE_AUDIO ||
            ((flags & FLV_VIDEO_FRAMETYPE_MASK) == FLV_FRAME_KEY) ||
            stream_type == FLV_STREAM_TYPE_DATA)
        pkt->flags |= AV_PKT_FLAG_KEY;

leave:
    last = avio_rb32(s->pb);
    if (last != orig_size + 11 && last != orig_size + 10 &&
        !avio_feof(s->pb) &&
        (last != orig_size || !last) && last != flv->sum_flv_tag_size &&
        !flv->broken_sizes) {
        av_log(s, AV_LOG_ERROR, "Packet mismatch %d %d %d\n", last, orig_size + 11, flv->sum_flv_tag_size);
        avio_seek(s->pb, pos + 1, SEEK_SET);
        ret = resync(s);
        av_packet_unref(pkt);
        if (ret >= 0) {
            goto retry;
        }
    }
    return ret;
}

1 flv_read_metabody

这里只想得到type

static int flv_read_metabody(AVFormatContext *s, int64_t next_pos)
{
    FLVContext *flv = s->priv_data;
    AMFDataType type;
    AVStream *stream, *astream, *vstream;
    AVStream av_unused *dstream;
    AVIOContext *ioc;
    int i;
    // only needs to hold the string "onMetaData".
    // Anything longer is something we don't want.
    char buffer[32];

    astream = NULL;
    vstream = NULL;
    dstream = NULL;
    ioc     = s->pb;

    // first object needs to be "onMetaData" string
    //读取8个字节,也就是AMF包
    type = avio_r8(ioc);
    if (type != AMF_DATA_TYPE_STRING ||
        amf_get_string(ioc, buffer, sizeof(buffer)) < 0)
        return TYPE_UNKNOWN;

    if (!strcmp(buffer, "onTextData"))
        return TYPE_ONTEXTDATA;

    if (!strcmp(buffer, "onCaption"))
        return TYPE_ONCAPTION;

    if (!strcmp(buffer, "onCaptionInfo"))
        return TYPE_ONCAPTIONINFO;

    if (strcmp(buffer, "onMetaData") && strcmp(buffer, "onCuePoint")) {
        av_log(s, AV_LOG_DEBUG, "Unknown type %s\n", buffer);
        return TYPE_UNKNOWN;
    }

    // find the streams now so that amf_parse_object doesn't need to do
    // the lookup every time it is called.
    for (i = 0; i < s->nb_streams; i++) {
        stream = s->streams[i];
        if (stream->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {
            vstream = stream;
            flv->last_keyframe_stream_index = i;
        } else if (stream->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) {
            astream = stream;
            if (flv->last_keyframe_stream_index == -1)
                flv->last_keyframe_stream_index = i;
        }
        else if (stream->codecpar->codec_type == AVMEDIA_TYPE_SUBTITLE)
            dstream = stream;
    }

    // parse the second object (we want a mixed array)
    if (amf_parse_object(s, astream, vstream, buffer, next_pos, 0) < 0)
        return -1;

    return 0;
}

2 flv_data_packet

static int flv_data_packet(AVFormatContext *s, AVPacket *pkt,
                           int64_t dts, int64_t next)
{
    AVIOContext *pb = s->pb;
    AVStream *st    = NULL;
    char buf[20];
    int ret = AVERROR_INVALIDDATA;
    int i, length = -1;
    int array = 0;
	//读取1个字节,得到AMF包的类型
    switch (avio_r8(pb)) {
    case AMF_DATA_TYPE_ARRAY:
        array = 1;
    case AMF_DATA_TYPE_MIXEDARRAY:
        avio_seek(pb, 4, SEEK_CUR);
    case AMF_DATA_TYPE_OBJECT:
        break;
    default:
        goto skip;
    }
	//读取AMF包中的数组
    while (array || (ret = amf_get_string(pb, buf, sizeof(buf))) > 0) {
    //得到key
        AMFDataType type = avio_r8(pb);
        if (type == AMF_DATA_TYPE_STRING && (array || !strcmp(buf, "text"))) {
        	//得到数组长度
            length = avio_rb16(pb);
            //获得数组内容
            ret    = av_get_packet(pb, pkt, length);
            if (ret < 0)
                goto skip;
            else
                break;
        } else {
            if ((ret = amf_skip_tag(pb, type)) < 0)
                goto skip;
        }
    }

    if (length < 0) {
        ret = AVERROR_INVALIDDATA;
        goto skip;
    }

    for (i = 0; i < s->nb_streams; i++) {
        st = s->streams[i];
        if (st->codecpar->codec_type == AVMEDIA_TYPE_SUBTITLE)
            break;
    }

    if (i == s->nb_streams) {
        st = create_stream(s, AVMEDIA_TYPE_SUBTITLE);
        if (!st)
            return AVERROR(ENOMEM);
        st->codecpar->codec_id = AV_CODEC_ID_TEXT;
    }

    pkt->dts  = dts;
    pkt->pts  = dts;
    pkt->size = ret;

    pkt->stream_index = st->index;
    pkt->flags       |= AV_PKT_FLAG_KEY;

skip:
    avio_seek(s->pb, next + 4, SEEK_SET);

    return ret;
}

可以看到,FFmpeg对FLV格式的解析,其实就是根据FLV的封装格式进行解析,再根据不同的流,设置不同的编解码器,然后进行读取,得到AVPacket,也就是数据

参考https://juejin.im/post/5ae04c6651882567244da8eb

你可能感兴趣的:(音视频)