主流开源视频媒体格式分析和AVFoundation解析各媒体信息方法

写在开始:关于视频格式

  • 视频格式不是一般人认为的,单纯的MP4,AVI,RMVB等文件/封装格式,也不是H.264,H.263,RealVideo等编码格式。而是编码格式+封装格式+GOP排列方式共同构成。

H.264

  • H.264编码的GOP(Group of Picture)是由关键帧I,预测帧P、B组成。
    此处缅怀雷霄骅。

0、 介绍几个简称

  • AVC:高级视频编码(Advanced Video Coding)简称AVC,又称H.264。
  • AAC:AAC(Advanced Audio Coding),中文名:高级音频编码。在传输流电视和音频广播等内容时,使用这种格式将能够发送高质量的音频内容,而且位速率比MP3低得多。
  • VBR:码率可以随着图像的复杂程度的不同而变化,因此其编码效率比较高,Motion发生时,马赛克很少。码率控制算法根据图像内容确定使用的比特率,图像内容比较简单则分配较少的码率(似乎码字更合适),图像内容复杂则分配较多的码字,这样既保证了质量,又兼顾带宽限制。这种算法优先考虑图像质量。H264三种码率控制方法(CBR, VBR, CVBR)

1、 MP4分析

  • MP4视频文件分析JSON
{
"media": {
"@ref": "/Users/Keer_LGQ/Documents/GitHub/DKShortVideo/0001.mp4",
"track": [
{
"@type": "概要",
"VideoCount": "1",
"AudioCount": "1",
"FileExtension": "mp4",
"Format": "MPEG-4",
"Format_Profile": "Base Media",
"CodecID": "mp42",
"FileSize": "8136183",
"Duration": "174.893",
"OverallBitRate_Mode": "VBR",
"OverallBitRate": "372167",
"FrameRate": "30.000",
"FrameCount": "5246",
"StreamSize": "64369",
"HeaderSize": "64353",
"DataSize": "8071830",
"FooterSize": "0",
"IsStreamable": "Yes",
"Encoded_Date": "UTC 2015-05-29 12:19:01",
"Tagged_Date": "UTC 2015-05-29 12:19:01",
"File_Modified_Date": "UTC 2017-06-05 01:57:20",
"File_Modified_Date_Local": "2017-06-05 09:57:20"
},
{
"@type": "视频",
"StreamOrder": "0",
"ID": "1",
"Format": "AVC",
"Format_Profile": "Main",
"Format_Level": "3.1",
"Format_Settings_CABAC": "Yes",
"Format_Settings_RefFrames": "3",
"CodecID": "avc1",
"Duration": "174.867",
"Source_Duration": "174.867",
"BitRate": "243609",
"Width": "854",
"Height": "480",
"Stored_Width": "864",
"Sampled_Width": "854",
"Sampled_Height": "480",
"PixelAspectRatio": "1.000",
"DisplayAspectRatio": "1.779",
"Rotation": "0.000",
"FrameRate_Mode": "VFR",
"FrameRate": "30.000",
"FrameRate_Minimum": "30.000",
"FrameRate_Maximum": "30.030",
"FrameCount": "5246",
"Standard": "NTSC",
"ColorSpace": "YUV",
"ChromaSubsampling": "4:2:0",
"BitDepth": "8",
"ScanType": "Progressive",
"StreamSize": "5324888",
"Source_StreamSize": "5324888",
"Language": "en",
"Encoded_Date": "UTC 2015-05-29 12:19:01",
"Tagged_Date": "UTC 2015-05-29 12:19:01",
"colour_range": "Limited",
"extra": {
"mdhd_Duration": "174867"
}
},
{
"@type": "音频",
"StreamOrder": "1",
"ID": "2",
"Format": "AAC",
"Format_Profile": "LC",
"CodecID": "mp4a-40-2",
"Duration": "174.893",
"Source_Duration": "174.892",
"BitRate_Mode": "VBR",
"BitRate": "125651",
"BitRate_Maximum": "184324",
"Channels": "2",
"ChannelPositions": "Front: L R",
"ChannelLayout": "L R",
"SamplesPerFrame": "1024",
"SamplingRate": "44100",
"SamplingCount": "7712782",
"FrameRate": "43.066",
"FrameCount": "7532",
"Source_FrameCount": "7532",
"Compression_Mode": "Lossy",
"StreamSize": "2746926",
"StreamSize_Proportion": "0.33762",
"Source_StreamSize": "2746926",
"Source_StreamSize_Proportion": "0.33762",
"Language": "en",
"Encoded_Date": "UTC 2015-05-29 12:19:01",
"Tagged_Date": "UTC 2015-05-29 12:19:01",
"extra": {
"mdhd_Duration": "174893"
}
}
]
}
}

解析

self.asset = [AVURLAsset assetWithURL:@"*.mp4"];

(lldb) po _asset.tracks
<__NSArrayM 0x1c405fa70>(
,

)
    [_asset loadValuesAsynchronouslyForKeys:@[@"tracks"] completionHandler: ^{

        NSArray *videoTracks = [_asset tracksWithMediaType:AVMediaTypeVideo];
        AVAssetTrack *trackVideo = [videoTracks objectAtIndex:0];
        
        CGSize size = [trackVideo naturalSize];
        Float64 sec = CMTimeGetSeconds([_asset duration]);
        float videoFrameRate = [trackVideo nominalFrameRate];
        
        NSLog(@"trackVideo = %@",trackVideo);
        NSLog(@"trackVideoSize = %f--%f",size.width,size.height);
        NSLog(@"trackVideoSec = %f",sec);
        NSLog(@"trackVideoFrameRate = %f",videoFrameRate);
        NSLog(@"trackVideo mediaType = %@",trackVideo.mediaType);
        NSLog(@"trackVideo formatDescriptions = %@",trackVideo.formatDescriptions);
        NSLog(@"trackVideo playable = %d",trackVideo.playable);
        NSLog(@"trackVideo decodable = %d",trackVideo.decodable);
        NSLog(@"trackVideo totalSampleDataLength = %lld",trackVideo.totalSampleDataLength);
        NSLog(@"trackVideo languageCode = %@",trackVideo.languageCode);
        NSLog(@"trackVideo mediaType = %f  -- %f",CMTimeGetSeconds(trackVideo.timeRange.start),CMTimeGetSeconds(trackVideo.timeRange.duration));
        
        NSLog(@"trackVideo segments = %@",trackVideo.segments);
        NSLog(@"trackVideo commonMetadata = %@",trackVideo.commonMetadata);
        NSLog(@"trackVideo metadata = %@",trackVideo.metadata);
        NSLog(@"trackVideo availableMetadataFormats = %@",trackVideo.availableMetadataFormats);
        NSLog(@"trackVideo availableTrackAssociationTypes = %@",trackVideo.availableTrackAssociationTypes);
    }];

  • video 信息
2018-04-16 13:47:15.491982+0800 DKShortVideo[6282:3083500] trackVideo = 
2018-04-16 13:47:15.492057+0800 DKShortVideo[6282:3083500] trackVideoSize = 854.000000--480.000000
2018-04-16 13:47:15.492109+0800 DKShortVideo[6282:3083500] trackVideoSec = 174.892700
2018-04-16 13:47:15.492145+0800 DKShortVideo[6282:3083500] trackVideoFrameRate = 30.000000
2018-04-16 13:47:15.492226+0800 DKShortVideo[6282:3083500] trackVideo mediaType = vide
2018-04-16 13:47:15.496043+0800 DKShortVideo[6282:3083500] trackVideo formatDescriptions = (
    " {\n\tmediaType:'vide' \n\tmediaSubType:'avc1' \n\tmediaSpecific: {\n\t\tcodecType: 'avc1'\t\tdimensions: 854 x 480 \n\t} \n\textensions: {{type = immutable dict, count = 13,\nentries =>\n\t2 : {contents = \"FormatName\"} = {contents = \"AVC Coding\"}\n\t3 : {contents = \"SpatialQuality\"} = {value = +0, type = kCFNumberSInt32Type}\n\t4 : {contents = \"Version\"} = {value = +0, type = kCFNumberSInt16Type}\n\t5 : {contents = \"CVImageBufferChromaLocationBottomField\"} = Left\n\t8 : {contents = \"CVPixelAspectRatio\"} = {type = immutable dict, count = 2,\nentries =>\n\t1 : {contents = \"HorizontalSpacing\"} = {value = +1, type = kCFNumberSInt32Type}\n\t2 : {contents = \"VerticalSpacing\"} = {value = +1, type = kCFNumberSInt32Type}\n}\n\n\t11 : {contents = \"TemporalQuality\"} = {value = +0, type = kCFNumberSInt32Type}\n\t12 : {contents = \"RevisionLevel\"} = {value = +0, type = kCFNumberSInt16Type}\n\t16 : {contents = \"CVImageBufferChromaLocationTopField\"} = Left\n\t17 : {contents = \"VerbatimISOSampleEntry\"} = {length = 136, capacity = 136, bytes = 0x00000088617663310000000000000001 ... 4801000468ebcd48}\n\t18 : {contents = \"SampleDescriptionExtensionAtoms\"} = {type = immutable dict, count = 1,\nentries =>\n\t2 : {contents = \"avcC\"} = {length = 42, capacity = 42, bytes = 0x014d001fffe1001b674d401f965201b0 ... 4801000468ebcd48}\n}\n\n\t19 : {contents = \"FullRangeVideo\"} = {value = false}\n\t20 : {contents = \"CVFieldCount\"} = {value = +1, type = kCFNumberSInt32Type}\n\t22 : {contents = \"Depth\"} = {value = +24, type = kCFNumberSInt16Type}\n}\n}\n}"
)
2018-04-16 13:47:15.509285+0800 DKShortVideo[6282:3083500] trackVideo playable = 1
2018-04-16 13:47:15.510107+0800 DKShortVideo[6282:3083500] trackVideo decodable = 1
2018-04-16 13:47:15.510365+0800 DKShortVideo[6282:3083500] trackVideo totalSampleDataLength = 5324888
2018-04-16 13:47:15.510639+0800 DKShortVideo[6282:3083500] trackVideo languageCode = eng
2018-04-16 13:47:15.511167+0800 DKShortVideo[6282:3083500] trackVideo mediaType = 0.000000  -- 174.866667
2018-04-16 13:47:15.511615+0800 DKShortVideo[6282:3083500] trackVideo segments = (
    ""
)
  • audio 信息
// audio track 命名是video实际上是audio
trackVideo = 
trackVideoSize = 0.000000--0.000000
trackVideoSec = 174.892700
trackVideoFrameRate = 43.066406
trackVideo mediaType = soun
trackVideo formatDescriptions = (
    " {\n\tmediaType:'soun' \n\tmediaSubType:'aac ' \n\tmediaSpecific: {\n\t\tASBD: {\n\t\t\tmSampleRate: 44100.000000 \n\t\t\tmFormatID: 'aac ' \n\t\t\tmFormatFlags: 0x0 \n\t\t\tmBytesPerPacket: 0 \n\t\t\tmFramesPerPacket: 1024 \n\t\t\tmBytesPerFrame: 0 \n\t\t\tmChannelsPerFrame: 2 \n\t\t\tmBitsPerChannel: 0 \t} \n\t\tcookie: {{length = 36, capacity = 36, bytes = 0x038080801f0040100480808014401500 ... 8080021210060102}} \n\t\tACL: {\U7acb\U4f53\U58f0\Uff08L R\Uff09}\n\t\tFormatList Array: {(null)} \n\t} \n\textensions: {{type = immutable dict, count = 1,\nentries =>\n\t2 : {contents = \"VerbatimISOSampleEntry\"} = {length = 84, capacity = 84, bytes = 0x000000546d7034610000000000000001 ... 8080021210060102}\n}\n}\n}"
)
trackVideo playable = 1
trackVideo decodable = 1
trackVideo totalSampleDataLength = 2746926
trackVideo languageCode = eng
trackVideo mediaType = 0.000000  -- 174.892698
trackVideo segments = (
    ""
)

从打印信息可以看出大部分MP4文件大部分信息都在formatDescriptions里。

typedef const struct CM_BRIDGED_TYPE(id) opaqueCMFormatDescription *CMFormatDescriptionRef;

mediaType = soun 什么意思呢?CMMediaType其实是个枚举。


typedef FourCharCode CMMediaType;
#if COREMEDIA_USE_DERIVED_ENUMS_FOR_CONSTANTS
enum : CMMediaType
#else
enum
#endif
{
    kCMMediaType_Video              = 'vide',
    kCMMediaType_Audio              = 'soun',
    kCMMediaType_Muxed              = 'muxx',
    kCMMediaType_Text               = 'text',
    kCMMediaType_ClosedCaption      = 'clcp',
    kCMMediaType_Subtitle           = 'sbtl',
    kCMMediaType_TimeCode           = 'tmcd',
    kCMMediaType_Metadata           = 'meta',
};

以上分析发现AVFoundation解析MP4文件还是很简单的。

m3u8 .ts

加载一个m3u8的视频路径。Charles分析response


// VOD :英文称为“Video on Demand”,所以也称为“VOD”。顾名思义,就是根据观众的要求播放节目的[视频点播系统](https://baike.baidu.com/item/%E8%A7%86%E9%A2%91%E7%82%B9%E6%92%AD%E7%B3%BB%E7%BB%9F),把用户所点击或选择的视频内容,传输给所请求的用户
// TARGETDURATION :10秒每段
*response:*

#EXTM3U
#EXT-X-TARGETDURATION:10
#EXT-X-VERSION:3
#EXT-X-MEDIA-SEQUENCE:0
#EXT-X-PLAYLIST-TYPE:VOD
#EXTINF:9.97667,    
fileSequence0.ts
#EXTINF:9.97667,    
fileSequence1.ts
#EXTINF:9.97667,    
fileSequence2.ts
#EXTINF:9.97667,    
fileSequence3.ts
.
.
.
#EXTINF:9.97667,    
fileSequence148.ts
#EXTINF:9.97667,    
fi

可以发现真正的请求是9.943333008秒的ts片段。

下面分析ts的解析。
首先下载一段.ts文件

// curl 使用选项-O将下载的数据写入到文件,必须使用文件的绝对地址。 例子:
curl http://devimages.apple.com.edgekey.net/streaming/examples/bipbop_4x3/gear2/fileSequence2.ts -o /Users/Keer_LGQ/Documents/Video\ Filter/video.ts --progress
  • TS视频文件分析JSON
{
"media": {
"@ref": "/Users/Keer_LGQ/Documents/GitHub/DKShortVideo/video.ts",
"track": [
{
"@type": "概要",
"ID": "1",
"VideoCount": "1",
"AudioCount": "1",
"FileExtension": "ts",
"Format": "MPEG-TS",
"FileSize": "816108",
"Duration": "9.943333008",
"OverallBitRate_Mode": "VBR",
"OverallBitRate": "655246",
"File_Modified_Date": "UTC 2018-04-16 06:39:59",
"File_Modified_Date_Local": "2018-04-16 14:39:59",
"extra": {
"OverallBitRate_Precision_Min": "655212",
"OverallBitRate_Precision_Max": "655280"
}
},
{
"@type": "视频",
"StreamOrder": "0-0",
"ID": "257",
"MenuID": "1",
"Format": "AVC",
"Format_Profile": "Main",
"Format_Level": "3",
"Format_Settings_CABAC": "No",
"Format_Settings_RefFrames": "2",
"Format_Settings_GOP": "M=2, N=24",
"CodecID": "27",
"Duration": "9.877",
"Width": "640",
"Height": "480",
"Sampled_Width": "640",
"Sampled_Height": "480",
"PixelAspectRatio": "1.000",
"DisplayAspectRatio": "1.333",
"ColorSpace": "YUV",
"ChromaSubsampling": "4:2:0",
"BitDepth": "8",
"ScanType": "Progressive",
"Delay": "29.920000",
"colour_range": "Limited",
"colour_description_present": "Yes",
"colour_primaries": "BT.601 NTSC",
"transfer_characteristics": "BT.709",
"matrix_coefficients": "BT.601"
},
{
"@type": "音频",
"StreamOrder": "0-1",
"ID": "258",
"MenuID": "1",
"Format": "AAC",
"Format_Version": "4",
"Format_Profile": "LC",
"MuxingMode": "ADTS",
"CodecID": "15",
"Duration": "9.287",
"BitRate_Mode": "VBR",
"Channels": "2",
"ChannelPositions": "Front: L R",
"ChannelLayout": "L R",
"SamplesPerFrame": "1024",
"SamplingRate": "22050",
"SamplingCount": "204778",
"FrameRate": "21.533",
"Compression_Mode": "Lossy",
"Delay": "29.873377",
"Delay_Source": "Container"
}
]
}
}

TS格式文件到底是什么鬼:

这里需要注意:加载m3u8流媒体视频和本地视频区别

  • 本地视频,根据URL创建好AVURLAsset实例,就能读取到AVAsset的NSArray *tracks信息。
  • 网络视频, 需要加载到信息以后会读取到AVPlayerItem的NSArray *tracks 信息。可以在AVPlayerItem的观察者中读取tracks信息,也可以监听AVPlayer的播放
  • (id)addPeriodicTimeObserverForInterval:(CMTime)interval queue:(nullable dispatch_queue_t)queue usingBlock:(void (^)(CMTime time))block;
  • 视频轨道
--------------------------------------------
2018-04-16 16:38:47.524132+0800 DKShortVideo[6440:3145512] trackVideo = 
2018-04-16 16:38:47.524251+0800 DKShortVideo[6440:3145512] trackVideoSize = 640.000000--480.000000
2018-04-16 16:38:47.524370+0800 DKShortVideo[6440:3145512] trackVideoSec = 1800.000000
2018-04-16 16:38:47.524465+0800 DKShortVideo[6440:3145512] trackVideoFrameRate = 0.000000
2018-04-16 16:38:47.524902+0800 DKShortVideo[6440:3145512] trackVideo mediaType = vide
2018-04-16 16:38:47.531169+0800 DKShortVideo[6440:3145512] trackVideo formatDescriptions = (
    " {\n\tmediaType:'vide' \n\tmediaSubType:'avc1' \n\tmediaSpecific: {\n\t\tcodecType: 'avc1'\t\tdimensions: 640 x 480 \n\t} \n\textensions: {{type = immutable dict, count = 9,\nentries =>\n\t0 : {contents = \"SampleDescriptionExtensionAtoms\"} = {type = immutable dict, count = 1,\nentries =>\n\t2 : {contents = \"avcC\"} = {length = 35, capacity = 35, bytes = 0x014d401effe10014274d401ea9181407 ... 1001000428de09c8}\n}\n\n\t1 : {contents = \"CVImageBufferYCbCrMatrix\"} = {contents = \"ITU_R_601_4\"}\n\t2 : {contents = \"CVFieldCount\"} = {value = +1, type = kCFNumberSInt32Type}\n\t4 : {contents = \"Depth\"} = {value = +24, type = kCFNumberSInt32Type}\n\t5 : {contents = \"CVImageBufferColorPrimaries\"} = SMPTE_C\n\t6 : {contents = \"FullRangeVideo\"} = {value = false}\n\t8 : {contents = \"CVImageBufferTransferFunction\"} = {contents = \"ITU_R_709_2\"}\n\t10 : {contents = \"CVImageBufferChromaLocationBottomField\"} = Center\n\t12 : {contents = \"CVImageBufferChromaLocationTopField\"} = Center\n}\n}\n}"
)
2018-04-16 16:38:47.550922+0800 DKShortVideo[6440:3145512] trackVideo playable = 1
2018-04-16 16:38:47.551010+0800 DKShortVideo[6440:3145512] trackVideo decodable = 1
2018-04-16 16:38:47.551068+0800 DKShortVideo[6440:3145512] trackVideo totalSampleDataLength = 0
2018-04-16 16:38:47.551126+0800 DKShortVideo[6440:3145512] trackVideo languageCode = (null)
2018-04-16 16:38:47.551229+0800 DKShortVideo[6440:3145512] trackVideo mediaType = nan  -- nan
2018-04-16 16:38:47.554477+0800 DKShortVideo[6440:3145512]  --------------------------------------------
  • 音频轨道
--------------------------------------------
2018-04-16 16:38:47.554736+0800 DKShortVideo[6440:3145512] trackVideo = 
2018-04-16 16:38:47.554774+0800 DKShortVideo[6440:3145512] trackVideoSize = 640.000000--480.000000
2018-04-16 16:38:47.554817+0800 DKShortVideo[6440:3145512] trackVideoSec = 1800.000000
2018-04-16 16:38:47.554851+0800 DKShortVideo[6440:3145512] trackVideoFrameRate = 0.000000
2018-04-16 16:38:47.554929+0800 DKShortVideo[6440:3145512] trackVideo mediaType = soun
2018-04-16 16:38:47.562085+0800 DKShortVideo[6440:3145512] trackVideo formatDescriptions = (
    " {\n\tmediaType:'soun' \n\tmediaSubType:'aac ' \n\tmediaSpecific: {\n\t\tASBD: {\n\t\t\tmSampleRate: 22050.000000 \n\t\t\tmFormatID: 'aac ' \n\t\t\tmFormatFlags: 0x0 \n\t\t\tmBytesPerPacket: 0 \n\t\t\tmFramesPerPacket: 1024 \n\t\t\tmBytesPerFrame: 0 \n\t\t\tmChannelsPerFrame: 2 \n\t\t\tmBitsPerChannel: 0 \t} \n\t\tcookie: {{length = 39, capacity = 39, bytes = 0x03808080220000000480808014401500 ... 1390068080800102}} \n\t\tACL: {(null)}\n\t\tFormatList Array: {(null)} \n\t} \n\textensions: {(null)}\n}"
)
2018-04-16 16:38:47.562175+0800 DKShortVideo[6440:3145512] trackVideo playable = 1
2018-04-16 16:38:47.562201+0800 DKShortVideo[6440:3145512] trackVideo decodable = 1
2018-04-16 16:38:47.562222+0800 DKShortVideo[6440:3145512] trackVideo totalSampleDataLength = 0
2018-04-16 16:38:47.562243+0800 DKShortVideo[6440:3145512] trackVideo languageCode = (null)
2018-04-16 16:38:47.562423+0800 DKShortVideo[6440:3145512] trackVideo mediaType = nan  -- nan

2018-04-16 16:38:47.562884+0800 DKShortVideo[6440:3145512]  --------------------------------------------
  • 隐藏字幕媒体轨道
2018-04-16 16:38:47.562932+0800 DKShortVideo[6440:3145512]  --------------------------------------------
2018-04-16 16:38:47.563000+0800 DKShortVideo[6440:3145512] trackVideo = 
2018-04-16 16:38:47.563024+0800 DKShortVideo[6440:3145512] trackVideoSize = 640.000000--480.000000
2018-04-16 16:38:47.563317+0800 DKShortVideo[6440:3145512] trackVideoSec = 1800.000000
2018-04-16 16:38:47.563353+0800 DKShortVideo[6440:3145512] trackVideoFrameRate = 0.000000
2018-04-16 16:38:47.563418+0800 DKShortVideo[6440:3145512] trackVideo mediaType = clcp
2018-04-16 16:38:47.564093+0800 DKShortVideo[6440:3145512] trackVideo formatDescriptions = (
    " {\n\tmediaType:'clcp' \n\tmediaSubType:'atcc' \n\tmediaSpecific: {\n(null) \n\t} \n\textensions: {(null)}\n}"
)
2018-04-16 16:38:47.564131+0800 DKShortVideo[6440:3145512] trackVideo playable = 1
2018-04-16 16:38:47.564152+0800 DKShortVideo[6440:3145512] trackVideo decodable = 1
2018-04-16 16:38:47.564171+0800 DKShortVideo[6440:3145512] trackVideo totalSampleDataLength = 0
2018-04-16 16:38:47.564191+0800 DKShortVideo[6440:3145512] trackVideo languageCode = (null)
2018-04-16 16:38:47.564211+0800 DKShortVideo[6440:3145512] trackVideo mediaType = nan  -- nan

  • AVMediaTypeVideo的track可以获取到视频的宽度,高度,视频总长度等信息。

充分了解音视频轨道信息对充分学习音视频开发有很大帮助。

期待和大家共同进步。

你可能感兴趣的:(主流开源视频媒体格式分析和AVFoundation解析各媒体信息方法)