元数据moov(三)tref box(ISO-14496-12)
Author:Pirate Leo
Email:[email protected]
ISO 14496 - 12 定义了一种封装媒体数据的基础文件格式,mp4、3gp、ismv等我们常见媒体封装格式都是以这种基础文件格式为基础衍生的。
如果从全局角度了解基础文件格式,请看我之前的博文《MP4文件格式详解——结构概述》。
本系列文档从MP4文件入手,对文件中重要的box进行解析。
<======================================================================>
本次继续解析moov box,关于moov的解析推荐从我之前的博文《MP4文件格式详解——元数据moov(一)》看起。
moov |
|
|
|
|
|
√ |
container for all the metadata |
|
mvhd |
|
|
|
|
√ |
movie header, overall declarations |
|
trak |
|
|
|
|
√ |
container for an individual track or stream |
|
|
tkhd |
|
|
|
√ |
track header, overall information about the track |
|
|
tref |
|
|
|
|
track reference container |
|
|
edts |
|
|
|
|
edit list container |
|
|
|
elst |
|
|
|
an edit list |
|
|
mdia |
|
|
|
√ |
container for the media information in a track |
|
|
|
mdhd |
|
|
√ |
media header, overall information about the media |
|
|
|
hdlr |
|
|
√ |
handler, declares the media (handler) type |
|
|
|
minf |
|
|
√ |
media information container |
|
|
|
|
vmhd |
|
|
video media header, overall information (video track only) |
|
|
|
|
smhd |
|
|
sound media header, overall information (sound track only) |
|
|
|
|
hmhd |
|
|
hint media header, overall information (hint track only) |
|
|
|
|
nmhd |
|
|
Null media header, overall information (some tracks only) |
|
|
|
|
dinf |
|
√ |
data information box, container |
|
|
|
|
|
dref |
√ |
data reference box, declares source(s) of media data in track |
|
|
|
|
stbl |
|
√ |
sample table box, container for the time/space map |
|
|
|
|
|
stsd |
√ |
sample descriptions (codec types, initialization etc.) |
|
|
|
|
|
stts |
√ |
(decoding) time-to-sample |
|
|
|
|
|
ctts |
|
(composition) time to sample |
|
|
|
|
|
stsc |
√ |
sample-to-chunk, partial data-offset information |
|
|
|
|
|
stsz |
|
sample sizes (framing) |
|
|
|
|
|
stz2 |
|
compact sample sizes (framing) |
|
|
|
|
|
stco |
√ |
chunk offset, partial data-offset information |
|
|
|
|
|
co64 |
|
64-bit chunk offset |
|
|
|
|
|
stss |
|
sync sample table (random access points) |
|
|
|
|
|
stsh |
|
shadow sync sample table |
|
|
|
|
|
padb |
|
sample padding bits |
|
|
|
|
|
stdp |
|
sample degradation priority |
|
|
|
|
|
sdtp |
|
independent and disposable samples |
|
|
|
|
|
sbgp |
|
sample-to-group |
|
|
|
|
|
sgpd |
|
sample group description |
|
|
|
|
|
subs |
|
sub-sample information |
由于我本地没有找到包含tref box的MP4文件,因此无法以实际数据分析。
但通过协议足以使我们明白tref box的作用:
tref box可以描述两track之间关系。
比如:一个MP4文件中有三条video track,ID分别是2、3、4,以及三条audio track,ID分别是6、7、8。
在播放track 2视频时到底应该采用6、7、8哪条音频与其配套播放?这时候就需要在track 2与6的tref box中指定一下,将2与6两条track绑定起来。
在我们常见的MP4文件中几乎看不到这种情况的存在,实际应用场景在哪呢?
我们知道,ISO-14496-12是一种基础文件格式,从这种文件格式衍生出的不仅mp4文件,还有很多用于在线实时交付的流媒体视频格式,比如微软的Smooth Streaming的解决方案中的ismv文件。
假设我们是一家电视台,我们采用了微软的Smooth Streaming技术进行节目发布,我们推出了13套节目,分别是CCAV 1-13。这时候我们服务器推出的媒体流可能只有一个。这个流中包含了全部的13套节目,至少有13条视频轨与13条音频轨。用户在收看节目时使用了某公司生产的类似机顶盒似的硬件设备,可以解码与播放,但是必须要找到每套节目对应的视频与音频(不能播放CCAV 5篮球赛画面的同时配上了CCAV 13的共同关注声音)。这时候就需要通过tref box将视频与音频之间的关系一一对应起来。
这就是tref box的实际应用场景之一,有些类似ts格式中的PAT,PMT。在官方协议中描述了另一种应用,即,参考时钟track,简单理解就是音视频在此处都引用了同一个time code track,以使音视频同步播放,类似ts格式中PCR与各track的PTS关系。
下面看具体字段:
aligned(8) class TrackReferenceBox extends Box(‘tref’)
{
}
aligned(8) class TrackReferenceTypeBox (unsigned int(32) reference_type) extends Box(reference_type)
{
unsigned int(32) track_IDs[];
}
顾名思意,tref box用于列出本track解析时所参考的track有哪些。
每个trak box中只能包含[0-1]个tref box;(通常情况下,我们所见的MP4文件是没有tref box的)
每个tref box下面可以包含1个以上的tref type box;
引用Apple官方给出的结构图如下:
在Apple协议中,atom是box的另一种名称;图中可知tref box中包含多个子box,每个子box需要填写type和track ID。
Type的填写参照下表(Apple定义):
Track reference types
Reference type |
Description |
---|---|
|
Time code. Usually references a time code track. |
|
Chapter or scene list. Usually references a text track. |
|
Synchronization. Usually between a video and sound track. Indicates that the two tracks are synchronized. The reference can be from either track to the other, or there may be two references. |
|
Transcript. Usually references a text track. |
|
Non-primary source. Indicates that the referenced track should send its data to this track, rather than presenting it. The referencing track will use the data to modify how it presents its data. See“Track Input Map Atoms”for more information. |
|
The referenced tracks contain the original media for this hint track. |
• ‘hint’ the referenced track(s) contain the original media for this hint track
• ‘cdsc‘ this track describes the referenced track.
• ‘hind‘ this track depends on the referenced hint track, i.e., it should only be used if the referenced
hint track is used.