Element ID coded with an UTF-8 like system : bits, big-endian 1xxx xxxx - Class A IDs (2^7 -1 possible values) (base 0x8X) 01xx xxxx xxxx xxxx - Class B IDs (2^14-1 possible values) (base 0x4X 0xXX) 001x xxxx xxxx xxxx xxxx xxxx - Class C IDs (2^21-1 possible values) (base 0x2X 0xXX 0xXX) 0001 xxxx xxxx xxxx xxxx xxxx xxxx xxxx - Class D IDs (2^28-1 possible values) (base 0x1X 0xXX 0xXX 0xXX)
bits, big-endian 1xxx xxxx - value 0 to 2^7-2 01xx xxxx xxxx xxxx - value 0 to 2^14-2 001x xxxx xxxx xxxx xxxx xxxx - value 0 to 2^21-2 0001 xxxx xxxx xxxx xxxx xxxx xxxx xxxx - value 0 to 2^28-2 0000 1xxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx - value 0 to 2^35-2 0000 01xx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx - value 0 to 2^42-2 0000 001x xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx - value 0 to 2^49-2 0000 0001 xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx - value 0 to 2^56-2
去掉001这样的前缀,xxx就是实际需要的element id, element data size的值。
element id, element data size 都是以001这样开头,element data直接跟在element data size,没有了前面的001这样的前缀。
EBML = 0x1A45DFA3,
EBMLVersion = 0x4286,
EBMLReadVersion = 0x42F7,
EBMLMaxIDLength = 0x42F2,
EBMLMaxSizeLength = 0x42F3,
DocType = 0x4282,
DocTypeVersion = 0x4287,
DocTypeReadVersion = 0x4285,
这是通过ghex程序拷贝的一个文件的ebml文件头信息(ghex打开文件后,可以通过save as菜单把hex保存为html):
1a 45 df a3 01 00 00 00 00 00 00 1f 42 86 81 01 42 f7 |
81 01 42 f2 81 04 42 f3 81 08 42 82 84 77 65 62 6d 42 |
87 81 02 42 85 81 02 18 53 80 67 01 00 00 00 00 18 ab |
Element ID:1a 45 df a3
Element data size : 01 [0000 0001, 8个字节]
Element data: 00 00 00 00 00 00 1f [十进制是31,表示了后面所有Element总长度(字节),所以对于EBML header 的level 0,data的内容就是header中sub element的总字节数]
Element ID:42 82
Element data size : 84 [84二进制就是1000 0100,去掉1,后面就是000 0100,十进制是4,表示后面的数据占四个字节]
Element data: 77 65 62 6d [对应的ascii字符就是w e b m]
-
/* EBML typefind helper */
-
static gboolean
-
ebml_check_header
(GstTypeFind * tf, const gchar * doctype, int doctype_len)
-
{
-
/* 4 bytes for EBML ID, 1 byte for header length identifier */
-
guint8 *data = gst_type_find_peek (tf,
0,
4 +
1);
-
gint len_mask =
0x80, size =
1, n =
1, total;
-
-
if (!data)
-
return FALSE;
-
-
/* ebml header? */
-
if (data[
0] !=
0x1A || data[
1] !=
0x45 || data[
2] !=
0xDF || data[
3] !=
0xA3)
-
return FALSE;
-
-
/* length of header */
-
total = data[
4];
-
/*
-
* len_mask binary: 1000 0000, while循环 total & len_mask 就可计算出前面0的个数,
-
* 碰到1结束循环,size的值刚好就是ebml head element的字节数。
-
*/
-
while (size <=
8 && !(total & len_mask)) {
-
size++;
-
len_mask >>=
1;
-
}
-
if (size >
8)
/* 得出ebml header(level 0) data 的字节数 */
-
return FALSE;
-
-
total &= (len_mask -
1);
-
while (n < size)
-
total = (total <<
8) | data[
4 + n++];
-
-
/* get new data for full header, 4 bytes for EBML ID,
-
* EBML length tag and the actual header */
-
data = gst_type_find_peek (tf,
0,
4 + size + total);
-
if (!data)
-
return FALSE;
-
-
/* only check doctype if asked to do so */
-
if (doctype ==
NULL || doctype_len ==
0)
-
return TRUE;
-
-
/* the header must contain the doctype. For now, we don't parse the
-
* whole header but simply check for the availability of that array
-
* of characters inside the header. Not fully fool-proof, but good
-
* enough. */
-
for (n =
4 + size; n <=
4 + size + total - doctype_len; n++)
-
if (!
memcmp (&data[n], doctype, doctype_len))
-
return TRUE;
-
-
return FALSE;
-
}
-
static void
-
matroska_type_find
(GstTypeFind * tf, gpointer ununsed)
-
{
-
if (
ebml_check_header (tf,
"matroska",
8))
-
gst_type_find_suggest (tf, GST_TYPE_FIND_MAXIMUM, MATROSKA_CAPS);
-
else
if (ebml_check_header (tf,
NULL,
0))
-
gst_type_find_suggest (tf, GST_TYPE_FIND_LIKELY, MATROSKA_CAPS);
-
}
-
多媒体封装格式详解---MKV【1】【2】【3】
http://blog.csdn.net/tx3344/article/details/8162656MKV文件格式
http://blog.chinaunix.net/uid-12845622-id-311943.html
Matroska文件解析之SimpleBlock
http://www.cnblogs.com/tangdoudou/archive/2012/05/14/2499063.html