前文介绍了数据页的记录的遍历方式,接下来讨论如何将表记录一条一条的读取出来,前文提过一个表的主键索引在InnoDB中称为Clustered Key,而表其它的普通索引称为Secondary Key,他们记录的格式是不同的,我们先讨论Clustered Key,Clustered Key包含叶节点(Leaf)和非叶点(Non-Leaf)数据页,他们的内容也是不同的,在mysql-file-parser中将它们分别抽象成ClusteredKeyLeafPage和ClusteredKeyLeafPage这两个类,它们都继承自IndexPage。因为记录解析相关代码比较多,后续演示代码不会很详细,主要介绍解析思路,要看具体代码实现,可以下载源代码。
我们先看用户记录的通用结构:
+---------------------------------------------------------------------+
| User Record |
+------------------------+-------------+------------------------------+
| variable-field-lengths | null-bitmap | user-record-extra | contents |
+------------------------+-------------+------------------------------+
variable-field-lengths: (variable), the list of variable-length filed 's store bytes in contents.
null-bitmap: (variable), the bitmap of non-pk and non-internal columns. 1 if null.
user-record-extra: (fixed), 5 bytes for compact/dynamic row format.
contents: (variable), field contents.
前面我们已经讲解了user-record-extra(也就是Extra Bytes)的解析,它是固定长度的,只要在next_offset位置,往前读取5个字节即可获得,它记录的是用户记录的元数据,并构建记录的单向链表。
我们看到user-record-extra还有数据,叫做null-bitmap,InnoDB是不记录空值的,而是通过位图来标记空值,null-bitmap是可变长的,要读取null-bitmap,首先需要知道该位图占多少字节。
位图长度 = (nullableColumnCount + 7) / 8,由公式可知,位图的长度由表的nullable字段的数量决定,如果一个表的字段都是非空字段,那么就不存null-bitmap;如果存在非空字段,就会有null-bitmap,位图的一个bit,表示一个nullable字段的值是否为空,bit值为0代表该字段值非空,如果为1代表字段的值为空。
null-bitmap前面是variable-field-lengths部分,该部分是可变长字段的长度列表,该列表也是可变长的,要读取同样需要先确定列表的长度。如果一个表没有可变长字段(varchar,text,…),那么就不存在variable-field-lengths;如果表含有可变长字段,则会在variable-field-lengths中记录可变长字段数据实际存储的字节数。那么variable-field-lengths中用几个字节来记录可变长字段的长度呢?
CHAR类型字段比较特殊,如果CHAR类型字段的字符集是可变长的, 如utf8mb4, utf8mb3, 那么InnoDB会将该字段做为可变长字段处理,如果字符集是固定长的,如latin1,则将该字段做为固定长字段处理。
如果可变长记录的(最大)长度小于255字节,则用1个字节记录该字段的长度。例如: 一个字段定义col1 varchar(10) character set utf8mb4
。col1字段最大可以容纳10个字符,字符集是utf8mb4,一个字符最大占用4个字节,那么col1最多占用10*4=40个字节存储空间,这时variable-field-lengths会用1个字节来存储col1字段内容的实际长度。
因为一字节最大值是255,如果一个字段的最大(可能)长度超过255个字节,一个字节肯定无法表达其长度,这时很可能需要2个字节来表示,例如:一个字段定义col2 varchar(100) character set utf8mb4
,那么col2最多占4*100,也就是400字节存储空间,那该字段使用2个字节来记录该字段的长度?答案是不一定!根据实际存储的长度,有可能是1个字节,也可能是2个字节,我在这里也折腾了半天。具体的算法如下:
if (!field->fixed_length) {
// if (debug) printf("Variable-length field: read the length\n");
/* Variable-length field: read the length */
len = *lens--;
if (field->max_length > 255 || field->type == FT_BLOB || field->type == FT_TEXT) {
if (len & 0x80) {
/* 1exxxxxxx xxxxxxxx */
len <<= 8;
len |= *lens--;
offs += len & 0x3fff;
if (len & 0x4000) {
len = offs | REC_OFFS_EXTERNAL;
} else {
len = offs;
}
goto resolved;
}
}
len = offs += len;
}
这部分代码是来自“undrop-for-innodb”的“c_parser.c”,我的C语言很菜,也说不出其中道道,我是硬将上述C代码改成Java语言实现。
通过前面null-bitmap和variable-field-lengths的解析,我们就可以知道每个字段的存储长度,就就可以从后面的contents部分读取出每个字段的内容。下面我们看看ClusteredKey的叶节点和非叶节点的记录结构:
### Contents (Clustered Key - Leaf Page) ###
-----------------------+--------------------+------------------+--------------------+
Cluster Key Fields (k) | Transaction ID (6) | Roll Pointer (7) | Non-Key Fields (j) |
-----------------------+--------------------+------------------+--------------------+
### Contents (Clustered Key - Non-Leaf Page) ###
---------------------------------------+----------------------
Cluster Key Min. Key on Child Page (k) | Child Page Number (4)
---------------------------------------+----------------------
Clustered Key Leaf Page存储的是行数据,通过结构图可以看到,记录最前面存储的是主键(PK)的字段,后面跟着两个伪列DB_TRX_ID和DB_ROLL_PTR,最后存储的是表的剩余非主键列。如果一个表没有定义主键,InnoDB会用一个伪列DB_ROW_ID(6)做为主键。
在演示代码的之前,还有一个细节,ibd文件的page(4)(MySQL8.0以前版本为Page(3))在Jeremy Cole的"InnoDB_Structures.pdf"中描述为"INDEX: Root page of first index",该页固定为主键索引的根。这意味着page(4)是我们解析表数据的起点。
当你的一个表,只有几行数据时,那么你可以观察到page(4)的page_level=0,这是一个Clustered Key Leaf Page,表所有的行记录都存储在该页内,当你不断往里面插入数据,该页无法容纳所有记录时,page(4)的会变成Clustered Key Non-Leaf Page,原来page(4)中的数据会分散到其它的叶节点内。
那么下面演示以下如何通过mysql_file_parser读取出sakila.film这个表的所有数据。
public class IdxPage3 {
public static TableMeta getFilmTableMeta() {
ColumnMeta filmId = newFixLengthColumnMeta(UNSIGNED_SMALLINT, 1, "film_id", false);
ColumnMeta trxId = newTrxIdColumnMeta(2);
ColumnMeta rollPtr = newRollPtrColumnMeta(3);
ColumnMeta title = newColumnMeta(VARCHAR, 4, "title", 512, false, true);
ColumnMeta description = newColumnMeta(TEXT, 5, "description", 65535, true, true);
ColumnMeta releaseYear = newFixLengthColumnMeta(YEAR, 6, "release_year", true);
ColumnMeta languageId = newFixLengthColumnMeta(UNSIGNED_TINYINT, 7, "language_id", false);
ColumnMeta originalLanguageId = newFixLengthColumnMeta(UNSIGNED_TINYINT, 8, "original_language_id", true);
ColumnMeta rentalDuration = newFixLengthColumnMeta(UNSIGNED_TINYINT, 9, "rental_duration", false);
ColumnMeta rentalRate = newDecimalColumnMeta(10, "rental_rate", false, 4, 2);
ColumnMeta length = newFixLengthColumnMeta(UNSIGNED_SMALLINT, 11, "length", true);
ColumnMeta replacementCost = newDecimalColumnMeta(12, "replacement_cost", false, 5, 2);
ColumnMeta rating = newEnumColumnMeta(13, "rating", true, "G","PG","PG-13","R","NC-17");
ColumnMeta specialFeatures = newSetColumnMeta(14, "special_features", true, "Trailers","Commentaries","Deleted Scenes","Behind the Scenes");
ColumnMeta lastUpdate = newFixLengthColumnMeta(TIMESTAMP, 15, "last_update", false);
TableMeta tableMeta = new TableMeta()
.addColumn(filmId)
.addColumn(trxId)
.addColumn(rollPtr)
.addColumn(title)
.addColumn(description)
.addColumn(releaseYear)
.addColumn(languageId)
.addColumn(originalLanguageId)
.addColumn(rentalDuration)
.addColumn(rentalRate)
.addColumn(length)
.addColumn(replacementCost)
.addColumn(rating)
.addColumn(specialFeatures)
.addColumn(lastUpdate)
.setClusterKey(1, filmId);
return tableMeta;
}
public static void main(String[] args) throws Exception {
String fileName = "D:\\Data\\mysql\\8.0.18\\data\\sakila\\film.ibd";
try (IbdFileParser parser = new IbdFileParser(fileName)) {
Map<Integer, List<Long>> pageTypeMap = parser.getPageTypeMap();
List<Long> indexPageNumbers = pageTypeMap.get(FilHeader.FIL_PAGE_INDEX);
IndexPage pkRoot = (IndexPage) parser.getPage(4);
long pkId = pkRoot.getIndexHeader().getIndexId().longValueExact();
StringBuilder buff = new StringBuilder();
int recCount = 0;
for (long pageNumber : indexPageNumbers) {
IndexPage indexPage = (IndexPage) parser.getPage(pageNumber);
IndexHeader indexHeader = indexPage.getIndexHeader();
int level = indexHeader.getPageLevel();
long indexId = indexHeader.getIndexId().longValueExact();
if (indexId == pkId && level == 0) {
List<ClusteredKeyLeafRecord> userRecords = new ClusteredKeyLeafPage(indexPage.getPageRaw(),
indexPage.getPageSize()).getUserRecords(getFilmTableMeta());
for (ClusteredKeyLeafRecord userRecord : userRecords) {
List<RecordField> fields = userRecord.getRecordFields();
buff.append("\n*************************** ").append(++recCount).append(". row ***************************\n");
for (RecordField field : fields) {
buff.append(String.format("%20s", field.getName())).append(": ").append(field.getContent())
.append("\n");
}
}
}
}
System.out.println(buff);
}
}
}
程序输出:
......
*************************** 999. row ***************************
film_id: 999
DB_TRX_ID: 000000738111
DB_ROLL_PTR: 81000002391e80
title: ZOOLANDER FICTION
description: A Fateful Reflection of a Waitress And a Boat who must Discover a Sumo Wrestler in Ancient China
release_year: 2006
language_id: 1
original_language_id: null
rental_duration: 5
rental_rate: 2.99
length: 101
replacement_cost: 28.99
rating: R
special_features: [Trailers, Deleted Scenes]
last_update: 2006-02-15T05:03:42
*************************** 1000. row ***************************
film_id: 1000
DB_TRX_ID: 000000738111
DB_ROLL_PTR: 81000002391e98
title: ZORRO ARK
description: A Intrepid Panorama of a Mad Scientist And a Boy who must Redeem a Boy in A Monastery
release_year: 2006
language_id: 1
original_language_id: null
rental_duration: 3
rental_rate: 4.99
length: 50
replacement_cost: 18.99
rating: NC-17
special_features: [Trailers, Commentaries, Behind the Scenes]
last_update: 2006-02-15T05:03:42
获取表所有的数据,只要将所有的ClusteredKeyLeafPage(index_id=pk, level=0),解析出来即可,记录解析依赖于对表的元数据(字段类型,每个字段在表中的顺序,字段是否允许为空等信息)。
非叶节点的解析是类似的,salika.film只有一个非叶节点page(4),我们直接解析page(4)的内容:
public class IdxPage4 {
public static void main(String[] args) throws Exception {
String fileName = "D:\\Data\\mysql\\8.0.18\\data\\sakila\\film.ibd";
try (IbdFileParser parser = new IbdFileParser(fileName)) {
Page page = parser.getPage(4);
ClusteredKeyNonLeafPage rootPage = new ClusteredKeyNonLeafPage(page.getPageRaw(), page.getPageSize());
List<ClusteredKeyNonLeafRecord> records = rootPage.getUserRecords(IdxPage3.getFilmTableMeta());
StringBuilder buff = new StringBuilder();
buff.append("Cluster Key Min. Key Child Page Number \n")
.append("-------------------- ----------------- \n");
for(ClusteredKeyNonLeafRecord record: records) {
long childPage = record.getChildPageNumber();
List<RecordField> minKeys = record.getMinClusterKeyOnChild();
String minKeyValue = "";
for (RecordField minKey : minKeys) {
minKeyValue += (minKey.getName() + " = " + minKey.getContent() + " ");
}
buff.append(String.format("%21s ", minKeyValue))
.append(String.format("%17d", childPage))
.append("\n");
}
System.out.println(buff);
}
}
}
程序输出
Cluster Key Min. Key Child Page Number
-------------------- -----------------
film_id = 1 8
film_id = 51 9
film_id = 153 10
film_id = 255 11
film_id = 359 12
film_id = 462 13
film_id = 565 14
film_id = 669 15
film_id = 772 18
film_id = 874 19
film_id = 976 20
可以通过该索引信息迅速找到需要的记录:
file_id的值在 1~ 50之间的记录,存放在page(8);
file_id的值在51~152之间的记录,存放在page(9);
…
file_id的值大于等于976的记录,存放在page(20);
验证一下,假设我们要查询file_id=666的记录,那么根据索引信息,该记录存放在page(14),验证一下是否正确:
page = parser.getPage(14);
ClusteredKeyLeafPage leafPage = new ClusteredKeyLeafPage(page.getPageRaw(), page.getPageSize());
List leafRecords = leafPage.getUserRecords(IdxPage3.getFilmTableMeta());
for(ClusteredKeyLeafRecord record: leafRecords) {
List fields = record.getRecordFields(); //字段有序列表, 按字段在表中顺序排序,从0开始;
RecordField filmIdField = fields.get(0);
long filmId = ((BigInteger)filmIdField.getContent()).longValueExact();
if(filmId == 666L) {
for(RecordField field: fields) {
System.out.println(String.format("%20s", field.getName())+": "+ field.getContent());
}
}
}
输出如下:
film_id: 666
DB_TRX_ID: 000000738111
DB_ROLL_PTR: 81000002073ee8
title: PAYCHECK WAIT
description: A Awe-Inspiring Reflection of a Boy And a Man who must Discover a Moose in The Sahara Desert
release_year: 2006
language_id: 1
original_language_id: null
rental_duration: 4
rental_rate: 4.99
length: 145
replacement_cost: 27.99
rating: PG-13
special_features: [Commentaries, Deleted Scenes]
last_update: 2006-02-15T05:03:42
///通过数据库查询该记录,与分析结果完全一致。
root@localhost [testcase]> select * from sakila.film where film_id=666\G
*************************** 1. row ***************************
film_id: 666
title: PAYCHECK WAIT
description: A Awe-Inspiring Reflection of a Boy And a Man who must Discover a Moose in The Sahara Desert
release_year: 2006
language_id: 1
original_language_id: NULL
rental_duration: 4
rental_rate: 4.99
length: 145
replacement_cost: 27.99
rating: PG-13
special_features: Commentaries,Deleted Scenes
last_update: 2006-02-15 05:03:42
1 row in set (0.00 sec)