InnoDB使用的是索引组织表(IOT)的方式存储表记录,索引组织表以主键构建一个B-tree的数据结构来存储行记录,行记录存储在树的叶节点内。这与Oracle数据库是不同的,Oracle数据库默认创建的表是堆组织表(HOT),HOT记录按堆数据结构进行存储。在InnoDB中,我们将存储行记录的B-tree索引称为Clustered Index, 而表的相关普通B-tree索引(非主键索引)称为Secondary Index。
The data in each table is divided into pages. The pages that make up each table are arranged in a tree data structure called a B-tree index. Table data and secondary indexes both use this type of structure. The B-tree index that represents an entire table is known as the clustered index, which is organized according to the primary key columns. The nodes of a clustered index data structure contain the values of all columns in the row. The nodes of a secondary index structure contain the values of index columns and primary key columns.
存储Clustered Index的节点(页)与Secondary Index的的节点的内容是不一样的,树结构,包含页节点(Leaf)和非叶节点(Non-Leaf),前文提到,InnoDB中表的记录和相关索引都使用FIL_PAGE_INDEX Page存储,要解析一个FIL_PAGE_INDEX页的第一步就是要确定这个页存储的是什么数据。
前文提到,表和其相关索引的数据是存储在FIL_PAGE_INDEX页的,那么我们首先看看FIL_PAGE_INDEX的页结构:
0+--------------------------+
| FIL Header (38) |
38+--------------------------+
| INDEX Header (36) |
74+--------------------------+
| FSEG Header (20) |
94+--------------------------+
| System Records (26) |
120+--------------------------+
| User Records | // Records are un-ordered physically, 物理无序(不按主键顺序)存放, 类似于堆表
| | // 但会由一个单向链表(singly-linked)将用户记录按主键有序串起来。
+--------Heap Top----------+ // Heap Top是Page空间使用的高水位(HWM)
| Free Space |
+--------------------------+
| Page Directory | // The page directory grows downwards from the FIL trailer in ascending order by key. //空间占用从后往前增长
| | // The number of entries is stored in the INDEX header. 数量记录在INDEX Header->PAGE_N_DIR_SLOTS中。
16376+--------------------------+
| FIL Trailer (8) |
16384+--------------------------+
* 前面的数字是在Page内的偏移(bytes)
解析的第一步需要解析FIL_PAGE_INDEX的INDEX_Header结构:
38 +----------------------+
| PAGE_N_DIR_SLOTS (2) |
40 +----------------------+
| PAGE_HEAP_TOP (2) |
42 +----------------------+
| PAGE_N_HEAP (2) |
46 +----------------------+
| PAGE_FREE (2) |
48 +----------------------+
| PAGE_GARBAGE (2) |
50 +----------------------+
| PAGE_LAST_INSERT (2) |
52 +----------------------+
| PAGE_DIRECTION (2) |
52 +----------------------+
| PAGE_N_DIRECTION (2) |
54 +----------------------+
| PAGE_N_RECS (2) |
56 +----------------------+
| PAGE_MAX_TRX_ID (2) |
64 +----------------------+
| PAGE_LEVEL (2) |
66 +----------------------+
| PAGE_INDEX_ID (2) |
73 +----------------------+
* 前面的数字是在Page内的偏移(bytes)
通过解析INDEX_Header,我们就可以知道FIL_PAGE_INDEX存储的是哪个索引,记录类型,page level, …:
public class IdxPage1 {
public static void main(String[] args) throws IOException, Exception {
String fileName = "D:\\Data\\mysql\\8.0.18\\data\\sakila\\film.ibd";
try (IbdFileParser parser = new IbdFileParser(fileName)) {
List<Long> pageNums = parser.getPageTypeMap().get(FilHeader.FIL_PAGE_INDEX);
StringBuilder buff = new StringBuilder();
buff.append(" PAGE PAGE_TYPE LEVEL INDEX_ID PAGE_PREV PAGE_NEXT\n")
.append("----- --------------- ----- -------- ----------- -----------\n");
for (long pageNum : pageNums) {
IndexPage indexPage = (IndexPage) parser.getPage(pageNum);
FilHeader filHeader = indexPage.getFilHeader();
IndexHeader indexHeader = indexPage.getIndexHeader();
buff.append(String.format("%5d ", pageNum))
.append(String.format("%15s ", filHeader.getPageTypeName()))
.append(String.format("%5d ", indexHeader.getPageLevel()))
.append(String.format("%8d ", indexHeader.getIndexId()))
.append(String.format("%11d ", filHeader.getPreviousPage()))
.append(String.format("%11d ", filHeader.getNextPage()))
.append("\n");
}
System.out.println(buff);
}
}
}
程序输入:
PAGE PAGE_TYPE LEVEL INDEX_ID PAGE_PREV PAGE_NEXT
----- --------------- ----- -------- ----------- -----------
4 FIL_PAGE_INDEX 1 596 4294967295 4294967295
5 FIL_PAGE_INDEX 1 597 4294967295 4294967295
6 FIL_PAGE_INDEX 0 598 4294967295 4294967295
7 FIL_PAGE_INDEX 0 599 4294967295 4294967295
8 FIL_PAGE_INDEX 0 596 4294967295 9
9 FIL_PAGE_INDEX 0 596 8 10
10 FIL_PAGE_INDEX 0 596 9 11
11 FIL_PAGE_INDEX 0 596 10 12
12 FIL_PAGE_INDEX 0 596 11 13
13 FIL_PAGE_INDEX 0 596 12 14
14 FIL_PAGE_INDEX 0 596 13 15
15 FIL_PAGE_INDEX 0 596 14 18
16 FIL_PAGE_INDEX 0 597 4294967295 17
17 FIL_PAGE_INDEX 0 597 16 4294967295
18 FIL_PAGE_INDEX 0 596 15 19
19 FIL_PAGE_INDEX 0 596 18 20
20 FIL_PAGE_INDEX 0 596 19 4294967295
首先我们观察INDEX_ID,这是索引在InnoDB中的唯一编号,通过输出我们可以看到sakila.film表有4个索引,编号分别为: 596, 597, 598和599,INDEX_ID对应的页存储的就是该索引的数据。我们可以在MySQL中查到对应信息:
select
idx.space space_id,
idx.page_no,
index_id,
idx.name index_name,
tab.name table_name
from innodb_indexes idx, innodb_tables tab
where
idx.table_id = tab.table_id
and index_id in(596, 597, 598, 599);
--语句输出:
space_id|page_no|index_id|index_name |table_name |
--------+-------+--------+---------------------------+-----------+
387| 4| 596|PRIMARY |sakila/film|
387| 5| 597|idx_title |sakila/film|
387| 6| 598|idx_fk_language_id |sakila/film|
387| 7| 599|idx_fk_original_language_id|sakila/film|
这与我们解析的结果是一致的。结合PAGE_LEVEL和PAGE_PREV/PAGE_NEXT信息可以得到整棵B-Tree的基本层次结构, 以主键(PRIMARY, 596)为例,我们可以看到596号索引最大的page_level为1,位于page(4), 最大的level代表着树的根节点(root), page4的page_next和page_prev都为0xffffffff(4294967295),可以理解为指向自己或者终结。page(8)的page_prev为0xffffffff,level=0,代表叶节点,是叶节点的最左边节点。page(20)的page_nex为0xffffff, level=0, 说明是叶子节点最右边的节点。所以根据解析输出,我们可以描绘树的基本结构:
Level 1: page(4)
|
+---------+----------+---------+--------+
/ / | \ \
Level 0: page(8) <-> page(9) <-> page(10) <-> ... <-> page(20)
后续文章将开始讨FIL_PAGE_INDEX页内的记录内容。