InnoDB文件物理结构解析5 - FIL_PAGE_INDEX

本文讨论FIL_PAGE_INDEX页的可回收垃圾记录(Garbage/Deleted Records),当我们删除某一条记录(delete from …)时,通常InnoDB并不会在物理存储上进行完全删除,而是在记录上置一个删除标志位,我们称这些行记录为垃圾记录,删除标志位位于对应记录的Extra Bytes中。与正常的记录(User Records)类似,InnoDB在页内也有一个单向链表将可回收垃圾记录串在一起,用户记录是从Infimum Record开始,到Supremum Record结束, 而可回收垃圾记录则是从"FIL_PAGE_INDEX"页中的"INDEX Header"中的PAGE_FREE (First Garbage Record Offset) 开始,到最后一条垃圾数据结束(next_offset=0,指向自己)。记录遍历的方式都是一样的。

sakila数据库是MySQL官方的案例库,这里介绍一下获取方法,打开链接: https://dev.mysql.com/doc/index-other.html,然后找到"Example Databases"章节,即可获取下载和相关文档链接。

上代码,解析Garbage Records:

/*
在数据库中执行删除操作:
root@localhost [sakila]> set foreign_key_checks = 0; --> 因为sakila有外键约束, 为了简化删除, 先在会话中禁用外键检查;
Query OK, 0 rows affected (0.00 sec)

root@localhost [sakila]> delete from sakila.film where film_id in (646, 656, 666); --> 删除3条记录;
Query OK, 3 rows affected (0.01 sec)
*/

public class IdxPage5 {
	public static void main(String[] args) throws Exception {
		String fileName = "D:\\Data\\mysql\\8.0.18\\data\\sakila\\film.ibd";
		try (IbdFileParser parser = new IbdFileParser(fileName)) {
			// 通过上文"InnoDB文件物理结构解析4"可知,page(14)包含film_id=(565, 668)之间的数据;
			Page page = parser.getPage(14);
			// System.out.println(ParserHelper.hexDump(page.getPageRaw()));
			ClusteredKeyLeafPage leafPage = new ClusteredKeyLeafPage(page.getPageRaw(), page.getPageSize());
			List<ClusteredKeyLeafRecord> garbageRecords = leafPage.getUserRecords(IdxPage3.getFilmTableMeta());
			StringBuilder buff = new StringBuilder();
			for (ClusteredKeyLeafRecord record : garbageRecords) {
				List<RecordField> fields = record.getRecordFields();
				buff.append("\n ==> Extra: deleted = ").append(record.getDeletedFlag()).append("; next offset = ")
						.append(record.getNextRecordOffset()).append(" <==\n");
				for (RecordField field : fields) {
					buff.append(String.format("%20s", field.getName())).append(": ").append(field.getContent()).append("\n");
				}
			}
			System.out.println(buff);
		}
	}
}

/* 
程序执行结果:
 ==> Extra: deleted = true; next offset = -1432 <==
             film_id: 666
           DB_TRX_ID: 00000000843c
         DB_ROLL_PTR: 01000001321526
               title: PAYCHECK WAIT
         description: A Awe-Inspiring Reflection of a Boy And a Man who must Discover a Moose in The Sahara Desert
        release_year: 2006
         language_id: 1
original_language_id: null
     rental_duration: 4
         rental_rate: 4.99
              length: 145
    replacement_cost: 27.99
              rating: PG-13
    special_features: [Commentaries, Deleted Scenes]
         last_update: 2006-02-15T05:03:42

 ==> Extra: deleted = true; next offset = -1489 <==
             film_id: 656
           DB_TRX_ID: 00000000843c
         DB_ROLL_PTR: 010000013214c7
               title: PAPI NECKLACE
         description: A Fanciful Display of a Car And a Monkey who must Escape a Squirrel in Ancient Japan
        release_year: 2006
         language_id: 1
original_language_id: null
     rental_duration: 3
         rental_rate: 0.99
              length: 128
    replacement_cost: 9.99
              rating: PG
    special_features: [Trailers, Deleted Scenes, Behind the Scenes]
         last_update: 2006-02-15T05:03:42

 ==> Extra: deleted = true; next offset = 0 <==
             film_id: 646
           DB_TRX_ID: 00000000843c
         DB_ROLL_PTR: 01000001321466
               title: OUTBREAK DIVINE
         description: A Unbelieveable Yarn of a Database Administrator And a Woman who must Succumb a A Shark in A U-Boat
        release_year: 2006
         language_id: 1
original_language_id: null
     rental_duration: 6
         rental_rate: 0.99
              length: 169
    replacement_cost: 12.99
              rating: NC-17
    special_features: [Trailers, Deleted Scenes, Behind the Scenes]
         last_update: 2006-02-15T05:03:42
*/

程序输出和我们预想的一样,记录虽然被delete语句删除了,但是数据还是保留在页内的。只是Extra Bytes的delete flag被置为true,最后一条被删除的记录指向的不是"Supremum Record",而是自己(next offset = 0)。

案例中获取删除数据用到了ClusteredKeyLeafPage的getGarbageRecords()方法,与获取普通用户记录的getUserRecords()方法的唯一不同是遍历记录的开始位置不同:

public class ClusteredKeyLeafPage {
	public List<ClusteredKeyLeafRecord> getUserRecords(TableMeta tableMeta) {
		int pos = getSystemRecords().getInfimumNextRecordPos(); 
		return iterateRecordInPage(tableMeta, pos);
	}
	
	public List<ClusteredKeyLeafRecord> getGarbageRecords(TableMeta tableMeta) {
		int pos = getIndexHeader().getFirstGarbageOffset();
		return iterateRecordInPage(tableMeta, pos);
	}
	
	private List<ClusteredKeyLeafRecord> iterateRecordInPage(TableMeta tableMeta, int firstRecordPos) {
	        //...
	        // 遍历结束的条件
			if (recCount > maxRecs || nextOffset == 0 || nextRecord == SUPREMUM_EXTRA_END_POS) {
				break;
			}
			//
	}
}	

更多测试情况:

  • 如果执行的是"truncate sakila.film",该方法是无效的,因为整个ibd文件的存储空间会被"重置"(文件会变小,没有page(14)),全表删除(“delete from sakila.film”)通常不会,但也有特例,当一个表的数据量非常小(索引深度小于1),所有的行都在一个(Leaf) Page时,观察到全表删除和truncate一样,整个页的记录数据会被清掉(置为00),可以通过hexdump确认。

  • 在删除整个页内的记录时,记录虽然不会被清掉,但观察到会有部分删除记录在User Record链表内的情况。

最后,介绍一个通过hexdump命令查看某个页内容的方法:

# 假设我们要看page(4),Page的大小为16KB(16384字节);
# 那么page(4)的起始位置为 4 * 16384=65536,读取长度为16384;
# 所以命令hexdump的命令为:
[think@TP-T470 sakila]$ hexdump --skip 65536 --length 16384 -C -v film2.ibd

你可能感兴趣的:(MySQL,mysql)