背景
dba的日常工作肯定有一项是ddl变更,ddl变更会锁表,这个可以说是dba心中永远的痛,特别是执行ddl变更,导致库上大量线程处于“Waiting for meta data lock”状态的时候。因此mysql 5.6的online ddl特性是dba们最期待的新特性,这个特性解决了执行ddl锁表的问题,保证了在进行表变更时,不会堵塞线上业务读写,保障在变更时,库依然能正常对外提供访问。网上关于online ddl的文章很多,但涉及原理的很少,都是介绍语法之类的,本文将详细介绍online ddl的原理,知其然,更要知其所以然。
ddl实现方式
5.6 online ddl推出以前,执行ddl主要有两种方式copy方式和inplace方式,inplace方式又称为(fast index creation)。相对于copy方式,inplace方式不拷贝数据,因此较快。但是这种方式仅支持添加、删除索引两种方式,而且与copy方式一样需要全程锁表,实用性不是很强。下面以加索引为例,简单介绍这两种方式的实现流程。
copy方式
(1).新建带索引的临时表
(2).锁原表,禁止DML,允许查询
(3).将原表数据拷贝到临时表(无排序,一行一行拷贝)
(4).进行rename,升级字典锁,禁止读写
(5).完成创建索引操作
inplace方式
(1).新建索引的数据字典
(2).锁表,禁止DML,允许查询
(3).读取聚集索引,构造新的索引项,排序并插入新索引
(4).等待打开当前表的所有只读事务提交
(5).创建索引结束
online ddl实现
online方式实质也包含了copy和inplace方式,对于不支持online的ddl操作采用copy方式,比如修改列类型,删除主键等;对于inplace方式,mysql内部以“是否修改记录格式”为基准也分为两类,一类需要重建表(修改记录格式),比如添加、删除列、修改列默认值等;另外一类是只需要修改表的元数据,比如添加、删除索引、修改列名等。Mysql将这两类方式分别称为rebuild方式和no-rebuild方式。online ddl主要包括3个阶段,prepare阶段,ddl执行阶段,commit阶段,rebuild方式比no-rebuild方式实质多了一个ddl执行阶段,prepare阶段和commit阶段类似。下面将主要介绍ddl执行过程中三个阶段的流程。
Prepare阶段:
创建新的临时frm文件持有EXCLUSIVE-MDL锁,禁止读写根据alter类型,确定执行方式(copy,online-rebuild,online-norebuild)更新数据字典的内存对象分配row_log对象记录增量生成新的临时ibd文件ddl执行阶段:
降级EXCLUSIVE-MDL锁,允许读写扫描old_table的聚集索引每一条记录rec 遍历 新表的聚集索引和二级索引,逐一处理根据rec构造对应的索引项将构造索引项插入sort_buffer块将sort_buffer块插入新的索引处理ddl执行过程中产生的增量(仅rebuild类型需要)commit阶段
升级到EXCLUSIVE-MDL锁,禁止读写重做最后row_log中最后一部分增量更新innodb的数据字典表提交事务(刷事务的redo日志)修改统计信息rename临时idb文件,frm文件变更完成
关键函数堆栈
拷贝数据
row_merge_build_indexes
row_merge_read_clustered_index //拷贝全量{
遍历老表的聚集索引
row_build //创建一个row
row_merge_buf_add //将row加入到sort_buffer
row_merge_insert_index_tuples //插入到新表(聚集索引+二级索引)
}
row_log_table_apply //对于rebuild类型,处理增量{
row_log_table_apply_insert //以insert为例
row_log_table_apply_convert_mrec //将buf项转为tuple
{
插入聚集索引 // row_ins_clust_index_entry_low
插入二级索引 // row_ins_sec_index_entry_low
}
}
修改表数据字典
commit_try_norebuild,commit_try_rebuild
常见的ddl操作
类型 |
并发DML |
算法 |
备注
|
添加/删除索引
|
Yes
|
Online(no-rebuild)
|
全文索引不支持
|
修改default值 修改列名 修改自增列值 |
Yes
|
Nothing
|
仅需要修改元数据
|
添加/删除列 交换列顺序 修改NULL/NOT NULL 修改ROW-FORMAT 添加/修改主键 |
Yes
|
Online(rebuild)
|
由于记录格式改变,需要重建表
|
修改列类型 Optimize table 转换字符集 |
No
|
Copy
|
需要锁表,不支持online
|
官网支持列表
Table 14.6 Summary of Online Status for DDL Operations
Operation | In-Place? | Copies Table? | Allows Concurrent DML? | Allows Concurrent Query? | Notes |
---|---|---|---|---|---|
CREATE INDEX , ADD INDEX |
Yes* | No* | Yes | Yes | Some restrictions for FULLTEXT index; see next row. |
ADD FULLTEXT INDEX |
Yes | No* | No | Yes | Creating the first FULLTEXT index for a table involves a table copy, unless there is a user-supplied FTS_DOC_ID column. Subsequent FULLTEXT indexes on the same table can be created in-place. |
DROP INDEX |
Yes | No | Yes | Yes | Modifies .frm file only, not the data file. |
OPTIMIZE TABLE |
Yes | Yes | Yes | Yes | Uses ALGORITHM=INPLACE as of MySQL 5.6.17. ALGORITHM=COPY is used if old_alter_table=1 or mysqld --skip-new option is enabled. OPTIMIZE TABLE using online DDL (ALGORITHM=INPLACE ) is not supported for tables with FULLTEXT indexes. |
Set default value for a column | Yes | No | Yes | Yes | Modifies .frm file only, not the data file. |
Change auto-increment value for a column | Yes | No | Yes | Yes | Modifies a value stored in memory, not the data file. |
Add a foreign key constraint | Yes* | No* | Yes | Yes | To avoid copying the table, disable foreign_key_checks during constraint creation. |
Drop a foreign key constraint | Yes | No | Yes | Yes | The foreign_key_checks option can be enabled or disabled. |
Rename a column | Yes* | No* | Yes* | Yes | To allow concurrent DML, keep the same data type and only change the column name. |
Add a column | Yes* | Yes* | Yes* | Yes | Concurrent DML is not allowed when adding an auto-increment column. Although ALGORITHM=INPLACE is allowed, the data is reorganized substantially, so it is still an expensive operation. |
Drop a column | Yes | Yes* | Yes | Yes | Although ALGORITHM=INPLACE is allowed, the data is reorganized substantially, so it is still an expensive operation. |
Reorder columns | Yes | Yes | Yes | Yes | Although ALGORITHM=INPLACE is allowed, the data is reorganized substantially, so it is still an expensive operation. |
Change ROW_FORMAT property |
Yes | Yes | Yes | Yes | Although ALGORITHM=INPLACE is allowed, the data is reorganized substantially, so it is still an expensive operation. |
Change KEY_BLOCK_SIZE property |
Yes | Yes | Yes | Yes | Although ALGORITHM=INPLACE is allowed, the data is reorganized substantially, so it is still an expensive operation. |
Make column NULL |
Yes | Yes | Yes | Yes | Although ALGORITHM=INPLACE is allowed, the data is reorganized substantially, so it is still an expensive operation. |
Make column NOT NULL |
Yes* | Yes | Yes | Yes | STRICT_ALL_TABLES or STRICT_TRANS_TABLES SQL_MODE is required for the operation to succeed. The operation fails if the column contains NULL values. As of 5.6.7, the server prohibits changes to foreign key columns that have the potential to cause loss of referential integrity. For more information, see Section 13.1.7, “ALTER TABLE Syntax”. Although ALGORITHM=INPLACE is allowed, the data is reorganized substantially, so it is still an expensive operation. |
Change data type of column | No | Yes | No | Yes | |
Add primary key | Yes* | Yes | Yes | Yes | Although ALGORITHM=INPLACE is allowed, the data is reorganized substantially, so it is still an expensive operation. ALGORITHM=INPLACE is not allowed under certain conditions if columns have to be converted to NOT NULL . See Example 14.9, “Creating and Dropping the Primary Key”. |
Drop primary key and add another | Yes | Yes | Yes | Yes | ALGORITHM=INPLACE is only allowed when you add a new primary key in the same ALTER TABLE ; the data is reorganized substantially, so it is still an expensive operation. |
Drop primary key | No | Yes | No | Yes | Restrictions apply when you drop a primary key primary key without adding a new one in the same ALTER TABLE statement. |
Convert character set | No | Yes | No | Yes | Rebuilds the table if the new character encoding is different. |
Specify character set | No | Yes | No | Yes | Rebuilds the table if the new character encoding is different. |
Rebuild with FORCE option |
Yes | Yes | Yes | Yes | Uses ALGORITHM=INPLACE as of MySQL 5.6.17. ALGORITHM=COPY is used if old_alter_table=1 or mysqld --skip-new option is enabled. Table rebuild using online DDL (ALGORITHM=INPLACE ) is not supported for tables with FULLTEXT indexes. |
Rebuild with “null” ALTER TABLE ... ENGINE=INNODB |
Yes | Yes | Yes | Yes | Uses ALGORITHM=INPLACE as of MySQL 5.6.17. ALGORITHM=COPY is used if old_alter_table=1 or mysqld --skip-new option is enabled. Table rebuild using online DDL (ALGORITHM=INPLACE ) is not supported for tables with FULLTEXT indexes. |
Set table-level persistent statistics options (STATS_PERSISTENT , STATS_AUTO_RECALC STATS_SAMPLE_PAGES ) |
Yes | No | Yes | Yes | Modifies .frm file only, not the data file. |
若干问题
1.如何实现数据完整性
使用online ddl后,用户心中一定有一个疑问,一边做ddl,一边做dml,表中的数据不会乱吗?这里面关键部件是row_log。row_log记录了ddl变更过程中新产生的dml操作,并在ddl执行的最后将其应用到新的表中,保证数据完整性。
2.online与数据一致性如何兼得
实际上,online ddl并非整个过程都是online,在prepare阶段和commit阶段都会持有MDL-Exclusive锁,禁止读写;而在整个ddl执行阶段,允许读写。由于prepare和commit阶段相对于ddl执行阶段时间特别短,因此基本可以认为是全程online的。Prepare阶段和commit阶段的禁止读写,主要是为了保证数据一致性。Prepare阶段需要生成row_log对象和修改内存的字典;Commit阶段,禁止读写后,重做最后一部分增量,然后提交,保证数据一致。
3.如何实现server层和innodb层一致性
在prepare阶段,server层会生成一个临时的frm文件,里面包含了新表的格式;innodb层生成了临时的ibd文件(rebuild方式);在ddl执行阶段,将数据从原表拷贝到临时ibd文件,并且将row_log增量应用到临时ibd文件;在commit阶段,innodb层修改表的数据字典,然后提交;最后innodb层和mysql层面分别重命名frm和idb文件。
参考文档
http://hedengcheng.com/?p=405
http://hedengcheng.com/?p=421
http://hedengcheng.com/?p=148