1.版本
1)操作系统
cat /etc/issue
cat /etc/issue
CentOS release 6.6 (Final)
Kernel \r on an \m
cat /proc/version
cat /proc/version
Linux version 2.6.32-504.el6.x86_64 ([email protected]) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC) ) #1 SMP Wed Oct 15 04:27:16 UTC 2014
2)mysql数据库版本
MySQL --version
MySQL Ver 14.14 Distrib 5.6.26, for linux-glibc2.5 (x86_64) using EditLine wrapper
2.问题描述
一个朋友让我帮忙分析一个问题,他先执行了一个online ddl,如 alter table test engine=innodb(mysql 5.6引入online ddl)。然后在另一个session中开启了一个事物,在事物中对test表执行了dml操作,但是不提交也不回滚。最后他发现先执行的ddl操作,被后面执行的dml操作阻塞。
1)先在我的测试环境还原一下这个问题
按时间顺序,执行如下操作:
session1
alter table tb_battle_member_bak1 engine=innodb;
session2
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> delete from tb_battle_member_bak1 where BattleID='CX201605240076';
Query OK, 1 row affected (0.01 sec)
2)在session3 观察session1和session2操作情况
mysql> show processlist;
+-----+------+----------------------+-------+-------------+---------+-----------------------------------------------------------------------+-------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+------+----------------------+-------+-------------+---------+-----------------------------------------------------------------------+-------------------------------------------------+
| 2 | repl | 172.172.178.76:45429 | NULL | Binlog Dump | 2518274 | Master has sent all binlog to slave; waiting for binlog to be updated | NULL |
| 202 | repl | 172.172.178.77:46833 | NULL | Binlog Dump | 1552398 | Master has sent all binlog to slave; waiting for binlog to be updated | NULL |
| 271 | root | 127.0.0.1:22201 | test5 | Query | 177 | altering table | alter table tb_battle_member_bak1 engine=innodb |
| 272 | root | 127.0.0.1:22202 | test5 | Sleep | 161 | | NULL |
| 273 | root | 127.0.0.1:22204 | NULL | Query | 0 | init | show processlist |
+-----+------+----------------------+-------+-------------+---------+-----------------------------------------------------------------------+-------------------------------------------------+
5 rows in set (0.00 sec)
mysql> show processlist;
+-----+------+----------------------+-------+-------------+---------+-----------------------------------------------------------------------+-------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+------+----------------------+-------+-------------+---------+-----------------------------------------------------------------------+-------------------------------------------------+
| 2 | repl | 172.172.178.76:45429 | NULL | Binlog Dump | 2518275 | Master has sent all binlog to slave; waiting for binlog to be updated | NULL |
| 202 | repl | 172.172.178.77:46833 | NULL | Binlog Dump | 1552399 | Master has sent all binlog to slave; waiting for binlog to be updated | NULL |
| 271 | root | 127.0.0.1:22201 | test5 | Query | 178 | Waiting for table metadata lock | alter table tb_battle_member_bak1 engine=innodb |
| 272 | root | 127.0.0.1:22202 | test5 | Sleep | 162 | | NULL |
| 273 | root | 127.0.0.1:22204 | NULL | Query | 0 | init | show processlist |
+-----+------+----------------------+-------+-------------+---------+-----------------------------------------------------------------------+-------------------------------------------------+
5 rows in set (0.00 sec)
NOTE:我们发现session1的online ddl在altering table这一步花了大概177秒,然后在转入下一步(committing alter table to stor)的时候,发生等待,我们看到的等待信息是Waiting for table metadata lock。
3) 回滚session2 的事物,查询session1 中所执行sql的profile(profile需提前开启,请参考http://blog.csdn.net/shaochenshuo/article/details/51580296)
session1 profile:
mysql> show profiles;
+----------+--------------+----------------------------------------------------------------------+
| Query_ID | Duration | Query |
+----------+--------------+----------------------------------------------------------------------+
| 1 | 0.00120700 | select @@version_comment limit 1 |
| 2 | 0.00086900 | show slave hosts |
| 3 | 0.00203125 | show databases |
| 4 | 0.00065800 | SELECT DATABASE() |
| 5 | 0.00088225 | show tables |
| 6 | 0.00026350 | SELECT DATABASE() |
| 7 | 0.00087025 | show tables |
| 8 | 4.55985575 | select count(1) from tb_battle_member_bak1 |
| 9 | 0.42077325 | truncate table tb_battle_member_bak1 |
| 10 | 260.97370150 | insert into tb_battle_member_bak1 select * from tb_battle_member_bak |
| 11 | 205.00446375 | alter table tb_battle_member_bak1 engine=innodb |
+----------+--------------+----------------------------------------------------------------------+
11 rows in set, 1 warning (0.00 sec)
mysql> show profile for query 11;
+--------------------------------+------------+
| Status | Duration |
+--------------------------------+------------+
| starting | 0.000383 |
| checking permissions | 0.000011 |
| checking permissions | 0.000036 |
| init | 0.000007 |
| Opening tables | 0.000133 |
| setup | 0.000242 |
| creating table | 0.054524 |
| After create | 0.000493 |
| System lock | 0.000019 |
| preparing for alter table | 0.073421 |
| altering table | 177.612401 |
| Waiting for table metadata loc | 1.000169 | >>因为session2 中关于该表的事物未提交,所以产生metadata等待
| altering table | 0.000018 | >>因锁等待无法执行committing alter...,再次转到altering table(因为执行已经执行完,此时执行时间很短)
| Waiting for table metadata loc | 1.000157 |
| altering table | 0.000023 |
| Waiting for table metadata loc | 1.000120 |
| altering table | 0.000019 |
...........................
...........................
...........................
| Waiting for table metadata loc | 0.042325 |
| altering table | 0.000098 |
| committing alter table to stor | 1.205005 | >>此时session2中事物已提交,altering table后顺利转到下一步
| end | 0.000029 |
| query end | 0.010155 |
| closing tables | 0.000050 |
| freeing items | 0.000182 |
| logging slow query | 0.000142 |
| cleaning up | 0.000028 |
+--------------------------------+------------+
72 rows in set, 1 warning (0.00 sec)
从上面的分析我们可以看出来,online ddl执行到committing alter table to stor这一步的时候需要获取表的metadata lock(online ddl中最耗时的altering table这一步不需要获取metadata锁,不会阻塞dml)。而这时session2 对该表的dml并未提交(占用该表的metadata 锁),所以session1中的online ddl被session2 中的dml阻塞。
居然知道了原因,那么修改应用,解决问题就简单了。
关于Waiting for table metadata lock 问题可参考:http://blog.csdn.net/shaochenshuo/article/details/51577728