MySQL5.5 中引入了 metadata lock. 顾名思义,metadata lock 不是为了保护表中的数据的,而是保护 database objects(元数据)的。包括表结构、schema、存储过程、函数、触发器、mysql的调度事件(events). 要理解 metadata lock 最重要的一点就是:将 metadata lock放到数据库事务的语义中来理解。metadata lock 的作用就是当一个事务在执行时,事务涉及到的所有元数据(metadata,也就是 database objects),必须是安全的。比如你在一个事物中select一个table,必须保证该table在你的事物完成之前,她不会被删除了,或者不会被修改了。
相关文档:http://dev.mysql.com/doc/refman/5.6/en/metadata-locking.html
1. metadata lock 的作用
MySQL uses metadata locking to manage concurrent access to database objects and to ensure data consistency. Metadata locking applies not just to tables, but also to schemas and stored programs (procedures, functions, triggers, and scheduled events).
metadata lock管理对database objects的并发访问,保证数据一致性。
2.metadata lock 会导致性能损耗和锁争用
Metadata locking does involve some overhead, which increases as query volume increases. Metadata contention increases the more that multiple queries attempt to access the same objects.
metadata lock 的引入导致一定的性能损耗。对同一个database object的访问越多,就会越导致该对象上的metadata lock的争用。
3.
Metadata locking is not a replacement for the table definition cache, and its mutexes and locks differ from the LOCK_open
mutex.
metadata lock 并不是 为了替代 表定义缓存。其mutex和lock和 LOCK_open mutex不一样。
4.
To ensure transaction serializability, the server must not permit one session to perform a data definition language (DDL) statement on a table that is used in an uncompleted explicitly or implicitly started transaction in another session. The server achieves this by acquiring metadata locks on tables used within a transaction and deferring release of those locks until the transaction ends. A metadata lock on a table prevents changes to the table's structure. This locking approach has the implication that a table that is being used by a transaction within one session cannot be used in DDL statements by other sessions until the transaction ends.
正在运行中的事务,必须要在事务开始时获得它要访问的所有的database objects上的 metadata lock, 然后在事务结束时释放那些database objects上的metadata lock. 事务和metadata lock的关系是极其紧密的:有事务必然就必然有metadata lock,事物结束就释放。metadata lock防止事物中的database objects 被修改,比如阻止事物中的table的结构被修改。所以事务中的database objects上执行DDL会被阻塞,直到事务结束。
5.
This principle applies not only to transactional tables, but also to nontransactional tables. Suppose that a session begins a transaction that uses transactional table t
and nontransactional table nt
as follows:
START TRANSACTION; SELECT * FROM t; SELECT * FROM nt;
The server holds metadata locks on both t
and nt
until the transaction ends. If another session attempts a DDL or write lock operation on either table, it blocks until metadata lock release at transaction end. For example, a second session blocks if it attempts any of these operations:
DROP TABLE t; ALTER TABLE t ...; DROP TABLE nt; ALTER TABLE nt ...; LOCK TABLE t ... WRITE;
metadata lock不仅仅涉及到事务引擎中的table,同样也适用于非事务引擎中的table. metadata lock不仅仅阻塞DDL,同时也阻塞 lock table table_name write 语句。
6.
If the server acquires metadata locks for a statement that is syntactically valid but fails during execution, it does not release the locks early. Lock release is still deferred to the end of the transaction because the failed statement is written to the binary log and the locks protect log consistency.
如果一个sql语句语法正确,但是却执行失败了,其上的metadata lock并不会马上释放,而是要在事务结束之后才释放。这是为了保证日志的一致性。
7.
In autocommit mode, each statement is in effect a complete transaction, so metadata locks acquired for the statement are held only to the end of the statement.
自动提交模式(mysql命令行工具默认是自动提交模式),语句一执行完马上就释放metadata lock,因为他是自动提交的单语句事务。
8.
Metadata locks acquired during a PREPARE
statement are released once the statement has been prepared, even if preparation occurs within a multiple-statement transaction.
事务中的metadata lock直到事务结束才释放,但是有一个特例:事务中的prepare(一般用在存储过程中的动态语句)语句一执行完马上释放对应的metadata lock.
9.
Before MySQL 5.5, when a transaction acquired the equivalent of a metadata lock for a table used within a statement, it released the lock at the end of the statement. This approach had the disadvantage that if a DDL statement occurred for a table that was being used by another session in an active transaction, statements could be written to the binary log in the wrong order.
MySQL 5.5 引入了metadata lock,取代了之前版本中的等价物。
但是metadata lock和她之前的等价物有一个区别:metadata lock直到事务结束才释放,而她的等价物是语句执行完就马上释放。metadata lock这样做的目的是为了保证 binary log 顺序的正确。
10. 实验一(lock table xxx write 语句; 实验环境Centos下的mysql5.6.27)
首先在终端A执行:
mysql> lock table cats write; Query OK, 0 rows affected (0.01 sec)
然后在终端B执行:
select * from cats;
你会发现被阻塞了。
然后在终端A中执行:
mysql> show processlist; +----+------+-----------+---------+---------+------+---------------------------------+--------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+------+-----------+---------+---------+------+---------------------------------+--------------------+ | 1 | root | localhost | ngx_lua | Query | 2940 | Waiting for table metadata lock | select * from cats | | 2 | root | localhost | ngx_lua | Query | 0 | init | show processlist | | 3 | root | localhost | NULL | Sleep | 2913 | | NULL | +----+------+-----------+---------+---------+------+---------------------------------+--------------------+ 3 rows in set (0.00 sec)
你可以看到 select * from cats 被阻塞的原因是:Waiting for table metadata lock
而那个 Sleep 中的正是 lock table cats wirte 语句。它也持有了 cats 表上的 metadata lock 的,排斥其它任何事务对该metadata lock的申请。
到这里你可能会问:说好的MVCC呢?说好的 select 语句可以使用MVCC,不需要用到锁呢?
所以也许这是MySQL和Oracle的一个区别。
然后我们执行 kill 3 试图将 lock table cats write 这个session杀掉,期望他是否metadata lock:
mysql> kill 3 -> ; Query OK, 0 rows affected (0.00 sec) mysql> show processlist; +----+------+-----------+---------+---------+------+---------------------------------+--------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+------+-----------+---------+---------+------+---------------------------------+--------------------+ | 1 | root | localhost | ngx_lua | Query | 3605 | Waiting for table metadata lock | select * from cats | | 2 | root | localhost | ngx_lua | Query | 0 | init | show processlist | +----+------+-----------+---------+---------+------+---------------------------------+--------------------+ 2 rows in set (0.00 sec) mysql> unlock table cats; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'cats' at line 1 mysql> unlock tables; Query OK, 0 rows affected (0.01 sec) mysql> show processlist; +----+------+-----------+---------+---------+------+-------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+------+-----------+---------+---------+------+-------+------------------+ | 1 | root | localhost | ngx_lua | Sleep | 3757 | | NULL | | 2 | root | localhost | ngx_lua | Query | 0 | init | show processlist | +----+------+-----------+---------+---------+------+-------+------------------+ 2 rows in set (0.01 sec) mysql>
然而看到,kill 3 并没有使 metadata lock 得到释放。使用了 unlock tables; 语句才释放了 metadata lock。最后那个select 语句也得以执行完成。
注意:lock table cats read; 语句并不会一致持有 metadata lock 而阻塞其它语句。
11. 实验二
首先在A终端中修改 autocommit 参数:
mysql> set autocommit=0; Query OK, 0 rows affected (0.00 sec) mysql> show session variables like 'autocommit'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | autocommit | OFF | +---------------+-------+ 1 row in set (0.00 sec) mysql> commit; Query OK, 0 rows affected (0.00 sec) mysql> select * from cats; +----+------+ | id | name | +----+------+ | 3 | NULL | | 2 | | | 1 | Andy | +----+------+ 3 rows in set (0.01 sec)
首先修改 session 的 autocommit 参数为 off, 然后开始一个事务。注意该事务一直没有提交。
然后在终端B中执行一条DDL语句:
发现被阻塞了。而阻塞它的就是那个没有提交的事务,因为他一致持有 metadata lock,所以导致DDL语句被阻塞。
mysql> show processlist; +----+------+-----------+---------+---------+------+---------------------------------+----------------------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+------+-----------+---------+---------+------+---------------------------------+----------------------------------+ | 1 | root | localhost | ngx_lua | Query | 0 | init | show processlist | | 2 | root | localhost | ngx_lua | Query | 177 | Waiting for table metadata lock | alter table cats drop index name | | 4 | root | localhost | NULL | Sleep | 322 | | NULL | +----+------+-----------+---------+---------+------+---------------------------------+----------------------------------+ 3 rows in set (0.00 sec)
手动 commit 之后,DDL的阻塞结束,顺利执行完成。
mysql> alter table cats drop index name; Query OK, 0 rows affected (4 min 30.81 sec) Records: 0 Duplicates: 0 Warnings: 0
可以看到阻塞了 (4 min 30.81 sec)
上面两个实验说明:DDL 语句以及lock table xxx write 和 事务 对 metadata lock 存在互斥争用。这是因为 DDL 和 lock table xxx write 语句会对metadata lock申请独占型的锁。
12. 实验三
在终端A(autocommit=off)中执行:
mysql> show variables like 'autocommit'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | autocommit | OFF | +---------------+-------+ 1 row in set (0.00 sec) mysql> update cats set name='Linus' where id=1; Query OK, 1 row affected (0.01 sec) Rows matched: 1 Changed: 1 Warnings: 0
然后在终端B(autocommit=on)中执行:
mysql> update cats set name='strup' where id=3; Query OK, 1 row affected (0.02 sec) Rows matched: 1 Changed: 1 Warnings: 0
可以看到终端B并没有被阻塞。
即使A,B都是 autocommit=off, 并且都不提交,都存在事务中,也不会相互阻塞。这说明普通的update,select,delete并不会在metadata lock上争用,也就是多个运行中的事物可以同时持有同一个database object上的metadata lock(因为这些非DDL语句并不会修改database objects,它们修改的是表数据而不是表结构。所以 事务对 metadata lock 是申请的共享锁。).
13. 实验四
首先在A终端中,设置 autocommit=off; 然后随便执行一条update,select,delete语句:
mysql> show variables like 'autocommit'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | autocommit | OFF | +---------------+-------+ 1 row in set (0.00 sec) mysql> update uu_test set sex='M' where id=1;
然后在B终端中执行一条 DDL:
alter table uu_test add index(userId);
结果你会发现B终端中的 该条DDL会一直被阻塞,在A查看:
mysql> show processlist; +----+------+-----------+------+---------+------+---------------------------------+---------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+------+-----------+------+---------+------+---------------------------------+---------------------------------------+ | 1 | root | localhost | aazj | Query | 0 | init | show processlist | | 2 | root | localhost | aazj | Query | 351 | Waiting for table metadata lock | alter table uu_test add index(userId) | | 4 | root | localhost | NULL | Sleep | 2900 | | NULL | +----+------+-----------+------+---------+------+---------------------------------+---------------------------------------+ 3 rows in set (0.00 sec)
发现是被 A 中为提交事物持有的 metadata lock 所阻塞。可以看到未提交事务的危害有多大!!!!!!它会一直持有 metadata lock.
14. 实验五:
在终端A中执行一个时间较长的DDL:
mysql> alter table uu_test add index(user_homeTel); Query OK, 0 rows affected (12.86 sec) Records: 0 Duplicates: 0 Warnings: 0
然后在DDL执行完之前,马上在B终端中执行:
mysql> update uu_test set user_Sex='M' where userId=1; Query OK, 0 rows affected (0.14 sec) Rows matched: 256 Changed: 0 Warnings: 0
我们看到DDL的执行期间,并没有阻塞其它事务中的update, select, delete 等等语句。也就是说DDL语句对 metadata lock 的持有是瞬时的,并不会再其执行期间一直持有。这点一定要注意。这就是 DDL 语句和 事务还有lock table xxx write语句的区别:DDL语句并不会再执行期间一直持有metadata lock,而是在执行的开始瞬时持有metadata lock,马上释放;而事务会在事务期间一直持有metadata lock;lock table xxx write语句也会一直持有metadata lock指定unlock语句发出。
15. 实验六(DDL最大的危害):
首先在A终端中设置autocommit=off; 然后随便执行一个select/update/delete语句,一直不提交,占用 metadata lock:
mysql> select userId,user_Sex from uu_test limit 2; +--------+----------+ | userId | user_Sex | +--------+----------+ | 1 | M | | 2 | F | +--------+----------+ 2 rows in set (0.09 sec)
然后在终端B中执行一条 DDL,很明显它会被上面的 metadata lock 阻塞:
然后我们在C终端中对同一个表uu_test执行随便一条:select/update/delete语句,神奇的情况发生!!!!!
可以看到C终端中的对同表uu_test一条select语句尽然被阻塞了!!!!!!
看下终端D中的show processlist:
可以看到:DDL 语句 alter table uu_test add index(user_QQ) 被 未提交的事务阻塞,然后DDL语句进而阻塞了其后事务中所有的针对同表uu_test的任何语句。因为他们都要获得 metadada lock。这应该是DDL语句的最大危害之处。同理可以推断:长事物长时间持有 metadata lock, 会阻塞其它DDL语句对metada lock的互斥申请,然后该DDL语句阻塞其后所有的涉及到该database objects的所有语句。这里按照我们的正常逻辑,C中的语句应该不会被阻塞才对啊?难道是为了防止DDL语句对metadata lock的申请,发生饥饿现象。所以才阻塞了C中的语句。或者对metadata lock的申请维持了一个FIFO的队列?
然后我们在A终端中执行提交:commit; 然后 B 中的DDL语句立即获得 metadata lock,然后又马上释放;然后C中的 select 也成功获得metadata lock. B中的DDL语句因为执行时间长,它会在C执行完之后,才执行完成。这也说明了DDL语句对metadata lock的持有是瞬时的,并不会在执行期间一直持有(不然C也不会再B之前执行完成)。其实对应的 innodb status 信息如下:
TRANSACTIONS ------------ Trx id counter 22565 Purge done for trx's n:o < 21037 undo n:o < 0 state: running but idle History list length 486 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 0, not started MySQL thread id 4, OS thread handle 0x96cffb70, query id 129 localhost root init show engine innodb status ---TRANSACTION 0, not started MySQL thread id 3, OS thread handle 0x96e34b70, query id 127 localhost root Waiting for table metadata lock select userId,user_Sex from uu_test limit 1 ---TRANSACTION 0, not started MySQL thread id 2, OS thread handle 0x96e65b70, query id 86 localhost root Waiting for table metadata lock alter table uu_test add index(user_QQ) ---TRANSACTION 22564, ACTIVE 103 sec MySQL thread id 1, OS thread handle 0x96e96b70, query id 46 localhost root cleaning up Trx read view will not see trx with id >= 22565, sees < 22565
此时的处理方法见下:
16. Waiting for table metadata lock 的处理:
在上面的 15. 实验六(DDL最大的危害):中,DDL 被未提交事务或者长事务持有的metadata lock阻塞,进而DDL阻塞其它所有涉及到同一个database object 的所有SQL语句。此时该如何处理呢:
1)如果是未提交的事务,则进行提交或者回滚就行了(如何查找未提交的事务见下面的17);
2)如果是长事务阻塞了DDL,则我们可以 kill sid 的方式杀掉 DDL 所在的session,因为DDL被metadata lock锁阻塞,所以其实还没有对表进行任何修改:
mysql> kill 2; Query OK, 0 rows affected (0.01 sec) mysql> show processlist; +----+------+-----------+------+---------+------+-------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+------+-----------+------+---------+------+-------+------------------+ | 1 | root | localhost | aazj | Query | 0 | init | show processlist | | 3 | root | localhost | aazj | Sleep | 1176 | | NULL | +----+------+-----------+------+---------+------+-------+------------------+ 2 rows in set (0.00 sec)
我们使用 kill 2 将 DDL 语句所在的session杀掉之后,终端C中的waiting from metadata lock, 立即消失了。我们再看DDL所在的终端B的报错信息:
mysql> alter table uu_test add index(user_QQ); ERROR 2013 (HY000): Lost connection to MySQL server during query
我们这样 kill 掉DDL所在的session的原因是:DDL因为被metadata lock锁阻塞,所以实际上并没有对表结构进行任何修改。所以kill掉是安全。
如果不kill掉的话,它会阻塞其后所有涉及到相同database object的所有的SQL语句(事务中的)。
注意:这里还涉及到一个如何查找未提交事务的问题:
17. 如何查找一直未提交的事务:
未提交的事务,会导致各种问题,比如死锁,比如上面的 Waiting for metadata lock 等等,那么我们在mysql中如何查找因为各种原因而一直未提交的事物呢:
mysql> SELECT * FROM information_schema.INNODB_TRX\G *************************** 1. row *************************** trx_id: 22564 trx_state: RUNNING trx_started: 2015-10-19 13:17:09 trx_requested_lock_id: NULL trx_wait_started: NULL trx_weight: 0 trx_mysql_thread_id: 1 trx_query: NULL trx_operation_state: NULL trx_tables_in_use: 0 trx_tables_locked: 0 trx_lock_structs: 0 trx_lock_memory_bytes: 312 trx_rows_locked: 0 trx_rows_modified: 0 trx_concurrency_tickets: 0 trx_isolation_level: REPEATABLE READ trx_unique_checks: 1 trx_foreign_key_checks: 1 trx_last_foreign_key_error: NULL trx_adaptive_hash_latched: 0 trx_adaptive_hash_timeout: 10000 trx_is_read_only: 0 trx_autocommit_non_locking: 0 1 row in set (0.00 sec)
我们看到 trx_id: 22564, trx_state: RUNNING, trx_started: 2015-10-19 13:17:09, trx_mysql_thread_id: 1
事务 22564 的 trx_state 一直是 RUNNING,而且从trx_started知道已经运行很久了,其对应的 trx_mysql_thread_id=1,所以是1号线程忘记了提交。
而且我们从上面的 show engines innodb status 中包括的事务信息,也可以看出:
---TRANSACTION 22564, ACTIVE 103 sec MySQL thread id 1, OS thread handle 0x96e96b70, query id 46 localhost root cleaning up Trx read view will not see trx with id >= 22565, sees < 22565
---TRANSACTION 22564, ACTIVE 103 sec
事务 22564 运行了很久。而且这里的 22564 和 information_schema.INNODB_TRX 中的 trx_id: 22564 是相符的。
找到了 未提交的 sid之后,在mysql客户端中执行 kill 1 就可以将其杀掉了。1号session被kill掉之后,被其阻塞的DDL和其它语句都从Waiting for metadata lock 中解脱出来,成功执行。1号session被kill之后,将会被回滚掉。
18. 使用 lock_wait_timeout 来防止 DDL 导致 Waiting metadata lock
既然DDL会被长事务或者为提交的事务该阻塞在 metadata lock 上,那么我们就可以在执行 DDL 的session中在session级别设置 metadata lock 的超时过期,如果在一定的时间内,DDL没有获得 metadata lock,那么就放弃:
mysql> set lock_wait_timeout=50; Query OK, 0 rows affected (0.07 sec) mysql> alter table uu_test add index(user_QQ); ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
上面我们看到 50秒超时,放弃 DDL语句。
lock_wait_timeout 的说明:
lock_wait_timeout
Command-Line Format | --lock_wait_timeout=# |
||
System Variable | Name | lock_wait_timeout |
|
Variable Scope | Global, Session | ||
Dynamic Variable | Yes | ||
Permitted Values | Type | integer |
|
Default | 31536000 |
||
Min Value | 1 |
||
Max Value | 31536000 |
This variable specifies the timeout in seconds for attempts to acquire metadata locks. The permissible values range from 1 to 31536000 (1 year). The default is 31536000.
This timeout applies to all statements that use metadata locks. These include DML and DDL operations on tables, views, stored procedures, and stored functions, as well as LOCK TABLES
, FLUSH TABLES WITH READ LOCK
, andHANDLER
statements.
This timeout does not apply to implicit accesses to system tables in the mysql
database, such as grant tables modified by GRANT
or REVOKE
statements or table logging statements. The timeout does apply to system tables accessed directly, such as with SELECT
or UPDATE
.
The timeout value applies separately for each metadata lock attempt. A given statement can require more than one lock, so it is possible for the statement to block for longer than the lock_wait_timeout
value before reporting a timeout error. When lock timeout occurs, ER_LOCK_WAIT_TIMEOUT
is reported.
lock_wait_timeout
does not apply to delayed inserts, which always execute with a timeout of 1 year. This is done to avoid unnecessary timeouts because a session that issues a delayed insert receives no notification of delayed insert timeouts.
注意:区分 lock_wait_timeout 和 innodb_lock_wait_timeout, 前者专门用于metadata lock, 默认值是1年, 后者是 innodb 中行锁的超时过期时间,默认50秒。
19. 总结:
1)metadata lock保护的是元数据,也就是database object(表结构等元数据),而不是表中的数据;
2)每一个在运行中的事务涉及到的database object,都必须获得metadata lock,然后在事务结束时进行释放(parepare语句除外);
3)DDL 语句以及lock table xxx write 和 事务 对 metadata lock 存在互斥争用;因为DDL和lock table write申请的是独占型的锁;而事务是共享锁。
普通的update,select,delete并不会在metadata lock上争用,多个运行中的事物可以同时持有同一个database object上的metadata lock(共享锁).
4)mysql终端默认是autocommit=on,千万不要将mysql工具默认修改成autocommit=off; 而JDBC连接默认是 autocommit=off的;
5)metadata lock 因为每一个事务都要先获得,事物结束时释放,所以MySQL中一定不要有大事务,特别是运行时间比较长的事物;
不然会导致对metadata lock的长期占用。会阻塞其它事务中任何涉及到该database object的DDL语句和lock table ... write语句;
6)DDL 语句和 事务还有lock table xxx write语句的区别:
DDL 语句并不会再执行期间一直持有metadata lock,而是只在执行的开始瞬时持有metadata lock,马上释放;
而事务会在事务期间一直持有metadata lock;lock table xxx write语句也会一直持有metadata lock直到unlock语句解锁。
7)长的事物 和 lock table ... write语句会长时间持有 metadata lock; 所以在执行DDL语句之前,要使用show processlist语句看DDL语句涉及到的table
是否被某个长时间运行的事物所访问。不然DDL语句会存在一直被 metadata lock 所阻塞的危险。可怕的不是DDL,而是长事务。或者在DDL语句所在的
session中在session级别设置 lock_wait_timeout 参数,防止 metadata lock 被一直锁住。
8)mysql命令行工具中执行的 DDL 语句不会受到 autocommit=on/off 的影响,DDL 语句自动开始事务,结束时自动提交事物;
9)DDL语句的最大危害之处:
未提交事物或者长事务,它们会长时间持有 metadata lock, 会阻塞其后的DDL语句对metada lock的互斥申请,
然后该DDL语句对metadata lock的互斥申请,会阻塞其后所有的涉及到该database objects的所有语句,因为它们也要申请metadata lock。
10)未提交事务会导致死锁,waiting metadata lock等等各种问题。解决办法是找到未提交事务的sid, 然后 kill sid; 让其回滚。
11)对lock table x write 的session执行kill sid并不能使其释放锁;对于被metadata lock所阻塞DDL语句的session执行kill sid可以使其放弃对
metadata lock的申请(它并没有获得metadata lock),从申请锁的排队队列中删除掉,从而不会阻塞其它的select/update/delete语句。