回顾一下生产中的一次MySQL异常,Cause: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction解决与处理。
异常如下:
Cause: java.sql.SQLException: Lock wait timeout exceeded;
try restarting transaction
翻译:锁等待超时,尝试重启事务。
① information_schema.innodb_trx–当前运行的所有事务
Field | Type | Null | Key | Default | Extra |
---|---|---|---|---|---|
trx_id | varchar(18) | NO | 事务ID | ||
trx_state | varchar(13) | NO | 事务状态: | ||
trx_started | datetime | NO | 0000-00-00 00:00:00 | 事务开始时间; | |
trx_requested_lock_id | varchar(81) | YES | NULL | innodb_locks.lock_id | |
trx_wait_started | datetime | YES | NULL | 事务开始等待的时间 | |
trx_weight | bigint(21) unsigned | NO | 0 | 事务权重 | |
trx_mysql_thread_id | bigint(21) unsigned | NO | 0 | 事务线程ID | |
trx_query | varchar(1024) | YES | NULL | 具体SQL语句 | |
trx_operation_state | varchar(64) | YES | NULL | 事务当前操作状态 | |
trx_tables_in_use | bigint(21) unsigned | NO | 0 | 事务中有多少个表被使用 | |
trx_tables_locked | bigint(21) unsigned | NO | 0 | 事务拥有多少个锁 | |
trx_lock_structs | bigint(21) unsigned | NO | 0 | ||
trx_lock_memory_bytes | bigint(21) unsigned | NO | 0 | 事务锁住的内存大小(B) | |
trx_rows_locked | bigint(21) unsigned | NO | 0 | 事务锁住的行数 | |
trx_rows_modified | bigint(21) unsigned | NO | 0 | 事务更改的行数 | |
trx_concurrency_tickets | bigint(21) unsigned | NO | 0 | 事务并发票数 | |
trx_isolation_level | varchar(16) | NO | 事务隔离级别 | ||
trx_unique_checks | int(1) | NO | 0 | 是否唯一性检查 | |
trx_foreign_key_checks | int(1) | NO | 0 | 是否外键检查 | |
trx_last_foreign_key_error | varchar(256) | YES | NULL | 最后的外键错误 | |
trx_adaptive_hash_latched | int(1) | NO | 0 | ||
trx_adaptive_hash_timeout | bigint(21) unsigned | NO | 0 |
② information_schema.innodb_locks–当前出现的锁
Field | Type | Null | Key | Default | Extra |
---|---|---|---|---|---|
lock_id | varchar(81) | NO | 锁ID | ||
lock_trx_id | varchar(18) | NO | 拥有锁的事务ID | ||
lock_mode | varchar(32) | NO | 锁模式 | ||
lock_type | varchar(32) | NO | 锁类型 | ||
lock_table | varchar(1024) | NO | 被锁的表 | ||
lock_index | varchar(1024) | YES | NULL | 被锁的索引 | |
lock_space | bigint(21) unsigned | YES | NULL | 被锁的表空间号 | |
lock_page | bigint(21) unsigned | YES | NULL | 被锁的页号 | |
lock_rec | bigint(21) unsigned | YES | NULL | 被锁的记录号 | |
lock_data | varchar(8192) | YES | NULL | 被锁的数据 |
③ information_schema.innodb_lock_waits–锁等待的对应关系
Field | Type | Null | Key | Default | Extra |
---|---|---|---|---|---|
requesting_trx_id | varchar(18) | NO | 请求锁的事务ID | ||
requested_lock_id | varchar(81) | NO | 请求锁的锁ID | ||
blocking_trx_id | varchar(18) | NO | 当前拥有锁的事务ID | ||
blocking_lock_id | varchar(81) | NO | 当前拥有锁的锁ID |
开始测试
第一步:创建测试表tx1,并插入测试数据;
create table tx1
(id int primary key ,
c1 varchar(20),
c2 varchar(30)),
c3 datetime
engine=innodb default charset = utf8 ;
insert into tx1 values
(1,'aaaa','aaaaa2',SYSDATE()),
(2,'bbbb','bbbbb2',,SYSDATE()),
(3,'cccc','ccccc2',,SYSDATE());
第二步:手动开启事务,查询三个表数据
start transaction;
update tx1 set c1='heyf',c2='heyf',c3=SYSDATE() where id =3 ;
select * from information_schema.innodb_trx;
select * from information_schema.INNODB_LOCKS;
select * from information_schema.INNODB_LOCK_WAITS;
此时没有锁,锁等待关系,只有innodb_trx表中有数据
*************************** 1. row ***************************
trx_id: 75E34
trx_state: RUNNING
trx_started: 2018-06-06 16:55:37
trx_requested_lock_id: NULL
trx_wait_started: NULL
trx_weight: 3
trx_mysql_thread_id: 235
trx_query: NULL
trx_operation_state: NULL
trx_tables_in_use: 0
trx_tables_locked: 0
trx_lock_structs: 2
trx_lock_memory_bytes: 376
trx_rows_locked: 1
trx_rows_modified: 1
trx_concurrency_tickets: 0
trx_isolation_level: REPEATABLE READ
trx_unique_checks: 1
trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
trx_adaptive_hash_latched: 0
trx_adaptive_hash_timeout: 10000
1 row in set (1.02 sec)
第三步:在另一会话中更新该记录,产生锁等待;
start transaction;
update tx1 set c1='heyfffff',c2='heyffffff',c3=SYSDATE()
where id =3 ;
查看innodb_trx表数据:
mysql> select * from information_schema.innodb_trx\G;
*************************** 1. row ***************************
trx_id: 75E35
trx_state: LOCK WAIT##事务状态
trx_started: 2018-06-06 17:02:33
trx_requested_lock_id: 75E35:0:7509:5
trx_wait_started: 2018-06-06 17:02:33
trx_weight: 2
trx_mysql_thread_id: 238
trx_query: update tx1 set c1='heyfffff',c2='heyffffff',c3=SYSDATE() where id =3
trx_operation_state: starting index read
trx_tables_in_use: 1
trx_tables_locked: 1
trx_lock_structs: 2
trx_lock_memory_bytes: 376
trx_rows_locked: 1
trx_rows_modified: 0
trx_concurrency_tickets: 0
trx_isolation_level: REPEATABLE READ
trx_unique_checks: 1
trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
trx_adaptive_hash_latched: 0
trx_adaptive_hash_timeout: 10000
*************************** 2. row ***************************
trx_id: 75E34
trx_state: RUNNING##事务状态
trx_started: 2018-06-06 16:55:37
trx_requested_lock_id: NULL
trx_wait_started: NULL
trx_weight: 3
trx_mysql_thread_id: 235
trx_query: NULL
trx_operation_state: NULL
trx_tables_in_use: 0
trx_tables_locked: 0
trx_lock_structs: 2
trx_lock_memory_bytes: 376
trx_rows_locked: 1
trx_rows_modified: 1
trx_concurrency_tickets: 0
trx_isolation_level: REPEATABLE READ
trx_unique_checks: 1
trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
trx_adaptive_hash_latched: 0
trx_adaptive_hash_timeout: 10000
2 rows in set (0.00 sec)
查看INNODB_LOCKS表数据:
mysql> select * from information_schema.INNODB_LOCKS\G;
*************************** 1. row ***************************
lock_id: 75E35:0:7509:5
lock_trx_id: 75E35
lock_mode: X
lock_type: RECORD
lock_table: `test`.`tx1`
lock_index: `PRIMARY`
lock_space: 0
lock_page: 7509
lock_rec: 5
lock_data: 3
*************************** 2. row ***************************
lock_id: 75E34:0:7509:5
lock_trx_id: 75E34
lock_mode: X
lock_type: RECORD
lock_table: `test`.`tx1`
lock_index: `PRIMARY`
lock_space: 0
lock_page: 7509
lock_rec: 5
lock_data: 3
2 rows in set (0.00 sec)
查看INNODB_LOCK_WAITS表数据:
mysql> select * from information_schema.INNODB_LOCK_WAITS;\G
*************************** 1. row ***************************
requesting_trx_id: 75E35 ## 请求锁的事务
requested_lock_id: 75E35:0:7509:5 ## 请求锁的锁ID
blocking_trx_id: 75E34 ## 拥有锁的事务
blocking_lock_id: 75E34:0:7509:5 ## 拥有锁的锁ID
在执行第二个update的时候,由于第一个update事务还未提交,故而第二个update在等待,其事务状态为LOCK WAIT ,等待时间超过innodb_lock_wait_timeout值时,则会报ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
异常。
在第二个update锁等待超时之后,对第一个update手动提交事务,则第一个update语句成功更新数据库中数据表。
锁等待递进
如果是多个锁等待,比如有三个update,update同一行记录,则锁等待关系会层级递进,第二个第三个update都保留对第一个update的锁等待且第三个update保留对第二个update的锁等待,如下图:
① 查看并修改变量值
show GLOBAL VARIABLES like '%innodb_lock_wait_timeout%';
set GLOBAL innodb_lock_wait_timeout=100;##设置大小值看系统情况
innodb_lock_wait_timeout指的是事务等待获取资源等待的最长时间,超过这个时间还未分配到资源则会返回应用失败。参数的时间单位是秒,默认值50S。
② 找到一直未提交事务导致后来进程死锁等待的进程,并杀掉
根据锁等待表中的拥有锁的事务id(blocking_trx_id),从innodb_trx表中找到trx_mysql_thread_id值,kill掉。
如 这里杀掉 进程235:
select trx_mysql_thread_id from information_schema.innodb_trx it
JOIN information_schema.INNODB_LOCK_WAITS ilw
on ilw.blocking_trx_id = it.trx_id;
##trx_mysql_thread_id: 235
kill 235
③ 优化SQL,优化数据库,优化项目。第一个update未执行完,第二个update就来了,超过等待时间就会报锁等待超时异常。在数据并发项目遇到这种情况概率比较大,这时候就要从项目、数据库、执行SQL多方面入手了。