问题:昨天线上出现报错导致一个功能无法执行,查找日志发现是mysql的死锁问题。
分析问题:其实解决问题最大的难点在于分析问题找到出现问题出现在哪里,这个过程花费的时间和思考是最多的,而使用代码解决问题反而很快速。
(1)报错日志如下:
2021-10-09 18:59:04] local.INFO: RobotAuction-SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction (SQL: select id
, cash
, freeze
, area_id
from users
where users
.id
in (2634, 2662, 2672, 2673, 2675, 2685, 2808, 2811, 2818, 2834, 2869, 2886, 2926, 2961, 2962, 2981, 3066, 3080, 3124, 3131, 3135) and users
.deleted_at
is null for update)-669
从报错日志上去查看msyql的死锁日志,再结合业务上的可能的操作进行分析
(2)死锁日志:
2021-10-09 18:59:04 0x150d81d79700
* (1) TRANSACTION:
TRANSACTION 385220, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 11 lock struct(s), heap size 1136, 19 row lock(s)
MySQL thread id 1376113, OS thread handle 23147762767616, query id 23440050 localhost 127.0.0.1 lpt Sending data
select `id`, `cash`, `freeze`, `area_id` from `users` where `users`.`id` in (2634, 2662, 2672, 2673, 2675, 2685, 2808, 2811, 2818, 2834, 2869, 2886, 2926, 2961, 2962, 2981, 3066, 3080, 3124, 3131, 3135) and `users`.`deleted_at` is null
for update
(1) WAITING FOR THIS LOCK TO BE GRANTED: 【1】等待id = 2675 的事务锁
RECORD LOCKS space id 191 page no 14 n bits 120 index PRIMARY of table lpt
.users
trx id 385220 lock_mode X locks rec but not gap waiting
Record lock, heap no 49 PHYSICAL RECORD: n_fields 52; compact format; info bits 0
** 省略部分
* (2) TRANSACTION:
TRANSACTION 385219, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
7 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 6
MySQL thread id 1376112, OS thread handle 23147757147904, query id 23440106 localhost 127.0.0.1 lpt statistics
select `id`, `cash` from `users` where `users`.`id` = 2675 and `users`.`deleted_at` is null limit 1 for update
(2) HOLDS THE LOCK(S): 【2】持有id=2675的锁
RECORD LOCKS space id 191 page no 14 n bits 120 index PRIMARY of table lpt
.users
trx id 385219 lock_mode X locks rec but not gap
Record lock, heap no 49 PHYSICAL RECORD: n_fields 52; compact format; info bits 0
** 省略部分
(2) WAITING FOR THIS LOCK TO BE GRANTED: 【3】等待 id = 2673的数据释放锁
RECORD LOCKS space id 191 page no 6 n bits 120 index PRIMARY of table lpt
.users
trx id 385219 lock_mode X locks rec but not gap waiting
Record lock, heap no 32 PHYSICAL RECORD: n_fields 52; compact format; info bits 0
** 省略部分
* WE ROLL BACK TRANSACTION (1) 【4】选择开销小的进行回滚,选择了事务(1),对照前面的任务未执行,回滚了
问题所在:
1、使用的锁都是排他锁
2、这典型的加锁顺序不同造成的死锁
3、分析过程【1】【2】【3】【4】
解析办法
1、通过分析业务,调整业务上的顺序来解决 ✔
2、通过表锁来解决,却会牺牲性能并且不能使用索引 ×