一、表结构,死锁日志,事务隔离级别
表结构:
CREATE TABLE `ep_lucky_bag_activity_user_task_schedule` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT COMMENT '主键id',
`user_id` char(32) NOT NULL COMMENT '用户id',
`task_id` char(32) NOT NULL COMMENT '任务id',
`rate` int(11) NOT NULL DEFAULT '0' COMMENT '任务完成的进度',
`use_rate` int(11) NOT NULL DEFAULT '0' COMMENT '',
`create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
`update_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '更新时间',
`deleted` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT '是否被删',
PRIMARY KEY (`id`),
UNIQUE KEY `udx_user_id_task_id` (`user_id`,`task_id`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='任务进度表'
死锁日志:
------------------------
LATEST DETECTED DEADLOCK
------------------------
2020-11-09 13:09:42 0x7f8e4379e700
*** (1) TRANSACTION:
TRANSACTION 29466672039, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 1 row lock(s), undo log entries 2
MySQL thread id 18422134, OS thread handle 140514547791616, query id 73544985667 XXXX XXXX update
insert into
ep_lucky_bag_activity_user_task_schedule (user_id, task_id, rate, use_rate)
values
('pqtjtyylwovtrqjghpkwrtpy972s6641', 'ab67f3ed87b141f29132a5f94ce50233', 1, 0)
on duplicate key update
rate = rate + 1,
use_rate = use_rate + 0,
update_time = now()
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1316 page no 5085 n bits 288 index udx_user_id_task_id of table `XXX`.`ep_lucky_bag_activity_user_task_schedule` trx id 29466672039 lock_mode X waiting
*** (2) TRANSACTION:
TRANSACTION 29466671925, ACTIVE 0 sec inserting, thread declared inside InnoDB 5000
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 4
MySQL thread id 18422851, OS thread handle 140248994146048, query id 73544985718 XXX XXX update
insert into
ep_lucky_bag_activity_user_task_schedule (user_id, task_id, rate, use_rate)
values
('pqtjtyylwovtrqjghpkwrtpy972s6641', '7e8fb79190fd435cad78242bf090a1b1', 1, 0)
on duplicate key update
rate = rate + 1,
use_rate = use_rate + 0,
update_time = now()
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 1316 page no 5085 n bits 288 index udx_user_id_task_id of table `XXX`.`ep_lucky_bag_activity_user_task_schedule` trx id 29466671925 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1316 page no 5085 n bits 288 index udx_user_id_task_id of table `XXX`.`ep_lucky_bag_activity_user_task_schedule` trx id 29466671925 lock_mode X locks gap before rec insert intention waiting
*** WE ROLL BACK TRANSACTION (1)
事务隔离级别为读已提交。
二、insert on duplicate key update语句读已提交隔离级别的加锁分析
通过个人网上学习 + 简单实验得出,不保证正确。
判断插入的间隙有没有gap锁。
有gap锁:加插入意向锁并等待,直到该间隙的gap锁释放。之后进入第2步。
没有gap锁:直接进入第2步。-
判断是否有唯一键冲突
有冲突:
(1)如果该唯一键被加了隐式锁,隐式锁会升级为记录锁,锁被实际插入这条记录的事务持有。
(2)获取唯一键的next-key锁(先获取gap锁,再获取记录锁)后,执行更新没有冲突:插入记录,加隐式锁。
三、死锁时事务执行的SQL
_ | 事务29466671925 | 事务29466672039 |
---|---|---|
1 | insert ('qpt...', 'ab6...') on duplicate key update | _ |
2 | _ | insert ('qpt...', 'ab6...') on duplicate key update |
3 | insert ('qpt...', '7e8...') on duplicate key update -> 死锁 | _ |
四、日志加锁分析
(1) TRANSACTION:
insert into
ep_lucky_bag_activity_user_task_schedule (user_id, task_id, rate, use_rate)
values
('pqtjtyylwovtrqjghpkwrtpy972s6641', 'ab67f3ed87b141f29132a5f94ce50233', 1, 0)
on duplicate key update
rate = rate + 1,
use_rate = use_rate + 0,
update_time = now()
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1316 page no 5085 n bits 288 index udx_user_id_task_id of table `XXX`.`ep_lucky_bag_activity_user_task_schedule` trx id 29466672039 lock_mode X waiting
事务29466672039在等待唯一索引('qpt...', 'ab6...')的X型next-key锁。
(2) TRANSACTION:
insert into
ep_lucky_bag_activity_user_task_schedule (user_id, task_id, rate, use_rate)
values
('pqtjtyylwovtrqjghpkwrtpy972s6641', '7e8fb79190fd435cad78242bf090a1b1', 1, 0)
on duplicate key update
rate = rate + 1,
use_rate = use_rate + 0,
update_time = now()
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 1316 page no 5085 n bits 288 index udx_user_id_task_id of table `XXX`.`ep_lucky_bag_activity_user_task_schedule` trx id 29466671925 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1316 page no 5085 n bits 288 index udx_user_id_task_id of table `XXX`.`ep_lucky_bag_activity_user_task_schedule` trx id 29466671925 lock_mode X locks gap before rec insert intention waiting
事务29466671925持有了唯一索引('qpt...', 'ab6...')的的X型记录锁,
在等待获取唯一索引('qpt...', 'ab6...')前的插入意向锁。
五、死锁时事务持有锁分析
_ | 事务29466671925 | 事务29466672039 |
---|---|---|
1 | insert ('qpt...', 'ab6...') on duplicate key update | _ |
2 | 持有('qpt...', 'ab6...')的隐式锁 | _ |
3 | _ | insert ('qpt...', 'ab6...') on duplicate key update |
4 | ('qpt...', 'ab6...')的隐式锁升级为X型记录锁 | _ |
5 | _ | 获取('qpt...', 'ab6...')的X型gap锁 |
6 | _ | 获取('qpt...', 'ab6...')的X型记录锁失败,进入等待 |
7 | insert ('qpt...', '7e8...') on duplicate key update | _ |
8 | 要插入的('qpt...', '7e8...') 间隙被加了gap锁,加插入意向锁并等待gap锁释放。两个事务持有对方想要的锁,等待对方持有的锁,死锁出现 | _ |
六、解决方式
查看业务代码后,发现事务29466671925在insert ('qpt...', 'ab6...')时就可以把事务提交了,就不会出现这个死锁问题。修改代码后上线,死锁告警不再出现。
七、总结
1.有些时候,光靠死锁日志很难分析出死锁出现的原因,需要结合业务逻辑分析。例如这次的死锁日志显示,两个事务插入唯一键不同的记录,却互相持有并等待同一唯一键的锁,刚看到死锁日志时是挺懵的。
2.写代码时仔细考虑事务覆盖的范围,还是很有用处的。例如这次死锁就是因为事务没有及时提交造成的。