org.apache.ibatis.exceptions.PersistenceException:
### Error flushing statements. Cause: org.apache.ibatis.executor.BatchExecutorException: com.baturu.wms.business.outbound.dao.OutboundNoticeHeaderDao.updateById (batch index #1) failed. Cause: java.sql.BatchUpdateException: Deadlock found when trying to get lock; try restarting transaction
### Cause: org.apache.ibatis.executor.BatchExecutorException: com.baturu.wms.business.outbound.dao.OutboundNoticeHeaderDao.updateById (batch index #1) failed. Cause: java.sql.BatchUpdateException: Deadlock found when trying to get lock; try restarting transaction
at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30)
...
Caused by: org.apache.ibatis.executor.BatchExecutorException: com.baturu.wms.business.outbound.dao.OutboundNoticeHeaderDao.updateById (batch index #1) failed. Cause: java.sql.BatchUpdateException: Deadlock found when trying to get lock; try restarting transaction
at org.apache.ibatis.executor.BatchExecutor.doFlushStatements(BatchExecutor.java:148)
...
at org.apache.ibatis.session.defaults.DefaultSqlSession.flushStatements(DefaultSqlSession.java:253)
... 44 more
Caused by: java.sql.BatchUpdateException: Deadlock found when trying to get lock; try restarting transaction
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
...
at org.apache.ibatis.executor.BatchExecutor.doFlushStatements(BatchExecutor.java:122)
... 52 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
...
第一次排查死锁异常,开始时有点无从下手,借着最近刚好看在innodb的劲,先百度了一下如何在mysql看死锁日志
命令:
mysql> show engine innodb status;
会出来很多很多数据,不要慌,只看我们需要的死锁部分
死锁一般是由两个事务互相等待锁导致,所以由事务1和事务2,我们看日志的目的就是找出两个事务互相等待的是什么锁
*** (1) TRANSACTION:
TRANSACTION 468429219, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 7 lock struct(s), heap size 1136, 7 row lock(s), undo log entries 17
MySQL thread id 31567782, OS thread handle 140580182210304, query id 786068511 1.1.1.1 ops updating
UPDATE wms_outbound_notice_header SET status=10,wave_code='19089',wave_error_msg='',updater=null,update_date='2021-11-10 17:01:09.377' WHERE id=1458342344177786881
事务1在执行【UPDATE wms_outbound_notice_header …】时生成了死锁
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 9028 page no 90 n bits 136 index PRIMARY of table `btr_wms`.`wms_outbound_notice_header` trx id 468429219 lock_mode X locks rec but not gap waiting
事务1等待锁情况:
事务id【468429219】在等待表【btr_wms
.wms_outbound_notice_header
】上的 X 锁(即排他锁)
所以事务1的id为468429219
*** (2) TRANSACTION:
TRANSACTION 468429218, ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 1
248 lock struct(s), heap size 41168, 40534 row lock(s), undo log entries 1662
MySQL thread id 31568472, OS thread handle 140582284355328, query id 786068544 1.1.1.1 ops updating
UPDATE wms_wave_detail SET active=0,updater=null,update_date='2021-11-10 17:01:09.441' WHERE outbound_notice_id IN (1452477021064904706)
事务2在执行【UPDATE wms_wave_detail …】时等待锁
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 9028 page no 90 n bits 136 index PRIMARY of table `btr_wms`.`wms_outbound_notice_header` trx id 468429218 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 8978 page no 395 n bits 88 index PRIMARY of table `btr_wms`.`wms_wave_detail` trx id 468429218 lock_mode X waiting
事务2拥有锁:
事务id【468429218】持有表btr_wms
.wms_outbound_notice_header
的 X 锁
事务2等待锁:
事务id【468429218】等待表btr_wms
.wms_wave_detail
的 X 锁
*** WE ROLL BACK TRANSACTION (1)
最后是回滚了事务1,解决死锁
由上面的日志可以得出锁互相等待的结论:
wms_outbound_notice_header
的锁在等待表wms_wave_detail
的锁wms_wave_detail
的锁在等待表wms_outbound_notice_header
的锁由异常日志可以看找出事务1执行的代码位置
@Override
@Transactional(rollbackFor = Exception.class)
public Long createWave(Xxx xxx) {
...
waveDetailService.insertBatch(saveWaveDetailList);
//执行下面的更新时等待锁
outboundNoticeHeaderService.updateBatchById(updateOutboundNoticeHeaderList);
...
}
接下来寻找哪里是先更新了outboundNoticeHeader
,再要去更新waveDetail
的
根据事务2执行sql的情况找出更新waveDetail的位置
@Override
@Transactional(rollbackFor = Exception.class)
public void saveWaveErrorMsg(Xxx xxx) {
...
waveDetailService.update(WaveDetailEntity.builder().active(BooleanStatus.FREEZE).build(), detailQueryWrapper);
...
outboundNoticeHeaderService.update(
OutboundNoticeHeaderEntity.builder()
.status(OutboundNoticeStatusEnum.CREATE.getType())
.waveErrorMsg(errorMsg)
.build(),
outboundNoticeHeaderEntityQueryWrapper);
}
看起来没毛病,也是先更新 waveDetail
再更新 outboundNoticeHeader
不过再看看外层代码发现大问题
@Transactional(rollbackFor = Exception.class)
public void execute(Xxx xxx){
...
for (OutboundNoticeHeaderDTO outboundNoticeHeaderDTO : outboundNoticeHeaderDTOS) {
...
waveHeaderService.saveWaveErrorMsg(errorMsg, Lists.newArrayList(outboundNoticeHeaderDTO));
...
}
...
}
方法【saveWaveErrorMsg】被套了一层for循环,而for循环所在方法【execute】是开启事务的
循环第一次【saveWaveErrorMsg】时,会获取到 outboundNoticeHeader
的锁,那么在循环第二次执行到【waveDetailService.update】就需要等待 waveDetail
的锁了
与方法【createWave】获取锁的顺序正好相反,所以产生了死锁
最后通过分析发现在【execute】上开启事务是错误的,于是把@Transactional去掉解决了问题
简图如下: