MySQL线上死锁案例分析

项目场景

项目开发中有两张表:c_bill(账单表),c_bill_detail(账单明细表),他们的表结构如下(这里只保留必要信息):

CREATE TABLE `c_bill_detail` (
  `id` bigint unsigned NOT NULL AUTO_INCREMENT COMMENT '主键',
  `bill_detail_no` varchar(32)  NOT NULL DEFAULT '' COMMENT '对账单编号',
  `receivable_date` datetime(3) DEFAULT NULL COMMENT '应收日期',
  `order_type` varchar(20) NOT NULL DEFAULT '' COMMENT 
  `bill_no` varchar(32)  NOT NULL DEFAULT '' COMMENT '账单编号',
  `invoice_amount` decimal(12,4) NOT NULL COMMENT '开票金额',
  `active` tinyint NOT NULL DEFAULT '1' COMMENT '是否逻辑删除',
  PRIMARY KEY (`id`) USING BTREE,
  KEY `idx_bill_no` (`bill_no`) USING BTREE
) ENGINE=InnoDB COMMENT='客户账单明细';

CREATE TABLE `c_bill` (
  `id` bigint unsigned NOT NULL AUTO_INCREMENT COMMENT '主键',
  `bill_no` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '对账单编号',
  `should_receive_amount` decimal(12,4) NOT NULL COMMENT '应收总额',
  `actual_should_receive_amount` decimal(12,4) NOT NULL COMMENT '实际应收金额',
  `invoice_status` tinyint DEFAULT NULL COMMENT '开票状态(字典:invoice-status)',
  `invoice_amount` decimal(12,4) NOT NULL COMMENT '开票金额',
  PRIMARY KEY (`id`) USING BTREE,
  UNIQUE KEY `uk_bill_no` (`bill_no`) USING BTREE
) ENGINE=InnoDB COMMENT='客户账单';

c_bill表跟c_bill_detail表是1对多的关系,c_bill表中的invoice_amount是由c_bill_detail表中的invoice_amount统计出来的。
统计sql如下:

UPDATE c_bill
      SET invoice_amount = (SELECT ifnull(sum(invoice_amount), 0)
                            FROM c_bill_detail
                            WHERE bill_no = #{billNo}
                              AND active = 1
                              AND order_type in ('sale_order', 'supplement_order', 'subject_sale_order')),
          invoice_date   = #{invoiceDate},
          invoice_status =
              CASE
                  WHEN invoice_amount = should_receive_amount THEN 1
                  WHEN invoice_amount = 0 THEN 0
                  ELSE 2
                  END
      where bill_no = #{billNo}
        and active = 1;

业务层面,账单进行开发票操作后,会更新c_bill_detail表跟c_bill

问题描述

有一天线上出现告警:
MySQL线上死锁案例分析_第1张图片
从日志上看发生了死锁,通过定位代码发现跟执行以下sql有关:

UPDATE c_bill
      SET invoice_amount = (SELECT ifnull(sum(invoice_amount), 0)
                            FROM c_bill_detail
                            WHERE bill_no = #{billNo}
                              AND active = 1
                              AND order_type in ('sale_order', 'supplement_order', 'subject_sale_order')),
          invoice_date   = #{invoiceDate},
          invoice_status =
              CASE
                  WHEN invoice_amount = should_receive_amount THEN 1
                  WHEN invoice_amount = 0 THEN 0
                  ELSE 2
                  END
      where bill_no = #{billNo}
        and active = 1;

通过数据库锁分析得到如下信息:
MySQL线上死锁案例分析_第2张图片
从上面信息可以得到以下信息:

  • 事务1等待c_bill_detail表的S锁,该锁对应的索引名称是 PRIMARY(也就是主键索引,id)
  • 事务1持有c_bill表的X锁,该锁对应的索引名称是uk_bill_no
  • 事务2等待c_bill表的X锁,该锁对应的索引名称为uk_bill_no
  • 事务2持有c_bill_detail表额S锁,该锁对应的索引名称是PRIMARY(也就是主键索引,id)

通过上面可以看出,事务1跟事务2直接的锁进入了死循环,形成了死锁。

原因分析:

死锁数据分析

上面的途中,给出了死锁有关的两个索引:c_bill_detail表的主键索引,跟c_bill表的主键索引。c_bill表知道是执行了上面提到的统计sql,那么,c_bill_detail表是执行了什么操作呢?

首先通过审计找出当时这两个事务的操作:
MySQL线上死锁案例分析_第3张图片
上面是线程3213915(事务A)的有关操作,可以看到对c_bill_detail表有如下更新:

-- SQL
UPDATE
  c_bill_detail
SET
  receivable_date = '2023-11-15 00:00:00',
  invoice_status = 2,
  invoice_amount = 305412,
  updater = '管理员',
  updater_code = 'ADMINISTRATOR',
  update_time = '2023-12-08 17:47:52.382000'
WHERE id = 146947
  AND active = 1

线程3213754(事务B)操作如下
MySQL线上死锁案例分析_第4张图片

UPDATE
  c_bill_detail
SET
  receivable_date = '2023-11-15 00:00:00',
  invoice_status = 2,
  invoice_amount = 305412,
  updater = '管理员',
  updater_code = 'ADMINISTRATOR',
  update_time = '2023-12-08 17:47:52.381000'
WHERE id = 147471
  AND active = 1;

从上面可以看出事务A对表c_bill_detailid = 146947数据进行了更新,事务B对表c_billid=147471进行了更新。

通过审计日志还发现,事务A跟事务B也都更新了c_bill表,而且都是更新了bill_no=XSZD202309070005这一行数据。

事务A:
MySQL线上死锁案例分析_第5张图片

UPDATE
  c_bill
SET
  invoice_amount =
  (SELECT
    IFNULL(SUM(invoice_amount), 0)
  FROM
    c_bill_detail
  WHERE bill_no = 'XSZD202309070005'
    AND active = 1
    AND order_type IN (
      'sale_order',
      'supplement_order',
      'subject_sale_order'
    )),
  invoice_date = '2023-12-08 00:00:00',
  invoice_status =
  CASE
    WHEN invoice_amount = should_receive_amount
    THEN 1
    WHEN invoice_amount = 0
    THEN 0
    ELSE 2
  END
WHERE bill_no = 'XSZD202309070005'
  AND active = 1;

事务B:

MySQL线上死锁案例分析_第6张图片

-- SQL
UPDATE
  c_bill
SET
  invoice_amount =
  (SELECT
    IFNULL(SUM(invoice_amount), 0)
  FROM
    c_bill_detail
  WHERE bill_no = 'XSZD202309070005'
    AND active = 1
    AND order_type IN (
      'sale_order',
      'supplement_order',
      'subject_sale_order'
    )),
  invoice_date = '2023-12-08 00:00:00',
  invoice_status =
  CASE
    WHEN invoice_amount = should_receive_amount
    THEN 1
    WHEN invoice_amount = 0
    THEN 0
    ELSE 2
  END
WHERE bill_no = 'XSZD202309070005'
  AND active = 1;

上图可以看出,事务B最终在更新c_bill时失败回滚了(因为发生了死锁)。

通过查看数据发现,c_bill_detailid = 146947id=147471对应的bill_no都是XSZD202309070005
MySQL线上死锁案例分析_第7张图片
到这里,只是发现了数据上的关联,还是不知道为什么会出现死锁,下面在其他环境进行复现。

select语句添加了共享读锁

为了更好复现这个死锁情况,现将线上的sql执行顺序整理如下:

MySQL线上死锁案例分析_第8张图片

下面在本地数据库,选取c_bill_detailid=19380id=19381,这两条数据有相同的bill_no=XSZD202211226768

MySQL线上死锁案例分析_第9张图片
开启两个事务,分别按照上面表格的sql数据进行执行,同时观察锁情况:

事务A 更新id = 19380:

Database changed
mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> UPDATE
    ->   c_bill_detail
    -> SET
    ->   receivable_date = '2023-11-15 00:00:00',
    ->   invoice_status = 2,
    ->   invoice_amount = 305412,
    ->   updater = '管理员',
    ->   updater_code = 'ADMINISTRATOR',
    ->   update_time = '2023-12-08 17:47:52.382000'
    -> WHERE id = 19380
    ->   AND active = 1;
Query OK, 1 row affected (0.03 sec)
Rows matched: 1  Changed: 1  Warnings: 0

事务B更新id = 19381的记录

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> UPDATE
    ->   c_bill_detail
    -> SET
    ->   receivable_date = '2023-11-15 00:00:00',
    ->   invoice_status = 2,
    ->   invoice_amount = 305412,
    ->   updater = '管理员',
    ->   updater_code = 'ADMINISTRATOR',
    ->   update_time = '2023-12-08 17:47:52.381000'
    -> WHERE id = 19381
    ->   AND active = 1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

这是观察锁情况

mysql> select * from performance_schema.data_locks\G;
*************************** 1. row ***************************
  // 省略表意向锁
*************************** 2. row ***************************
               ENGINE: INNODB
       ENGINE_LOCK_ID: 139645347952208:230:8:12:139645358792496
ENGINE_TRANSACTION_ID: 65810
            THREAD_ID: 563867
             EVENT_ID: 34
        OBJECT_SCHEMA: fresh
          OBJECT_NAME: c_bill_detail
       PARTITION_NAME: NULL
    SUBPARTITION_NAME: NULL
           INDEX_NAME: PRIMARY
OBJECT_INSTANCE_BEGIN: 139645358792496
            LOCK_TYPE: RECORD
            LOCK_MODE: X,REC_NOT_GAP
          LOCK_STATUS: GRANTED
            LOCK_DATA: 19381
*************************** 3. row ***************************
       // 省略表锁
*************************** 4. row ***************************
               ENGINE: INNODB
       ENGINE_LOCK_ID: 139645347951400:230:8:9:139645358786480
ENGINE_TRANSACTION_ID: 65809
            THREAD_ID: 563866
             EVENT_ID: 34
        OBJECT_SCHEMA: fresh
          OBJECT_NAME: c_bill_detail
       PARTITION_NAME: NULL
    SUBPARTITION_NAME: NULL
           INDEX_NAME: PRIMARY
OBJECT_INSTANCE_BEGIN: 139645358786480
            LOCK_TYPE: RECORD
            LOCK_MODE: X,REC_NOT_GAP
          LOCK_STATUS: GRANTED
            LOCK_DATA: 19380
4 rows in set (0.00 sec)

从上面看出,c_bill_detail表的id=19381id=19380的数据加上了X锁,这是意料之中的。

接下来事务A执行更新c_bill

mysql> UPDATE
    ->   c_bill
    -> SET
    ->   invoice_amount =
    ->   (SELECT
    ->     IFNULL(SUM(invoice_amount), 0)
    ->   FROM
    ->     c_bill_detail
    ->   WHERE bill_no = 'XSZD202211226768'
    ->     AND active = 1
    ->     AND order_type IN (
    ->       'sale_order',
    ->       'supplement_order',
    ->       'subject_sale_order'
    ->     )),
    ->   invoice_date = '2023-12-08 00:00:00',
    ->   invoice_status =
    ->   CASE
    ->     WHEN invoice_amount = should_receive_amount
    ->     THEN 1
    ->     WHEN invoice_amount = 0
    ->     THEN 0
    ->     ELSE 2
    ->   END
    -> WHERE bill_no = 'XSZD202211226768'
    ->   AND active = 1;

此时事务A发生了阻塞
MySQL线上死锁案例分析_第10张图片

这时查看锁情况:

mysql> select * from performance_schema.data_locks\G;
*************************** 1. row ***************************
           // 表锁
*************************** 2. row ***************************
               ENGINE: INNODB
       ENGINE_LOCK_ID: 139645347952208:230:8:16:139645358792496
ENGINE_TRANSACTION_ID: 65820
            THREAD_ID: 563867
             EVENT_ID: 43
        OBJECT_SCHEMA: fresh
          OBJECT_NAME: c_bill_detail
       PARTITION_NAME: NULL
    SUBPARTITION_NAME: NULL
           INDEX_NAME: PRIMARY
OBJECT_INSTANCE_BEGIN: 139645358792496
            LOCK_TYPE: RECORD
            LOCK_MODE: X,REC_NOT_GAP
          LOCK_STATUS: GRANTED
            LOCK_DATA: 19381
*************************** 3. row ***************************
               //c_bill 表意向锁
*************************** 4. row ***************************
              // c_bill_detail表意向锁
*************************** 5. row ***************************
               ENGINE: INNODB
       ENGINE_LOCK_ID: 139645347951400:230:8:15:139645358786480
ENGINE_TRANSACTION_ID: 65819
            THREAD_ID: 563866
             EVENT_ID: 44
        OBJECT_SCHEMA: fresh
          OBJECT_NAME: c_bill_detail
       PARTITION_NAME: NULL
    SUBPARTITION_NAME: NULL
           INDEX_NAME: PRIMARY
OBJECT_INSTANCE_BEGIN: 139645358786480
            LOCK_TYPE: RECORD
            LOCK_MODE: X,REC_NOT_GAP
          LOCK_STATUS: GRANTED
            LOCK_DATA: 19380
*************************** 6. row ***************************
               ENGINE: INNODB
       ENGINE_LOCK_ID: 139645347951400:229:5:6:139645358786824
ENGINE_TRANSACTION_ID: 65819
            THREAD_ID: 563866
             EVENT_ID: 45
        OBJECT_SCHEMA: fresh
          OBJECT_NAME: c_bill
       PARTITION_NAME: NULL
    SUBPARTITION_NAME: NULL
           INDEX_NAME: uk_bill_no
OBJECT_INSTANCE_BEGIN: 139645358786824
            LOCK_TYPE: RECORD
            LOCK_MODE: X,REC_NOT_GAP
          LOCK_STATUS: GRANTED
            LOCK_DATA: 'XSZD202211226768', 5117
*************************** 7. row ***************************
               ENGINE: INNODB
       ENGINE_LOCK_ID: 139645347951400:229:7:6:139645358787168
ENGINE_TRANSACTION_ID: 65819
            THREAD_ID: 563866
             EVENT_ID: 45
        OBJECT_SCHEMA: fresh
          OBJECT_NAME: c_bill
       PARTITION_NAME: NULL
    SUBPARTITION_NAME: NULL
           INDEX_NAME: PRIMARY
OBJECT_INSTANCE_BEGIN: 139645358787168
            LOCK_TYPE: RECORD
            LOCK_MODE: X,REC_NOT_GAP
          LOCK_STATUS: GRANTED
            LOCK_DATA: 5117
*************************** 8. row ***************************
               ENGINE: INNODB
       ENGINE_LOCK_ID: 139645347951400:230:58:117:139645358787512
ENGINE_TRANSACTION_ID: 65819
            THREAD_ID: 563866
             EVENT_ID: 45
        OBJECT_SCHEMA: fresh
          OBJECT_NAME: c_bill_detail
       PARTITION_NAME: NULL
    SUBPARTITION_NAME: NULL
           INDEX_NAME: idx_bill_no
OBJECT_INSTANCE_BEGIN: 139645358787512
            LOCK_TYPE: RECORD
            LOCK_MODE: S
          LOCK_STATUS: GRANTED
            LOCK_DATA: 'XSZD202211226768', 19380
*************************** 9. row ***************************
               ENGINE: INNODB
       ENGINE_LOCK_ID: 139645347951400:230:58:118:139645358787512
ENGINE_TRANSACTION_ID: 65819
            THREAD_ID: 563866
             EVENT_ID: 45
        OBJECT_SCHEMA: fresh
          OBJECT_NAME: c_bill_detail
       PARTITION_NAME: NULL
    SUBPARTITION_NAME: NULL
           INDEX_NAME: idx_bill_no
OBJECT_INSTANCE_BEGIN: 139645358787512
            LOCK_TYPE: RECORD
            LOCK_MODE: S
          LOCK_STATUS: GRANTED
            LOCK_DATA: 'XSZD202211226768', 19381
*************************** 10. row ***************************
               ENGINE: INNODB
       ENGINE_LOCK_ID: 139645347951400:230:8:16:139645358787856
ENGINE_TRANSACTION_ID: 65819
            THREAD_ID: 563866
             EVENT_ID: 45
        OBJECT_SCHEMA: fresh
          OBJECT_NAME: c_bill_detail
       PARTITION_NAME: NULL
    SUBPARTITION_NAME: NULL
           INDEX_NAME: PRIMARY
OBJECT_INSTANCE_BEGIN: 139645358787856
            LOCK_TYPE: RECORD
            LOCK_MODE: S,REC_NOT_GAP
          LOCK_STATUS: WAITING
            LOCK_DATA: 19381
10 rows in set (0.00 sec)

从上面的Row10发现,事务A跟表c_bill_detail的id = 19381的记录添加了S锁,并且在锁状态为等待状态。

接着事务B也执行更新c_bill表,发现就会出现死锁的情况。
MySQL线上死锁案例分析_第11张图片
到这里总结上面的持锁过程:

  • 事务A先持有t_bill_detailid=19380X
  • 接着事务B持有t_bll_detailid=19381X锁,与上一把没有存在锁竞争,都能正常执行
  • 事务A更新c_bill,这时不仅给表c_bill表的bill_no=XSZD202211226768的记录加上了X锁,同时也给c_bill_detailid=19381的记录添加了S锁,并且处于等待状态。
  • 事务B更新c_bill同样会给c_bill_detailid=19380的记录添加S锁。

这里有一下几点需要注意:

  1. S锁跟X锁不兼容,会出现锁等待的情况。
  2. 普通的select语句是不加锁的,但是在update语句中进行select查询赋值,这时的select就会添加上共享锁。
  3. 共享锁主要是保证每次读取的都是最新的值(读取时不支持修改)。

以上就是生成环境形成死锁的分析过程。关于X锁跟S锁的更多说明,可以参考:Innodb中的锁

你可能感兴趣的:(MySQL,mysql,死锁,共享锁,排他锁)