技术分享 | 半一致性读对 Update 的优化

作者:赵黎明
爱可生 MySQL DBA 团队成员,Oracle 10g OCM,MySQL 5.7 OCP,擅长数据库性能问题诊断、事务与锁问题的分析等,负责处理客户 MySQL 及我司自研 DMP 平台日常运维中的问题,对开源数据库相关技术非常感兴趣。
本文来源:原创投稿
*爱可生开源社区出品,原创内容未经授权不得随意使用,转载请联系小编并注明来源。

什么是半一致性读

先看下官方的描述:

  • 是一种用在 Update 语句中的读操作(一致性读)的优化,是在 RC 事务隔离级别下与一致性读的结合。
  • 当 Update 语句的 where 条件中匹配到的记录已经上锁,会再次去 InnoDB 引擎层读取对应的行记录,判断是否真的需要上锁(第一次需要由 InnoDB 先返回一个最新的已提交版本)。
  • 只在 RC 事务隔离级别下或者是设置了 innodb_locks_unsafe_for_binlog=1 的情况下才会发生。
  • innodb_locks_unsafe_for_binlog 参数在 8.0 版本中已被去除(可见,这是一个可能会导致数据不一致的参数,官方也不建议使用了)。

测试案例

InnoDB 引擎的强大之处就在于它能完美地支持事务,而事务的一致性则是由事务隔离级别和并发事务锁来保证的。接下来,我们先通过 2 个测试案例来观察半一致性读会对事务产生哪些影响。

案例 1

  • RC 隔离级别,3 个 Session 执行事务语句
-- 创建测试表
root@localhost:mysqld.sock[zlm] <5.7.30-log>create table zlm.t(id int,sal int) engine innodb default character set utf8mb4;
Query OK, 0 rows affected (0.06 sec)

root@localhost:mysqld.sock[zlm] <5.7.30-log>show create table zlm.t\G
*************************** 1. row ***************************
       Table: t
Create Table: CREATE TABLE `t` (
  `id` int(11) DEFAULT NULL,
  `sal` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.01 sec)

-- 写入测试数据(创建插入数据的存储过程)
root@localhost:mysqld.sock[zlm] <5.7.30-log>drop procedure if exists zlm.proc_t;
Query OK, 0 rows affected (0.00 sec)

root@localhost:mysqld.sock[zlm] <5.7.30-log>delimiter $$
root@localhost:mysqld.sock[zlm] <5.7.30-log>create procedure zlm.proc_t()
    -> begin
    -> declare i int default 1;
    -> declare j int default 100;
    -> while i<11 do
    -> insert into t(id,sal) values(i,j);
    -> set i=i+1;
    -> set j=j+100;
    -> end while;
    -> end $$
Query OK, 0 rows affected (0.01 sec)

root@localhost:mysqld.sock[zlm] <5.7.30-log>delimiter ;
root@localhost:mysqld.sock[zlm] <5.7.30-log>select * from t;
+------+------+
| id | sal |
+------+------+
| 1 | 100 |
| 2 | 200 |
| 3 | 300 |
| 4 | 400 |
| 5 | 500 |
| 6 | 600 |
| 7 | 700 |
| 8 | 800 |
| 9 | 900 |
| 10 | 1000 |
+------+------+
10 rows in set (0.00 sec)

-- 开启RC隔离级别
root@localhost:mysqld.sock[zlm] <5.7.30-log>set @@global.tx_isolation='read-committed';
Query OK, 0 rows affected, 1 warning (0.00 sec)

注意,从8.0.3版本开始,去掉了tx_isolation参数,参数名只支持transaction_isolation

-- 开启两个新的Session(设置global参数后,仅对新连接生效)
-- 确认Session 1的隔离级别和线程ID
root@localhost:mysqld.sock[(none)] <5.7.30-log>show variables like 'tx_isolation';
+---------------+----------------+
| Variable_name | Value |
+---------------+----------------+
| tx_isolation | READ-COMMITTED |
+---------------+----------------+
1 row in set (0.01 sec)

root@localhost:mysqld.sock[(none)] <5.7.30-log>select connection_id();
+-----------------+
| connection_id() |
+-----------------+
| 8 |
+-----------------+
1 row in set (0.00 sec)

-- Session 1执行当前读的Select语句
root@localhost:mysqld.sock[zlm] <5.7.30-log>begin;select * from t where id>3 and id<6 for update;
Query OK, 0 rows affected (0.00 sec)

+------+------+
| id | sal |
+------+------+
| 4 | 400 |
| 5 | 500 |
+------+------+
2 rows in set (0.00 sec)

-- 查看加锁详情(需设置参数innodb_status_output_locks=on,否则看不到IX锁)
-- 线程8的事务1314,获取到了1个表级插入意向锁IX,2个记录锁,对应id=4,id=5的这两条记录
---TRANSACTION 1313, ACTIVE 7 sec
2 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 8, OS thread handle 140086065960704, query id 277 localhost root
TABLE LOCK table `zlm`.`t` trx id 1313 lock mode IX
RECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1313 lock_mode X locks rec but not gap
Record lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000204; asc ;;
 1: len 6; hex 000000000515; asc ;;
 2: len 7; hex b0000001240110; asc $ ;;
 3: len 4; hex 80000004; asc ;;
 4: len 4; hex 80000190; asc ;;

Record lock, heap no 6 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000205; asc ;;
 1: len 6; hex 000000000516; asc ;;
 2: len 7; hex b1000001250110; asc % ;;
 3: len 4; hex 80000005; asc ;;
 4: len 4; hex 800001f4; asc ;;

-- 确认Session 2的隔离级别和线程ID
root@localhost:mysqld.sock[(none)] <5.7.30-log>show variables like 'tx_isolation';
+---------------+----------------+
| Variable_name | Value |
+---------------+----------------+
| tx_isolation | READ-COMMITTED |
+---------------+----------------+
1 row in set (0.00 sec)

root@localhost:mysqld.sock[(none)] <5.7.30-log>select connection_id();
+-----------------+
| connection_id() |
+-----------------+
| 9 |
+-----------------+
1 row in set (0.00 sec)

-- Session 2执行当前读的Select语句
root@localhost:mysqld.sock[zlm] <5.7.30-log>begin;select * from t where id = 7 for update;
Query OK, 0 rows affected (0.00 sec)

ERROR 1205 (HY000): Unknown error 1205    ## 1205表示锁等待超时,这里吐槽下最新GA的5.7.30,遇到错误仅会抛出一个代码,没有错误描述,不方便排查

-- 查看加锁详情
-- innodb锁等待超时前,可以看到线程9的1314事务正在请求并等待1个记录锁,id=4的这条记录
---TRANSACTION 1314, ACTIVE 3 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 9, OS thread handle 140086065690368, query id 282 localhost root Sending data
select * from t where id = 7 for update
------- TRX HAS BEEN WAITING 3 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1314 lock_mode X locks rec but not gap waiting
Record lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000204; asc ;;
 1: len 6; hex 000000000515; asc ;;
 2: len 7; hex b0000001240110; asc $ ;;
 3: len 4; hex 80000004; asc ;;
 4: len 4; hex 80000190; asc ;;

------------------
TABLE LOCK table `zlm`.`t` trx id 1314 lock mode IX
RECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1314 lock_mode X locks rec but not gap
RECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1314 lock_mode X locks rec but not gap waiting
Record lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000204; asc ;;
 1: len 6; hex 000000000515; asc ;;
 2: len 7; hex b0000001240110; asc $ ;;
 3: len 4; hex 80000004; asc ;;
 4: len 4; hex 80000190; asc ;;

-- innodb锁等待超时后再观察一次,线程9的事务1314的事务仍然没有结束,对t表持有IX锁,并且仍然在等待id=4的行锁释放
---TRANSACTION 1314, ACTIVE 453 sec
2 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 9, OS thread handle 140086065690368, query id 282 localhost root
TABLE LOCK table `zlm`.`t` trx id 1314 lock mode IX
RECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1314 lock_mode X locks rec but not gap

-- 确认Session 3的隔离级别和线程ID
+---------------+----------------+
| Variable_name | Value |
+---------------+----------------+
| tx_isolation | READ-COMMITTED |
+---------------+----------------+
1 row in set (0.01 sec)

root@localhost:mysqld.sock[(none)] <5.7.30-log>select connection_id();
+-----------------+
| connection_id() |
+-----------------+
| 10 |
+-----------------+
1 row in set (0.00 sec)

-- Session 3执行Update语句
root@localhost:mysqld.sock[zlm] <5.7.30-log>begin;update t set sal = sal + 1 where id = 7;
Query OK, 0 rows affected (0.00 sec)

Query OK, 1 row affected (0.00 sec)

在Session 1事务仍然未结束的情况下,Session 3的事务未被阻塞,可以正常执行

-- 查看3个语句的执行计划
root@localhost:mysqld.sock[zlm] <5.7.30-log>explain select * from t where id>3 and id<6 for update;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | t | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 11.11 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

root@localhost:mysqld.sock[zlm] <5.7.30-log>explain select * from t where id = 7 for update;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | t | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 10.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

root@localhost:mysqld.sock[zlm] <5.7.30-log>explain update t set sal=sal+1 where id=7;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | UPDATE | t | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set (0.00 sec)

意料之中,由于 t 表没有索引,执行计划必然是走全表扫描,也就是每条被读取到的记录,都会上行锁。
那为何 Session 1 只锁了id=4,id=5 的这两条,并没有锁全表呢?而同样是请求 id=7 的记录,为何 Session 2 无法获取锁资源,Session 3 却能成功执行?
也许大家从上面的锁分析可以很快得到结论,由于 Session 1 只占用了 id=4、id=5 的行锁,那么 Session 3 去请求 id=7 的自然不会有冲突(似乎挺有道理)
那么 Session 2 对 id=7 的请求,为何会被锁定呢?

带着这些疑问,我们继续看第 2 个案例

案例 2

  • RC 隔离级别,这次 Session 1 执行的 Select 语句不带 where 条件
-- Session 1 执行当前读的 Select 语句
root@localhost:mysqld.sock[zlm] <5.7.30-log>begin;select * from t for update;
Query OK, 0 rows affected (0.00 sec)

+------+------+
| id | sal |
+------+------+
| 1 | 100 |
| 2 | 200 |
| 3 | 300 |
| 4 | 400 |
| 5 | 500 |
| 6 | 600 |
| 7 | 700 |
| 8 | 800 |
| 9 | 900 |
| 10 | 1000 |
+------+------+
10 rows in set (0.00 sec)

-- 查看加锁详情
-- 线程8的1317事务获得了1个IX表锁和10个X记录锁,即:把表中的10条记录都锁定了
-- t表上没有索引,MySQL默认会创建GEN_CLUST_INDEX的聚簇索引,而语句没有加where条件,只能走全表扫描,每条被读取的记录,都要在聚簇索引上加上记录锁(全表记录锁,相当于一个表锁了)
---TRANSACTION 1317, ACTIVE 5 sec
2 lock struct(s), heap size 1136, 10 row lock(s)
MySQL thread id 8, OS thread handle 140086065960704, query id 312 localhost root
TABLE LOCK table `zlm`.`t` trx id 1317 lock mode IX
RECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1317 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000201; asc ;;
 1: len 6; hex 00000000050e; asc ;;
 2: len 7; hex ab0000011f0110; asc ;;
 3: len 4; hex 80000001; asc ;;
 4: len 4; hex 80000064; asc d;;

Record lock, heap no 3 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000202; asc ;;
 1: len 6; hex 00000000050f; asc ;;
 2: len 7; hex ac000001200110; asc ;;
 3: len 4; hex 80000002; asc ;;
 4: len 4; hex 800000c8; asc ;;

Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000203; asc ;;
 1: len 6; hex 000000000514; asc ;;
 2: len 7; hex af000001230110; asc # ;;
 3: len 4; hex 80000003; asc ;;
 4: len 4; hex 8000012c; asc ,;;

Record lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000204; asc ;;
 1: len 6; hex 000000000515; asc ;;
 2: len 7; hex b0000001240110; asc $ ;;
 3: len 4; hex 80000004; asc ;;
 4: len 4; hex 80000190; asc ;;

Record lock, heap no 6 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000205; asc ;;
 1: len 6; hex 000000000516; asc ;;
 2: len 7; hex b1000001250110; asc % ;;
 3: len 4; hex 80000005; asc ;;
 4: len 4; hex 800001f4; asc ;;

Record lock, heap no 7 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000206; asc ;;
 1: len 6; hex 000000000517; asc ;;
 2: len 7; hex b2000001260110; asc & ;;
 3: len 4; hex 80000006; asc ;;
 4: len 4; hex 80000258; asc X;;

Record lock, heap no 8 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000207; asc ;;
 1: len 6; hex 000000000518; asc ;;
 2: len 7; hex b3000001270110; asc ' ;;
 3: len 4; hex 80000007; asc ;;
 4: len 4; hex 800002bc; asc ;;

Record lock, heap no 9 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000208; asc ;;
 1: len 6; hex 000000000519; asc ;;
 2: len 7; hex b4000001280110; asc ( ;;
 3: len 4; hex 80000008; asc ;;
 4: len 4; hex 80000320; asc ;;

Record lock, heap no 10 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000209; asc ;;
 1: len 6; hex 00000000051a; asc ;;
 2: len 7; hex b5000001290110; asc ) ;;
 3: len 4; hex 80000009; asc ;;
 4: len 4; hex 80000384; asc ;;

Record lock, heap no 11 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 00000000020a; asc ;;
 1: len 6; hex 00000000051b; asc ;;
 2: len 7; hex b60000012a0110; asc * ;;
 3: len 4; hex 8000000a; asc ;;
 4: len 4; hex 800003e8; asc ;;

-- Session 2执行当前读的Select语句
root@localhost:mysqld.sock[zlm] <5.7.30-log>begin;select * from t where id = 7 for update;
Query OK, 0 rows affected (0.00 sec)

ERROR 1205 (HY000): Unknown error 1205

与之前案例 1 相同,也是锁等待超时退出

-- 查看加锁详情
-- 这次线程9的事务1318从第1条记录就开始加锁了
---TRANSACTION 1318, ACTIVE 4 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 9, OS thread handle 140086065690368, query id 315 localhost root Sending data
select * from t where id = 7 for update
------- TRX HAS BEEN WAITING 4 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1318 lock_mode X locks rec but not gap waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000201; asc ;;
 1: len 6; hex 00000000050e; asc ;;
 2: len 7; hex ab0000011f0110; asc ;;
 3: len 4; hex 80000001; asc ;;
 4: len 4; hex 80000064; asc d;;

------------------
TABLE LOCK table `zlm`.`t` trx id 1318 lock mode IX
RECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1318 lock_mode X locks rec but not gap waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000201; asc ;;
 1: len 6; hex 00000000050e; asc ;;
 2: len 7; hex ab0000011f0110; asc ;;
 3: len 4; hex 80000001; asc ;;
 4: len 4; hex 80000064; asc d;;

-- Session 3执行Update语句
root@localhost:mysqld.sock[zlm] <5.7.30-log>begin;update t set sal = sal + 1 where id = 7;
Query OK, 0 rows affected (0.00 sec)

ERROR 1205 (HY000): Unknown error 1205

与案例 1 不同的是,这次Update语句也遭遇锁等待超时退出了

-- 查看加锁详情
-- 这次Session 3请求id=7的记录锁,该锁被Session 1持有未释放,导致了Session 3锁等待超时
---TRANSACTION 1319, ACTIVE 14 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 8 row lock(s)
MySQL thread id 10, OS thread handle 140086066231040, query id 322 localhost root updating
update t set sal = sal + 1 where id = 7
------- TRX HAS BEEN WAITING 14 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1319 lock_mode X locks rec but not gap waiting
Record lock, heap no 8 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000207; asc ;;
 1: len 6; hex 000000000518; asc ;;
 2: len 7; hex b3000001270110; asc ' ;;
 3: len 4; hex 80000007; asc ;;
 4: len 4; hex 800002bc; asc ;;

------------------
TABLE LOCK table `zlm`.`t` trx id 1319 lock mode IX
RECORD LOCKS space id 23 page no 3 n bits 80 index GEN_CLUST_INDEX of table `zlm`.`t` trx id 1319 lock_mode X locks rec but not gap waiting
Record lock, heap no 8 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 6; hex 000000000207; asc ;;
 1: len 6; hex 000000000518; asc ;;
 2: len 7; hex b3000001270110; asc ' ;;
 3: len 4; hex 80000007; asc ;;
 4: len 4; hex 800002bc; asc ;;

案例分析

由于 t 表上不存在索引,3 个会话执行的语句都是全表扫描,在 RC 事务隔离级别下,这些语句都是需要发起当前读的操作(读取t表上最新的已提交事务版本),需要对读取到的全部记录加上记录锁(即行锁、也可称为 InnoDB 锁,大多数情况下,RC 隔离级别没有 Gap 锁,因此基本不太会出现 Next-Key 锁,对高并发场景比较友好)。

案例 1

  • Session 1:开始需要对每条记录加锁,由于不需要维护可重复读,也不需要锁 Gap,当返回 MySQL Server 层通过 where 条件过滤后,最终只对 id=4、id=5 的记录加了锁。
  • Session 2:从 id=1 开始读取记录并加锁,当读取到 id=4 的记录时,由于 Session 1 先对 id=4 的记录上了锁,就无法再对其进行加锁操作,我们看到它一直在等待 id=4 的 X 锁,直到锁等待超时报错,为何是 id=4,而不是 id=5?因为是按聚簇索引一条条读取记录的,所以锁也需要一条条加,当上一条记录的锁资源没获取到,就不会对下一条记录加锁。
  • Session 3:同样地,最开始也需要对读取到的记录一条条加锁,由于 id=7 的记录与 id=4、id=5 上的行锁并不冲突,此处可以利用半一致性读对 Update 的优化特性,提前将 id=7 上的行锁释放掉了,因此 Update 不会被阻塞,事务得以正常执行。

案例 2

  • Session 1:Select 语句没有用 where 条件,通过全表扫描访问到的所有记录都无法通过 MySQL Server 层过滤,因此将 t 表的全部记录都上了 X 锁。
  • Session 2:由于 Session 1 已经将全部记录都上了 X 锁,Session 2 当前读的 Select 操作由于无法获取任何记录的 X 锁,就被阻塞了。
  • Session 3:同样地,Session 1 持有的全记录 X 锁,使 Session 3 的 where 条件落到了匹配的区间内,表示 Session 1 对 id=7 的行确实需要更新,必须上锁,因此 Session 3 的 Update 被阻塞。

总结

  • 在 RC 事务隔离级别下,Update 语句可以利用到半一致性读的特性,会多进行一次判断,当 where 条件匹配到的记录与当前持有锁的事务中的记录不冲突时,就会提前释放 InnoDB 锁,虽然这样做违背了二阶段加锁协议,但却可以减少锁冲突,提高事务并发能力,是一种很好的优化行为。

参考链接

https://dev.mysql.com/doc/ref...
https://dev.mysql.com/doc/ref...
https://my.oschina.net/JKOPER...

你可能感兴趣的:(mysql,数据库)