mysql版本:8.0.20
调试IDE:Visual Studio Code
看如下一条sql语句:
# table T (id int, name varchar(20))
delete from T where id = 10;
MySQL在执行的过程中,是如何加锁呢?
再看下面这条语句:
select * from T where id = 5;
那这条语句呢?其实这其中包含太多知识点了。要回答这两个问题,首先需要了解一些知识。
在MySQL默认存储引擎InnoDB中,实现的是基于多版本的并发控制协议——MVCC(Multi-Version Concurrency Control)(注:与MVVC相对的,是基于锁的并发控制,Lock-Based Concurrency Control)。
其中MVCC最大的好处是:读不加锁,读写不冲突。在读多写少的OLTP应用中,读写不冲突是非常重要的,极大的提高了系统的并发性能,在现阶段,几乎所有的RDBMS,都支持MVCC。其实,MVCC就一句话总结:同一份数据临时保存多个版本的一种方式,进而实现并发控制。
在MVCC并发控制中,读操作可以分为两类:快照读与当前读。
也叫做聚簇索引。在InnoDB中,数据的组织方式就是聚簇索引:完整的记录,储存在主键索引中,通过主键索引,就可以获取记录中所有的列。
也就是最左优先,这条原则针对的是组合索引和前缀索引,理解:
1、在MySQL中,进行条件过滤时,是按照向右匹配直到遇到范围查询(>,<,between,like)就停止匹配,比如说a = 1 and b = 2 and c > 3 and d = 4 如果建立(a, b, c, d)顺序的索引,d是用不到索引的,如果建立(a, b, d, c)索引就都会用上,其中a,b,d的顺序可以任意调整。
2、= 和 in 可以乱序,比如 a = 1 and b = 2 and c = 3 建立(a, b, c)索引可以任意顺序,MySQL的查询优化器会优化索引可以识别的形式。
传统的RDMS加锁的一个原则,就是2PL(Two-Phase Locking,二阶段锁)。也就是说锁操作分为两个阶段:加锁阶段和解锁阶段,并且保证加锁阶段和解锁阶段不想交。也就是说在一个事务中,不管有多少条增删改,都是在加锁阶段加锁,在 commit 后,进入解锁阶段,才会全部解锁。
MySQL/InnoDB中,定义了四种隔离级别:
在InnoDB中完整行锁包含三部分:
SQL1: select * from t1 where id = 10;(不加锁。因为MySQL是使用多版本并发控制的,读不加锁。)
SQL2: delete from t1 where id = 10;(需根据多种情况进行分析)
假设t1表上有索引,执行计划一定会选择使用索引进行过滤 (索引扫描),根据以下组合,来进行分析。
注:在前面八种组合下,也就是RC,RR隔离级别下,SQL1:select操作均不加锁,采用的是快照读,因此在下面的讨论中就忽略了,主要讨论SQL2:delete操作的加锁。
id是主键,Read Committed隔离级别,给定SQL:delete from t1 where id = 10; 只需要将主键上,id = 10的记录加上X锁即可。如下图所示:
结论:id是主键时,此SQL只需要在id=10这条记录上加X锁即可。
示例:
#准备数据
mysql> create table t1 (id int,name varchar(10));
mysql> alter table t1 add primary key (id);
mysql> insert into t1 values(1,'a'),(4,'c'),(7,'b'),(10,'a'),(20,'d'),(30,'b');
mysql> select * from t1;
+----+------+
| id | name |
+----+------+
| 1 | a |
| 4 | c |
| 7 | b |
| 10 | a |
| 20 | d |
| 30 | b |
+----+------+
6 rows in set (0.00 sec)
会话1
mysql> select @@tx_isolation;
+----------------+
| @@tx_isolation |
+----------------+
| READ-COMMITTED |
+----------------+
1 row in set, 1 warning (0.00 sec)
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> delete from t1 where id=10;
Query OK, 1 row affected (0.00 sec)
会话2
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from t1;
+----+------+
| id | name |
+----+------+
| 1 | a |
| 4 | c |
| 7 | b |
| 10 | a |
| 20 | d |
| 30 | b |
+----+------+
6 rows in set (0.00 sec)
mysql> update t1 set name='a1' where id=10;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> update t1 set name='a1' where id=11;
Query OK, 0 rows affected (0.00 sec)
Rows matched: 0 Changed: 0 Warnings: 0
mysql> update t1 set name='a1' where id=7;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
从示例中可以看到会话1执行的delete操作,只对id=10加了X锁。
id不是主键,而是一个Unique的二级索引键值。那么在RC隔离级别下,delete from t1 where id = 10; 需要加什么锁呢?见下图:
此组合中,id是unique索引,而主键是name列。此时,加锁的情况由于组合一有所不同。由于id是unique索引,因此delete语句会选择走id列的索引进行where条件的过滤,在找到id=10的记录后,首先会将unique索引上的id=10索引记录加上X锁,同时,会根据读取到的name列,回主键索引(聚簇索引),然后将聚簇索引上的name = ‘d’ 对应的主键索引项加X锁。
为什么聚簇索引上的记录也要加锁?试想一下,如果并发的一个SQL,是通过主键索引来更新:update t1 set id = 100 where name = 'd';此时,如果delete语句没有将主键索引上的记录加锁,那么并发的update就会感知不到delete语句的存在,违背了同一记录上的更新/删除需要串行执行的约束。
示例:
准备数据
mysql> create table t1 (id int,name varchar(10));
Query OK, 0 rows affected (0.06 sec)
mysql> ALTER TABLE test.t1 ADD UNIQUE INDEX idx_id (id);
Query OK, 0 rows affected (0.07 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> ALTER TABLE test.t1 ADD PRIMARY KEY (name);
Query OK, 0 rows affected (0.11 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> insert into t1 values(1,'f'),(2,'zz'),(3,'b'),(5,'a'),(6,'c'),(10,'d');
Query OK, 6 rows affected (0.01 sec)
Records: 6 Duplicates: 0 Warnings: 0
会话1
mysql> begin;
Query OK, 0 rows affected (0.01 sec)
mysql> delete from t1 where id=10;
Query OK, 1 row affected (0.00 sec)
会话2
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from t1;
+------+------+
| id | name |
+------+------+
| 1 | f |
| 2 | zz |
| 3 | b |
| 5 | a |
| 6 | c |
| 10 | d |
+------+------+
6 rows in set (0.00 sec)
mysql> update t1 set id =100 where name='d';
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> update t1 set id =100 where name='c';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update t1 set id =101 where name='a';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0
结论:若id列是unique列,其上有unique索引。那么SQL需要加两个X锁,一个对应于id unique索引上的id = 10的记录,另一把锁对应于聚簇索引上的[name=’d’,id=10]的记录。
id列是一个普通索引。假设delete from t1 where id = 10; 语句,仍旧选择id列上的索引进行过滤where条件,那么此时会持有哪些锁?同样见下图:
由上图可以看出,首先,id列索引上,满足id = 10查询的记录,均加上X锁。同时,这些记录对应的主键索引上的记录也加上X锁。与组合二的唯一区别,组合二最多只有一个满足条件的记录,而在组合三中会将所有满足条件的记录全部加上锁。
结论:若id列上有非唯一索引,那么对应的所有满足SQL查询条件的记录,都会加上锁。同时,这些记录在主键索引上也会加上锁。
示例:
准备数据
mysql> create table t1 (id int,name varchar(10));
mysql> ALTER TABLE test.t1 ADD PRIMARY KEY (name);
mysql> alter table t1 add index idx_id (id);
mysql> insert into t1 values(2,'zz'),(6,'c'),(10,'b'),(10,'d'),(11,'f'),(15,'a');
会话1
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> delete from t1 where id=10;
Query OK, 2 rows affected (0.00 sec)
会话2
mysql> select * from t1;
+------+------+
| id | name |
+------+------+
| 2 | zz |
| 6 | c |
| 10 | b |
| 10 | d |
| 11 | f |
| 15 | a |
+------+------+
6 rows in set (0.00 sec)
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> update t1 set id=11 where name='b';
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> update t1 set id=11 where name='d';
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> update t1 set id=11 where name='f';
Query OK, 0 rows affected (0.00 sec)
Rows matched: 1 Changed: 0 Warnings: 0
mysql> update t1 set id=11 where name='c';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
相对于前面的组合,该组合相对特殊,因为id列上无索引,所以在 where id = 10 这个查询条件下,没法通过索引来过滤,因此只能全表扫描做过滤。对于该组合,MySQL又会进行怎样的加锁呢?看下图:
由于id列上无索引,因此只能走聚簇索引,进行全表扫描。由图可以看出满足条件的记录只有两条,但是,聚簇索引上的记录都会加上X锁。但在实际操作中,MySQL进行了改进,在进行过滤条件时,发现不满足条件后,会调用 unlock_row 方法,把不满足条件的记录放锁(违背了2PL原则)。这样做,保证了最后满足条件的记录加上锁,但是每条记录的加锁操作是不能省略的。
结论:若id列上没有索引,MySQL会走聚簇索引进行全表扫描过滤。由于是在MySQl Server层面进行的,因此每条记录无论是否满足条件,都会加上X锁,但是,为了效率考虑,MySQL在这方面进行了改进,在扫描过程中,若记录不满足过滤条件,会进行解锁操作。同时优化违背了2PL原则。
示例:
准备数据
mysql> create table t1 (id int,name varchar(10));
mysql> ALTER TABLE test.t1 ADD PRIMARY KEY (name);
mysql> insert into t1 values(5,'a'),(3,'b'),(10,'d'),(2,'f'),(10,'g'),(9,'zz');
会话1
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> delete from t1 where id=10;
Query OK, 2 rows affected (0.00 sec)
会话2
mysql> select * from t1;
+------+------+
| id | name |
+------+------+
| 5 | a |
| 3 | b |
| 10 | d |
| 2 | f |
| 10 | g |
| 9 | zz |
+------+------+
6 rows in set (0.00 sec)
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> update t1 set id=6 where name='a';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update t1 set id=6 where name='b';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update t1 set id=6 where name='d';
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> update t1 set id=6 where name='f';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update t1 set id=6 where name='g';
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> update t1 set id=6 where name='zz';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update t1 set id=6 where name='zzf';
Query OK, 0 rows affected (0.00 sec)
Rows matched: 0 Changed: 0 Warnings: 0
实验结果与推倒的结论不一致,
实验结果看出只锁住了id=10的两行。
id列是主键列,Repeatable Read隔离级别,针对delete from t1 where id = 10; 这条SQL,加锁与组合一:”id主键 + RC“一致。
结论:id是主键是,此SQL语句只需要在id = 10这条记录上加上X锁即可。
示例:
mysql> create table t1 (id int,name varchar(10));
mysql> alter table t1 add primary key (id);
mysql> insert into t1 values(1,'a'),(4,'c'),(7,'b'),(10,'a'),(20,'d'),(30,'b');
mysql> select * from t1;
+----+------+
| id | name |
+----+------+
| 1 | a |
| 4 | c |
| 7 | b |
| 10 | a |
| 20 | d |
| 30 | b |
+----+------+
6 rows in set (0.00 sec)
mysql> select @@tx_isolation;
+-----------------+
| @@tx_isolation |
+-----------------+
| REPEATABLE-READ |
+-----------------+
1 row in set, 1 warning (0.00 sec)
会话1
mysql> begin;
Query OK, 0 rows affected (0.01 sec)
mysql> delete from t1 where id=10;
Query OK, 1 row affected (0.00 sec)
会话2
mysql> select * from t1;
+----+------+
| id | name |
+----+------+
| 1 | a |
| 4 | c |
| 7 | b |
| 10 | a |
| 20 | d |
| 30 | b |
+----+------+
6 rows in set (0.00 sec)
mysql> update t1 set name='a1' where id=10;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> update t1 set name='a1' where id=11;
Query OK, 0 rows affected (0.00 sec)
Rows matched: 0 Changed: 0 Warnings: 0
mysql> update t1 set name='a1' where id=7;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
id唯一索引 + RR的加锁与id唯一索引,RC一致。两个X锁,id唯一索引满足条件的记录上一个,对应的聚簇索引上的记录一个。
示例:
准备数据
mysql> create table t1 (id int,name varchar(10));
mysql> ALTER TABLE test.t1 ADD UNIQUE INDEX idx_id (id);
mysql> ALTER TABLE test.t1 ADD PRIMARY KEY (name);
mysql> insert into t1 values(1,'f'),(2,'zz'),(3,'b'),(5,'a'),(6,'c'),(10,'d');
会话1
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> delete from t1 where id=10;
Query OK, 1 row affected (0.01 sec)
会话2
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from t1;
+------+------+
| id | name |
+------+------+
| 1 | f |
| 2 | zz |
| 3 | b |
| 5 | a |
| 6 | c |
| 10 | d |
+------+------+
6 rows in set (0.00 sec)
mysql> update t1 set id =100 where name='d';
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> update t1 set id =100 where name='c';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update t1 set id =101 where name='a';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
在组合一到组合四中,隔离级别是Read Committed下,会出现幻读情况,但是在该组合Repeatable Read级别下,不会出现幻读情况,这是怎么回事呢?而MySQL又是如何给上述语句加锁呢?看下图:
该组合和组合三看起来很相似,但差别很大,在该组合中加入了一个间隙锁(Gap锁)。这个Gap锁就是相对于RC级别下,RR级别下不会出现幻读情况的关键。实质上,Gap锁不是针对于记录本身的,而是记录之间的Gap。所谓幻读,就是同一事务下,连续进行多次当前读,且读取一个范围内的记录(包括直接查询所有记录结果或者做聚合统计),发现结果不一致(标准档案一般指记录增多, 记录的减少应该也算是幻读)。
那么该如何解决这个问题呢?如何保证多次当前读返回一致的记录,那么就需要在多个当前读之间,其他事务不会插入新的满足条件的记录并提交。为了实现该结果,Gap锁就应运而生。
如图所示,有些位置可以插入新的满足条件的记录,考虑到B+树的有序性,满足条件的记录一定是具有连续性的。因此会在 [4, b], [10, c], [10, d], [20, e] 之间加上Gap锁。
Insert操作时,如insert(10, aa),首先定位到 [4, b], [10, c]间,然后插入在插入之前,会检查该Gap是否加锁了,如果被锁上了,则Insert不能加入记录。因此通过第一次当前读,会把满足条件的记录加上X锁,还会加上三把Gap锁,将可能插入满足条件记录的3个Gap锁上,保证后续的Insert不能插入新的满足 id = 10 的记录,也就解决了幻读问题。
而在组合五,组合六中,同样是RR级别,但是不用加上Gap锁,在组合五中id是主键,组合六中id是Unique键,都能保证唯一性。一个等值查询,最多只能返回一条满足条件的记录,而且新的相同取值的记录是无法插入的。
结论:在RR隔离级别下,id列上有非唯一索引,对于上述的SQL语句;首先,通过id索引定位到第一条满足条件的记录,给记录加上X锁,并且给Gap加上Gap锁,然后在主键聚簇索引上满足相同条件的记录加上X锁,然后返回;之后读取下一条记录重复进行。直至第一条出现不满足条件的记录,此时,不需要给记录加上X锁,但是需要给Gap加上Gap锁,最后返回结果。
示例:
准备数据
mysql> create table t1 (id int,name varchar(10));
mysql> ALTER TABLE test.t1 ADD PRIMARY KEY (name);
mysql> alter table t1 add index idx_id (id);
mysql> insert into t1 values(1,'a'),(4,'b'),(10,'c'),(20,'e'),(10,'d');
会话1
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> delete from t1 where id=10;
Query OK, 2 rows affected (0.00 sec)
会话2
mysql> select * from t1;
+------+------+
| id | name |
+------+------+
| 1 | a |
| 4 | b |
| 10 | c |
| 10 | d |
| 20 | e |
+------+------+
5 rows in set (0.00 sec)
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> insert into t1 values(6,'aa');
Query OK, 1 row affected (0.00 sec)
mysql> insert into t1 values(6,'bb');
Query OK, 1 row affected (0.01 sec)
mysql> insert into t1 values(6,'cc');
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> insert into t1 values(7,'cc');
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> insert into t1 values(8,'cc');
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> insert into t1 values(9,'cc');
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> insert into t1 values(10,'cc');
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> insert into t1 values(11,'cc');
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> insert into t1 values(11,'ff');
Query OK, 1 row affected (0.00 sec)
mysql> insert into t1 values(11,'g');
Query OK, 1 row affected (0.00 sec)
该组合中,id列上无索引,只能进行全表扫描,那么该如何加锁,看下图:
如图,可以看出这是一个很恐怖的事情,全表每条记录要加X锁,每个Gap加上Gap锁,如果表上存在大量数据时,又是什么情景呢?这种情况下,这个表,除了不加锁的快照读,其他任何加锁的并发SQL,均不能执行,不能更新,删除,插入,这样,全表锁死。
当然,和组合四一样,MySQL进行了优化,就是semi-consistent read。semi-consistent read开启的情况下,对于不满足条件的记录,MySQL会提前放锁,同时Gap锁也会释放。而semi-consistent read是如何触发:要么在Read Committed隔离级别下;要么在Repeatable Read隔离级别下,设置了 innodb_locks_unsafe_for_binlog 参数。
示例:
准备数据
mysql> create database cgwtest;
mysql> CREATE TABLE `t` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`d` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
//mysql> insert into t1 values(5,'a'),(3,'b'),(10,'d'),(2,'f'),(10,'g'),(9,'zz');
mysql> insert into t values(1,1),(5,5),(10,10);
会话1
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> delete from t where d=5;
Query OK, 1 rows affected (0.00 sec)
会话2
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from t;
+----+------+
| id | d |
+----+------+
| 1 | 1 |
| 5 | 5 |
| 10 | 10 |
+----+------+
3 rows in set (0.00 sec)
mysql> insert into t values(2,2);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
(注:以下流程和源码是主流程和重点关注的点!)
/* Basic lock modes */
enum lock_mode {
LOCK_IS = 0, /* intention shared */
LOCK_IX, /* intention exclusive */
LOCK_S, /* shared */
LOCK_X, /* exclusive */
LOCK_AUTO_INC, /* locks the auto-inc counter of a table
in an exclusive mode */
LOCK_NONE, /* this is used elsewhere to note consistent read */
LOCK_NUM = LOCK_NONE, /* number of lock modes */
LOCK_NONE_UNSET = 255
};
ut_ad(gap_mode == LOCK_ORDINARY || gap_mode == LOCK_GAP ||
gap_mode == LOCK_REC_NOT_GAP);
#define ut_ad(EXPR) ut_a(EXPR)
/** Debug statement. Does nothing unless UNIV_DEBUG is defined. */调试断言
核心方法:
/** Sets a lock on a record.
mostly due to we cannot reposition a record in R-Tree (with the
nature of splitting)
@param[in] pcur cursor
@param[in] rec record
@param[in] index index
@param[in] offsets rec_get_offsets(rec, index)
@param[in] sel_mode select mode: SELECT_ORDINARY,
SELECT_SKIP_LOKCED, or SELECT_NO_WAIT
@param[in] mode lock mode
@param[in] type LOCK_ORDINARY, LOCK_GAP, or LOC_REC_NOT_GAP
@param[in] thr query thread
@param[in] mtr mtr
@return DB_SUCCESS, DB_SUCCESS_LOCKED_REC, or error code */
UNIV_INLINE
dberr_t sel_set_rec_lock(btr_pcur_t *pcur, const rec_t *rec,
dict_index_t *index, const ulint *offsets,
select_mode sel_mode, ulint mode, ulint type,
que_thr_t *thr, mtr_t *mtr) {
trx_t *trx;
dberr_t err = DB_SUCCESS;
const buf_block_t *block;
block = btr_pcur_get_block(pcur);
trx = thr_get_trx(thr);
trx_mutex_enter(trx);
ut_ad(trx_can_be_handled_by_current_thread(trx));
bool too_many_locks = (UT_LIST_GET_LEN(trx->lock.trx_locks) > 10000);
trx_mutex_exit(trx);
if (too_many_locks) {
if (buf_LRU_buf_pool_running_out()) {
return (DB_LOCK_TABLE_FULL);
}
}
if (index->is_clustered()) {
err = lock_clust_rec_read_check_and_lock(
lock_duration_t::REGULAR, block, rec, index, offsets, sel_mode,
static_cast(mode), type, thr);
} else {
if (dict_index_is_spatial(index)) {
if (type == LOCK_GAP || type == LOCK_ORDINARY) {
ut_ad(0);
ib::error(ER_IB_MSG_1026) << "Incorrectly request GAP lock "
"on RTree";
return (DB_SUCCESS);
}
err = sel_set_rtr_rec_lock(pcur, rec, index, offsets, sel_mode, mode,
type, thr, mtr);
} else {
err = lock_sec_rec_read_check_and_lock(
lock_duration_t::REGULAR, block, rec, index, offsets, sel_mode,
static_cast(mode), type, thr);
}
}
return (err);
}
dberr_t lock_clust_rec_read_check_and_lock(
const lock_duration_t duration, const buf_block_t *block, const rec_t *rec,
dict_index_t *index, const ulint *offsets, const select_mode sel_mode,
const lock_mode mode, const ulint gap_mode, que_thr_t *thr) {
dberr_t err;
ulint heap_no;
ut_ad(rec_offs_validate(rec, index, offsets));
if (srv_read_only_mode || index->table->is_temporary()) {
return (DB_SUCCESS);
}
heap_no = page_rec_get_heap_no(rec);
if (heap_no != PAGE_HEAP_NO_SUPREMUM) {
lock_rec_convert_impl_to_expl(block, rec, index, offsets);//隐示锁转显示锁
}
DEBUG_SYNC_C("after_lock_clust_rec_read_check_and_lock_impl_to_expl");
lock_mutex_enter();//系统锁
if (duration == lock_duration_t::AT_LEAST_STATEMENT) {
lock_protect_locks_till_statement_end(thr);
}
ut_ad(mode != LOCK_X ||
lock_table_has(thr_get_trx(thr), index->table, LOCK_IX));
ut_ad(mode != LOCK_S ||
lock_table_has(thr_get_trx(thr), index->table, LOCK_IS));
err = lock_rec_lock(false, sel_mode, mode | gap_mode, block, heap_no, index,
thr);
MONITOR_INC(MONITOR_NUM_RECLOCK_REQ);
lock_mutex_exit();
ut_ad(lock_rec_queue_validate(false, block, rec, index, offsets));
DEBUG_SYNC_C("after_lock_clust_rec_read_check_and_lock");
ut_ad(err == DB_SUCCESS || err == DB_SUCCESS_LOCKED_REC ||
err == DB_LOCK_WAIT || err == DB_DEADLOCK || err == DB_SKIP_LOCKED ||
err == DB_LOCK_NOWAIT);
return (err);
}
/** Tries to lock the specified record in the mode requested. If not immediately
possible, enqueues a waiting lock request. This is a low-level function
which does NOT look at implicit locks! Checks lock compatibility within
explicit locks. This function sets a normal next-key lock, or in the case
of a page supremum record, a gap type lock.
@param[in] impl if true, no lock is set if no wait is
necessary: we assume that the caller will
set an implicit lock
@param[in] sel_mode select mode: SELECT_ORDINARY,
SELECT_SKIP_LOCKED, or SELECT_NO_WAIT
@param[in] mode lock mode: LOCK_X or LOCK_S possibly ORed to
either LOCK_GAP or LOCK_REC_NOT_GAP
@param[in] block buffer block containing the record
@param[in] heap_no heap number of record
@param[in] index index of record
@param[in,out] thr query thread
@return DB_SUCCESS, DB_SUCCESS_LOCKED_REC, DB_LOCK_WAIT, DB_DEADLOCK,
DB_SKIP_LOCKED, or DB_LOCK_NOWAIT */
static dberr_t lock_rec_lock(bool impl, select_mode sel_mode, ulint mode,
const buf_block_t *block, ulint heap_no,
dict_index_t *index, que_thr_t *thr) {
ut_ad(lock_mutex_own());
ut_ad(!srv_read_only_mode);
/* Implicit locks are equivalent to LOCK_X|LOCK_REC_NOT_GAP, so we can omit
creation of explicit lock only if the requested mode was LOCK_REC_NOT_GAP */
ut_ad(!impl || ((mode & LOCK_REC_NOT_GAP) == LOCK_REC_NOT_GAP));
/* We try a simplified and faster subroutine for the most
common cases */
switch (lock_rec_lock_fast(impl, mode, block, heap_no, index, thr)) {
case LOCK_REC_SUCCESS
return (DB_SUCCESS);
case LOCK_REC_SUCCESS_CREATED:
return (DB_SUCCESS_LOCKED_REC);
case LOCK_REC_FAIL:
return (
lock_rec_lock_slow(impl, sel_mode, mode, block, heap_no, index, thr));
default:
ut_error;
}
}
delete语句调用堆栈:
lock_rec_lock(bool impl, select_mode sel_mode, ulint mode, const buf_block_t * block, ulint heap_no, dict_index_t * index, que_thr_t * thr) (\root\mysql-8.0.20\storage\innobase\lock\lock0lock.cc:1667)
lock_clust_rec_read_check_and_lock(const lock_duration_t duration, const buf_block_t * block, const rec_t * rec, dict_index_t * index, const ulint * offsets, const select_mode sel_mode, const lock_mode mode, const ulint gap_mode, que_thr_t * thr) (\root\mysql-8.0.20\storage\innobase\lock\lock0lock.cc:5701)
sel_set_rec_lock(btr_pcur_t * pcur, const rec_t * rec, dict_index_t * index, const ulint * offsets, select_mode sel_mode, ulint mode, ulint type, que_thr_t * thr, mtr_t * mtr) (\root\mysql-8.0.20\storage\innobase\row\row0sel.cc:1184)
row_search_mvcc(unsigned char * buf, page_cur_mode_t mode, row_prebuilt_t * prebuilt, ulint match_mode, const ulint direction) (\root\mysql-8.0.20\storage\innobase\row\row0sel.cc:5214)
ha_innobase::general_fetch(ha_innobase * const this, uchar * buf, uint direction, uint match_mode) (\root\mysql-8.0.20\storage\innobase\handler\ha_innodb.cc:9949)
ha_innobase::rnd_next(ha_innobase * const this, uchar * buf) (\root\mysql-8.0.20\storage\innobase\handler\ha_innodb.cc:10226)
handler::ha_rnd_next(handler * const this, uchar * buf) (\root\mysql-8.0.20\sql\handler.cc:2966)
TableScanIterator::Read(TableScanIterator * const this) (\root\mysql-8.0.20\sql\records.cc:423)
Sql_cmd_delete::delete_from_single_table(Sql_cmd_delete * const this, THD * thd) (\root\mysql-8.0.20\sql\sql_delete.cc:503)
Sql_cmd_delete::execute_inner(Sql_cmd_delete * const this, THD * thd) (\root\mysql-8.0.20\sql\sql_delete.cc:823)
Sql_cmd_dml::execute(Sql_cmd_dml * const this, THD * thd) (\root\mysql-8.0.20\sql\sql_select.cc:725)
mysql_execute_command(THD * thd, bool first_level) (\root\mysql-8.0.20\sql\sql_parse.cc:3471)
mysql_parse(THD * thd, Parser_state * parser_state) (\root\mysql-8.0.20\sql\sql_parse.cc:5306)
dispatch_command(THD * thd, const COM_DATA * com_data, enum_server_command command) (\root\mysql-8.0.20\sql\sql_parse.cc:1776)
do_command(THD * thd) (\root\mysql-8.0.20\sql\sql_parse.cc:1274)
handle_connection(void * arg) (\root\mysql-8.0.20\sql\conn_handler\connection_handler_per_thread.cc:302)
pfs_spawn_thread(void * arg) (\root\mysql-8.0.20\storage\perfschema\pfs.cc:2854)
libpthread.so.0!start_thread (未知源:0)
libc.so.6!clone (未知源:0)
执行删除:
row_upd_clust_step(upd_node_t * node, que_thr_t * const thr) (\root\mysql-8.0.20\storage\innobase\row\row0upd.cc:2982)
row_upd(upd_node_t * node, que_thr_t * thr) (\root\mysql-8.0.20\storage\innobase\row\row0upd.cc:3175)
row_upd_step(que_thr_t * thr) (\root\mysql-8.0.20\storage\innobase\row\row0upd.cc:3306)
row_update_for_mysql_using_upd_graph(const unsigned char * mysql_rec, row_prebuilt_t * prebuilt) (\root\mysql-8.0.20\storage\innobase\row\row0mysql.cc:2347)
row_update_for_mysql(const unsigned char * mysql_rec, row_prebuilt_t * prebuilt) (\root\mysql-8.0.20\storage\innobase\row\row0mysql.cc:2443)
ha_innobase::delete_row(ha_innobase * const this, const uchar * record) (\root\mysql-8.0.20\storage\innobase\handler\ha_innodb.cc:9374)
handler::ha_delete_row(handler * const this, const uchar * buf) (\root\mysql-8.0.20\sql\handler.cc:7894)
Sql_cmd_delete::delete_from_single_table(Sql_cmd_delete * const this, THD * thd) (\root\mysql-8.0.20\sql\sql_delete.cc:528)
Sql_cmd_delete::execute_inner(Sql_cmd_delete * const this, THD * thd) (\root\mysql-8.0.20\sql\sql_delete.cc:823)
Sql_cmd_dml::execute(Sql_cmd_dml * const this, THD * thd) (\root\mysql-8.0.20\sql\sql_select.cc:725)
mysql_execute_command(THD * thd, bool first_level) (\root\mysql-8.0.20\sql\sql_parse.cc:3471)
mysql_parse(THD * thd, Parser_state * parser_state) (\root\mysql-8.0.20\sql\sql_parse.cc:5306)
dispatch_command(THD * thd, const COM_DATA * com_data, enum_server_command command) (\root\mysql-8.0.20\sql\sql_parse.cc:1776)
do_command(THD * thd) (\root\mysql-8.0.20\sql\sql_parse.cc:1274)
handle_connection(void * arg) (\root\mysql-8.0.20\sql\conn_handler\connection_handler_per_thread.cc:302)
pfs_spawn_thread(void * arg) (\root\mysql-8.0.20\storage\perfschema\pfs.cc:2854)
libpthread.so.0!start_thread (未知源:0)
libc.so.6!clone (未知源:0)
核心方法:
/** Checks if some other transaction has a conflicting explicit lock request
in the queue, so that we have to wait.
@return lock or NULL */
static const lock_t *lock_rec_other_has_conflicting(
ulint mode, /*!< in: LOCK_S or LOCK_X,
possibly ORed to LOCK_GAP or
LOC_REC_NOT_GAP,
LOCK_INSERT_INTENTION */
const buf_block_t *block, /*!< in: buffer block containing
the record */
ulint heap_no, /*!< in: heap number of the record */
const trx_t *trx) /*!< in: our transaction */
{
ut_ad(lock_mutex_own());
ut_ad(!(mode & ~(ulint)(LOCK_MODE_MASK | LOCK_GAP | LOCK_REC_NOT_GAP |
LOCK_INSERT_INTENTION)));
ut_ad(!(mode & LOCK_PREDICATE));
ut_ad(!(mode & LOCK_PRDT_PAGE));
RecID rec_id{block, heap_no};
const bool is_supremum = rec_id.is_supremum();
return (Lock_iter::for_each(rec_id, [=](const lock_t *lock) {
return (!(lock_rec_has_to_wait(trx, mode, lock, is_supremum)));
}));
}
/** Iterate over all the locks on a specific row
@param[in] rec_id Iterate over locks on this row
@param[in] f Function to call for each entry
@param[in] hash_table The hash table to iterate over
@return lock where the callback returned false */
template
static const lock_t *for_each(const RecID &rec_id, F &&f,
hash_table_t *hash_table = lock_sys->rec_hash) {
ut_ad(lock_mutex_own());
auto list = hash_get_nth_cell(hash_table,
hash_calc_hash(rec_id.m_fold, hash_table));
for (auto lock = first(list, rec_id); lock != nullptr;
lock = advance(rec_id, lock)) {
ut_ad(lock->is_record_lock());
if (!std::forward(f)(lock)) {
return (lock);
}
}
return (nullptr);
}
};
/** Checks if a lock request for a new lock has to wait for request lock2.
@return true if new lock has to wait for lock2 to be removed */
UNIV_INLINE
bool lock_rec_has_to_wait(
const trx_t *trx, /*!< in: trx of new lock */
ulint type_mode, /*!< in: precise mode of the new lock
to set: LOCK_S or LOCK_X, possibly
ORed to LOCK_GAP or LOCK_REC_NOT_GAP,
LOCK_INSERT_INTENTION */
const lock_t *lock2, /*!< in: another record lock; NOTE that
it is assumed that this has a lock bit
set on the same record as in the new
lock we are setting */
bool lock_is_on_supremum)
/*!< in: true if we are setting the
lock on the 'supremum' record of an
index page: we know then that the lock
request is really for a 'gap' type lock */
{
ut_ad(trx && lock2);
ut_ad(lock_get_type_low(lock2) == LOCK_REC);
const bool is_hp = trx_is_high_priority(trx);
if (trx != lock2->trx &&
!lock_mode_compatible(static_cast(LOCK_MODE_MASK & type_mode),
lock_get_mode(lock2))) {
/* If our trx is High Priority and the existing lock is WAITING and not
high priority, then we can ignore it. */
if (is_hp && lock2->is_waiting() && !trx_is_high_priority(lock2->trx)) {
return (false);
}
/* We have somewhat complex rules when gap type record locks
cause waits */
if ((lock_is_on_supremum || (type_mode & LOCK_GAP)) &&
!(type_mode & LOCK_INSERT_INTENTION)) {
/* Gap type locks without LOCK_INSERT_INTENTION flag
do not need to wait for anything. This is because
different users can have conflicting lock types
on gaps. */
return (false);
}
if (!(type_mode & LOCK_INSERT_INTENTION) && lock_rec_get_gap(lock2)) {
/* Record lock (LOCK_ORDINARY or LOCK_REC_NOT_GAP
does not need to wait for a gap type lock */
return (false);
}
if ((type_mode & LOCK_GAP) && lock_rec_get_rec_not_gap(lock2)) {
/* Lock on gap does not need to wait for
a LOCK_REC_NOT_GAP type lock */
return (false);
}
if (lock_rec_get_insert_intention(lock2)) {
/* No lock request needs to wait for an insert
intention lock to be removed. This is ok since our
rules allow conflicting locks on gaps. This eliminates
a spurious deadlock caused by a next-key lock waiting
for an insert intention lock; when the insert
intention lock was granted, the insert deadlocked on
the waiting next-key lock.
Also, insert intention locks do not disturb each
other. */
return (false);
}
return (true);
}
return (false);
}
调用堆栈
lock_rec_other_has_conflicting(ulint mode, const buf_block_t * block, ulint heap_no, const trx_t * trx) (\root\mysql-8.0.20\storage\innobase\lock\lock0lock.cc:805)
lock_rec_insert_check_and_lock(ulint flags, const rec_t * rec, buf_block_t * block, dict_index_t * index, que_thr_t * thr, mtr_t * mtr, ulint * inherit) (\root\mysql-8.0.20\storage\innobase\lock\lock0lock.cc:5291)
btr_cur_ins_lock_and_undo(ulint flags, btr_cur_t * cursor, dtuple_t * entry, que_thr_t * thr, mtr_t * mtr, ulint * inherit) (\root\mysql-8.0.20\storage\innobase\btr\btr0cur.cc:2621)
btr_cur_optimistic_insert(ulint flags, btr_cur_t * cursor, ulint ** offsets, mem_heap_t ** heap, dtuple_t * entry, rec_t ** rec, big_rec_t ** big_rec, que_thr_t * thr, mtr_t * mtr) (\root\mysql-8.0.20\storage\innobase\btr\btr0cur.cc:2841)
row_ins_clust_index_entry_low(uint32_t flags, ulint mode, dict_index_t * index, ulint n_uniq, dtuple_t * entry, que_thr_t * thr, bool dup_chk_only) (\root\mysql-8.0.20\storage\innobase\row\row0ins.cc:2515)
row_ins_clust_index_entry(dict_index_t * index, dtuple_t * entry, que_thr_t * thr, bool dup_chk_only) (\root\mysql-8.0.20\storage\innobase\row\row0ins.cc:3095)
row_ins_index_entry(dict_index_t * index, dtuple_t * entry, uint32_t & multi_val_pos, que_thr_t * thr) (\root\mysql-8.0.20\storage\innobase\row\row0ins.cc:3286)
row_ins_index_entry_step(ins_node_t * node, que_thr_t * thr) (\root\mysql-8.0.20\storage\innobase\row\row0ins.cc:3424)
row_ins(ins_node_t * node, que_thr_t * thr) (\root\mysql-8.0.20\storage\innobase\row\row0ins.cc:3542)
row_ins_step(que_thr_t * thr) (\root\mysql-8.0.20\storage\innobase\row\row0ins.cc:3666)
row_insert_for_mysql_using_ins_graph(const unsigned char * mysql_rec, row_prebuilt_t * prebuilt) (\root\mysql-8.0.20\storage\innobase\row\row0mysql.cc:1585)
row_insert_for_mysql(const unsigned char * mysql_rec, row_prebuilt_t * prebuilt) (\root\mysql-8.0.20\storage\innobase\row\row0mysql.cc:1715)
ha_innobase::write_row(ha_innobase * const this, uchar * record) (\root\mysql-8.0.20\storage\innobase\handler\ha_innodb.cc:8530)
handler::ha_write_row(handler * const this, uchar * buf) (\root\mysql-8.0.20\sql\handler.cc:7837)
write_record(THD * thd, TABLE * table, COPY_INFO * info, COPY_INFO * update) (\root\mysql-8.0.20\sql\sql_insert.cc:2111)
Sql_cmd_insert_values::execute_inner(Sql_cmd_insert_values * const this, THD * thd) (\root\mysql-8.0.20\sql\sql_insert.cc:621)
Sql_cmd_dml::execute(Sql_cmd_dml * const this, THD * thd) (\root\mysql-8.0.20\sql\sql_select.cc:725)
mysql_execute_command(THD * thd, bool first_level) (\root\mysql-8.0.20\sql\sql_parse.cc:3471)
mysql_parse(THD * thd, Parser_state * parser_state) (\root\mysql-8.0.20\sql\sql_parse.cc:5306)
dispatch_command(THD * thd, const COM_DATA * com_data, enum_server_command command) (\root\mysql-8.0.20\sql\sql_parse.cc:1776)
结论:在Repeatable Read隔离级别下,如果进行全表扫描的当前读,那么会锁上表上的所有记录,并且所有的Gap加上Gap锁,杜绝所有的 delete/update/insert 操作。当然在MySQL中,可以触发 semi-consistent read来缓解锁开销与并发影响,但是semi-consistent read本身也会带来其他的问题,不建议使用。
在最后组合中,对于上诉的删除SQL语句,加锁过程和组合八一致。但是,对于查询语句(比如select * from T1 where id = 10)来说,在RC,RR隔离级别下,都是快照读,不加锁。在Serializable隔离级别下,无论是查询语句也会加锁,也就是说快照读不存在了,MVCC降级为Lock-Based CC。
结论:在MySQL/InnoDB中,所谓的读不加锁,并不适用于所有的情况,而是和隔离级别有关。在Serializable隔离级别下,所有的操作都会加锁。
数据库事务的4个特性:
原子性(Atomic): 事务中的多个操作,不可分割,要么都成功,要么都失败; All or Nothing.
一致性(Consistency): 事务操作之后, 数据库所处的状态和业务规则是一致的; 比如a,b账户相互转账之后,总金额不变;
隔离性(Isolation): 多个事务之间就像是串行执行一样,不相互影响;
持久性(Durability): 事务提交后被持久化到永久存储.
其中 隔离性 分为了四种:
READ UNCOMMITTED:可以读取未提交的数据,未提交的数据称为脏数据,所以又称脏读。此时:幻读,不可重复读和脏读均允许;
READ COMMITTED:只能读取已经提交的数据;此时:允许幻读和不可重复读,但不允许脏读,所以RC隔离级别要求解决脏读;
REPEATABLE READ:同一个事务中多次执行同一个select,读取到的数据没有发生改变;此时:允许幻读,但不允许不可重复读和脏读,所以RR隔离级别要求解决不可重复读;
SERIALIZABLE: 幻读,不可重复读和脏读都不允许,所以serializable要求解决幻读;
脏读:可以读取未提交的数据。RC 要求解决脏读;
不可重复读:同一个事务中多次执行同一个select, 读取到的数据发生了改变(被其它事务update并且提交);
可重复读:同一个事务中多次执行同一个select, 读取到的数据没有发生改变(一般使用MVCC实现);RR各级级别要求达到可重复读的标准;
幻读:同一个事务中多次执行同一个select, 读取到的数据行发生改变。也就是行数减少或者增加了(被其它事务delete/insert并且提交)。SERIALIZABLE要求解决幻读问题;
这里一定要区分 不可重复读 和 幻读:
不可重复读的重点是修改:
同样的条件的select, 你读取过的数据, 再次读取出来发现值不一样了
幻读的重点在于新增或者删除:
同样的条件的select, 第1次和第2次读出来的记录数不一样
从结果上来看, 两者都是为多次读取的结果不一致。但如果你从实现的角度来看, 它们的区别就比较大:
对于前者, 在RC下只需要锁住满足条件的记录,就可以避免被其它事务修改,也就是 select for update, select in share mode; RR隔离下使用MVCC实现可重复读;
对于后者, 要锁住满足条件的记录及所有这些记录之间的gap,也就是需要 gap lock。
而ANSI SQL标准没有从隔离程度进行定义,而是定义了事务的隔离级别,同时定义了不同事务隔离级别解决的三大并发问题:
Isolation Level |
Dirty Read |
Unrepeatable Read |
Phantom Read |
Read UNCOMMITTED |
YES |
YES |
YES |
READ COMMITTED |
NO |
YES |
YES |
READ REPEATABLE |
NO |
NO |
YES |
SERIALIZABLE |
NO |
NO |
NO |
除了MySQL默认采用RR隔离级别之外,其它几大数据库都是采用RC隔离级别。
但是他们的实现也是极其不一样的。Oracle仅仅实现了RC 和 SERIALIZABLE隔离级别。默认采用RC隔离级别,解决了脏读。但是允许不可重复读和幻读。其SERIALIZABLE则解决了脏读、不可重复读、幻读。
MySQL的实现:MySQL默认采用RR隔离级别,SQL标准是要求RR解决不可重复读的问题,但是因为MySQL采用了gap lock,所以实际上MySQL的RR隔离级别也解决了幻读的问题。那么MySQL的SERIALIZABLE是怎么回事呢?其实MySQL的SERIALIZABLE采用了经典的实现方式,对读和写都加锁。
MySQL数据库中默认隔离级别为RR,但是实际情况是使用RC 和 RR隔离级别的都不少。好像淘宝、网易都是使用的 RC 隔离级别。那么在MySQL中 RC 和 RR有什么区别呢?我们该如何选择呢?为什么MySQL将RR作为默认的隔离级别呢?
1> 显然 RR 支持 gap lock(next-key lock),而RC则没有gap lock。因为MySQL的RR需要gap lock来解决幻读问题。而RC隔离级别则是允许存在不可重复读和幻读的。所以RC的并发一般要好于RR;
2> RC 隔离级别,通过 where 条件过滤之后,不符合条件的记录上的行锁,会释放掉(虽然这里破坏了“两阶段加锁原则”);但是RR隔离级别,即使不符合where条件的记录,也不会是否行锁和gap lock;所以从锁方面来看,RC的并发应该要好于RR;另外 insert into t select ... from s where 语句在s表上的锁也是不一样的。
2563 2048 + 512 + 3