MySQL8.0有一种复合索引可以跳过扫描范围访问方法,什么意思呢?简单的说,对于复合索引(a,b,c),我们以往常说“where a=xx and b=xx”是可以用到索引,但“where b=xx and c=xx”是用不到索引的。但MySQL8.0中,后者也是可以用到索引了。
我们建一张测试表,其测试数据是104万,如下:
root@localhost|test>select version();
+-----------+
| version() |
+-----------+
| 8.0.19 |
+-----------+
1 row in set (0.00 sec)
root@localhost|test>show create table t\G
*************************** 1. row ***************************
Table: t
Create Table: CREATE TABLE `t` (
`id` int NOT NULL AUTO_INCREMENT,
`dept` tinyint DEFAULT NULL comment '部门ID,1~10',
`name` varchar(30) COLLATE utf8mb4_bin DEFAULT NULL comment '姓名',
`create_time` datetime DEFAULT NULL comment '注册时间',
PRIMARY KEY (`id`),
KEY `idx_muti` (`create_time`,`name`,`dept`)
) ENGINE=InnoDB AUTO_INCREMENT=1441756 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.00 sec)
root@localhost|test>select count(1) from t;
+----------+
| count(1) |
+----------+
| 1048576 |
+----------+
1 row in set (0.11 sec)
从上可看到这个表有一个复合索引idx_muti (create_time,name,dept),我们尝试直接查dept字段看是否能用到索引
root@localhost|test>explain select count(1) from t where dept=5;
+----+-------------+-------+------------+-------+---------------+----------+---------+------+---------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+---------+----------+--------------------------+
| 1 | SIMPLE | t | NULL | index | idx_muti | idx_muti | 131 | NULL | 1045864 | 10.00 | Using where; Using index |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+---------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
root@localhost|test>select count(1) from t where dept=5;
+----------+
| count(1) |
+----------+
| 105271 |
+----------+
1 row in set (1.16 sec)
从执行计划可以看到对于复合索引(a,b,c),我们再也不需要非要先使用(a,b)为前提才能使用c了,感觉真的方便了不少!
慢着,这真正执行的耗时感觉有点不太对,我们尝试不走索引看看
root@localhost|test>explain select count(1) from t ignore index(idx_muti) where dept=5;
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| 1 | SIMPLE | t | NULL | ALL | NULL | NULL | NULL | NULL | 1045864 | 10.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
root@localhost|test>select count(1) from t ignore index(idx_muti) where dept=5;
+----------+
| count(1) |
+----------+
| 105271 |
+----------+
1 row in set (0.70 sec)
跳过扫描范围访问方法
请考虑以下情形:
CREATE TABLE t1 (f1 INT NOT NULL, f2 INT NOT NULL, PRIMARY KEY(f1, f2));
INSERT INTO t1 VALUES
(1,1), (1,2), (1,3), (1,4), (1,5),
(2,1), (2,2), (2,3), (2,4), (2,5);
INSERT INTO t1 SELECT f1, f2 + 5 FROM t1;
INSERT INTO t1 SELECT f1, f2 + 10 FROM t1;
INSERT INTO t1 SELECT f1, f2 + 20 FROM t1;
INSERT INTO t1 SELECT f1, f2 + 40 FROM t1;
ANALYZE TABLE t1;
EXPLAIN SELECT f1, f2 FROM t1 WHERE f2 > 40;
为了执行该查询,MySQL可以选择索引扫描以获取所有行(索引包括要选择的所有列),然后应用子句中的f2 > 40
条件WHERE
以产生最终结果集。
范围扫描比全索引扫描更有效,但是在这种情况下不能使用,因为f1
在第一个索引列上没有条件 。然而,如MySQL的8.0.13的,优化器可以执行多个范围扫描,一个用于的每个值f1
,使用一种称为跳跃扫描方法是类似于松散索引扫描(见节8.2.1.17,“优化GROUP BY”):
f1
(索引前缀)之间跳过 。f2 > 40
对其余索引部分上的条件, 对每个不同的前缀值执行子范围扫描。对于前面显示的数据集,算法的运行方式如下:
f1 = 1
)。f1 = 1 AND f2 > 40
)构造范围。f1 = 2
)。f1 = 2 AND f2 > 40
)构造范围。使用此策略可减少访问的行数,因为MySQL会跳过不符合每个构造范围的行。此跳过扫描访问方法适用于以下情况:
k
,] B_1,...,B_ m
,C [,D_1,...,D_ n
])。关键部分A和D可能为空,但B和C必须为非空。GROUP BY
或 DISTINCT
。k
必须是相等谓词,并且它们必须是常量。这包括 IN()
操作员。AND
的OR
条件: (cond1
(key_part1
) OR cond2
(key_part1
)) AND (cond1
(key_part2
) OR ...) AND ...
#删除原来的索引
root@localhost|test>drop index idx_muti on t;
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0
#新建一个索引,以dept字段作为前缀字段。
root@localhost|test>create index idx_muti on t(dept, create_time);
Query OK, 0 rows affected (4.75 sec)
Records: 0 Duplicates: 0 Warnings: 0
root@localhost|test>explain select count(1) from t where create_time>='2020-01-01' and create_time<'2020-02-01';
+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+----------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+----------------------------------------+
| 1 | SIMPLE | t | NULL | range | idx_muti | idx_muti | 8 | NULL | 116183 | 100.00 | Using where; Using index for skip scan |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+----------------------------------------+
1 row in set, 1 warning (0.00 sec)
root@localhost|test>select count(1) from t where create_time>='2020-01-01' and create_time<'2020-02-01';
+----------+
| count(1) |
+----------+
| 75932 |
+----------+
1 row in set (0.08 sec)
#重启MySQL
root@localhost|test>system systemctl restart mysql
#尝试使用全表扫描的方法
root@localhost|test>explain select count(1) from t ignore index(idx_muti) where create_time>='2020-01-01' and create_time<'2020-02-01';
ERROR 2006 (HY000): MySQL server has gone away #因为上面重启了mysql,因此此处链接丢失了需要重连
No connection. Trying to reconnect...
Connection id: 8
Current database: test
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| 1 | SIMPLE | t | NULL | ALL | NULL | NULL | NULL | NULL | 1045864 | 11.11 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.03 sec)
root@localhost|test>select count(1) from t ignore index(idx_muti) where create_time>='2020-01-01' and create_time<'2020-02-01';
+----------+
| count(1) |
+----------+
| 75932 |
+----------+
1 row in set (0.71 sec)
可见此时使用索引的效率更高。因此使用新特性时要注意其适用范围,否则可能会适得其反。
官档参考:https://dev.mysql.com/doc/refman/8.0/en/range-optimization.html#range-access-skip-scan