对于服务端开发人员来说性能非常重要, 接口的响应时间也是关系到用户体验的大事, 所以SQL的执行时长就至关重要了, 我们通过SQL语句结构优化和添加索引的方式来压缩SQL的执行时长, 但是优化一条SQL我们首先需要知道这条SQL的问题在哪, 你需要一个好的工具Explain
我们来看下索引所在的表结构
CREATE TABLE "user" (
"id" int(11) NOT NULL AUTO_INCREMENT,
"name" varchar(20) NOT NULL,
"age" int(3) NOT NULL,
"sex" int(1) NOT NULL COMMENT '0:女, 1:男',
"marital_status" int(1) DEFAULT NULL COMMENT '0:未婚, 1:已婚, 2:离婚, 3:再婚',
"create_date" datetime DEFAULT NULL,
PRIMARY KEY ("id"),
KEY "marital_status_index" ("marital_status")
) ENGINE=InnoDB AUTO_INCREMENT=149 DEFAULT CHARSET=utf8;
我们可以看到这个表中只有一个额外的索引marital_status
, 但是为何我不给age
和sex
加索引呢?
和年龄相关的业务呢通常情况下都是范围查询, 范围查询会导致索引失效, 所以也就没有必要为
age
建立索引, 而sex
则是因为其只有俩个属性, 分别是0和1, 建立索引也没有意义
在MySQL中想要排查索引是否失效我们需要一个特殊的函数explain
, 我们来看下它的作用
mysql> explain select age from user where marital_status = 1;
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | user | NULL | ref | marital_age_index | marital_age_index | 4 | const | 29 | 100.00 | Using index |
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+-------------+
1 row in set (0.00 sec)
mysql>
我们来解释下每个列所代表的的含义
没有
依赖关系有
依赖关系type
( 查询的类型/级别 )级别顺序
system > const > eq_ref > ref > range > index > all
mysql> explain select * from user where id = 35;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | user | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
1 row in set (0.00 sec)
between
或者in
函数mysql> explain select * from user where age between 20 and 21;
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | user | NULL | range | age_index | age_index | 4 | NULL | 8 | 100.00 | Using index condition |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------+
1 row in set (0.00 sec)
但是要注意, 如果你在已经有索引的字段上使用between
或者in
会导致索引失效触发全表扫描mysql> explain select * from user where marital_status between 0 and 1;
+----+-------------+-------+------------+------+----------------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+----------------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | user | NULL | ALL | marital_status_index | NULL | NULL | NULL | 137 | 45.99 | Using where |
+----+-------------+-------+------------+------+----------------------+------+---------+------+------+----------+-------------+
1 row in set (0.00 sec)
mysql> explain select age from user;
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
| 1 | SIMPLE | user | NULL | index | NULL | age_index | 4 | NULL | 137 | 100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
1 row in set (0.00 sec)
例如我们表中除了marital_age_index
一个索引外还存在另一个marital_status_index
复合索引, 在符合条件的情况下possible_keys中会列出marital_status_index,marital_age_index
mysql> explain select * from user where marital_status =1 ;
+----+-------------+-------+------------+------+----------------------------------------+----------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+----------------------------------------+----------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | user | NULL | ref | marital_status_index,marital_age_index | marital_status_index | 5 | const | 29 | 100.00 | NULL |
+----+-------------+-------+------------+------+----------------------------------------+----------------------+---------+-------+------+----------+-------+
1 row in set (0.00 sec)
上面说的符合条件指的是符合索引的最左原则
, 意思就是只有我用到复合索引中最左边的列或者全部列该索引才会生效, 详细的解读在下面
key
( 查询中实际使用到的索引 )我们知道int类型是4bytes, 如果在数据库中设置的是允许为null的话, 那么就会占用5bytes, 这也就是为什么我上面代码中条件语句是id的话`key_len = 4`, 是marital_status的话就是`key_len = 5`
rows
( 查询的行数,越小越好 )Extra
当MySQL无法使用索引完成排序时就会使用filesort来完成排序
mysql> explain select age from user where sex = 1 order by age;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
| 1 | SIMPLE | user | NULL | ALL | NULL | NULL | NULL | NULL | 137 | 10.00 | Using where; Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
1 row in set (0.00 sec)
感兴趣的朋友可以看一下MySQL官方对于filesort使用的介绍, 这里就不多叙述了
MySQL优化排序官方文档
当使用覆盖索引会触发, 意思就是当查询中只有索引列时, 此时不必读取整个数据行, 直接根据索引获取所需数据即可
mysql> explain select age from user where marital_status = 1;
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | user | NULL | ref | marital_age_index | marital_age_index | 4 | const | 29 | 100.00 | Using index |
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+-------------+
1 row in set (0.00 sec)
Using index; Using where同时出现的情况会在下面详细讲解↓
Index Condition Pushdown(ICP), MySQL5.6中新加入的特性, 直接在存储引擎中使用索引对数据进行过滤, 可以减少存储引擎必须访问基表的次数以及MySQL服务器必须访问存储引擎的次数
感兴趣的朋友可以看下MySQL官方介绍索引条件下推优化
mysql> explain select marital_status from user where marital_status = 1 order by sex;
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+---------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+---------------------------------------+
| 1 | SIMPLE | user | NULL | ref | marital_age_index | marital_age_index | 4 | const | 29 | 100.00 | Using index condition; Using filesort |
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+---------------------------------------+
1 row in set (0.00 sec)
当查询语句中使用了where进行过滤时会触发Using where, 和索引无关
mysql> explain select sex from user where sex = 1;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | user | NULL | ALL | NULL | NULL | NULL | NULL | 137 | 10.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set (0.00 sec)
表连接频繁的情况下将早期连表中的部分行数据
存入缓存期, 然后使用缓冲区中的行来执行与当前表的联接
当表较大且未存储在存储引擎的高速缓存中时,在辅助索引上使用范围扫描来读取行会导致对表的许多随机磁盘访问。通过磁盘扫描多范围读取(MRR)优化,MySQL尝试通过首先仅扫描索引并收集相关行的键来减少用于范围扫描的随机磁盘访问次数。然后对键进行排序,最后使用主键的顺序从基表中检索行。磁盘扫描MRR的动机是减少随机磁盘访问的次数,而是对基表数据进行更顺序的扫描
官方介绍 : 多范围读取优化
当查询中包含了group by和order by时会触发Using temporary, 触发后MySQL会生成一个临时表来存储此次查询的结果, 然后在临时表中进行排序或分组
mysql> explain select marital_status from user where marital_status between 1 and 2 group by age;
+----+-------------+-------+------------+-------+-------------------+-------------------+---------+------+------+----------+-----------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+-------------------+-------------------+---------+------+------+----------+-----------------------------------------------------------+
| 1 | SIMPLE | user | NULL | range | marital_age_index | marital_age_index | 4 | NULL | 63 | 100.00 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+-------+------------+-------+-------------------+-------------------+---------+------+------+----------+-----------------------------------------------------------+
1 row in set (0.00 sec)
类型太多我们就不一一列举了, 感兴趣的朋友可以看一下官方文档EXPLAIN输出格式
其实最左原则本不应该写到Explain的博文中的, 但是在Explain的解析下你会发现不一样的最左原则
我们之前创建过一个复合索引KEY "marital_age_index" ("marital_status","age")
, 还记得我上面提到过的最左原则吧, 我们来看下什么时候会出现Using index
, 什么时候出现Using where
, 什么时候又会出现Using where;Using index
我们对照着来看, 先看第一组
# 1. 没有order by
mysql> explain select age from user where marital_status = 1;
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | user | NULL | ref | marital_age_index | marital_age_index | 4 | const | 29 | 100.00 | Using index |
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+-------------+
1 row in set (0.00 sec)
# 2. 有order by
mysql> explain select age from user where marital_status = 1 order by age;
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+--------------------------+
| 1 | SIMPLE | user | NULL | ref | marital_age_index | marital_age_index | 4 | const | 29 | 100.00 | Using where; Using index |
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+--------------------------+
1 row in set (0.00 sec)
我们看到此时的Extra分别是是Using index和Using where; Using index
在第二句SQL中我们加了一个order by age
结果就变成了Using where; Using index, 这是因为order by也会用到索引, 而我们order by的对象是复合索引中的另一列, 当我们在order by或者group by中加入了索引的另一列, 此时仍然会触发复合索引, 然后使用索引读取数据
我们再来看第二组SQL的执行结果
# 1. select marital_status
mysql> explain select marital_status from user where age = 1;
+----+-------------+-------+------------+-------+---------------+-------------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-------------------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | user | NULL | index | NULL | marital_age_index | 8 | NULL | 137 | 10.00 | Using where; Using index |
+----+-------------+-------+------------+-------+---------------+-------------------+---------+------+------+----------+--------------------------+
1 row in set (0.00 sec)
# 2. select *
mysql> explain select * from user where age = 1;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | user | NULL | ALL | NULL | NULL | NULL | NULL | 137 | 10.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set (0.00 sec)
Extra的值分别是Using where; Using index
和Using where
这就是最左原则中的例外情况, 虽然条件语句没有符合最左原则的规范, 但是我们在select后面添加了索引中的另一个列, 此时MySQL检测到了之后它就会先去根据这俩个列去找到对应的索引键值, 然后使用索引读取数据
我们来看第三组SQL的执行结果
# 1. where age
mysql> explain select age from user where age = 1;
+----+-------------+-------+------------+-------+---------------+-------------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-------------------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | user | NULL | index | NULL | marital_age_index | 8 | NULL | 137 | 10.00 | Using where; Using index |
+----+-------------+-------+------------+-------+---------------+-------------------+---------+------+------+----------+--------------------------+
1 row in set (0.00 sec)
# 2. where marital_status
mysql> explain select age from user where marital_status = 1 ;
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | user | NULL | ref | marital_age_index | marital_age_index | 4 | const | 29 | 100.00 | Using index |
+----+-------------+-------+------------+------+-------------------+-------------------+---------+-------+------+----------+-------------+
1 row in set (0.00 sec)
这组中第一句我们在select中和where中同时使用到了复合索引中的age
列, 我们任然可以使用该复合索引, 只要我们满足select中及where中必须都存在复合索引的另一列且必须是复合索引中的列
我们得到的结论是当我们的SQL语句不符合最左原则时我们任然可以使用该复合索引, 但是要符合以下几点
必须在select后指定索引中符合最左原则的另一列
select中及where中必须都存在复合索引的另一列且必须是复合索引中的列
他们三个性能的排序关系是
Using index > Using index; Using where > Using where
参考文档
EXPLAIN官方文档