MySQL 关于 reconsidering_access_paths_for_index_ordering

参考

  • reconsidering-access-paths-for-index-ordering-a-dangerous-optimization-and-a-fix
  • bugs.mysql.com: Dangerous optimization reconsidering_access_paths_for_index_ordering
  • MySQL Optimizer Trace

版本 5.7.24

问题复现

CREATE TABLE `test_tbl` (
      `id` bigint(20) NOT NULL AUTO_INCREMENT,
      `a` varchar(64) NOT NULL COMMENT '',
      `b` tinyint(4) NOT NULL DEFAULT '1' COMMENT '',
      `c` varchar(64) NOT NULL COMMENT '',
      `gmt_create` datetime DEFAULT CURRENT_TIMESTAMP,
      `gmt_update` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
      PRIMARY KEY (`id`),
      UNIQUE KEY `uk_a_c` (`a`,`c`),
      KEY `idx_a_b` (`a`,`b`)
) ENGINE=InnoDB AUTO_INCREMENT=7011293 DEFAULT CHARSET=utf8
# 数据样本700w,b字段基本没有区分度
# Query 1
> SELECT id, a, b, c, gmt_create, gmt_update FROM test_tbl WHERE a = '163944888' AND b = 1 ORDER BY id LIMIT 0, 20
20 rows in set (2.79 sec)
# Query 2
> SELECT id, a, b, c, gmt_create, gmt_update FROM test_tbl WHERE a = '163944888' AND b = 1 ORDER BY id LIMIT 0, 50
50 rows in set (0.02 sec)
# Query 3
> SELECT id, a, b, c, gmt_create, gmt_update FROM test_tbl force index(`idx_a_b`) WHERE a = '163944888' AND b = 1 ORDER BY id LIMIT 0, 50;
50 rows in set (0.00 sec)

同样的查询当limit大小为20的时候,执行时间较长。当limit大小为50的时候,执行速度更快。当强制指定索引 idx_a_b时,结果更加快。

从Explain执行计划上看到
Query 1: 使用Primary全表扫描,不需要排序。
Query 2: 使用uk_a_c唯一索引,a字段生效,结果需要排序。
Query 3: 使用idx_a_b二级索引,在不考虑回表取数据过程,该索引理论上是最优方案。

Optimizer Trace

对比 Query 1Query 2 的Optimizer Trace结果

image.png

SQL 优化器对比,reconsidering_access_paths_for_index_ordering 阶段,Query 1 选择了主键扫描避免排序,Query 2选择uk_a_c索引。

  • 从官方了解到后续MySQL可能会提供改阶段行为的开关。
    bugs.mysql.com: Dangerous optimization reconsidering_access_paths_for_index_ordering

那选择idx_a_c不就好了吗?为什么Query 2会选择uk_a_c索引?

在前面的considered_execution_plans阶段,uk_a_c cost略小于idx_a_b,一般是由于b字段区分度不高产生。

image.png

我们对样本表数据重新处理,将b字段处理为主键值。

> EXPLAIN SELECT id, a, b, c, gmt_create, gmt_update FROM test_tbl_01 WHERE a = '163944888' AND b = 1 ORDER BY id LIMIT 0, 20;
+----+-------------+-------------+------------+------+----------------+---------+---------+-------------+------+----------+-----------------------+
| id | select_type | table       | partitions | type | possible_keys  | key     | key_len | ref         | rows | filtered | Extra                 |
+----+-------------+-------------+------------+------+----------------+---------+---------+-------------+------+----------+-----------------------+
|  1 | SIMPLE      | test_tbl_01 | NULL       | ref  | uk_a_c,idx_a_b | idx_a_b | 195     | const,const |    1 |   100.00 | Using index condition |
+----+-------------+-------------+------------+------+----------------+---------+---------+-------------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)

总结

  1. idx_a_b索引b字段区分度不高,导致优化器选择uk_a_c索引, 从而触发reconsidering_access_paths_for_index_ordering选择PRIMARY。
  2. reconsidering_access_paths_for_index_ordering 会触发主键全表扫描,和以往 order by id limit N(当N特别大时,避免回表查询以及排序,会选择全表扫描)不同,这次是在N较小时,主动选择了全表扫描。
  3. 还有一种绕过方案:uk_a_c 顺序调整,即改成 uk_c_a,不让该唯一索引参与SQL优化过程,而是使用到idx_a_b,从而也就避免了reconsidering_access_paths_for_index_ordering 阶段选择PRIMARY。

你可能感兴趣的:(MySQL 关于 reconsidering_access_paths_for_index_ordering)