联合索引(a,b),颠覆你对MySQL 索引的印象

一、表结构

CREATE TABLE `cd_happy_for_ni_deals` (
  `id` int(11) NOT NULL DEFAULT '0',
  `update_time` datetime DEFAULT NULL COMMENT '更新时间',
  `publish_status` int(11) NOT NULL DEFAULT '4' COMMENT '发布状态',
  KEY `idx_of_publish_status_update_time` (`publish_status`,`update_time`,`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

二、唯一性基数

mysql> select count(distinct(update_time)) from cd_happy_for_ni_deals;
+------------------------------+
| count(distinct(update_time)) |
+------------------------------+
|                      1845933 |
+------------------------------+
1 row in set (4.68 sec)

mysql> select count(distinct(publish_status)) from cd_happy_for_ni_deals;
+---------------------------------+
| count(distinct(publish_status)) |
+---------------------------------+
|                               2 |
+---------------------------------+
1 row in set (1.76 sec)

mysql> select count(id) from cd_happy_for_ni_deals;
+-----------+
| count(id) |
+-----------+
|   1907609 |
+-----------+
1 row in set (0.00 sec)

update_time 的选择性:1845933 / 1907609.to_f = 0.9676684268107353 接近1

publish_status 的选择性: 2 / 1907609.to_f = 1.0484328811617055e-06 接近0

三、建立(a,b) 索引,分别根据 a 查询,b 查询,(a,b) 查询,(b,a) 查询,统计结果

不走寻常路,我就偏选择 选择性低的做索引的第一位。

创建索引

mysql> alter table cd_happy_for_ni_deals add index `idx_of_publish_status_update_time` (`publish_status`, `update_time`, `id`);
Query OK, 0 rows affected (14.69 sec)
Records: 0  Duplicates: 0  Warnings: 0

根据a 查询

mysql> explain select SQL_NO_CACHE id, publish_status from cd_happy_for_ni_deals where publish_status = 4 \G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: cd_happy_for_ni_deals
         type: ref
possible_keys: idx_of_publish_status_update_time
          key: idx_of_publish_status_update_time
      key_len: 4
          ref: const
         rows: 964056 <- 只查询publish_status 的情况
        Extra: Using index
1 row in set (0.00 sec)

平均查询时间:

mysql> select SQL_NO_CACHE count(id) from cd_happy_for_ni_deals where publish_status = 4 \G;
*************************** 1. row ***************************
count(id): 1858081
1 row in set (0.69 sec)

理论上可以用到索引(a,b) 中的 a 部分。

根据b 查询

mysql> explain select SQL_NO_CACHE id, publish_status from cd_happy_for_ni_deals where update_time = '2014-05-17 23:00:48' \G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: cd_happy_for_ni_deals
         type: index
possible_keys: NULL
          key: idx_of_publish_status_update_time
      key_len: 17
          ref: NULL
         rows: 1928113 <- 只查询update_time 的情况
        Extra: Using where; Using index
1 row in set (0.01 sec)

平均查询时间:

mysql> select SQL_NO_CACHE count(id) from cd_happy_for_ni_deals where update_time = '2014-05-17 23:00:48' \G;
*************************** 1. row ***************************
count(id): 1
1 row in set (1.06 sec)

查询b 的时候,理论上用不到索引的。为啥这里???

根据(a,b) 查询

mysql> explain select SQL_NO_CACHE id, publish_status from cd_happy_for_ni_deals where publish_status = 4 and update_time = '2014-05-17 23:00:48' \G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: cd_happy_for_ni_deals
         type: ref
possible_keys: idx_of_publish_status_update_time
          key: idx_of_publish_status_update_time
      key_len: 13
          ref: const,const
         rows: 1
        Extra: Using where; Using index
1 row in set (0.01 sec)

平均查询时间:

mysql> select SQL_NO_CACHE count(id) from cd_happy_for_ni_deals where publish_status = 4 and update_time = '2014-05-17 23:00:48' \G;
*************************** 1. row ***************************
count(id): 1
1 row in set (0.00 sec)

符合理论上的预期。

根据(b,a) 查询

mysql> explain select SQL_NO_CACHE id, publish_status from cd_happy_for_ni_deals where  update_time = '2014-05-17 23:00:48' and publish_status = 4 \G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: cd_happy_for_ni_deals
         type: ref
possible_keys: idx_of_publish_status_update_time
          key: idx_of_publish_status_update_time
      key_len: 13
          ref: const,const
         rows: 1
        Extra: Using where; Using index
1 row in set (0.00 sec)

平均查询时间:

mysql> select SQL_NO_CACHE count(id) from cd_happy_for_ni_deals where  update_time = '2014-05-17 23:00:48' and publish_status = 4 \G;
*************************** 1. row ***************************
count(id): 1
1 row in set (0.00 sec)

理论上,这里只能用到(a,b)中的a部分,为啥也这么快??

结论:

1、理论上索引对顺序是敏感的,但是由于MySQL的查询优化器会自动调整where子句的条件顺序以使用适合的索引。

2、将选择性高的列放在索引的最前列。根据场景的不同,这条经验法则并不是完全准确的。在某些场景下,可能需要根据运行频率最高的查询来调整索引列的顺序。

参考

http://www.programering.com/a/MTMwAzMwATM.html

你可能感兴趣的:(联合索引(a,b),颠覆你对MySQL 索引的印象)