上周,面试官问了我,哪些情况下索引会失效,我回答了网上的答案,但是还从来没有亲身试验过。下午无聊,建了张表,插入了3549000条数据(原本先弄1000W条的,但是够用了)。
数据库:mysql Ver 14.14 Distrib 5.7.23, for Linux (x86_64) using EditLine wrapper
CREATE TABLE `city_data` (
`city_id` varchar(100) DEFAULT NULL,
`data_type` varchar(255) DEFAULT NULL,
`data_value` varchar(255) DEFAULT NULL,
`data_date` date DEFAULT NULL,
`data_flag` varchar(255) DEFAULT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=utf8
CREATE TABLE `t_city` (
`city_id` varchar(100) NOT NULL,
`city_name` varchar(255) NOT NULL,
`parent_id` varchar(100) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8
不加索引直接查询:
select * from city_data; 花费10.589秒
select data_value from city_data where data_value='1.480',花费1.428秒
只在data_value上加索引:
select data_value from city_data where data_value='1.480',花费0.027秒,可见加了索引效果明显。
注:原始的数据的时间从2012-01-01到2018-12-31
但是在某些情况下,会使索引失效:
1.
使用 not in,!= ,<>
date_type会有"A","B","C","D","E","F","S","X","Y","Z","AY","LL",12中结果值,现在在date_type上加上索引。
select c.city_id,c.city_name,d.data_value,d.data_type
from t_city c,city_data d
where c.city_id=d.city_id and d.data_date=date_format('2012-08-08', '%Y%m%d')
and d.data_type in ('A','B','C','D','E','F')
select c.city_id,c.city_name,d.data_value,d.data_type
from t_city c,city_data d
where c.city_id=d.city_id and d.data_date=date_format('2012-08-08', '%Y%m%d')
and d.data_type not in ('S','X','Y','Z','AY','LL')
两个语句查询的结果相同,使用in花费0.048s,使用not in花费1.963s。
临时插入两条数据:
abazhou A 15165.333 2019-01-01 F
abazhou B 23123.75 2019-01-01 F
在表city_data的data_type,data_date上加索引
select c.city_id,c.city_name,d.data_value,d.data_type,d.data_date
from t_city c,city_data d
where c.city_id=d.city_id
and c.city_id='abazhou'
and d.data_type='A'
and d.data_date=date_format('2019-01-01', '%Y%m%d')
花费0.029s
select c.city_id,c.city_name,d.data_value,d.data_type,d.data_date
from t_city c,city_data d
where c.city_id=d.city_id
and c.city_id='abazhou'
and d.data_type!='B'
and d.data_date=date_format('2019-01-01', '%Y%m%d')
花费2.782s
去掉索引
select c.city_id,c.city_name,d.data_value,d.data_type,d.data_date
from t_city c,city_data d
where c.city_id=d.city_id
and c.city_id='abazhou'
and d.data_type='A'
and d.data_date=date_format('2019-01-01', '%Y%m%d')
花费1.874s
3条查询都得到以一个结果
可见!=让索引失效了,同理<>符号
2.
使用like的模糊查询
select c.city_id,c.city_name,d.data_value,d.data_type,d.data_date
from t_city c,city_data d
where c.city_id=d.city_id
and c.city_id='abazhou'
and d.data_type like '%A%'
and d.data_date>date_format('2018-12-30', '%Y%m%d')
花费1.953s
select c.city_id,c.city_name,d.data_value,d.data_type,d.data_date
from t_city c,city_data d
where c.city_id=d.city_id
and c.city_id='abazhou'
and d.data_type like '%A'
and d.data_date>date_format('2018-12-30', '%Y%m%d')
花费1.875s
select c.city_id,c.city_name,d.data_value,d.data_type,d.data_date
from t_city c,city_data d
where c.city_id=d.city_id
and c.city_id='abazhou'
and d.data_type like 'A%'
and d.data_date>date_format('2018-12-30', '%Y%m%d')
花费0.156s
可见使用%通配符,如果把%放在左边,则会使索引失效,但是放最右边不会失效。
即以%开头失效。
3.
在where子句中使用某些函数对索引列进行操作
select c.city_id,c.city_name,d.data_value,d.data_type,d.data_date
from t_city c,city_data d
where c.city_id=d.city_id
and c.city_id='abazhou'
and CONCAT(data_type,'Y')='AY'
and date_add(data_date , interval 1 day)>date_format('2018-12-30', '%Y%m%d')
使用了concat()函数,索引失效,花费2.361s
select c.city_id,c.city_name,d.data_value,d.data_type,d.data_date
from t_city c,city_data d
where c.city_id=d.city_id
and c.city_id='abazhou'
and data_type='A'
and date_add(data_date , interval 1 day)=date_format('2018-12-30', '%Y%m%d')
花费0.111s
但是date_add()函数并没有让索引失效,原因未知。
4
使用运算符,例如+,-,*,/
select c.city_id,c.city_name,d.data_value,d.data_type,d.data_flag,d.data_date
from t_city c,city_data d
where c.city_id=d.city_id
and c.city_id='abazhou' and d.data_value+1='47475.998'
使用了对索引列使用了+号,花费2.082s
select c.city_id,c.city_name,d.data_value,d.data_type,d.data_flag,d.data_date
from t_city c,city_data d
where c.city_id=d.city_id
and c.city_id='abazhou' and d.data_value='47474.998'
不使用,得到相同结果,花费0.029s
5.
最左前缀,比如创建了(A,B,C)索引,则就创建了(A),(A,B),(A,B,C)3个索引,在where子句中使用最频繁的选择放在最左边。如果where子句中A=‘a’ and B='b' and C='C'则索引都会生效,如果是A='a' and C='c',则C的索引不会生效,因为B没有出现在子句中,同理C='c',B='b' and C='c'也不会生效,中间断层了。A='a' and B>'b' and C='c',这种情况AB生效C不会生效。
6.
比如有一个sex性别列,只包含M和F两个值,对该列创建索引没多大用处,因为不管查谁,数据都很大。索引不会生效。