SQL——索引失效,查询优化

上周,面试官问了我,哪些情况下索引会失效,我回答了网上的答案,但是还从来没有亲身试验过。下午无聊,建了张表,插入了3549000条数据(原本先弄1000W条的,但是够用了)。

数据库:mysql  Ver 14.14 Distrib 5.7.23, for Linux (x86_64) using  EditLine wrapper

CREATE TABLE `city_data` (
  `city_id` varchar(100) DEFAULT NULL,
  `data_type` varchar(255) DEFAULT NULL,
  `data_value` varchar(255) DEFAULT NULL,
  `data_date` date DEFAULT NULL,
  `data_flag` varchar(255) DEFAULT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=utf8
CREATE TABLE `t_city` (
  `city_id` varchar(100) NOT NULL,
  `city_name` varchar(255) NOT NULL,
  `parent_id` varchar(100) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8

不加索引直接查询:

select * from city_data; 花费10.589秒

select data_value from city_data where data_value='1.480',花费1.428秒

只在data_value上加索引:

select data_value from city_data where data_value='1.480',花费0.027秒,可见加了索引效果明显。

注:原始的数据的时间从2012-01-01到2018-12-31

但是在某些情况下,会使索引失效:

1.

使用 not in,!= ,<>

date_type会有"A","B","C","D","E","F","S","X","Y","Z","AY","LL",12中结果值,现在在date_type上加上索引。

select c.city_id,c.city_name,d.data_value,d.data_type
from t_city c,city_data d 
where c.city_id=d.city_id and d.data_date=date_format('2012-08-08', '%Y%m%d')
and d.data_type in ('A','B','C','D','E','F')

select c.city_id,c.city_name,d.data_value,d.data_type
from t_city c,city_data d 
where c.city_id=d.city_id and d.data_date=date_format('2012-08-08', '%Y%m%d')
and d.data_type not in ('S','X','Y','Z','AY','LL')

两个语句查询的结果相同,使用in花费0.048s,使用not in花费1.963s。
临时插入两条数据:
abazhou	A	15165.333	2019-01-01	F
abazhou	B	23123.75	2019-01-01	F
在表city_data的data_type,data_date上加索引

select c.city_id,c.city_name,d.data_value,d.data_type,d.data_date
from t_city c,city_data d 
where c.city_id=d.city_id 
and c.city_id='abazhou'
and d.data_type='A' 
and d.data_date=date_format('2019-01-01', '%Y%m%d')
花费0.029s

select c.city_id,c.city_name,d.data_value,d.data_type,d.data_date
from t_city c,city_data d 
where c.city_id=d.city_id 
and c.city_id='abazhou'
and d.data_type!='B' 
and d.data_date=date_format('2019-01-01', '%Y%m%d')
花费2.782s

去掉索引
select c.city_id,c.city_name,d.data_value,d.data_type,d.data_date
from t_city c,city_data d 
where c.city_id=d.city_id 
and c.city_id='abazhou'
and d.data_type='A' 
and d.data_date=date_format('2019-01-01', '%Y%m%d')
花费1.874s

3条查询都得到以一个结果
可见!=让索引失效了,同理<>符号

2.

使用like的模糊查询

select c.city_id,c.city_name,d.data_value,d.data_type,d.data_date
from t_city c,city_data d 
where c.city_id=d.city_id 
and c.city_id='abazhou'
and d.data_type like '%A%'
and d.data_date>date_format('2018-12-30', '%Y%m%d')
花费1.953s

select c.city_id,c.city_name,d.data_value,d.data_type,d.data_date
from t_city c,city_data d 
where c.city_id=d.city_id 
and c.city_id='abazhou'
and d.data_type like '%A'
and d.data_date>date_format('2018-12-30', '%Y%m%d') 
花费1.875s

select c.city_id,c.city_name,d.data_value,d.data_type,d.data_date
from t_city c,city_data d 
where c.city_id=d.city_id 
and c.city_id='abazhou'
and d.data_type like 'A%'
and d.data_date>date_format('2018-12-30', '%Y%m%d') 
花费0.156s

可见使用%通配符,如果把%放在左边,则会使索引失效,但是放最右边不会失效。
即以%开头失效。

3.

在where子句中使用某些函数对索引列进行操作

select c.city_id,c.city_name,d.data_value,d.data_type,d.data_date
from t_city c,city_data d 
where c.city_id=d.city_id 
and c.city_id='abazhou'
and CONCAT(data_type,'Y')='AY'
and date_add(data_date , interval 1 day)>date_format('2018-12-30', '%Y%m%d') 
使用了concat()函数,索引失效,花费2.361s

select c.city_id,c.city_name,d.data_value,d.data_type,d.data_date
from t_city c,city_data d 
where c.city_id=d.city_id 
and c.city_id='abazhou'
and data_type='A'
and date_add(data_date , interval 1 day)=date_format('2018-12-30', '%Y%m%d') 
花费0.111s

但是date_add()函数并没有让索引失效,原因未知。

4

使用运算符,例如+,-,*,/

select c.city_id,c.city_name,d.data_value,d.data_type,d.data_flag,d.data_date
from t_city c,city_data d 
where c.city_id=d.city_id 
and c.city_id='abazhou' and d.data_value+1='47475.998'
使用了对索引列使用了+号,花费2.082s

select c.city_id,c.city_name,d.data_value,d.data_type,d.data_flag,d.data_date
from t_city c,city_data d 
where c.city_id=d.city_id 
and c.city_id='abazhou' and d.data_value='47474.998'
不使用,得到相同结果,花费0.029s

5.

最左前缀,比如创建了(A,B,C)索引,则就创建了(A),(A,B),(A,B,C)3个索引,在where子句中使用最频繁的选择放在最左边。如果where子句中A=‘a’ and B='b' and C='C'则索引都会生效,如果是A='a' and C='c',则C的索引不会生效,因为B没有出现在子句中,同理C='c',B='b' and C='c'也不会生效,中间断层了。A='a' and B>'b' and C='c',这种情况AB生效C不会生效。

6.

比如有一个sex性别列,只包含M和F两个值,对该列创建索引没多大用处,因为不管查谁,数据都很大。索引不会生效。

你可能感兴趣的:(mysql,sql,我的面试,sql,mysql)