SQL的优化常常是跟表里是数据相关的,一开始表设计只能考虑大部分情况下会用到的索引,有些特殊情况下,mysql的查询优化器不一定会走设计时考虑的执行计划,比如索引选择性太低也会走全表扫描,比如连表的时候可能走小表的索引而不走主表的索引等等。
1. offset优化
优化前sql:
SELECT SQL_NO_CACHE `broker`.*,`db`.`name`,db.branch
from broker
LEFT JOIN myuser mu on mu.UserGUID=broker.ModifiedBy
LEFT JOIN broker pb on pb.id=broker.parent_brokerId
LEFT JOIN broker_bank db on db.broker_id = broker.id
and isdel=0 and isdefault=1
where broker.`status` in (1,2)
and broker.token='rkqqnn1427611021'
and broker.is_delete=0
and IFNULL(broker.capacity_des,'') <> '9108'
ORDER BY CreatedOn desc
LIMIT 126000, 3000;
执行计划:
优化方向:
主要优化方向是分页的优化,将offset转化为前一次查询的索引键的查询,此优化在线上测试可以将原来的查询时间6s左右降低到3s左右;而查看explain后的信息,发现并无多少优化空间,查询和连表已经把能够用的索引已经用到了,可以加多一个(token,createdOn)的联合索引。
优化后:
SELECT SQL_NO_CACHE `broker`.*,`db`.`name`,db.branch
from broker
LEFT JOIN myuser mu on mu.UserGUID=broker.ModifiedBy
LEFT JOIN broker pb on pb.broker=broker.parent_brokerId
LEFT JOIN broker db on db.regbroker_id = broker.broker_Id
and isdel=0 and isdefault=1
where broker.`status` in (1,2)
and broker.token='rkqqnn1427611021'
and broker.is_delete=0
and IFNULL(broker.capacity_des,'') <> '9108'
AND broker.CreatedOn <= '2016-07-23 14:19:04'
ORDER BY CreatedOn desc
LIMIT 3000;
执行计划:
2. having 修改执行计划
优化前sql:
SELECT SQL_NO_CACHE
info.proj_id,
cst.broker_cstId,
LEFT ( save_enddate, 16 ) AS save_enddate,info.end_date, mobile,
name
( CASE WHEN cpr.check_date IS NULL THEN cst.CreatedOn ELSE cpr.check_date END ) AS lasttime,
cpr.confirm_mode,
cpr.is_delay
FROM
broker_cst cst
LEFT JOIN broker_cst_proj cpr ON cst.broker_cstId = cpr.cst_id
LEFT JOIN broker_building_info info ON info.proj_id = cpr.proj_id
WHERE
cst.regbroker_id = '39db2bc2-85d5-b48f-bec9-64d1167e1824'
AND (
( cst_status = 2 AND check_status = 2 AND ( save_enddate IS NULL OR save_enddate > LEFT ( now( ), 10 ) ) )
OR ( cst_status = 2 AND check_status = 1 )
OR ( cst_status = 3 AND check_status = 1 )
)
AND end_date >= CURRENT_DATE()
GROUP BY
cst.broker_cstId
ORDER BY
lasttime DESC
LIMIT 0,
20;
执行计划:
优化方向:
执行计划和sql的原意有所出入,sql原意是以broker_cst为主表,regbroker_id为所以过滤,而执行计划却变成了以broker_building_info为主表,以索引proj_id作为连表的key,而后再在数据上过滤,执行计划这样走的缘由应该是查询条件有broker_building_info的字段,而这个表的数据极少(256),于是mysql误以为以此表为主表可以过滤大量数据,然而,proj_id连表会几乎将broker_cst的数据都查出来(759456行),然后再在临时表上过滤,这样就很慢了。
搞清楚了原因之后,使用having把broker_building_info的查询条件放到后面,走原来的执行计划,问题解决。前者sql最慢要19s,优化后是1s左右。
优化后:
SELECT SQL_NO_CACHE
info.proj_id,
cst.broker_cstId,
LEFT ( save_enddate, 16 ) AS save_enddate,info.end_date,
cst_name,
mobile
( CASE WHEN cpr.check_date IS NULL THEN cst.CreatedOn ELSE cpr.check_date END ) AS lasttime,
cpr.confirm_mode,
cpr.is_delay
FROM
broker_cst cst
LEFT JOIN broker_cst_proj cpr ON cst.broker_cstId = cpr.cst_id
LEFT JOIN broker_building_info info ON info.proj_id = cpr.proj_id
WHERE
cst.regbroker_id = '39db2bc2-85d5-b48f-bec9-64d1167e1824'
AND (
( cst_status = 2 AND check_status = 2 AND ( save_enddate IS NULL OR save_enddate > LEFT ( now( ), 10 ) ) )
OR ( cst_status = 2 AND check_status = 1 )
OR ( cst_status = 3 AND check_status = 1 )
)
GROUP BY
cst.broker_cstId
HAVING end_date >= CURRENT_DATE()
ORDER BY
lasttime DESC
LIMIT 0,
20;
执行计划:
可以看到排序是filesort,还是存在优化空间,只需要增加一个(regbroker_id ,lasttime)的联合索引,排序可以在索引上排序,不过要用子查询:
SELECT * from (SELECT
info.proj_id,
cst.broker_cstId,
LEFT ( save_enddate, 16 ) AS save_enddate,info.end_date,
cst_name,
mobile_tel,
tel2,
tel3,
tel4,
lp_name AS ProjName,
cst_status,
check_status,
cst.tel_type,
( CASE WHEN cpr.check_date IS NULL THEN cst.CreatedOn ELSE cpr.check_date END ) AS lasttime,
cpr.confirm_mode,
cpr.is_delay
FROM
broker_cst cst
LEFT JOIN broker_cst_proj cpr ON cst.broker_cstId = cpr.cst_id
LEFT JOIN broker_building_info info ON info.proj_id = cpr.proj_id
WHERE
cst.regbroker_id = '39db2bc2-85d5-b48f-bec9-64d1167e1824'
AND (
( cst_status = 2 AND check_status = 2 AND ( save_enddate IS NULL OR save_enddate > LEFT ( now( ), 10 ) ) )
OR ( cst_status = 2 AND check_status = 1 )
OR ( cst_status = 3 AND check_status = 1 )
)
GROUP BY
cst.broker_cstId
ORDER BY
lasttime DESC ) AS ttt
where end_date >= CURRENT_DATE()
LIMIT 0,
20;
因为前一步查询出来的数据量不大,所以这一步优化后的基本没什么变化。
3. 排序优化
优化前
SELECT * FROM broker` WHERE (((`broker`.`status`=0) AND (`broker`.`token`='wulncm') AND (`broker`.`is_delete`=0)) AND (`broker`.`channel` IN (0, 'xmf', 'aczl', 'api', 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 24, 25, 26, 27, 28, 29))) AND (IFNULL(broker.capacity_des,'') <> '9108') ORDER BY `CreatedOn` DESC LIMIT 10;
explain:
看到排序用了filesort, 很明显的排序没有用到索引的情况,查看查询各阶段耗费时间:
我是在Navicat Mysql里面看的,命令行可以参考这篇文章(在文章最后):
MySQL索引原理及慢查询优化--整理
对token, CreatedOn加联合索引即可。
发现索引已经走了新建的所有,filesort消失,done!
需要注意的是:
只有当ORDER BY中所有的列必须包含在相同的索引,并且索引的顺序和order by子句中的顺序完全一致,并且所有列的排序方向(升序或者降序)一样才有。
举个例子:
索引: idx_sort(`token`,`status`,`created_at`)
对于sql:
1. select * from test where token='wulncm' order by token, status,created_at ;
可以用到索引排序;
2. select * from test where token='wulncm' order by status,created_at ;
可以用到索引排序;
3. select * from test where token='wulncm' order by created_at ;
部分索引(token),由于缺少status,created_at 只能filesort;
4. select * from test where token='wulncm' order by status desc, created_at ;
列排序方向不一致,用不到索引,filesort
4.联表优化
- 联表原理
在MySQL 中,只有一种Join 算法,就是大名鼎鼎的Nested Loop Join,他没有其他很多数据库所提供的Hash Join,也没有Sort Merge Join。顾名思义,Nested Loop Join 实际上就是通过驱动表的结果集作为循环基础数据,然后一条一条的通过该结果集中的数据作为过滤条件到下一个表中查询数据,然后合并结果。如果还有第三个参与Join,则再通过前两个表的Join 结果集作为循环基础数据,再一次通过循环查询条件到第三个表中查询数据,如此往复。
-– 《MySQL 性能调优与架构设计》