这些内容不是我子自己写的 是再车轮任职期间组长刘国庆做的技术分享,现在这里做个笔记,当自己的时间碎片把
由于mysql 底层由分析器会选择最优的索引,如果一个sql中存在多个索引,mysql 可能存在错误使用索引的情况,所以要使用强制索引,毕竟mysql 其实有点像php,有自己的词法解析器、语法解析器然后有分析器,最后才到了执行器。
mysql中常见的坑
sql案例:
DELETE FORM testtable WHERE biz_date <= '2017-08-21 00:00:00' AND status = 2 limit 500
表大小200M左右,数据100w,biz_date和status有联合索引
索引分析
mysql > desc select * from testtable WHERE biz_date <= '2017-08-21 00:00:00';
+----+-------------+-----------+------+----------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+------+----------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | testtable | ALL | idx_bizdate_st | NULL | NULL | NULL | 980626 | Using where |
+----+-------------+-----------+------+----------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)
-- 只查询biz_date
-- 关键点:rows:980626;type:ALL
但是如果我们加入limit 速度会更快一些
mysql > desc select * from testtable WHERE biz_date <= '2017-08-21 00:00:00' and status = 2 limit 100;
+----+-------------+-----------+-------+----------------+----------------+---------+------+--------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------+----------------+----------------+---------+------+--------+-----------------------+
| 1 | SIMPLE | testtable | range | idx_bizdate_st | idx_bizdate_st | 6 | NULL | 490319 | Using index condition |
+----+-------------+-----------+-------+----------------+----------------+---------+------+--------+-----------------------+
1 row in set (0.00 sec)
-- 查询biz_date + status+ limit
-- 关键点:rows:490319;
通过 biz_date 预估出来的行数 和 biz_date + status=2 预估出来的行数几乎一样,为98w。实际查询表 biz_date + status=2 一条记录都没有。整表数据量达到了99万,MySQL发现通过索引扫描需要98w行(预估) 因此,MySQL通过统计信息预估的时候,发现需要扫描的索引行数几乎占到了整个表,放弃了使用索引,选择了走全表扫描 那是不是他的统计信息有问题呢?我们重新收集了下表统计信息,发现执行计划的预估行数还是一样,猜测只能根据组合索引的第一个字段进行预估
mysql > select * from testtable WHERE biz_date <= '2017-08-21 00:00:00' and status = 2;
Empty set (0.79 sec)
mysql > select * from testtable force index(idx_bizdate_st) WHERE biz_date <= '2017-08-21 00:00:00' and status = 2;
Empty set (0.16 sec)
强制指定索引后,查询耗时和没有强制索引比较,的确执行速度快了很多,因为没有强制索引是全表扫描嘛!但是!依然非常慢 那么还有什么办法去优化这个本来应该很快的查询呢? 重新建个索引? 控制下范围?
mysql > select * from testtable WHERE biz_date >= '2017-08-20 00:00:00' and biz_date <= '2017-08-21 00:00:00' and status = 2;
Empty set (0.00 sec)
mysql > desc select * from testtable WHERE biz_date >= '2017-08-20 00:00:00' and biz_date <= '2017-08-21 00:00:00' and status = 2;
+----+-------------+------------------+-------+----------------+----------------+---------+------+------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------+-------+----------------+----------------+---------+------+------+-----------------------+
| 1 | SIMPLE | testtable | range | idx_bizdate_st | idx_bizdate_st | 6 | NULL | 789 | Using index condition |
+----+-------------+------------------+-------+----------------+----------------+---------+------+------+-----------------------+
1 row in set (0.00 sec)
SELECT count(*) FROM `qcp_ad_monitor_info` WHERE status='2' and updated_at >= '2020-07-14' and updated_at <= '2020-07-14 23:59:59' and channel='ayj01' order by id desc
数据100w,updated_at和channel有单独的索引
select * from `qcp_ad_monitor_info` where status='2' and updated_at >= '2020-07-14' and updated_at <= '2020-07-14 23:59:59' and channel='ayj01' order by updated_at desc limit 0,10
总结:
可见order by 也会影响mysql分析器的索引选择
最后说一下find_in_set