最近为公司写后台程序,就是数据库检索,其实就是拼sql,在测试环境运行没问题,线上却没有数据显示,最后发现是sql运行超时,整整6s,最后给优化到0.5s
表结构:
| newreport | CREATE TABLE `newreport` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`user_id` varchar(32) NOT NULL DEFAULT '' COMMENT '...',
`report_id` varchar(32) NOT NULL DEFAULT '' COMMENT '...',
`report_type` tinyint(3) unsigned NOT NULL DEFAULT '0' COMMENT '...',
`reason_ids` varchar(256) NOT NULL DEFAULT '' COMMENT '...',
`description` varchar(256) NOT NULL DEFAULT '' COMMENT '...',
`pictures` varchar(256) NOT NULL DEFAULT '' COMMENT '...',
`createTs` int(10) unsigned NOT NULL DEFAULT '0',
`comment_id` bigint(20) NOT NULL DEFAULT '0' COMMENT '...',
`status` tinyint(2) NOT NULL DEFAULT '0' COMMENT '...',
`extra` text,
`chat_type` tinyint(3) unsigned NOT NULL DEFAULT '0' COMMENT '...',
`version` varchar(32) NOT NULL DEFAULT '' COMMENT '...',
`region` varchar(32) NOT NULL DEFAULT '' COMMENT '...',
`operator` varchar(32) NOT NULL DEFAULT '' COMMENT '...'
`update_time` int(10) unsigned NOT NULL DEFAULT '0' COMMENT '...',
PRIMARY KEY (`id`),
KEY `report_id` (`report_id`),
KEY `idx_user_id` (`user_id`),
) ENGINE=InnoDB AUTO_INCREMENT=1701203 DEFAULT CHARSET=utf8 |
原sql
select sql_no_cache report_id, count(*) as report_total, max(createTs) as recent_time from newreport where report_type=4 and createTs between 1566849758 and 1567454558 and 0='\'\'' and 2=2 and status=0 group by report_id order by report_total desc, createTs limit 0,20\G
就是这样的sql运行时间6s
为createTs加了索引:
KEY `idx_createTs` (`createTs`)
时间减少到0.5s ...
当然没有完,要不然这个就太水了,加个索引就解决了还有啥说的。
createTs小范围createTs between 1566849758 and 1567454558
对应的数据量占总数的0.87%。
大范围 createTs between 1556668800 and 1567454558
对应的行数占总数97% 时间跨度20190501-20190902。
每天新增数据量接近2k,数据库总数 1701186。
当范围查询超过总数一定比例,mysql会全局扫描。
问题来了,上面的查询时间范围为大概一周,当查询时间增加到4个月时,查询时间降到3s,继续优化:
select sql_no_cache report_id, count(*) as report_total, max(createTs) as recent_time from newreport s right join (select id from newreport where createTs > 1556668800 and createTs < 1567454558) t using(id) where report_type=4 and 0='\'\'' and 2=2 and status=0 group by report_id order by report_total desc, recent_time limit 0,20;
explain结果:
explain select sql_no_cache report_id, count(*) as report_total, max(createTs) as recent_time from newreport where id in (select id from newreport where createTs > 1556668800 and createTs < 1567454558) and report_type=4 and 0='\'\'' and 2=2 and status=0 group by report_id order by report_total desc, createTs limit 0,20\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: newreport
partitions: NULL
type: ALL
possible_keys: PRIMARY,report_id
key: NULL
key_len: NULL
ref: NULL
rows: 1655044
filtered: 1.00
Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: newreport
partitions: NULL
type: eq_ref
possible_keys: PRIMARY,idx_createTs
key: PRIMARY
key_len: 8
ref: starmaker.newreport.id
rows: 1
filtered: 36.68
Extra: Using where
时间区间4个月,查询时间0.8s,勉强够用,可是看到right join的sql并没有使用createTs索引,继续优化,强制让mysql使用这个索引
explain select sql_no_cache report_id, count(*) as report_total, max(createTs) as recent_time from newreport s right join (select id from newreport use index(idx_createTs) where createTs > 1556668800 and createTs < 1567454558) t using(id) where report_type=4 and 0='\'\'' and 2=2 and status=0 group by report_id order by report_total desc, createTs limit 0,20\G
explain结果:
id: 1
select_type: SIMPLE
table: newreport
partitions: NULL
type: range
possible_keys: idx_createTs
key: idx_createTs
key_len: 4
ref: NULL
rows: 607124
filtered: 100.00
Extra: Using where; Using index; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: s
partitions: NULL
type: eq_ref
possible_keys: PRIMARY,report_id
key: PRIMARY
key_len: 8
ref: starmaker.newreport.id
rows: 1
filtered: 5.00
Extra: Using where
2 rows in set, 2 warnings (0.00 sec)
扫描行数由1655044减少到607124
时间区间4个月,查询时间0.5s
当然如果可以,还能继续优化,那就是修改索引createTs为联合索引,where子句中离散型字段有点多,也就是distinct field后,没有几个值的field,可以把这几个field放在createTs前面构建联合索引,比如(field, createTs),放在createTs的前面是因为最左前缀,当只需要联合索引的后面字段时,可以把联合索引靠前的字段通过 field in (...) and createTs = ...
利用起来。
但是索引不是越多越好,因为维护索引也是需要成本的嘛,索引多了,插入更新就会困难,要做的是将已有的资源发挥到极限。
至于为什么要right join
一下,因为btree原理,createTs是非主键索引,底层维护存储的是createTs和对应记录行的主键,通过这个非聚簇索引找到满足条件的id,这个id就是主键索引,即聚簇索引,聚簇索引底层除了保存了主键,还有主键对应的记录行,也就是聚簇索引保存了数据表,直接利用聚簇索引的话,不就省了回表操作嘛。