postgresql的查询计划有时候并不会按着我们建的索引走
例如 当我们创建了一张用户活跃表
create table td_user_active(usr_id int,usr_name text,upd_time timestamp with time zone);
insert into td_user_active select id,md5(now()::text),(current_date - interval '1000 days')::date+generate_series(1,id%1000) from generate_series(1,10000) id;
create index idx_td_user_active on td_user_active(id,upd_time desc);
现在我们查询下 每个用户最新的活跃时间
select usr_id,max(upd_time) from td_user_active group by usr_id;
时间花费了1702ms
explain analyze select usr_id,max(upd_time) from td_user_active group by usr_id;
HashAggregate (cost=121607.57..121691.54 rows=8397 width=12) (actual time=2413.421..2416.425 rows=9990 loops=1)
-> Seq Scan on td_user_active (cost=0.00..96632.71 rows=4994971 width=12) (actual time=0.330..779.988 rows=4995000 loops=1)
Total runtime: 2417.269 ms
(3 rows)
Time: 2417.833 ms
该查询并没有走索引
下面就用cte优化
with recurive td_scan as (
(select usr_id,upd_time from td_user_active order by usr_id , upd_time desc limit 1)
union all
(select (select min(usr_id) from td_user_active where usr_id > td_scan.usr_id) as usr_id,
(select max(upd_time) from td_user_active where usr_id=(select min(usr_id) from td_user_active where usr_id > td_scan.usr_id)) as upd_time from td_scan where td_scan.usr_id is not null)) select * from td_scan;
只用了501ms
explain analyze 。。。。。
CTE Scan on td_scan (cost=589.02..591.04 rows=101 width=12) (actual time=0.026..488.336 rows=9991 loops=1)
CTE td_scan
-> Recursive Union (cost=0.43..589.02 rows=101 width=12) (actual time=0.024..482.111 rows=9991 loops=1)
-> Limit (cost=0.43..1.78 rows=1 width=12) (actual time=0.023..0.023 rows=1 loops=1)
-> Index Only Scan using idx_td_user_active on td_user_active td_user_active_3 (cost=0.43..6749407.37 rows=4994971 width=12) (actua
l time=0.021..0.021 rows=1 loops=1)
Heap Fetches: 1
-> WorkTable Scan on td_scan td_scan_1 (cost=0.00..58.52 rows=10 width=4) (actual time=0.047..0.047 rows=1 loops=9991)
Filter: (usr_id IS NOT NULL)
Rows Removed by Filter: 0
SubPlan 2
-> Result (cost=1.79..1.80 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=9990)
InitPlan 1 (returns $2)
-> Limit (cost=0.43..1.79 rows=1 width=4) (actual time=0.021..0.021 rows=1 loops=9990)
-> Index Only Scan using idx_td_user_active on td_user_active (cost=0.43..2260052.23 rows=1664990 width=4) (actual
time=0.020..0.020 rows=1 loops=9990)
Index Cond: ((usr_id IS NOT NULL) AND (usr_id > td_scan_1.usr_id))
Heap Fetches: 9989
SubPlan 6
-> Result (cost=4.02..4.03 rows=1 width=0) (actual time=0.023..0.023 rows=1 loops=9990)
InitPlan 4 (returns $5)
-> Result (cost=1.79..1.80 rows=1 width=0) (actual time=0.012..0.012 rows=1 loops=9990)
InitPlan 3 (returns $4)
-> Limit (cost=0.43..1.79 rows=1 width=4) (actual time=0.011..0.011 rows=1 loops=9990)
-> Index Only Scan using idx_td_user_active on td_user_active td_user_active_1 (cost=0.43..2260052.23 rows=
1664990 width=4) (actual time=0.010..0.010 rows=1 loops=9990)
Index Cond: ((usr_id IS NOT NULL) AND (usr_id > td_scan_1.usr_id))
Heap Fetches: 9989
InitPlan 5 (returns $6)
-> Limit (cost=0.43..2.22 rows=1 width=8) (actual time=0.022..0.022 rows=1 loops=9990)
-> Index Only Scan using idx_td_user_active on td_user_active td_user_active_2 (cost=0.43..1065.39 rows=595 width=8
) (actual time=0.009..0.009 rows=1 loops=9990)
Index Cond: ((usr_id = $5) AND (upd_time IS NOT NULL))
Heap Fetches: 9989
Total runtime: 489.845 ms
(31 rows)
Time: 491.155 ms