分页技术(一),已经证明limit-offset带来的性能下降,即使增加了order by 和索引,也同样性能下降的问题。稍后我将例子给证明。
解决这个问题的方法是:排序+过滤条件分页
创建测试表,插入1千万记录,并创建索引:
highgo=# drop table testhyl;
DROP TABLE
highgo=# create table testhyl(id serial,name varchar);
CREATE TABLE
highgo=# insert into testhyl (name) values (generate_series(1,10000000)||'hyl');
INSERT 0 10000000
highgo=# analyze test;
ANALYZE
highgo=#create index testhyl_i_id on testhyl (id);
highgo=# \d testhyl
数据表 "public.testhyl"
栏位 | 类型 | 修饰词
------+-------------------+-----------------------------------------------
id | integer | 非空 默认 nextval('testhyl_id_seq'::regclass)
name | character varying |
索引:
"testhyl_i_id" btree (id)
highgo=# select * from testhyl order by id limit 5 offset 100;
id | name
-----+--------
101 | 101hyl
102 | 102hyl
103 | 103hyl
104 | 104hyl
105 | 105hyl
(5 行记录)
highgo=# select * from testhyl where id>100 order by id limit 5;
id | name
-----+--------
101 | 101hyl
102 | 102hyl
103 | 103hyl
104 | 104hyl
105 | 105hyl
(5 行记录)
1、取id=100后的5条数据:
highgo=# explain analyze select * from testhyl order by id limit 5 offset 100;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Limit (cost=5.19..5.43 rows=5 width=36) (actual time=0.036..0.037 rows=5 loops=1)
-> Index Scan using testhyl_i_id on testhyl (cost=0.44..475908.44 rows=10000000 width=36) (actual time=0.017..0.034 rows=105 loops=1)
Planning time: 0.089 ms
Execution time: 0.055 ms
(4 行记录)
highgo=# explain analyze select * from testhyl where id>100 order by id limit 5;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.44..0.90 rows=5 width=36) (actual time=0.032..0.033 rows=5 loops=1)
-> Index Scan using testhyl_i_id on testhyl (cost=0.44..311117.76 rows=3333333 width=36) (actual time=0.030..0.031 rows=5 loops=1)
Index Cond: (id > 100)
Planning time: 0.115 ms
Execution time: 0.058 ms
(5 行记录)
2、取id=1000后的5条数据:
highgo=# explain analyze select * from testhyl order by id limit 5 offset 1000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Limit (cost=48.03..48.26 rows=5 width=36) (actual time=0.319..0.321 rows=5 loops=1)
-> Index Scan using testhyl_i_id on testhyl (cost=0.44..475908.44 rows=10000000 width=36) (actual time=0.022..0.297 rows=1005 loops=1)
Planning time: 0.099 ms
Execution time: 0.344 ms
(4 行记录)
highgo=# explain analyze select * from testhyl where id>1000 order by id limit 5;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.44..0.90 rows=5 width=36) (actual time=0.020..0.021 rows=5 loops=1)
-> Index Scan using testhyl_i_id on testhyl (cost=0.44..311117.76 rows=3333333 width=36) (actual time=0.019..0.020 rows=5 loops=1)
Index Cond: (id > 1000)
Planning time: 0.093 ms
Execution time: 0.041 ms
(5 行记录)
3、取id=10000后的5条数据:
highgo=# explain analyze select * from testhyl order by id limit 5 offset 10000;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Limit (cost=476.34..476.58 rows=5 width=36) (actual time=1.720..1.720 rows=5 loops=1)
-> Index Scan using testhyl_i_id on testhyl (cost=0.44..475908.44 rows=10000000 width=36) (actual time=0.020..1.577 rows=10005 loops=1)
Planning time: 0.084 ms
Execution time: 1.739 ms
(4 行记录)
highgo=# explain analyze select * from testhyl where id>10000 order by id limit 5;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Limit (cost=0.44..0.90 rows=5 width=36) (actual time=0.026..0.028 rows=5 loops=1)
-> Index Scan using testhyl_i_id on testhyl (cost=0.44..311117.76 rows=3333333 width=36) (actual time=0.025..0.026 rows=5 loops=1)
Index Cond: (id > 10000)
Planning time: 0.099 ms
Execution time: 0.049 ms
(5 行记录)
4、取id=100000后的5条数据:
highgo=# explain analyze select * from testhyl order by id limit 5 offset 100000;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Limit (cost=4759.52..4759.75 rows=5 width=36) (actual time=15.336..15.338 rows=5 loops=1)
-> Index Scan using testhyl_i_id on testhyl (cost=0.44..475908.44 rows=10000000 width=36) (actual time=0.021..13.956 rows=100005 loops=1)
Planning time: 0.103 ms
Execution time: 15.361 ms
(4 行记录)
highgo=# explain analyze select * from testhyl where id>100000 order by id limit 5;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.44..0.90 rows=5 width=36) (actual time=0.028..0.030 rows=5 loops=1)
-> Index Scan using testhyl_i_id on testhyl (cost=0.44..311117.76 rows=3333333 width=36) (actual time=0.027..0.028 rows=5 loops=1)
Index Cond: (id > 100000)
Planning time: 0.106 ms
Execution time: 0.054 ms
(5 行记录)
5、取id=1000000后的5条数据:
highgo=# explain analyze select * from testhyl order by id limit 5 offset 1000000;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Limit (cost=47591.24..47591.47 rows=5 width=36) (actual time=142.140..142.141 rows=5 loops=1)
-> Index Scan using testhyl_i_id on testhyl (cost=0.44..475908.44 rows=10000000 width=36) (actual time=0.031..130.140 rows=1000005 loops=1)
Planning time: 0.104 ms
Execution time: 142.166 ms
(4 行记录)
highgo=# explain analyze select * from testhyl where id>1000000 order by id limit 5;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Limit (cost=0.44..0.90 rows=5 width=36) (actual time=0.024..0.025 rows=5 loops=1)
-> Index Scan using testhyl_i_id on testhyl (cost=0.44..311117.76 rows=3333333 width=36) (actual time=0.022..0.023 rows=5 loops=1)
Index Cond: (id > 1000000)
Planning time: 0.086 ms
Execution time: 0.043 ms
(5 行记录)
对testhyl表(数据条数:一千万条)
分别使用order by+limit-offset和where+order by+limit方法来实现分页,所耗用的时间对比分析:
说明:以上内容来源于瀚高基础软件 韩永利老师 分页技术分享部分章节,特此鸣谢!
By 波罗