PostgreSQL 9.6 内核优化 - sort 性能增强

作者

digoal

日期

2016-10-09

背景

PostgreSQL 9.6在排序这块做了一些性能增强，前面一篇主要讲了排序算法的变更。

《PostgreSQL 9.6 内核优化 - sort性能增强(batch化quicksort代替replacement selection when work_mem small)》

本文针对另外几个SORT增强的点再简单介绍一下。

PostgreSQL 9.6 排序增强点

1. 文本排序性能增强，当相同的文本多次出现时，性能有所增强。

Speed up text sorts where the same string occurs multiple times (Peter Geoghegan)

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=0e57b4d8bd9674adaf5747421b3255b85e385534

Speed up text sorts where the same strings occur multiple times.

Cache strxfrm() blobs across calls made to the text SortSupport abbreviation routine.  
This can speed up sorting if the same string needs to be abbreviated many times in a row.

Also, cache the result of the previous strcoll() comparison, so that if we're asked to compare the same strings agin, we do need to call strcoll() again.

Perhaps surprisingly, these optimizations don't seem to hurt even when they don't help.  
memcmp() is really cheap compared to strcoll() or strxfrm().

Peter Geoghegan, reviewed by me.

2. 对uuid,bytea,char(n)的排序性能增强，使用abb keys, 整型比较算法取代memcmp。

Speed up sorting of uuid, bytea, and char(n) fields by using "abbreviated" keys (Peter Geoghegan)

Support for abbreviated keys has also been added to the non-default operator classes text_pattern_ops, varchar_pattern_ops, and bpchar_pattern_ops.

Processing of ordered-set aggregates can also now exploit abbreviated keys.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a76ef15d9fc9207a0758e8d6f6700dc8c931a934

Add sort support routine for the UUID data type.

This introduces a simple encoding scheme to produce abbreviated keys:
pack as many bytes of each UUID as will fit into a Datum.  
On little-endian machines, a byteswap is also performed; 
the abbreviated comparator can therefore just consist of a simple 3-way unsigned integer comparison.

The purpose of this change is to speed up sorting data on a column of type UUID.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=bfb54ff15a447fb22e9deae096e0d45b3e4bd56f

Make abbreviated key comparisons for text a bit cheaper.

If we do some byte-swapping while abbreviating, 
we can do comparisons using integer arithmetic rather than memcmp.

3. 并行创建索引(非堵塞式)的性能增强

Speed up CREATE INDEX CONCURRENTLY by treating TIDs as 64-bit integers during sorting (Peter Geoghegan)

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b648b70342fbe712383e8cd76dc8f7feaba9aaa3

Speed up CREATE INDEX CONCURRENTLY's TID sort. 

Encode TIDs as 64-bit integers to speed up comparisons.   
This seems to speed things up on all platforms, but is even more beneficial when 8-byte integers are passed by value. 

Peter Geoghegan.  Design suggestions and review by Tom Lane.  
Review also by Simon Riggs and by me.

测试例子

1. 高度重复文本排序

1000万记录

postgres=# create table test(id int, info text);
CREATE TABLE
postgres=# insert into test select generate_series(1,10000000), '111111111111111111111111111111111111111111111111111111111111';
INSERT 0 10000000

1.1 PostgreSQL 9.5

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test order by info ;
                                                            QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=1854886.10..1879886.14 rows=10000017 width=65) (actual time=9035.319..10444.277 rows=10000000 loops=1)
   Output: id, info
   Sort Key: test.info
   Sort Method: external sort  Disk: 733144kB
   Buffers: shared hit=123457, temp read=91643 written=91643
   ->  Seq Scan on public.test  (cost=0.00..223457.17 rows=10000017 width=65) (actual time=0.007..1135.602 rows=10000000 loops=1)
         Output: id, info
         Buffers: shared hit=123457
 Planning time: 0.095 ms
 Execution time: 11151.855 ms
(10 rows)

排序耗时7.9秒

1.2 PostgreSQL 9.6

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test order by info ;
                                                           QUERY PLAN                                                            
---------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=4544431.28..4612332.63 rows=27160540 width=9) (actual time=8141.378..9660.753 rows=10000000 loops=1)
   Output: id, info
   Sort Key: test.info
   Sort Method: external sort  Disk: 733144kB
   Buffers: shared hit=123457, temp read=91643 written=91643
   ->  Seq Scan on public.test  (cost=0.00..395062.40 rows=27160540 width=9) (actual time=0.011..1188.940 rows=10000000 loops=1)
         Output: id, info
         Buffers: shared hit=123457
 Planning time: 0.053 ms
 Execution time: 10387.210 ms
(10 rows)

排序耗时6.9秒

2. 离散文本排序

1000万记录

postgres=# create table test_rand(id int, info text);
CREATE TABLE
postgres=# insert into test_rand select generate_series(1,10000000), md5(random()::text);
INSERT 0 10000000

2.1 PostgreSQL 9.5

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test_rand order by info ;
                                                              QUERY PLAN                                                               
---------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=1754736.47..1781195.02 rows=10583418 width=36) (actual time=21266.183..26347.778 rows=10000000 loops=1)
   Output: id, info
   Sort Key: test_rand.info
   Sort Method: external merge  Disk: 459464kB
   Buffers: shared hit=83334, temp read=150352 written=150352
   ->  Seq Scan on public.test_rand  (cost=0.00..189168.18 rows=10583418 width=36) (actual time=0.013..1363.646 rows=10000000 loops=1)
         Output: id, info
         Buffers: shared hit=83334
 Planning time: 0.069 ms
 Execution time: 27005.070 ms
(10 rows)

排序耗时19.9秒

2.2 PostgreSQL 9.6

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test_rand order by info ;
                                                              QUERY PLAN                                                              
--------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=1658485.88..1683485.52 rows=9999858 width=37) (actual time=16424.010..21024.141 rows=10000000 loops=1)
   Output: id, info
   Sort Key: test_rand.info
   Sort Method: external merge  Disk: 459464kB
   Buffers: shared hit=83334, temp read=179142 written=179142
   ->  Seq Scan on public.test_rand  (cost=0.00..183332.58 rows=9999858 width=37) (actual time=0.011..1025.182 rows=10000000 loops=1)
         Output: id, info
         Buffers: shared hit=83334
 Planning time: 0.053 ms
 Execution time: 21691.354 ms
(10 rows)

排序耗时15秒

3. char(n)排序

1000万记录

postgres=# create table char_32(id char(32));
CREATE TABLE
postgres=# insert into char_32 select md5(random()::text) from generate_series(1,10000000);
INSERT 0 10000000

3.1 PostgreSQL 9.5

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from char_32 order by id;
                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=925778.75..936403.83 rows=4250034 width=132) (actual time=29156.470..35327.059 rows=10000000 loops=1)
   Output: id
   Sort Key: char_32.id
   Sort Method: external merge  Disk: 420408kB
   Buffers: shared hit=83339, temp read=137591 written=137591
   ->  Seq Scan on public.char_32  (cost=0.00..125834.34 rows=4250034 width=132) (actual time=0.009..1324.482 rows=10000000 loops=1)
         Output: id
         Buffers: shared hit=83334
 Planning time: 0.085 ms
 Execution time: 35961.840 ms
(10 rows)

排序耗时27.8秒

3.2 PostgreSQL 9.6

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from char_32 order by id;
                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=1658523.51..1683523.71 rows=10000080 width=33) (actual time=17423.580..22348.735 rows=10000000 loops=1)
   Output: id
   Sort Key: char_32.id
   Sort Method: external merge  Disk: 420432kB
   Buffers: shared hit=83339, temp read=163911 written=163911
   ->  Seq Scan on public.char_32  (cost=0.00..183334.80 rows=10000080 width=33) (actual time=0.019..1203.205 rows=10000000 loops=1)
         Output: id
         Buffers: shared hit=83336
 Planning time: 0.200 ms
 Execution time: 23013.343 ms
(10 rows)

排序耗时16.2秒

4. 非堵塞式创建索引

5000万随机数据

postgres=# create table curr_idx_test(id int);
CREATE TABLE
postgres=# insert into curr_idx_test select 50000000*random() from generate_series(1,50000000);
INSERT 0 50000000
postgres=# \dt+ curr_idx_test 
                         List of relations
 Schema |     Name      | Type  |  Owner   |  Size   | Description 
--------+---------------+-------+----------+---------+-------------
 public | curr_idx_test | table | postgres | 1729 MB | 
(1 row)

4.1 PostgreSQL 9.5

postgres=# set maintenance_work_mem='16GB';
SET
postgres=# \timing
Timing is on.
postgres=# create index CONCURRENTLY idx1_curr_idx_test on curr_idx_test(id);
CREATE INDEX
Time: 92591.646 ms

4.2 PostgreSQL 9.6

postgres=# set maintenance_work_mem='16GB';
SET
postgres=# \timing
Timing is on.
postgres=# create index CONCURRENTLY idx1_curr_idx_test on curr_idx_test(id);
CREATE INDEX
Time: 56483.672 ms

5. 堵塞式创建索引

5.1 PostgreSQL 9.5

postgres=# create index idx2_curr_idx_test on curr_idx_test(id);
CREATE INDEX
Time: 40733.880 ms

5.2 PostgreSQL 9.6

postgres=# create index idx2_curr_idx_test on curr_idx_test(id);
CREATE INDEX
Time: 40713.340 ms

结果对比

1. 高度重复文本排序(1000万记录)
PostgreSQL 9.5 排序耗时7.9秒
PostgreSQL 9.6 排序耗时6.9秒

2. 离散文本排序(1000万记录)，quicksort的优化
PostgreSQL 9.5 排序耗时19.9秒
PostgreSQL 9.6 排序耗时15秒

3. char(n)排序(1000万记录)
PostgreSQL 9.5 排序耗时27.8秒
PostgreSQL 9.6 排序耗时16.2秒

4. 非堵塞式创建索引(5000万记录)
PostgreSQL 9.5 排序耗时92.6秒
PostgreSQL 9.6 排序耗时56.5秒

5. 堵塞式创建索引(5000万记录)
PostgreSQL 9.5 排序耗时40.7秒
PostgreSQL 9.6 排序耗时40.7秒

Count

PostgreSQL 9.6 内核优化 - sort 性能增强