第1章 简介
1.1 参考文档
https://www.percona.com/blog/postgresql-tuple-level-statistics-with-pgstattuple/
1.2 关于pgstattuble
由于Postgres表膨胀会降低数据库性能,因此我们可以通过消除表膨胀来提高其性能。我们可以使用pgstattuple扩展来识别膨胀的表。
这个扩展提供了几个函数来获取元级统计信息。因为pgstattuple函数产生大量的页面级信息,所以默认情况下对它们的访问是有限的。默认情况下,只有pg_stat_scan_tables角色有能力执行pgstattuple函数。
使用pgstattuple函数,我们可以列出死元组百分比高的表,并运行手动VACUUM来回收死元组占用的空间。
第2章 pgstattuble 插件安装 & 测试
2.1 PG15 & pg_tde编译
2.1.1 操作系统信息
我的操作系统是CentOS 8.5,其内核信息如下
[root@pg-server01 ~]# uname -a
Linux pg-server01 4.18.0-348.el8.x86_64 #1 SMP Tue Oct 19 15:14:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
[root@pg-server01 ~]# cat /etc/redhat-release
CentOS Linux release 8.5.2111
2.1.2 配置阿里云 yum 源
https://www.cnblogs.com/hunttown/p/16287988.html
2.1.3 安装相关依赖
yum -y install gcc readline readline-devel zlib-devel openssl-devel libicu-devel make json-c-devel git wget
2.1.4 下载PG15 源码&编译
mkdir -p /data/software/pg/;
cd /data/software/pg/;
wget https://ftp.postgresql.org/pub/source/v15.4/postgresql-15.4.tar.gz
tar -xvf postgresql-15.4.tar.gz;
cd ./postgresql-15.4;
./configure
–prefix=/opt/pgsql
–with-openssl
make && make install
2.1.5 编译 pgstattuble 插件
cd /data/software/pg/postgresql-15.4/;
cd ./contrib/pgstattuple;
make && make install
2.1.6 编译 btree_gin 插件
cd /data/software/pg/postgresql-15.4/;
cd ./contrib/btree_gin;
make && make install
2.2 安装配置PG
2.2.1 添加 postgres用户并授权
useradd postgres
chown -R postgres:postgres /home/postgres;
chown -R postgres:postgres /opt/pgsql;
2.2.2 配置环境变量
vi /etc/profile # 添加如下内容
export PGHOME=/opt/pgsql
export PATH= P A T H : PATH: PATH:PGHOME/bin
2.2.3 初始化数据库
mkdir -p /data/pgdata/;
chown postgres:postgres /data/pgdata/;
su - postgres
initdb -D /data/pgdata/ -U postgres --encoding=UTF8 --lc-collate=en_US.UTF-8 --lc-ctype=en_US.UTF-8
pg_ctl -D /data/pgdata/ start
2.3 测试 pgstattuble 插件
2.3.1 开启 pgstattuble 插件
su - postgres
psql
select * from pg_available_extensions where name=‘pgstattuple’;
create extension pgstattuple;
select * from pg_available_extensions where name=‘pgstattuple’;
[root@pg-server01 opt]# su - postgres
[postgres@pg-server01 ~]$ psql
psql (15.4)
Type “help” for help.
postgres=# select * from pg_available_extensions where name=‘pgstattuple’;
name | default_version | installed_version | comment
-------------±----------------±------------------±----------------------------
pgstattuple | 1.5 | | show tuple-level statistics
(1 row)
postgres=# create extension pgstattuple;
CREATE EXTENSION
postgres=# select * from pg_available_extensions where name=‘pgstattuple’;
name | default_version | installed_version | comment
-------------±----------------±------------------±----------------------------
pgstattuple | 1.5 | 1.5 | show tuple-level statistics
(1 row)
postgres=#
grant pg_stat_scan_tables to ;
2.3.2 pgstattuble测试前准备
接下来,在使用 pgstattuple 函数之前,让我们创建一个表和索引来进行演示。
psql
create table workshop (jobno int);
insert into workshop values (generate_series(1,80000));
create index workshop_index on workshop (jobno);
[postgres@pg-server01 ~]$ psql
psql (15.4)
Type “help” for help.
postgres=# create table workshop (jobno int);
CREATE TABLE
postgres=# insert into workshop values (generate_series(1,80000));
INSERT 0 80000
postgres=# create index workshop_index on workshop (jobno);
CREATE INDEX
postgres=#
2.3.3 pgstattuple 功能
2.3.3.1 pgstattuple(regclass)
这个函数返回一个关系的物理长度,“死亡”元组的百分比,以及其他信息。这可以帮助用户确定是否需要真空。参数是目标关系的名称(可选的模式限定)或OID。例如:
SELECT * FROM pgstattuple(‘pg_catalog.pg_proc’);
SELECT * FROM pgstattuple(‘workshop’);
postgres=# \x
Expanded display is on.
postgres=# SELECT * FROM pgstattuple(‘pg_catalog.pg_proc’);
-[ RECORD 1 ]------±------
table_len | 786432
tuple_count | 3253
tuple_len | 731586
tuple_percent | 93.03
dead_tuple_count | 12
dead_tuple_len | 4790
dead_tuple_percent | 0.61
free_space | 22564
free_percent | 2.87
postgres=# SELECT * FROM pgstattuple(‘workshop’);
-[ RECORD 1 ]------±-------
table_len | 2899968
tuple_count | 80000
tuple_len | 2240000
tuple_percent | 77.24
dead_tuple_count | 0
dead_tuple_len | 0
dead_tuple_percent | 0
free_space | 10056
free_percent | 0.35
2.3.3.3 让我们在删除或更新行时检查元组统计信息
postgres=# SELECT * FROM pgstattuple(‘workshop’);
-[ RECORD 1 ]------±-------
table_len | 2899968
tuple_count | 80000
tuple_len | 2240000
tuple_percent | 77.24
dead_tuple_count | 0
dead_tuple_len | 0
dead_tuple_percent | 0
free_space | 10056
free_percent | 0.35
DELETE FROM workshop WHERE jobno % 8 = 0;
postgres=# SELECT * FROM pgstattuple(‘workshop’);
-[ RECORD 1 ]------±-------
table_len | 2899968
tuple_count | 70000
tuple_len | 1960000
tuple_percent | 67.59
dead_tuple_count | 10000
dead_tuple_len | 280000
dead_tuple_percent | 9.66
free_space | 10056
free_percent | 0.35
2.3.3.4 在表上执行VACUUM命令
vacuum workshop;
SELECT * FROM pgstattuple(‘workshop’);
vacuum full workshop;
SELECT * FROM pgstattuple(‘workshop’);
postgres=# vacuum workshop;
VACUUM
postgres=# SELECT * FROM pgstattuple(‘workshop’);
-[ RECORD 1 ]------±-------
table_len | 2899968
tuple_count | 70000
tuple_len | 1960000
tuple_percent | 67.59
dead_tuple_count | 0
dead_tuple_len | 0
dead_tuple_percent | 0
free_space | 330412
free_percent | 11.39
postgres=# vacuum full workshop;
VACUUM
postgres=# SELECT * FROM pgstattuple(‘workshop’);
-[ RECORD 1 ]------±-------
table_len | 2539520
tuple_count | 70000
tuple_len | 1960000
tuple_percent | 77.18
dead_tuple_count | 0
dead_tuple_len | 0
dead_tuple_percent | 0
free_space | 10840
free_percent | 0.43
2.3.3.5 pgstattuple 查询检查表膨胀
select relname,(pgstattuple(oid)).dead_tuple_percent
from pg_class
where relkind = ‘r’
order by dead_tuple_percent desc limit 10;
postgres=# \x
Expanded display is off.
postgres=#
postgres=# select relname,(pgstattuple(oid)).dead_tuple_percent
postgres-# from pg_class
postgres-# where relkind = ‘r’
postgres-# order by dead_tuple_percent desc limit 10;
relname | dead_tuple_percent
-----------------------±-------------------
pg_init_privs | 2.56
pg_extension | 1.28
pg_class | 0.75
pg_proc | 0.61
pg_depend | 0.47
pg_statistic | 0.36
pg_type | 0.29
pg_attribute | 0.06
pg_user_mapping | 0
pg_statistic_ext_data | 0
(10 rows)
SELECT relname, oid, relowner,
(pgstattuple(oid)).*
FROM pg_class
WHERE relkind = ‘r’ order by dead_tuple_percent desc limit 5;
postgres=# SELECT relname, oid, relowner,
postgres-# (pgstattuple(oid)).*
postgres-# FROM pg_class
postgres-# WHERE relkind = ‘r’ order by dead_tuple_percent desc limit 5;
-[ RECORD 1 ]------±-------------
relname | pg_init_privs
oid | 3394
relowner | 10
table_len | 24576
tuple_count | 222
tuple_len | 17316
tuple_percent | 70.46
dead_tuple_count | 9
dead_tuple_len | 630
dead_tuple_percent | 2.56
free_space | 4568
free_percent | 18.59
-[ RECORD 2 ]------±-------------
relname | pg_extension
oid | 3079
relowner | 10
table_len | 8192
tuple_count | 2
tuple_len | 210
tuple_percent | 2.56
dead_tuple_count | 1
dead_tuple_len | 105
dead_tuple_percent | 1.28
free_space | 7816
free_percent | 95.41
-[ RECORD 3 ]------±-------------
relname | pg_class
oid | 1259
relowner | 10
table_len | 114688
tuple_count | 412
tuple_len | 79912
tuple_percent | 69.68
dead_tuple_count | 5
dead_tuple_len | 860
dead_tuple_percent | 0.75
free_space | 28656
free_percent | 24.99
-[ RECORD 4 ]------±-------------
relname | pg_proc
oid | 1255
relowner | 10
table_len | 786432
tuple_count | 3253
tuple_len | 731586
tuple_percent | 93.03
dead_tuple_count | 12
dead_tuple_len | 4790
dead_tuple_percent | 0.61
free_space | 22564
free_percent | 2.87
-[ RECORD 5 ]------±-------------
relname | pg_depend
oid | 2608
relowner | 10
table_len | 114688
tuple_count | 1854
tuple_len | 90846
tuple_percent | 79.21
dead_tuple_count | 11
dead_tuple_len | 539
dead_tuple_percent | 0.47
free_space | 2288
free_percent | 1.99
2.3.4 pgstatindex(regclass)
2.3.4.1 这个函数返回一条显示b树索引信息的记录
SELECT * FROM pgstatindex (‘workshop_index’);
postgres=# SELECT * FROM pgstatindex (‘workshop_index’);
-[ RECORD 1 ]------±-------
version | 4
tree_level | 1
index_size | 1589248
root_block_no | 3
internal_pages | 1
leaf_pages | 192
empty_pages | 0
deleted_pages | 0
avg_leaf_density | 89.74
leaf_fragmentation | 0
2.3.5 pgstatginindex(regclass)
2.3.5.1 这个函数返回一条显示GIN索引信息的记录
GIN索引参考:https://pganalyze.com/blog/gin-index
– DROP TABLE IF EXISTS test;
CREATE TABLE test (
id bigserial PRIMARY KEY,
data jsonb
);
INSERT INTO test(data) VALUES (‘{“field”: “value1”}’);
INSERT INTO test(data) VALUES (‘{“field”: “value2”}’);
INSERT INTO test(data) VALUES (‘{“other_field”: “value42”}’);
– CREATE INDEX ON test USING gin(data jsonb_path_ops);
CREATE INDEX test_data_gin_idx ON test USING gin(data jsonb_path_ops);
EXPLAIN SELECT * FROM test WHERE data @> ‘{“field”: “value1”}’;
SELECT * FROM pgstatginindex(‘test_data_gin_idx’);
postgres=# SELECT * FROM pgstatginindex(‘test_data_gin_idx’);
version | pending_pages | pending_tuples
---------±--------------±---------------
2 | 0 | 0
(1 row)
select *
from pg_indexes
where tablename = ‘test’;
dnf install perf
sudo perf probe -x /opt/pgsql/bin/postgres ginInsertCleanup
sudo perf stat -a -e probe_postgres:ginInsertCleanup – sleep 60
– DROP TABLE IF EXISTS records;
CREATE TABLE records (
id bigserial PRIMARY KEY,
customer_id int4,
data jsonb
);
CREATE EXTENSION btree_gin;
CREATE INDEX ON records USING gin (data, customer_id);
EXPLAIN SELECT * FROM records WHERE customer_id = 123 AND data @> ‘{ “location”: “New York” }’;
EXPLAIN SELECT * FROM records WHERE customer_id = 123;
select *
from pg_indexes
where tablename = ‘records’;
SELECT * FROM pgstatginindex(‘records_data_customer_id_idx’);
CREATE TABLE post(
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
content TEXT NOT NULL,
author_id INT NOT NULL,
language text NOT NULL DEFAULT(‘english’),
document tsvector
);
SELECT id, title FROM post
WHERE post.document @@ to_tsquery(‘english’, ‘Endangered & Species’)
ORDER BY ts_rank(post.document, to_tsquery(‘english’, ‘Endangered & Species’)) DESC
2.3.6 pgstathashindex(regclass)
2.3.6.1 参考文档
https://postgrespro.com/blog/pgsql/4161321
https://postgrespro.com/docs/postgrespro/15/demodb-bookings-installation
2.3.6.2 测试
cd /data/software/pg/demo-data/;
psql -f demo-small-en-20170815.sql
psql
\c demo
create index on flights using hash(flight_no);
create extension pgstattuple;
select * from pg_available_extensions where name=‘pgstattuple’;
select *
from pg_indexes
where tablename = ‘flights’;
explain (costs off) select * from flights where flight_no = ‘PG0001’;
select * from pgstathashindex(‘flights_flight_no_idx’);
Bitmap Heap Scan on flights
Recheck Cond: (flight_no = ‘PG0001’::bpchar)
-> Bitmap Index Scan on flights_flight_no_idx
Index Cond: (flight_no = ‘PG0001’::bpchar)
(4 rows)
demo=# select * from pgstathashindex(‘flights_flight_no_idx’);
version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | live_items | dead_items | free_percent
---------±-------------±---------------±-------------±-------------±-----------±-----------±------------------
4 | 128 | 13 | 1 | 0 | 33121 | 0 | 42.36979656038809
(1 row)