pageinspect源码解读

pageinspect源码解读

pageinspect提供从低层次检查数据库页内容的函数,可用于debug,所有的函数只允许superusers使用。
其源码在openGauss源码contrib/pageinspect目录下。

[gs@sgnode ~]$ cd /data/openGauss-server/contrib/pageinspect
[gs@sgnode pageinspect]$ ll
total 76
-rw-rw-r--. 1 gs gs 15099 Jun 22 13:23 btreefuncs.cpp
-rw-rw-r--. 1 gs gs  1637 Jun 22 13:24 fsmfuncs.cpp
-rw-rw-r--. 1 gs gs  9185 Jun 22 13:13 ginfuncs.cpp
-rwxrwxr-x. 1 gs gs  6743 Jun 22 13:23 heapfuncs.cpp
-rw-rw-r--. 1 gs gs   458 Jun 22 13:13 Makefile
-rw-rw-r--. 1 gs gs  3581 Jun 22 13:13 pageinspect--1.0.sql
-rw-rw-r--. 1 gs gs   173 Jun 22 13:13 pageinspect.control
-rw-rw-r--. 1 gs gs  1255 Jun 22 13:13 pageinspect--unpackaged--1.0.sql
-rw-rw-r--. 1 gs gs 17263 Jun 22 13:23 rawpage.cpp

pageinspect编译

在源码目录直接编译,编译后的相关文件在$GAUSSHOME目录下相应路径

[gs@sgnode pageinspect]$ make
g++ -std=c++11 -D_GLIBCXX_USE_CXX11_ABI=0 -fsigned-char -DSTREAMPLAN -DPGXC -mcx16 -msse4.2 -O0 -Wall -Wpointer-arith -Wno-write-strings -fnon-call-exceptions -fno-common -freg-struct-return -pipe -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -g -DENABLE_GSTRACE -fno-aggressive-loop-optimizations -Wno-attributes -fno-omit-frame-pointer -fno-expensive-optimizations -Wno-unused-but-set-variable -fstack-protector -Wl,-z,relro,-z,now -Wl,-z,noexecstack -std=c++14 -pthread  -D_REENTRANT -D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS -fpic -I. -I. -I../../src/include -I../../src/lib/gstrace -D_GNU_SOURCE  -I/data/openGauss-third_party_binarylibs/dependency/install_tools_centos7.6_x86_64/unixodbc/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libobs/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libcgroup/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/openssl/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/liborc/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libparquet/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/protobuf/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/grpc/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/boost/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/llvm/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/kerberos/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/postgresql-hll/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/cjson/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libedit/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/numactl/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/zlib1.2.11/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/lz4/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libcurl/comm/include  -c -o rawpage.o rawpage.cpp
g++ -std=c++11 -D_GLIBCXX_USE_CXX11_ABI=0 -fsigned-char -DSTREAMPLAN -DPGXC -mcx16 -msse4.2 -O0 -Wall -Wpointer-arith -Wno-write-strings -fnon-call-exceptions -fno-common -freg-struct-return -pipe -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -g -DENABLE_GSTRACE -fno-aggressive-loop-optimizations -Wno-attributes -fno-omit-frame-pointer -fno-expensive-optimizations -Wno-unused-but-set-variable -fstack-protector -Wl,-z,relro,-z,now -Wl,-z,noexecstack -std=c++14 -pthread  -D_REENTRANT -D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS -fpic -I. -I. -I../../src/include -I../../src/lib/gstrace -D_GNU_SOURCE  -I/data/openGauss-third_party_binarylibs/dependency/install_tools_centos7.6_x86_64/unixodbc/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libobs/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libcgroup/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/openssl/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/liborc/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libparquet/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/protobuf/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/grpc/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/boost/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/llvm/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/kerberos/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/postgresql-hll/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/cjson/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libedit/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/numactl/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/zlib1.2.11/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/lz4/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libcurl/comm/include  -c -o heapfuncs.o heapfuncs.cpp
g++ -std=c++11 -D_GLIBCXX_USE_CXX11_ABI=0 -fsigned-char -DSTREAMPLAN -DPGXC -mcx16 -msse4.2 -O0 -Wall -Wpointer-arith -Wno-write-strings -fnon-call-exceptions -fno-common -freg-struct-return -pipe -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -g -DENABLE_GSTRACE -fno-aggressive-loop-optimizations -Wno-attributes -fno-omit-frame-pointer -fno-expensive-optimizations -Wno-unused-but-set-variable -fstack-protector -Wl,-z,relro,-z,now -Wl,-z,noexecstack -std=c++14 -pthread  -D_REENTRANT -D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS -fpic -I. -I. -I../../src/include -I../../src/lib/gstrace -D_GNU_SOURCE  -I/data/openGauss-third_party_binarylibs/dependency/install_tools_centos7.6_x86_64/unixodbc/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libobs/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libcgroup/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/openssl/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/liborc/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libparquet/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/protobuf/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/grpc/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/boost/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/llvm/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/kerberos/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/postgresql-hll/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/cjson/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libedit/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/numactl/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/zlib1.2.11/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/lz4/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libcurl/comm/include  -c -o btreefuncs.o btreefuncs.cpp
g++ -std=c++11 -D_GLIBCXX_USE_CXX11_ABI=0 -fsigned-char -DSTREAMPLAN -DPGXC -mcx16 -msse4.2 -O0 -Wall -Wpointer-arith -Wno-write-strings -fnon-call-exceptions -fno-common -freg-struct-return -pipe -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -g -DENABLE_GSTRACE -fno-aggressive-loop-optimizations -Wno-attributes -fno-omit-frame-pointer -fno-expensive-optimizations -Wno-unused-but-set-variable -fstack-protector -Wl,-z,relro,-z,now -Wl,-z,noexecstack -std=c++14 -pthread  -D_REENTRANT -D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS -fpic -I. -I. -I../../src/include -I../../src/lib/gstrace -D_GNU_SOURCE  -I/data/openGauss-third_party_binarylibs/dependency/install_tools_centos7.6_x86_64/unixodbc/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libobs/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libcgroup/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/openssl/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/liborc/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libparquet/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/protobuf/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/grpc/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/boost/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/llvm/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/kerberos/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/postgresql-hll/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/cjson/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libedit/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/numactl/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/zlib1.2.11/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/lz4/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libcurl/comm/include  -c -o fsmfuncs.o fsmfuncs.cpp
g++ -std=c++11 -D_GLIBCXX_USE_CXX11_ABI=0 -fsigned-char -DSTREAMPLAN -DPGXC -mcx16 -msse4.2 -O0 -Wall -Wpointer-arith -Wno-write-strings -fnon-call-exceptions -fno-common -freg-struct-return -pipe -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -g -DENABLE_GSTRACE -fno-aggressive-loop-optimizations -Wno-attributes -fno-omit-frame-pointer -fno-expensive-optimizations -Wno-unused-but-set-variable -fstack-protector -Wl,-z,relro,-z,now -Wl,-z,noexecstack -std=c++14 -pthread  -D_REENTRANT -D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS -fpic -I. -I. -I../../src/include -I../../src/lib/gstrace -D_GNU_SOURCE  -I/data/openGauss-third_party_binarylibs/dependency/install_tools_centos7.6_x86_64/unixodbc/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libobs/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libcgroup/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/openssl/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/liborc/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libparquet/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/protobuf/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/grpc/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/boost/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/llvm/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/kerberos/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/postgresql-hll/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/cjson/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libedit/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/numactl/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/zlib1.2.11/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/lz4/comm/include -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libcurl/comm/include  -c -o ginfuncs.o ginfuncs.cpp
g++ -std=c++11 -D_GLIBCXX_USE_CXX11_ABI=0 -fsigned-char -DSTREAMPLAN -DPGXC -mcx16 -msse4.2 -O0 -Wall -Wpointer-arith -Wno-write-strings -fnon-call-exceptions -fno-common -freg-struct-return -pipe -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -g -DENABLE_GSTRACE -fno-aggressive-loop-optimizations -Wno-attributes -fno-omit-frame-pointer -fno-expensive-optimizations -Wno-unused-but-set-variable -fstack-protector -Wl,-z,relro,-z,now -Wl,-z,noexecstack -std=c++14 -pthread  -D_REENTRANT -D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS -fpic -shared -o pageinspect.so rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o ginfuncs.o -L../../src/common/port -pthread -L/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/zlib1.2.11/comm/lib -I/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/zlib1.2.11/comm/include -L/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libedit/comm/lib -L/data/openGauss-third_party_binarylibs/platform/centos7.6_x86_64/Huawei_Secure_C/comm/lib -L/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/openssl/comm/lib -L/data/openGauss-third_party_binarylibs/buildtools/centos7.6_x86_64/libstd/gcc7.3.0/comm/lib -L/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libcgroup/comm/lib -L -L/data/openGauss-third_party_binarylibs/dependency/install_tools_centos7.6_x86_64/unixodbc/lib -L/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libobs/comm/lib -L/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/kerberos/comm/lib -L../../src/gstrace//common -L/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/numactl/comm/lib -L/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/libcurl/comm/lib  -L/data/openGauss-third_party_binarylibs/dependency/centos7.6_x86_64/jemalloc/debug/lib
[gs@sgnode pageinspect]$ make install
/usr/bin/mkdir -p '/data/openGauss-server/dest/lib/postgresql'
/usr/bin/mkdir -p '/data/openGauss-server/dest/share/postgresql/extension'
/usr/bin/mkdir -p '/data/openGauss-server/dest/share/postgresql/extension'
/bin/sh ../../config/install-sh -c -m 755  pageinspect.so '/data/openGauss-server/dest/lib/postgresql/pageinspect.so'
/bin/sh ../../config/install-sh -c -m 644 ./pageinspect.control '/data/openGauss-server/dest/share/postgresql/extension/'
/bin/sh ../../config/install-sh -c -m 644 ./pageinspect--1.0.sql ./pageinspect--unpackaged--1.0.sql  '/data/openGauss-server/dest/share/postgresql/extension/'
[gs@sgnode pageinspect]$

在gsql命令行执行create extension后可使用。

cc1=# create extension pageinspect;
CREATE EXTENSION
// 查询pageinspect提供的函数
cc1=# \x
Expanded display is on.
cc1=# \df
List of functions
-[ RECORD 1 ]-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
Schema              | public
Name                | bt_metap
Result data type    | record
Argument data types | relname text, OUT magic integer, OUT version integer, OUT root integer, OUT level integer, OUT fastroot integer, OUT fastlevel integer
Type                | normal
fencedmode          | f
propackage          | f
prokind             | f
-[ RECORD 2 ]-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
Schema              | public
Name                | bt_page_items
Result data type    | SETOF record
Argument data types | relname text, blkno integer, OUT itemoffset smallint, OUT ctid tid, OUT itemlen smallint, OUT nulls boolean, OUT vars boolean, OUT data text
Type                | normal
fencedmode          | f
propackage          | f
prokind             | f
-[ RECORD 3 ]-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
Schema              | public
Name                | bt_page_stats
Result data type    | record
Argument data types | relname text, blkno integer, OUT blkno integer, OUT type "char", OUT live_items integer, OUT dead_items integer, OUT avg_item_size integer, OUT page_size integer
, OUT free_size integer, OUT btpo_prev integer, OUT btpo_next integer, OUT btpo integer, OUT btpo_flags integer
Type                | normal
fencedmode          | f
propackage          | f
prokind             | f
-[ RECORD 4 ]-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
Schema              | public
Name                | fsm_page_contents
Result data type    | text
Argument data types | page bytea
Type                | normal
fencedmode          | f
propackage          | f
prokind             | f
-[ RECORD 5 ]-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
Schema              | public
Name                | get_raw_page
Result data type    | bytea
Argument data types | text, integer
Type                | normal
fencedmode          | f
propackage          | f
prokind             | f
-[ RECORD 6 ]-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
Schema              | public
Name                | get_raw_page
Result data type    | bytea
Argument data types | text, text, integer
Type                | normal
fencedmode          | f
propackage          | f
prokind             | f
-[ RECORD 7 ]-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
Schema              | public
Name                | gin_leafpage_items
Result data type    | SETOF record
Argument data types | page bytea, OUT first_tid tid, OUT nbytes smallint, OUT tids tid[]
Type                | normal
fencedmode          | f
propackage          | f
prokind             | f
-[ RECORD 8 ]-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
Schema              | public
Name                | gin_metapage_info
Result data type    | record
Argument data types | page bytea, OUT pending_head bigint, OUT pending_tail bigint, OUT tail_free_size integer, OUT n_pending_pages bigint, OUT n_pending_tuples bigint, OUT n_total_pa
ges bigint, OUT n_entry_pages bigint, OUT n_data_pages bigint, OUT n_entries bigint, OUT version integer
Type                | normal
fencedmode          | f
propackage          | f
prokind             | f
-[ RECORD 9 ]-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
Schema              | public
Name                | gin_page_opaque_info
Result data type    | record
Argument data types | page bytea, OUT rightlink bigint, OUT maxoff integer, OUT flags text[]
Type                | normal
fencedmode          | f
propackage          | f
prokind             | f
-[ RECORD 10 ]------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
Schema              | public
Name                | heap_page_items
Result data type    | SETOF record
Argument data types | page bytea, OUT lp smallint, OUT lp_off smallint, OUT lp_flags smallint, OUT lp_len smallint, OUT t_xmin xid, OUT t_xmax xid, OUT t_field3 integer, OUT t_ctid ti
d, OUT t_infomask2 integer, OUT t_infomask integer, OUT t_hoff smallint, OUT t_bits text, OUT t_oid oid
Type                | normal
fencedmode          | f
propackage          | f
prokind             | f
-[ RECORD 11 ]------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
Schema              | public
Name                | page_compress_meta
Result data type    | text
Argument data types | relation_name bytea, blkno integer, blknum integer, OUT compress_meta text
Type                | normal
fencedmode          | f
propackage          | f
prokind             | f
-[ RECORD 12 ]------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
Schema              | public
Name                | page_compress_meta_usage
Result data type    | text
Argument data types | OUT help text
Type                | normal
fencedmode          | f
propackage          | f
prokind             | f
-[ RECORD 13 ]------+------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------
Schema              | public
Name                | page_header
Result data type    | record
Argument data types | page bytea, OUT lsn text, OUT tli smallint, OUT flags smallint, OUT lower smallint, OUT upper smallint, OUT special smallint, OUT pagesize smallint, OUT version
smallint, OUT prune_xid xid
Type                | normal
fencedmode          | f
propackage          | f
prokind             | f

pageinspect源码

  • 函数 get_raw_page(text, int4) RETURNS bytea
    查看页数据,关联rawpage.cpp get_raw_page 方法。
/*
 * get_raw_page
 *
 * Returns a copy of a page from shared buffers as a bytea
 * 以bytea的形式从共享缓冲区返回页的副本
 */
PG_FUNCTION_INFO_V1(get_raw_page);

Datum get_raw_page(PG_FUNCTION_ARGS)
{
    text* relname = PG_GETARG_TEXT_P(0);
    uint32 blkno = PG_GETARG_UINT32(1);
    bytea* raw_page = NULL;

    /*
     * We don't normally bother to check the number of arguments to a C
     * function, but here it's needed for safety because early 8.4 beta
     * releases mistakenly redefined get_raw_page() as taking three arguments.
     * 检查参数个数,8.4 beta之前的版本错误地将get_raw_page()重定义为接受3个参数
     */
    if (PG_NARGS() != 2)
        ereport(ERROR,
            (errmsg("wrong number of arguments to get_raw_page()"),
                errhint("Run the updated pageinspect.sql script.")));

    // 获取页面数据
    raw_page = get_raw_page_internal(relname, MAIN_FORKNUM, blkno);

    // 返回页面数据
    PG_RETURN_BYTEA_P(raw_page);
}

/*
 * relname: 表
 * forknum: fork number,main: 0, fsm: 1, vm: 2, init: 4
 * blkno: 页的编号
 */
static bytea* get_raw_page_internal(text* relname, ForkNumber forknum, BlockNumber blkno)
{
    bytea* raw_page = NULL;
    RangeVar* relrv = NULL;
    Relation rel;
    char* raw_page_data = NULL;
    Buffer buf;

    // 非超级用户返回错误
    if (!superuser())
        ereport(
            ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), (errmsg("must be system admin to use raw functions"))));
    // 初始化RangeVar,解析relname转换为RangeVar,relname格式可为:relname, schemaname.relanme, catalogname.schemaname.relname
    relrv = makeRangeVarFromNameList(textToQualifiedNameList(relname));
    // 打开relation
    rel = relation_openrv(relrv, AccessShareLock);

    /* Check that this relation has storage */
    // 检查relation是否有存储空间
    if (rel->rd_rel->relkind == RELKIND_VIEW)
        ereport(ERROR,
            (errcode(ERRCODE_WRONG_OBJECT_TYPE),
                errmsg("cannot get raw page from view \"%s\"", RelationGetRelationName(rel))));
    if (rel->rd_rel->relkind == RELKIND_CONTQUERY)
        ereport(ERROR,
            (errcode(ERRCODE_WRONG_OBJECT_TYPE),
                errmsg("cannot get raw page from contview for streaming engine \"%s\"", 
                       RelationGetRelationName(rel))));
    if (rel->rd_rel->relkind == RELKIND_COMPOSITE_TYPE)
        ereport(ERROR,
            (errcode(ERRCODE_WRONG_OBJECT_TYPE),
                errmsg("cannot get raw page from composite type \"%s\"", RelationGetRelationName(rel))));
    if (rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
        ereport(ERROR,
            (errcode(ERRCODE_WRONG_OBJECT_TYPE),
                errmsg("cannot get raw page from foreign table \"%s\"", RelationGetRelationName(rel))));

    if (rel->rd_rel->relkind == RELKIND_STREAM)
        ereport(ERROR,
            (errcode(ERRCODE_WRONG_OBJECT_TYPE),
                errmsg("cannot get raw page from stream for streaming engine \"%s\"", 
                       RelationGetRelationName(rel))));

    /*
     * Reject attempts to read non-local temporary relations; we would be
     * likely to get wrong data since we have no visibility into the owning
     * session's local buffers.
     * 非本地临时表不能访问,如果是则抛出异常
     */
    if (RELATION_IS_OTHER_TEMP(rel))
        ereport(ERROR,
            (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot access temporary tables of other sessions")));

    // 判断参数blkno(页编号) >= 页数量,则抛出异常
    if (blkno >= RelationGetNumberOfBlocks(rel))
        elog(ERROR, "block number %u is out of range for relation \"%s\"", blkno, RelationGetRelationName(rel));

    /* Initialize buffer to copy to */
    // 初始化raw_page
    raw_page = (bytea*)palloc(BLCKSZ + VARHDRSZ);
    SET_VARSIZE(raw_page, BLCKSZ + VARHDRSZ);
    raw_page_data = VARDATA(raw_page);

    /* Take a verbatim copy of the page */
   // 将页编号blkno的所有数据读出来
    buf = ReadBufferExtended(rel, forknum, blkno, RBM_NORMAL, NULL);
    LockBuffer(buf, BUFFER_LOCK_SHARE);

    memcpy(raw_page_data, BufferGetPage(buf), BLCKSZ);

    LockBuffer(buf, BUFFER_LOCK_UNLOCK);
    ReleaseBuffer(buf);

    relation_close(rel, AccessShareLock);

    return raw_page;
}
  • 函数 get_raw_page(text, text, int4) RETURNS bytea
    查看页数据,关联rawpage.cpp get_raw_page_fork 方法。与get_raw_page方法类似,增加了指定fork。
/*
 * get_raw_page_fork
 *
 * Same, for any fork
 */
PG_FUNCTION_INFO_V1(get_raw_page_fork);

Datum get_raw_page_fork(PG_FUNCTION_ARGS)
{
    text* relname = PG_GETARG_TEXT_P(0);
    text* forkname = PG_GETARG_TEXT_P(1);
    uint32 blkno = PG_GETARG_UINT32(2);
    bytea* raw_page = NULL;
    ForkNumber forknum;

    // 获取forkname对应的number,main: 0, fsm: 1, vm: 2, init: 4
    forknum = forkname_to_number(text_to_cstring(forkname));

    // 与get_raw_page调用相同的方法获取页面数据
    raw_page = get_raw_page_internal(relname, forknum, blkno);

    PG_RETURN_BYTEA_P(raw_page);
}

用gsql执行,查看返回的页面数据,可以看到前面40byte的头部数据,以及5个tuple的行指针,尾部有5个元组,中间是空闲空间未填充。

cc1=# create table m(id int);
CREATE TABLE
cc1=# insert into m  select generate_series(1,5);
cc1=# \x
Expanded display is on.
cc1=# select * from get_raw_page('m',0);
-[ RECORD 1 ]+--------------------------------------------------------------------------------------
get_raw_page | \x0000000010fdf602000000003c00601f00200620000000005d380000000000000000000000000000e09f3800c09f3800a09f3800809f3800609f38000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003000000000000000000000000000000050001000008180005000000000000000300000000000000000000000000000004000100000818000400000000000000030000000000000000000000000000000300010000081800030000000000000003000000000000000000000000000000020001000008180002000000000000000300000000000000000000000000000001000100000818000100000000000000

cc1=#
  • 函数 page_header(IN page bytea,OUT lsn text,OUT tli smallint,OUT flags smallint,OUT lower smallint,OUT upper smallint,OUT special smallint,OUT pagesize smallint,OUT version smallint,OUT prune_xid xid)
    查看页头信息,关联rawpage.cpp page_header 方法。
/*
 * page_header
 *
 * Allows inspection of page header fields of a raw page
 * 允许检查页面数据的页头
 */

PG_FUNCTION_INFO_V1(page_header);

Datum page_header(PG_FUNCTION_ARGS)
{
    // 页面数据
    bytea* raw_page = PG_GETARG_BYTEA_P(0);
    uint32 raw_page_size;

    TupleDesc tupdesc;

    Datum result;
    HeapTuple tuple;
    Datum values[11];
    bool nulls[11];

    PageHeader page;
    XLogRecPtr lsn;
    char lsnchar[64];
    errno_t rc = EOK;

    // 非超级用户报错
    if (!superuser())
        ereport(ERROR,
            (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), (errmsg("must be system admin to use raw page functions"))));

    raw_page_size = VARSIZE(raw_page) - VARHDRSZ;

    page = (PageHeader)VARDATA(raw_page);

    /*
     * Check that enough data was supplied, so that we don't try to access
     * fields outside the supplied buffer.
     * 页大小<需要获取的页头大小,没有足够的数据,则抛出异常
     */
    if (raw_page_size < GetPageHeaderSize(page))
        ereport(ERROR,
            (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("input page too small (%d bytes)", raw_page_size)));

    
    /* Build a tuple descriptor for our result type */
    // 构建一个元组描述符,该函数包含OUT参数,以这些参数类型进行构建
    if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
        elog(ERROR, "return type must be a row type");

    /* Extract information from the page header */
    // 从页头提取需要返回的信息
    lsn = PageGetLSN(page);
    rc = snprintf_s(lsnchar, sizeof(lsnchar), sizeof(lsnchar) - 1, "%X/%X", (uint32)(lsn >> 32), (uint32)lsn);
    securec_check_ss(rc, "\0", "\0");

    rc = memset_s(nulls, sizeof(nulls), 0, sizeof(nulls));
    securec_check_c(rc, "\0", "\0");
    values[0] = CStringGetTextDatum(lsnchar);
    values[1] = UInt16GetDatum(page->pd_checksum);
    values[2] = UInt16GetDatum(page->pd_flags);
    values[3] = UInt16GetDatum(page->pd_lower);
    values[4] = UInt16GetDatum(page->pd_upper);
    values[5] = UInt16GetDatum(page->pd_special);
    values[6] = UInt16GetDatum(PageGetPageSize(page));
    values[7] = UInt16GetDatum(PageGetPageLayoutVersion(page));
    if (PageIs8BXidHeapVersion(page)) {
        values[8] = TransactionIdGetDatum(page->pd_prune_xid + ((HeapPageHeader)page)->pd_xid_base);
        values[9] = TransactionIdGetDatum(((HeapPageHeader)page)->pd_xid_base);
        values[10] = TransactionIdGetDatum(((HeapPageHeader)page)->pd_multi_base);
        nulls[8] = false;
        nulls[9] = false;
        nulls[10] = false;
    } else {
        values[8] = ShortTransactionIdGetDatum(page->pd_prune_xid);
        nulls[9] = true;
        nulls[10] = true;
    }

    /* Build and return the tuple. */
    // 使用元组描述符和数据构建元组并返回
    tuple = heap_form_tuple(tupdesc, values, nulls);
    result = HeapTupleGetDatum(tuple);

    PG_RETURN_DATUM(result);
}

使用gsql查看页头信息,可查看lsn,行指针的末尾为60,元组的起始位为8032,与上面查询到的页面数据一致,页大小为8192等。

cc1=# select * from page_header(get_raw_page('m',0));
    lsn    | tli | flags | lower | upper | special | pagesize | version | prune_xid
-----------+-----+-------+-------+-------+---------+----------+---------+-----------
 0/2F6FD10 |   0 |     0 |    60 |  8032 |    8192 |     8192 |       6 |     14429
(1 row)
  • 函数 page_compress_meta(IN relation_name bytea,IN blkno int4,IN blknum int4,OUT compress_meta text)
    获取压缩的元数据,关联rawpage.cpp page_compress_meta方法。
// arg1: relation name
// arg2: start blockno
// arg3: number of parsing block, default is 1
//
PG_FUNCTION_INFO_V1(page_compress_meta);

Datum page_compress_meta(PG_FUNCTION_ARGS)
{
    text* relname = PG_GETARG_TEXT_P(0);
    uint32 blkno = PG_GETARG_UINT32(1);
    uint32 blknum = PG_GETARG_UINT32(2);
    uint32 real_blknum;
    RangeVar* relrv = NULL;
    Relation rel;
    StringInfo output = makeStringInfo();

    super_user();
    relrv = makeRangeVarFromNameList(textToQualifiedNameList(relname));
    rel = relation_openrv(relrv, AccessShareLock);
    check(rel);

    // get block number, then check start blkno and input blknum
    //
    real_blknum = RelationGetNumberOfBlocks(rel);
    if (blkno >= real_blknum) {
        // 校验起始页编号 
        appendStringInfo(output, "start blkno %u >= real block number %u \n", blkno, real_blknum);
    } else {
        if (blkno + blknum > real_blknum) {  // blknum是否重新赋值
            blknum = real_blknum - blkno;
        }

        appendStringInfo(output,
            "relfinenode (space=%u, db=%u, rel=%u) \n",
            rel->rd_node.spcNode, // 表空间 oid
            rel->rd_node.dbNode,  // 数据库 oid
            rel->rd_node.relNode); // relation oid

        // parse and output the compression metadata
        // 读取各页面数据,解析压缩的元数据
        for (uint32 i = 0; i < blknum; ++i) {
            char* raw_page = read_raw_page(rel, MAIN_FORKNUM, blkno + i);
            appendStringInfo(output, "Block #%d \n", blkno + i);
            parse_compress_meta(output, raw_page, rel);
            appendStringInfo(output, "\n");
            pfree(raw_page);
        }
    }

    relation_close(rel, AccessShareLock);
    
    // 构建返回值
    bytea* dumpVal = (bytea*)palloc(VARHDRSZ + output->len);
    SET_VARSIZE(dumpVal, VARHDRSZ + output->len);
    memcpy(VARDATA(dumpVal), output->data, output->len);
    pfree(output->data);
    pfree(output);

    PG_RETURN_TEXT_P(dumpVal);
}

static void parse_compress_meta(StringInfo outputBuf, char* page_content, Relation rel)
{
    PageHeader page_header = (PageHeader)page_content;

    // 非压缩页面
    if (!PageIsCompressed(page_header)) {
        appendStringInfo(outputBuf, "\t This page is not compressed \n");
        return;
    }

    char* start = page_content + page_header->pd_special;
    // char* current = start;
    int size = PageGetSpecialSize(page_header); // pd_special大小

    TupleDesc desc = RelationGetDescr(rel); // 描述元组的结构
    Form_pg_attribute* att = desc->attrs; // 元组属性的数组
    int attrno;
    int attrnum = desc->natts; // 元组属性的大小

    int cmprsOff = 0;
    void* metaInfo = NULL;
    char mode = 0;

    // 遍历并解析
    for (attrno = 0; attrno < attrnum && cmprsOff < size; ++attrno) {
        Form_pg_attribute thisatt = att[attrno];
        int metaSize = 0;

        metaInfo = PageCompress::FetchAttrCmprMeta(start + cmprsOff, thisatt->attlen, &metaSize, &mode);
        switch (mode) {
            case CMPR_DELTA: {
                DeltaCmprMeta* deltaInfo = (DeltaCmprMeta*)metaInfo;
                appendStringInfo(outputBuf, "\t Col #%d: Delta, attr-len %d, min-val ", attrno, deltaInfo->bytes);

                int min_val_start = cmprsOff + sizeof(mode) + sizeof(unsigned char);
                formatBytes(outputBuf, (start + min_val_start), thisatt->attlen);
                appendStringInfo(outputBuf, "\n");
                break;
            }

            case CMPR_DICT: {
                DictCmprMeta* dictMeta = (DictCmprMeta*)metaInfo;
                appendStringInfo(outputBuf, "\t Col #%d: dictionary, items %d \n", attrno, dictMeta->dictItemNum);

                for (int i = 0; i < dictMeta->dictItemNum; ++i) {
                    DictItemData* item = dictMeta->dictItems + i;
                    appendStringInfo(outputBuf, "\t\t Item #%d, len %d, data ", i, item->itemSize);
                    formatBytes(outputBuf, item->itemData, item->itemSize);
                    appendStringInfo(outputBuf, "\n");
                }

                break;
            }

            case CMPR_PREFIX: {
                PrefixCmprMeta* prefixMeta = (PrefixCmprMeta*)metaInfo;
                appendStringInfo(outputBuf, "\t Col #%d: prefix, len %d, data ", attrno, prefixMeta->len);
                formatBytes(outputBuf, prefixMeta->prefixStr, prefixMeta->len);
                appendStringInfo(outputBuf, "\n");

                break;
            }

            case CMPR_NUMSTR: {
                appendStringInfo(outputBuf, "\t Col #%d: number string compression \n", attrno);
                break;
            }

            case CMPR_NONE: {
                appendStringInfo(outputBuf, "\t Col #%d: none compression \n", attrno);
                break;
            }
        }
        cmprsOff += metaSize;
    }
}
  • 函数 page_compress_meta_usage(OUT help text)
    打印函数 page_compress_meta的帮助信息。
cc1=# select page_compress_meta_usage();
             page_compress_meta_usage
---------------------------------------------------
 usage: page_compress_meta name blkno blknum      +
                                                  +
         name, relation/table name, only for heap +
         blkno, the start blockno                 +
         blknum, how many blocks to parse         +

(1 row)

cc1=#
  • 函数 heap_page_items(IN page bytea,OUT lp smallint,OUT lp_off smallint,OUT lp_flags smallint,OUT lp_len smallint,OUT t_xmin xid,OUT t_xmax xid,OUT t_field3 int4,OUT t_ctid tid,OUT t_infomask2 integer,OUT t_infomask integer,OUT t_hoff smallint,OUT t_bits text,OUT t_oid oid)
    显示一个堆页面上所有的行指针,关联heapfuncs.cpp的heap_page_items方法。
/*
 * heap_page_items
 *
 * Allows inspection of line pointers and tuple headers of a heap page.
 */
PG_FUNCTION_INFO_V1(heap_page_items);

typedef struct heap_page_items_state {
    TupleDesc tupd;
    Page page;
    uint16 offset;
} heap_page_items_state;

Datum heap_page_items(PG_FUNCTION_ARGS)
{
    bytea* raw_page = PG_GETARG_BYTEA_P(0);
    heap_page_items_state* inter_call_data = NULL;
    FuncCallContext* fctx = NULL;
    uint32 raw_page_size;

    // 非超级用户报错
    if (!superuser())
        ereport(ERROR,
            (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), (errmsg("must be system admin to use raw page functions"))));

    // 页大小
    raw_page_size = VARSIZE(raw_page) - VARHDRSZ;

    // 首次调用
    if (SRF_IS_FIRSTCALL()) {
        TupleDesc tupdesc;
        MemoryContext mctx;

        if (raw_page_size < SizeOfPageHeaderData)
            ereport(ERROR,
                (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("input page too small (%d bytes)", raw_page_size)));

        // 创建一个空的FuncCallContext数据结构,并做一些其他基本的多函数调用设置(如创建一个适当的长期生存的上下文来保存跨调用数据multi_call_memory_ctx)和错误检查
        fctx = SRF_FIRSTCALL_INIT();
        mctx = MemoryContextSwitchTo(fctx->multi_call_memory_ctx);

        inter_call_data = (heap_page_items_state*)palloc(sizeof(heap_page_items_state));

        /* Build a tuple descriptor for our result type */
        // 构建元组结束符
        if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
            elog(ERROR, "return type must be a row type");

        inter_call_data->tupd = tupdesc;

        // 首个行指针offset,赋值为1
        inter_call_data->offset = FirstOffsetNumber;
        // 页数据
        inter_call_data->page = VARDATA(raw_page);

        // 页行指针最大的offset
        fctx->max_calls = PageGetMaxOffsetNumber(inter_call_data->page);
        fctx->user_fctx = inter_call_data;

        MemoryContextSwitchTo(mctx);
    }

    // 多函数调用准备工作,清理TupleTableSlot
    fctx = SRF_PERCALL_SETUP();
    inter_call_data = (heap_page_items_state*)(fctx->user_fctx);

    if (fctx->call_cntr < fctx->max_calls) {
        Page page = inter_call_data->page;
        HeapTuple resultTuple;
        Datum result;
        ItemId id;
        Datum values[13];
        bool nulls[13];
        uint16 lp_offset;
        uint16 lp_flags;
        uint16 lp_len;

        memset(nulls, 0, sizeof(nulls));

        /* Extract information from the line pointer */
        // 获取行指针的信息
        id = PageGetItemId(page, inter_call_data->offset);

        lp_offset = ItemIdGetOffset(id); // 到元组的偏移量
        lp_flags = ItemIdGetFlags(id); // 状态值,0:unused, 1:used
        lp_len = ItemIdGetLength(id); // 元组的字节长度

        values[0] = UInt16GetDatum(inter_call_data->offset); // 行指针的偏移量
        values[1] = UInt16GetDatum(lp_offset);
        values[2] = UInt16GetDatum(lp_flags);
        values[3] = UInt16GetDatum(lp_len);

        /*
         * We do just enough validity checking to make sure we don't reference
         * data outside the page passed to us. The page could be corrupt in
         * many other ways, but at least we won't crash.
         */
        if (ItemIdHasStorage(id) && lp_len >= MinHeapTupleSize && lp_offset == MAXALIGN(lp_offset) &&
            lp_offset + lp_len <= raw_page_size) {
            // 行指针指向的元组存储在页面中
            HeapTupleData tup;
            HeapTupleHeader tuphdr;
            int bits_len;

            /* Extract information from the tuple header */
            // 从元组头部获取信息
            tuphdr = (HeapTupleHeader)PageGetItem(page, id);
            tup.t_data = tuphdr;
            HeapTupleCopyBaseFromPage(&tup, page);

            values[4] = UInt32GetDatum(HeapTupleGetRawXmin(&tup));
            values[5] = UInt32GetDatum(HeapTupleGetRawXmax(&tup));
            values[6] = UInt32GetDatum(HeapTupleHeaderGetRawCommandId(tuphdr)); /* shared with xvac */
            values[7] = PointerGetDatum(&tuphdr->t_ctid); // ctid
            values[8] = UInt32GetDatum(tuphdr->t_infomask2);
            values[9] = UInt32GetDatum(tuphdr->t_infomask);
            values[10] = UInt8GetDatum(tuphdr->t_hoff);

            /*
             * We already checked that the item as is completely within the
             * raw page passed to us, with the length given in the line
             * pointer.. Let's check that t_hoff doesn't point over lp_len,
             * before using it to access t_bits and oid.
             */
            // 校验元组头部大小是否小于元组字节长度
            if (tuphdr->t_hoff >= sizeof(HeapTupleHeader) && tuphdr->t_hoff <= lp_len) {
                if (tuphdr->t_infomask & HEAP_HASNULL) {
                    bits_len = tuphdr->t_hoff - (((char*)tuphdr->t_bits) - ((char*)tuphdr));

                    values[11] = CStringGetTextDatum(bits_to_text(tuphdr->t_bits, bits_len * 8));
                } else
                    nulls[11] = true;

                if (tuphdr->t_infomask & HEAP_HASOID)
                    values[12] = HeapTupleHeaderGetOid(tuphdr);
                else
                    nulls[12] = true;
            } else {
                nulls[11] = true;
                nulls[12] = true;
            }
        } else {
            /*
             * The line pointer is not used, or it's invalid. Set the rest of
             * the fields to NULL
             * 行指针未使用或无效,将剩余的字段值设置为NULL
             */
            int i;

            for (i = 4; i <= 12; i++)
                nulls[i] = true;
        }

        /* Build and return the result tuple. */
        // 构建&返回结果元组
        resultTuple = heap_form_tuple(inter_call_data->tupd, values, nulls);
        result = HeapTupleGetDatum(resultTuple);

        inter_call_data->offset++; // offset +1

        // 设置一些状态值,将当前结果返回
        SRF_RETURN_NEXT(fctx, result);
    } else
        // 停止多函数调用,删除多函数调用使用的相关参数,变量等
        SRF_RETURN_DONE(fctx);
}

用gsql执行,可看到对应输出结果:

cc1=# SELECT * FROM heap_page_items(get_raw_page('m',0));
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------
  1 |   8160 |        1 |     28 |  14432 |      0 |        0 | (0,1)  |           1 |       2304 |     24 |        |
  2 |   8128 |        1 |     28 |  14432 |      0 |        0 | (0,2)  |           1 |       2304 |     24 |        |
  3 |   8096 |        1 |     28 |  14432 |      0 |        0 | (0,3)  |           1 |       2304 |     24 |        |
  4 |   8064 |        1 |     28 |  14432 |      0 |        0 | (0,4)  |           1 |       2304 |     24 |        |
  5 |   8032 |        1 |     28 |  14432 |      0 |        0 | (0,5)  |           1 |       2304 |     24 |        |
(5 rows)

cc1=# insert into m select generate_series(6,10);
INSERT 0 5
cc1=# SELECT * FROM heap_page_items(get_raw_page('m',0));
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------
  1 |   8160 |        1 |     28 |  14432 |      0 |        0 | (0,1)  |           1 |       2304 |     24 |        |
  2 |   8128 |        1 |     28 |  14432 |      0 |        0 | (0,2)  |           1 |       2304 |     24 |        |
  3 |   8096 |        1 |     28 |  14432 |      0 |        0 | (0,3)  |           1 |       2304 |     24 |        |
  4 |   8064 |        1 |     28 |  14432 |      0 |        0 | (0,4)  |           1 |       2304 |     24 |        |
  5 |   8032 |        1 |     28 |  14432 |      0 |        0 | (0,5)  |           1 |       2304 |     24 |        |
  6 |   8000 |        1 |     28 |  14444 |      0 |        0 | (0,6)  |           1 |       2048 |     24 |        |
  7 |   7968 |        1 |     28 |  14444 |      0 |        0 | (0,7)  |           1 |       2048 |     24 |        |
  8 |   7936 |        1 |     28 |  14444 |      0 |        0 | (0,8)  |           1 |       2048 |     24 |        |
  9 |   7904 |        1 |     28 |  14444 |      0 |        0 | (0,9)  |           1 |       2048 |     24 |        |
 10 |   7872 |        1 |     28 |  14444 |      0 |        0 | (0,10) |           1 |       2048 |     24 |        |
(10 rows)

cc1=#
  • 函数 bt_metap(IN relname text,OUT magic int4,OUT version int4,OUT root int4,OUT level int4,OUT fastroot int4,OUT fastlevel int4)
    返回关于一个B树索引元页的信息,关联btreefuncs.cpp的bt_metap方法。
/* ------------------------------------------------
 * bt_metap()
 *
 * Get a btree's meta-page information
 *
 * Usage: SELECT * FROM bt_metap('t1_pkey')
 * ------------------------------------------------
 */
Datum bt_metap(PG_FUNCTION_ARGS)
{
    text* relname = PG_GETARG_TEXT_P(0);
    Datum result;
    Relation rel;
    RangeVar* relrv = NULL;
    BTMetaPageData* metad = NULL;
    TupleDesc tupleDesc;
    int j;
    char* values[6];
    Buffer buffer;
    Page page;
    HeapTuple tuple;

    if (!superuser())
        ereport(ERROR,
            (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), (errmsg("must be system admin to use pageinspect functions"))));

    relrv = makeRangeVarFromNameList(textToQualifiedNameList(relname));
    rel = relation_openrv(relrv, AccessShareLock);

    // 非btree 索引报错
    if (!IS_INDEX(rel) || !IS_BTREE(rel))
        elog(ERROR, "relation \"%s\" is not a btree index", RelationGetRelationName(rel));

    /*
     * Reject attempts to read non-local temporary relations; we would be
     * likely to get wrong data since we have no visibility into the owning
     * session's local buffers.
     * 非本地临时表报错
     */
    if (RELATION_IS_OTHER_TEMP(rel))
        ereport(ERROR,
            (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot access temporary tables of other sessions")));

    // 读取索引元页数据
    buffer = ReadBuffer(rel, 0);
    LockBuffer(buffer, BUFFER_LOCK_SHARE);

    // 从元页数据构建BTMetaPageData结构
    page = BufferGetPage(buffer);
    metad = BTPageGetMeta(page);

    /* Build a tuple descriptor for our result type */
    // 为结果类型构建元组结束符
    if (get_call_result_type(fcinfo, NULL, &tupleDesc) != TYPEFUNC_COMPOSITE)
        elog(ERROR, "return type must be a row type");

    // 赋值
    j = 0;
    values[j] = (char*)palloc(32);
    snprintf(values[j++], 32, "%d", metad->btm_magic);
    values[j] = (char*)palloc(32);
    snprintf(values[j++], 32, "%d", metad->btm_version);
    values[j] = (char*)palloc(32);
    snprintf(values[j++], 32, "%d", metad->btm_root); // 当前root所在的页编号
    values[j] = (char*)palloc(32);
    snprintf(values[j++], 32, "%d", metad->btm_level); // 索引树的高度
    values[j] = (char*)palloc(32);
    snprintf(values[j++], 32, "%d", metad->btm_fastroot);
    values[j] = (char*)palloc(32);
    snprintf(values[j++], 32, "%d", metad->btm_fastlevel);

    // 构建返回元组
    tuple = BuildTupleFromCStrings(TupleDescGetAttInMetadata(tupleDesc), values);

    result = HeapTupleGetDatum(tuple);

    UnlockReleaseBuffer(buffer);
    relation_close(rel, AccessShareLock);

    PG_RETURN_DATUM(result);
}

用gsql执行:

cc1=# create table l(id int primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "l_pkey" for table "l"
CREATE TABLE
cc1=# insert into l select generate_series(1,10000);
INSERT 0 10000
cc1=# analyze l;
ANALYZE
cc1=# select * from bt_metap('l_pkey');
 magic  | version | root | level | fastroot | fastlevel
--------+---------+------+-------+----------+-----------
 340322 |       2 |    3 |     1 |        3 |         1
(1 row)

cc1=#
  • 函数 bt_page_stats(IN relname text, IN blkno int4,OUT blkno int4,OUT type "char",OUT live_items int4,OUT dead_items int4,OUT avg_item_size int4,OUT page_size int4,OUT free_size int4,OUT btpo_prev int4,OUT btpo_next int4,OUT btpo int4,OUT btpo_flags int4)
    返回有关 B-树索引单一页面的总计信息,关联btreefuncs.cpp的bt_page_items方法。
/* -------------------------------------------------
 * GetBTPageStatistics()
 *
 * Collect statistics of single b-tree page
 * 收集单个b-tree页面的统计信息
 * -------------------------------------------------
 */
static void GetBTPageStatistics(BlockNumber blkno, Buffer buffer, BTPageStat* stat)
{
    Page page = BufferGetPage(buffer);
    PageHeader phdr = (PageHeader)page;
    // 存储兄弟节点的指针
    BTPageOpaqueInternal opaque = (BTPageOpaqueInternal)PageGetSpecialPointer(page);
    int item_size = 0;
    int off;

    stat->blkno = blkno; // 页面编号

    stat->max_avail = BLCKSZ - (BLCKSZ - phdr->pd_special + SizeOfPageHeaderData);

    stat->dead_items = stat->live_items = 0; // 初始化dead items,live items

    stat->page_size = PageGetPageSize(page); // 页面大小

    /* page type (flags) */
    if (P_ISDELETED(opaque)) { // 页已从树中删除
        stat->type = 'd';

        if (PageIs4BXidVersion(page))
            stat->btpo.xact = opaque->btpo.xact_old;
        else
            stat->btpo.xact = ((BTPageOpaque)opaque)->xact;
        return;
    } else if (P_IGNORE(opaque))  // 空页
        stat->type = 'e';
    else if (P_ISLEAF(opaque)) // 叶页,即不是内页
        stat->type = 'l';
    else if (P_ISROOT(opaque)) // 根页面(没有父页面)
        stat->type = 'r';
    else
        stat->type = 'i';

    /* btpage opaque data */
    stat->btpo_prev = opaque->btpo_prev; // 前一个页面
    stat->btpo_next = opaque->btpo_next; // 后一个页面
    stat->btpo.level = opaque->btpo.level;  
    stat->btpo_flags = opaque->btpo_flags; // 状态
    stat->btpo_cycleid = opaque->btpo_cycleid;

    /* count live and dead tuples, and free space */
    // 计算存活元组,死元组和空闲空间
    // 遍历行指针
    for (off = FirstOffsetNumber; off <= PageGetMaxOffsetNumber(page); off++) {
        IndexTuple itup;

        ItemId id = PageGetItemId(page, off);

        itup = (IndexTuple)PageGetItem(page, id);

        item_size += IndexTupleSize(itup);

        if (!ItemIdIsDead(id))
            stat->live_items++;
        else
            stat->dead_items++;
    }
    stat->free_size = PageGetFreeSpace(page);

    if ((stat->live_items + stat->dead_items) > 0)
        stat->avg_item_size = item_size / (stat->live_items + stat->dead_items);
    else
        stat->avg_item_size = 0;
}

/* -----------------------------------------------
 * bt_page_stats()
 *
 * Usage: SELECT * FROM bt_page_stats('t1_pkey', 1);
 * -----------------------------------------------
 */
Datum bt_page_stats(PG_FUNCTION_ARGS)
{
    text* relname = PG_GETARG_TEXT_P(0);
    uint32 blkno = PG_GETARG_UINT32(1);
    Buffer buffer;
    Relation rel;
    RangeVar* relrv = NULL;
    Datum result;
    HeapTuple tuple;
    TupleDesc tupleDesc;
    int j;
    char* values[11];
    BTPageStat stat = {0};

    if (!superuser())
        ereport(ERROR,
            (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), (errmsg("must be system admin to use pageinspect functions"))));

    relrv = makeRangeVarFromNameList(textToQualifiedNameList(relname));
    rel = relation_openrv(relrv, AccessShareLock);

    // 非btree索引报错
    if (!IS_INDEX(rel) || !IS_BTREE(rel))
        elog(ERROR, "relation \"%s\" is not a btree index", RelationGetRelationName(rel));

    /*
     * Reject attempts to read non-local temporary relations; we would be
     * likely to get wrong data since we have no visibility into the owning
     * session's local buffers.
     * 非本地临时表报错
     */
    if (RELATION_IS_OTHER_TEMP(rel))
        ereport(ERROR,
            (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot access temporary tables of other sessions")));

    // 页编号为0是元页面,返回错误
    if (blkno == 0)
        elog(ERROR, "block 0 is a meta page");

    // 检查页编号是否存在,不存在则报错
    CHECK_RELATION_BLOCK_RANGE(rel, blkno);

    // 读取对应编号的页数据
    buffer = ReadBuffer(rel, blkno);
    LockBuffer(buffer, BUFFER_LOCK_SHARE);

    /* keep compiler quiet */
    // 初始化相关变量
    stat.btpo_prev = stat.btpo_next = InvalidBlockNumber;
    stat.btpo_flags = stat.free_size = stat.avg_item_size = 0;

    // 获取页面统计信息
    GetBTPageStatistics(blkno, buffer, &stat);

    UnlockReleaseBuffer(buffer);
    relation_close(rel, AccessShareLock);

    /* Build a tuple descriptor for our result type */
    // 构建元组描述符
    if (get_call_result_type(fcinfo, NULL, &tupleDesc) != TYPEFUNC_COMPOSITE)
        elog(ERROR, "return type must be a row type");

    j = 0;
    // 赋值
    values[j] = (char*)palloc(32);
    snprintf(values[j++], 32, "%d", stat.blkno);  // 页编号
    values[j] = (char*)palloc(32);
    snprintf(values[j++], 32, "%c", stat.type); // 页类型
    values[j] = (char*)palloc(32);
    snprintf(values[j++], 32, "%d", stat.live_items); // 存活元组数
    values[j] = (char*)palloc(32);
    snprintf(values[j++], 32, "%d", stat.dead_items); // 死元组数
    values[j] = (char*)palloc(32);
    snprintf(values[j++], 32, "%d", stat.avg_item_size); // 元组平均大小
    values[j] = (char*)palloc(32);
    snprintf(values[j++], 32, "%d", stat.page_size); // 页大小
    values[j] = (char*)palloc(32);
    snprintf(values[j++], 32, "%d", stat.free_size); // 空闲空间
    values[j] = (char*)palloc(32);
    snprintf(values[j++], 32, "%d", stat.btpo_prev); // 前一个页面
    values[j] = (char*)palloc(32);
    snprintf(values[j++], 32, "%d", stat.btpo_next); // 后一个页面
    values[j] = (char*)palloc(32);
    if (stat.type == 'd')
        snprintf(values[j++], 64, XID_FMT, stat.btpo.xact);
    else
        snprintf(values[j++], 32, "%d", stat.btpo.level);
    values[j] = (char*)palloc(32);
    snprintf(values[j++], 32, "%d", stat.btpo_flags);

    // 构建返回元组
    tuple = BuildTupleFromCStrings(TupleDescGetAttInMetadata(tupleDesc), values);

    result = HeapTupleGetDatum(tuple);

    PG_RETURN_DATUM(result);
}

用gsql执行:

cc1=# select * from bt_page_stats('l_pkey',1);
 blkno | type | live_items | dead_items | avg_item_size | page_size | free_size | btpo_prev | btpo_next | btpo | btpo_flags
-------+------+------------+------------+---------------+-----------+-----------+-----------+-----------+------+------------
     1 | l    |        367 |          0 |            16 |      8192 |       800 |         0 |         2 |    0 |          1
(1 row)

cc1=#  select * from bt_page_stats('l_pkey',2);
 blkno | type | live_items | dead_items | avg_item_size | page_size | free_size | btpo_prev | btpo_next | btpo | btpo_flags
-------+------+------------+------------+---------------+-----------+-----------+-----------+-----------+------+------------
     2 | l    |        367 |          0 |            16 |      8192 |       800 |         1 |         4 |    0 |          1
(1 row)

cc1=#  select * from bt_page_stats('l_pkey',3);
 blkno | type | live_items | dead_items | avg_item_size | page_size | free_size | btpo_prev | btpo_next | btpo | btpo_flags
-------+------+------------+------------+---------------+-----------+-----------+-----------+-----------+------+------------
     3 | r    |         28 |          0 |            15 |      8192 |      7588 |         0 |         0 |    1 |          2
(1 row)

cc1=#  select * from bt_page_stats('l_pkey',4);
 blkno | type | live_items | dead_items | avg_item_size | page_size | free_size | btpo_prev | btpo_next | btpo | btpo_flags
-------+------+------------+------------+---------------+-----------+-----------+-----------+-----------+------+------------
     4 | l    |        367 |          0 |            16 |      8192 |       800 |         2 |         5 |    0 |          1
(1 row)

cc1=#  select * from bt_page_stats('l_pkey',5);
 blkno | type | live_items | dead_items | avg_item_size | page_size | free_size | btpo_prev | btpo_next | btpo | btpo_flags
-------+------+------------+------------+---------------+-----------+-----------+-----------+-----------+------+------------
     5 | l    |        367 |          0 |            16 |      8192 |       800 |         4 |         6 |    0 |          1
(1 row)

cc1=#
  • 函数 bt_page_items(IN relname text, IN blkno int4,OUT itemoffset smallint,OUT ctid tid,OUT itemlen smallint,OUT nulls bool,OUT vars bool,OUT data text)
    返回一个 B-树索引页面上项的所有细节信息,关联btreefuncs.cpp的bt_page_stats方法。
/*-------------------------------------------------------
 * bt_page_items()
 *
 * Get IndexTupleData set in a btree page
 *
 * Usage: SELECT * FROM bt_page_items('t1_pkey', 1);
 *-------------------------------------------------------
 */

/*
 * cross-call data structure for SRF
 */
struct user_args {
    Page page;
    OffsetNumber offset;
};

Datum bt_page_items(PG_FUNCTION_ARGS)
{
    text* relname = PG_GETARG_TEXT_P(0);
    uint32 blkno = PG_GETARG_UINT32(1);
    Datum result;
    char* values[6];
    HeapTuple tuple;
    FuncCallContext* fctx = NULL;
    MemoryContext mctx;
    struct user_args* uargs;

    if (!superuser())
        ereport(ERROR,
            (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), (errmsg("must be system admin to use pageinspect functions"))));

    // 首次调用
    if (SRF_IS_FIRSTCALL()) {
        RangeVar* relrv = NULL;
        Relation rel;
        Buffer buffer;
        BTPageOpaqueInternal opaque;
        TupleDesc tupleDesc;

        fctx = SRF_FIRSTCALL_INIT();

        relrv = makeRangeVarFromNameList(textToQualifiedNameList(relname));
        rel = relation_openrv(relrv, AccessShareLock);
        
        // 非btree索引报错
        if (!IS_INDEX(rel) || !IS_BTREE(rel))
            elog(ERROR, "relation \"%s\" is not a btree index", RelationGetRelationName(rel));

        /*
         * Reject attempts to read non-local temporary relations; we would be
         * likely to get wrong data since we have no visibility into the
         * owning session's local buffers.
         * 非本地临时表报错
         */
        if (RELATION_IS_OTHER_TEMP(rel))
            ereport(ERROR,
                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot access temporary tables of other sessions")));

        if (blkno == 0)
            elog(ERROR, "block 0 is a meta page");

        // 检查页编号是否有效
        CHECK_RELATION_BLOCK_RANGE(rel, blkno);

        // 获取页数据
        buffer = ReadBuffer(rel, blkno);
        LockBuffer(buffer, BUFFER_LOCK_SHARE);

        /*
         * We copy the page into local storage to avoid holding pin on the
         * buffer longer than we must, and possibly failing to release it at
         * all if the calling query doesn't fetch all rows.
         * 复制页面数据到user_args
         */
        mctx = MemoryContextSwitchTo(fctx->multi_call_memory_ctx);

        uargs = (user_args*)palloc(sizeof(struct user_args));

        uargs->page = (char*)palloc(BLCKSZ);
        memcpy(uargs->page, BufferGetPage(buffer), BLCKSZ);

        UnlockReleaseBuffer(buffer);
        relation_close(rel, AccessShareLock);

        uargs->offset = FirstOffsetNumber; // 初始化offset, 1

        opaque = (BTPageOpaqueInternal)PageGetSpecialPointer(uargs->page);

        if (P_ISDELETED(opaque))
            elog(NOTICE, "page is deleted");

        fctx->max_calls = PageGetMaxOffsetNumber(uargs->page); // 获取行指针数量

        /* Build a tuple descriptor for our result type */
        // 构建元组描述符
        if (get_call_result_type(fcinfo, NULL, &tupleDesc) != TYPEFUNC_COMPOSITE)
            elog(ERROR, "return type must be a row type");

        // 构建 AttInMetadata 结构
        fctx->attinmeta = TupleDescGetAttInMetadata(tupleDesc);

        fctx->user_fctx = uargs;

        MemoryContextSwitchTo(mctx);
    }

    // 多函数调用设置
    fctx = SRF_PERCALL_SETUP();
    uargs = (user_args*)(fctx->user_fctx);

    if (fctx->call_cntr < fctx->max_calls) {
        ItemId id;
        IndexTuple itup;
        int j;
        int off;
        int dlen;
        char* dump = NULL;
        char* ptr = NULL;

        id = PageGetItemId(uargs->page, uargs->offset);

        if (!ItemIdIsValid(id))
            elog(ERROR, "invalid ItemId");

        itup = (IndexTuple)PageGetItem(uargs->page, id); // 获取索引元组

        j = 0;
        values[j] = (char*)palloc(32);
        snprintf(values[j++], 32, "%d", uargs->offset); // 行指针offset
        values[j] = (char*)palloc(32);
        snprintf(values[j++], 32, "(%u,%u)", BlockIdGetBlockNumber(&(itup->t_tid.ip_blkid)), itup->t_tid.ip_posid); // ctid
        values[j] = (char*)palloc(32);
        snprintf(values[j++], 32, "%d", (int)IndexTupleSize(itup)); // 索引元组大小
        values[j] = (char*)palloc(32);
        snprintf(values[j++], 32, "%c", IndexTupleHasNulls(itup) ? 't' : 'f'); // 索引元组是否存在空值
        values[j] = (char*)palloc(32);
        snprintf(values[j++], 32, "%c", IndexTupleHasVarwidths(itup) ? 't' : 'f'); // 索引元组是否存在var-width属性

        // 索引元组数据
        ptr = (char*)itup + IndexInfoFindDataOffset(itup->t_info);
        dlen = IndexTupleSize(itup) - IndexInfoFindDataOffset(itup->t_info);
        dump = (char*)palloc0(dlen * 3 + 1);
        values[j] = dump;
        for (off = 0; off < dlen; off++) {
            if (off > 0)
                *dump++ = ' ';
            sprintf(dump, "%02x", *(ptr + off) & 0xff);
            dump += 2;
        }

        // 构建返回元组
        tuple = BuildTupleFromCStrings(fctx->attinmeta, values);
        result = HeapTupleGetDatum(tuple);

        uargs->offset = uargs->offset + 1;

        // 设置一些状态值,将当前结果返回
        SRF_RETURN_NEXT(fctx, result);
    } else {
        // 释放中间变量
        pfree(uargs->page);
        pfree(uargs);
        // 停止多函数调用,删除多函数调用使用的相关参数,变量等
        SRF_RETURN_DONE(fctx);
    }
}

用gsql执行:

cc1=# select * from bt_page_items('l_pkey',2);
 itemoffset |  ctid   | itemlen | nulls | vars |          data
------------+---------+---------+-------+------+-------------------------
          1 | (3,55)  |      16 | f     | f    | dd 02 00 00 00 00 00 00
          2 | (1,141) |      16 | f     | f    | 6f 01 00 00 00 00 00 00
          3 | (1,142) |      16 | f     | f    | 70 01 00 00 00 00 00 00
          4 | (1,143) |      16 | f     | f    | 71 01 00 00 00 00 00 00
          5 | (1,144) |      16 | f     | f    | 72 01 00 00 00 00 00 00
          6 | (1,145) |      16 | f     | f    | 73 01 00 00 00 00 00 00
          7 | (1,146) |      16 | f     | f    | 74 01 00 00 00 00 00 00
          8 | (1,147) |      16 | f     | f    | 75 01 00 00 00 00 00 00
          9 | (1,148) |      16 | f     | f    | 76 01 00 00 00 00 00 00
         10 | (1,149) |      16 | f     | f    | 77 01 00 00 00 00 00 00
         11 | (1,150) |      16 | f     | f    | 78 01 00 00 00 00 00 00
         12 | (1,151) |      16 | f     | f    | 79 01 00 00 00 00 00 00
         13 | (1,152) |      16 | f     | f    | 7a 01 00 00 00 00 00 00
         14 | (1,153) |      16 | f     | f    | 7b 01 00 00 00 00 00 00
         15 | (1,154) |      16 | f     | f    | 7c 01 00 00 00 00 00 00
         16 | (1,155) |      16 | f     | f    | 7d 01 00 00 00 00 00 00
         17 | (1,156) |      16 | f     | f    | 7e 01 00 00 00 00 00 00
         18 | (1,157) |      16 | f     | f    | 7f 01 00 00 00 00 00 00
         19 | (1,158) |      16 | f     | f    | 80 01 00 00 00 00 00 00
         20 | (1,159) |      16 | f     | f    | 81 01 00 00 00 00 00 00
         21 | (1,160) |      16 | f     | f    | 82 01 00 00 00 00 00 00
         22 | (1,161) |      16 | f     | f    | 83 01 00 00 00 00 00 00
         23 | (1,162) |      16 | f     | f    | 84 01 00 00 00 00 00 00
         24 | (1,163) |      16 | f     | f    | 85 01 00 00 00 00 00 00
         25 | (1,164) |      16 | f     | f    | 86 01 00 00 00 00 00 00
         26 | (1,165) |      16 | f     | f    | 87 01 00 00 00 00 00 00
         27 | (1,166) |      16 | f     | f    | 88 01 00 00 00 00 00 00
         28 | (1,167) |      16 | f     | f    | 89 01 00 00 00 00 00 00
         29 | (1,168) |      16 | f     | f    | 8a 01 00 00 00 00 00 00
         30 | (1,169) |      16 | f     | f    | 8b 01 00 00 00 00 00 00
         31 | (1,170) |      16 | f     | f    | 8c 01 00 00 00 00 00 00
         32 | (1,171) |      16 | f     | f    | 8d 01 00 00 00 00 00 00
         33 | (1,172) |      16 | f     | f    | 8e 01 00 00 00 00 00 00
         34 | (1,173) |      16 | f     | f    | 8f 01 00 00 00 00 00 00
         35 | (1,174) |      16 | f     | f    | 90 01 00 00 00 00 00 00
         36 | (1,175) |      16 | f     | f    | 91 01 00 00 00 00 00 00
         37 | (1,176) |      16 | f     | f    | 92 01 00 00 00 00 00 00
         38 | (1,177) |      16 | f     | f    | 93 01 00 00 00 00 00 00
         39 | (1,178) |      16 | f     | f    | 94 01 00 00 00 00 00 00
         40 | (1,179) |      16 | f     | f    | 95 01 00 00 00 00 00 00
         41 | (1,180) |      16 | f     | f    | 96 01 00 00 00 00 00 00
         42 | (1,181) |      16 | f     | f    | 97 01 00 00 00 00 00 00
         43 | (1,182) |      16 | f     | f    | 98 01 00 00 00 00 00 00
         44 | (1,183) |      16 | f     | f    | 99 01 00 00 00 00 00 00
cc1=#
  • 函数 fsm_page_contents(IN page bytea)
    显示FSM页面的内部节点结构,关联fsmfuncs.cpp fsm_page_contents方法。
/*
 * Dumps the contents of a FSM page.
 */
PG_FUNCTION_INFO_V1(fsm_page_contents);

Datum fsm_page_contents(PG_FUNCTION_ARGS)
{
    bytea* raw_page = PG_GETARG_BYTEA_P(0);
    StringInfoData sinfo;
    FSMPage fsmpage;
    uint32 i;

    if (!superuser())
        ereport(ERROR,
            (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), (errmsg("must be system admin to use raw page functions"))));

    // 获取fsm页面数据
    fsmpage = (FSMPage)PageGetContents(VARDATA(raw_page));

    // 初始化StringInfoData结构
    initStringInfo(&sinfo);

    // append每个节点的信息返回
    for (i = 0; i < NodesPerPage; i++) {
        if (fsmpage->fp_nodes[i] != 0)
            appendStringInfo(&sinfo, "%d: %d\n", i, fsmpage->fp_nodes[i]);
    }
    // append "next"指针,指向页面中下一个要返回的槽
    appendStringInfo(&sinfo, "fp_next_slot: %d\n", fsmpage->fp_next_slot);

    PG_RETURN_TEXT_P(cstring_to_text(sinfo.data));
}

用gsql执行:

cc1=# SELECT fsm_page_contents(get_raw_page('pg_class', 'fsm', 0));
 fsm_page_contents
-------------------
 0: 250           +
 1: 250           +
 3: 250           +
 7: 250           +
 15: 250          +
 31: 250          +
 63: 250          +
 127: 250         +
 255: 250         +
 511: 250         +
 1023: 250        +
 2047: 250        +
 4095: 250        +
 fp_next_slot: 0  +

(1 row)
cc1=#
  • 函数 gin_metapage_info(IN page bytea,OUT pending_head bigint,OUT pending_tail bigint,OUT tail_free_size int4,OUT n_pending_pages bigint,OUT n_pending_tuples bigint,OUT n_total_pages bigint,OUT n_entry_pages bigint,OUT n_data_pages bigint,OUT n_entries bigint,OUT version int4)
    返回有关一个 GIN索引元页的信息,关联ginfuncs.cpp gin_metapage_info方法。
Datum gin_metapage_info(PG_FUNCTION_ARGS)
{
    bytea* raw_page = PG_GETARG_BYTEA_P(0);
    int raw_page_size;
    TupleDesc tupdesc;
    Page page;
    GinPageOpaque opaq;
    GinMetaPageData* metadata = NULL;
    HeapTuple resultTuple;
    Datum values[10];
    bool nulls[10];

    if (!superuser())
        ereport(
            ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), (errmsg("must be superuser to use raw page functions"))));

    // 校验页大小
    raw_page_size = VARSIZE(raw_page) - VARHDRSZ;
    if (raw_page_size < BLCKSZ)
        ereport(ERROR,
            (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("input page too small (%d bytes)", raw_page_size)));
    page = VARDATA(raw_page);

    // 校验是否gin索引页
    opaq = (GinPageOpaque)PageGetSpecialPointer(page);
    if (opaq->flags != GIN_META)
        ereport(ERROR,
            (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                errmsg("input page is not a GIN metapage"),
                errdetail("Flags %04X, expected %04X", opaq->flags, GIN_META)));

    /* Build a tuple descriptor for our result type */
    // 构建元组描述符
    if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
        elog(ERROR, "return type must be a row type");

    // 获取gin索引元数据
    metadata = GinPageGetMeta(page);

    memset(nulls, 0, sizeof(nulls));

    values[0] = Int64GetDatum(metadata->head);
    values[1] = Int64GetDatum(metadata->tail);
    values[2] = Int32GetDatum(metadata->tailFreeSize);
    values[3] = Int64GetDatum(metadata->nPendingPages);
    values[4] = Int64GetDatum(metadata->nPendingHeapTuples);

    /* statistics, updated by VACUUM */
    values[5] = Int64GetDatum(metadata->nTotalPages);
    values[6] = Int64GetDatum(metadata->nEntryPages);
    values[7] = Int64GetDatum(metadata->nDataPages);
    values[8] = Int64GetDatum(metadata->nEntries);

    values[9] = Int32GetDatum(metadata->ginVersion);

    /* Build and return the result tuple. */
    // 构建返回元组
    resultTuple = heap_form_tuple(tupdesc, values, nulls);

    return HeapTupleGetDatum(resultTuple);
}

用gsql执行:

cc1=# \d+ g
                          Table "public.g"
 Column |  Type   | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
 id     | integer |           | plain   |              |
Indexes:
    "g_idx" gin (id) TABLESPACE pg_default
Has OIDs: no
Options: orientation=row, compression=no
cc1=# select * from gin_metapage_info(get_raw_page('g_idx',0));
 pending_head | pending_tail | tail_free_size | n_pending_pages | n_pending_tuples | n_total_pages | n_entry_pages | n_data_pages | n_entries | version
--------------+--------------+----------------+-----------------+------------------+---------------+---------------+--------------+-----------+---------
            2 |            2 |           8040 |               1 |                6 |             2 |             1 |            0 |         0 |       2
(1 row)

cc1=#
  • 函数 gin_page_opaque_info(IN page bytea,OUT rightlink bigint,OUT maxoff int4,OUT flags text[])
    返回有关一个 GIN索引不透明区域的信息,如页面类型等,关联ginfuncs.cpp gin_page_opaque_info方法。
Datum gin_page_opaque_info(PG_FUNCTION_ARGS)
{
    bytea* raw_page = PG_GETARG_BYTEA_P(0);
    int raw_page_size;
    TupleDesc tupdesc;
    Page page;
    GinPageOpaque opaq;
    HeapTuple resultTuple;
    Datum values[3];
    bool nulls[10];
    Datum flags[16];
    int nflags = 0;
    uint16 flagbits;

    if (!superuser())
        ereport(
            ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), (errmsg("must be superuser to use raw page functions"))));

    raw_page_size = VARSIZE(raw_page) - VARHDRSZ;
    if (raw_page_size < BLCKSZ)
        ereport(ERROR,
            (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("input page too small (%d bytes)", raw_page_size)));
    page = VARDATA(raw_page);

    // 获取GIN索引不透明区域的信息
    opaq = (GinPageOpaque)PageGetSpecialPointer(page);

    /* Build a tuple descriptor for our result type */
    if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
        elog(ERROR, "return type must be a row type");

    /* Convert the flags bitmask to an array of human-readable names */
    // 转换标记名
    flagbits = opaq->flags;
    if (flagbits & GIN_DATA)
        flags[nflags++] = CStringGetTextDatum("data");
    if (flagbits & GIN_LEAF)
        flags[nflags++] = CStringGetTextDatum("leaf"); // 叶页
    if (flagbits & GIN_DELETED)
        flags[nflags++] = CStringGetTextDatum("deleted"); 
    if (flagbits & GIN_META)
        flags[nflags++] = CStringGetTextDatum("meta"); // 元页
    if (flagbits & GIN_LIST)
        flags[nflags++] = CStringGetTextDatum("list");
    if (flagbits & GIN_LIST_FULLROW)
        flags[nflags++] = CStringGetTextDatum("list_fullrow");
    if (flagbits & GIN_INCOMPLETE_SPLIT)
        flags[nflags++] = CStringGetTextDatum("incomplete_split");
    if (flagbits & GIN_COMPRESSED)
        flags[nflags++] = CStringGetTextDatum("compressed");
    flagbits &= ~(GIN_DATA | GIN_LEAF | GIN_DELETED | GIN_META | GIN_LIST | GIN_LIST_FULLROW | GIN_INCOMPLETE_SPLIT |
                  GIN_COMPRESSED);
    if (flagbits) {
        // flag非已知标记处理为hex值
        /* any flags we don't recognize are printed in hex */
        flags[nflags++] = DirectFunctionCall1(to_hex32, Int32GetDatum(flagbits));
    }

    memset(nulls, 0, sizeof(nulls));

    values[0] = Int64GetDatum(opaq->rightlink); // 下一个页如果存在
    values[1] = Int64GetDatum(opaq->maxoff);
    values[2] = PointerGetDatum(construct_array(flags, nflags, TEXTOID, -1, false, 'i'));

    /* Build and return the result tuple. */
    // 构建返回元组
    resultTuple = heap_form_tuple(tupdesc, values, nulls);

    return HeapTupleGetDatum(resultTuple);
}

用gsql执行:

cc1=# SELECT * FROM gin_page_opaque_info(get_raw_page('g_idx',25));
 rightlink | maxoff |        flags
-----------+--------+---------------------
        26 |    408 | {list,list_fullrow}
(1 row)

cc1=#
  • 函数 gin_leafpage_items(IN page bytea,OUT first_tid tid,OUT nbytes int2,OUT tids tid[])
    返回有关存储在一个 GIN叶子页面中的数据的信息,关联ginfuncs.cpp gin_leafpage_items方法。
typedef struct gin_leafpage_items_state {
    TupleDesc tupd;
    GinPostingList* seg;
    GinPostingList* lastseg;
} gin_leafpage_items_state;

Datum gin_leafpage_items(PG_FUNCTION_ARGS)
{
    bytea* raw_page = PG_GETARG_BYTEA_P(0);
    int raw_page_size;
    FuncCallContext* fctx = NULL;
    gin_leafpage_items_state* inter_call_data = NULL;

    if (!superuser())
        ereport(
            ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), (errmsg("must be superuser to use raw page functions"))));

    raw_page_size = VARSIZE(raw_page) - VARHDRSZ;

    // 首次调用
    if (SRF_IS_FIRSTCALL()) {
        TupleDesc tupdesc;
        MemoryContext mctx;
        Page page;
        GinPageOpaque opaq;

        // 校验页大小
        if (raw_page_size < BLCKSZ)
            ereport(ERROR,
                (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("input page too small (%d bytes)", raw_page_size)));
        page = VARDATA(raw_page);

        // 校验不透明区域大小
        if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
            ereport(ERROR,
                (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                    errmsg("input page is not a valid GIN data leaf page"),
                    errdetail("Special size %d, expected %d",
                        (int)PageGetSpecialSize(page),
                        (int)MAXALIGN(sizeof(GinPageOpaqueData)))));

        opaq = (GinPageOpaque)PageGetSpecialPointer(page);
        // 非compressed GIN data leaf页报错
        if (opaq->flags != (GIN_DATA | GIN_LEAF | GIN_COMPRESSED))
            ereport(ERROR,
                (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                    errmsg("input page is not a compressed GIN data leaf page"),
                    errdetail("Flags %04X, expected %04X", opaq->flags, (GIN_DATA | GIN_LEAF | GIN_COMPRESSED))));

        // 初始化FuncCallContext数据结构
        fctx = SRF_FIRSTCALL_INIT();
        mctx = MemoryContextSwitchTo(fctx->multi_call_memory_ctx);

        inter_call_data = (gin_leafpage_items_state*)palloc(sizeof(gin_leafpage_items_state));

        /* Build a tuple descriptor for our result type */
        // 构建元组描述符
        if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
            elog(ERROR, "return type must be a row type");

        inter_call_data->tupd = tupdesc;

        // 首个porting list
        inter_call_data->seg = GinDataLeafPageGetPostingList(page);
        inter_call_data->lastseg =
            (GinPostingList*)(((char*)inter_call_data->seg) + GinDataLeafPageGetPostingListSize(page));

        fctx->user_fctx = inter_call_data;

        MemoryContextSwitchTo(mctx);
    }

    // 多函数调用准备工作
    fctx = SRF_PERCALL_SETUP();
    inter_call_data = (gin_leafpage_items_state*)fctx->user_fctx;

    if (inter_call_data->seg != inter_call_data->lastseg) {
        GinPostingList* cur = inter_call_data->seg;
        HeapTuple resultTuple;
        Datum result;
        Datum values[3];
        bool nulls[3];
        int ndecoded, i;
        ItemPointer tids;
        Datum* tids_datum = NULL;

        memset(nulls, 0, sizeof(nulls));

        values[0] = ItemPointerGetDatum(&cur->first); // 列表的第一个tid
        values[1] = UInt16GetDatum(cur->nbytes); // 后面的字节数

        /* build an array of decoded item pointers */
        // 构建数组存储所有列表值
        tids = ginPostingListDecode(cur, &ndecoded);
        tids_datum = (Datum*)palloc(ndecoded * sizeof(Datum));
        for (i = 0; i < ndecoded; i++)
            tids_datum[i] = ItemPointerGetDatum(&tids[i]);
        values[2] = PointerGetDatum(construct_array(tids_datum, ndecoded, TIDOID, sizeof(ItemPointerData), false, 's'));
        pfree(tids_datum);
        pfree(tids);

        /* Build and return the result tuple. */
        // 构建返回元组
        resultTuple = heap_form_tuple(inter_call_data->tupd, values, nulls);
        result = HeapTupleGetDatum(resultTuple);

        inter_call_data->seg = GinNextPostingListSegment(cur);

        // 设置一些状态值,将当前结果返回
        SRF_RETURN_NEXT(fctx, result);
    } else
        // 停止多函数调用,删除多函数调用使用的相关参数,变量等
        SRF_RETURN_DONE(fctx);
}

用gsql执行:

cc1=# CREATE TABLE test1 (x int, y int[]);
CREATE TABLE
cc1=# INSERT INTO test1 VALUES (1, ARRAY[11, 111]);
INSERT 0 1
cc1=# CREATE INDEX test1_y_idx ON test1 USING gin (y) WITH (fastupdate = off);
CREATE INDEX
cc1=# select * FROM gin_leafpage_items(get_raw_page('test1_y_idx',
                        (pg_relation_size('test1_y_idx') /
                         current_setting('block_size')::bigint)::int - 1));cc1(# cc1(#
-[ RECORD 1 ]-------------------------------------------------------------------------------------------
first_tid | (0,2)
nbytes    | 372
tids      | {"(0,2)","(0,3)","(0,4)","(0,5)","(0,6)","(0,7)","(0,8)","(0,9)","(0,10)","(0,11)","(0,12)","(0,13)","(0,14)","(0,15)","(0,16)","(0,17)","(0,18)","(0,19)","(0,20)","(0,21)
","(0,22)","(0,23)","(0,24)","(0,25)","(0,26)","(0,27)","(0,28)","(0,29)","(0,30)","(0,31)","(0,32)","(0,33)","(0,34)","(0,35)","(0,36)","(0,37)","(0,38)","(0,39)","(0,40)","(0,41)","
(0,42)","(0,43)","(0,44)","(0,45)","(0,46)","(0,47)","(0,48)","(0,49)","(0,50)","(0,51)","(0,52)","(0,53)","(0,54)","(0,55)","(0,56)","(0,57)","(0,58)","(0,59)","(0,60)","(0,61)","(0,
62)","(0,63)","(0,64)","(0,65)","(0,66)","(0,67)","(0,68)","(0,69)","(0,70)","(0,71)","(0,72)","(0,73)","(0,74)","(0,75)","(0,76)","(0,77)","(0,78)","(0,79)","(0,80)","(0,81)","(0,82)
","(0,83)","(0,84)","(0,85)","(0,86)","(0,87)","(0,88)","(0,89)","(0,90)","(0,91)","(0,92)","(0,93)","(0,94)","(0,95)","(0,96)","(0,97)","(0,98)","(0,99)","(0,100)","(0,101)","(0,102)
","(0,103)","(0,104)","(0,105)","(0,106)","(0,107)","(0,108)","(0,109)","(0,110)","(0,111)","(0,112)","(0,113)","(0,114)","(0,115)","(0,116)","(0,117)","(0,118)","(0,119)","(1,1)","(1
,2)","(1,3)","(1,4)","(1,5)","(1,6)","(1,7)","(1,8)","(1,9)","(1,10)","(1,11)","(1,12)","(1,13)","(1,14)","(1,15)","(1,16)","(1,17)","(1,18)","(1,19)","(1,20)","(1,21)","(1,22)","(1,2
3)","(1,24)","(1,25)","(1,26)","(1,27)","(1,28)","(1,29)","(1,30)","(1,31)","(1,32)","(1,33)","(1,34)","(1,35)","(1,36)","(1,37)","(1,38)","(1,39)","(1,40)","(1,41)","(1,42)","(1,43)"
,"(1,44)","(1,45)","(1,46)","(1,47)","(1,48)","(1,49)","(1,50)","(1,51)","(1,52)","(1,53)","(1,54)","(1,55)","(1,56)","(1,57)","(1,58)","(1,59)","(1,60)","(1,61)","(1,62)","(1,63)","(
1,64)","(1,65)","(1,66)","(1,67)","(1,68)","(1,69)","(1,70)","(1,71)","(1,72)","(1,73)","(1,74)","(1,75)","(1,76)","(1,77)","(1,78)","(1,79)","(1,80)","(1,81)","(1,82)","(1,83)","(1,8
4)","(1,85)","(1,86)","(1,87)","(1,88)","(1,89)","(1,90)","(1,91)","(1,92)","(1,93)","(1,94)","(1,95)","(1,96)","(1,97)","(1,98)","(1,99)","(1,100)","(1,101)","(1,102)","(1,103)","(1,
104)","(1,105)","(1,106)","(1,107)","(1,108)","(1,109)","(1,110)","(1,111)","(1,112)","(1,113)","(1,114)","(1,115)","(1,116)","(1,117)","(1,118)","(1,119)","(2,1)","(2,2)","(2,3)","(2
,4)","(2,5)","(2,6)","(2,7)","(2,8)","(2,9)","(2,10)","(2,11)","(2,12)","(2,13)","(2,14)","(2,15)","(2,16)","(2,17)","(2,18)","(2,19)","(2,20)","(2,21)","(2,22)","(2,23)","(2,24)","(2
,25)","(2,26)","(2,27)","(2,28)","(2,29)","(2,30)","(2,31)","(2,32)","(2,33)","(2,34)","(2,35)","(2,36)","(2,37)","(2,38)","(2,39)","(2,40)","(2,41)","(2,42)","(2,43)","(2,44)","(2,45
)","(2,46)","(2,47)","(2,48)","(2,49)","(2,50)","(2,51)","(2,52)","(2,53)","(2,54)","(2,55)","(2,56)","(2,57)","(2,58)","(2,59)","(2,60)","(2,61)","(2,62)","(2,63)","(2,64)","(2,65)",
"(2,66)","(2,67)","(2,68)","(2,69)","(2,70)","(2,71)","(2,72)","(2,73)","(2,74)","(2,75)","(2,76)","(2,77)","(2,78)","(2,79)","(2,80)","(2,81)","(2,82)","(2,83)","(2,84)","(2,85)","(2
,86)","(2,87)","(2,88)","(2,89)","(2,90)","(2,91)","(2,92)","(2,93)","(2,94)","(2,95)","(2,96)","(2,97)","(2,98)","(2,99)","(2,100)","(2,101)","(2,102)","(2,103)","(2,104)","(2,105)",
"(2,106)","(2,107)","(2,108)","(2,109)","(2,110)","(2,111)","(2,112)","(2,113)","(2,114)","(2,115)","(2,116)","(2,117)","(2,118)","(2,119)","(3,1)","(3,2)","(3,3)","(3,4)","(3,5)","(3
,6)","(3,7)","(3,8)","(3,9)","(3,10)","(3,11)","(3,12)","(3,13)","(3,14)"}
--More--

你可能感兴趣的:(pageinspect源码解读)