1)pageinspect扩展工具用于查看数据底层存储信息。
1)get_raw_page函数:将指定表文件中的页面内容返回。param1:表名,param2:main/fsm/vm, param3:第几页;
2)page_header函数:返回本页面中的page header信息。param1:get_raw_page函数的返回值;
3)heap_page_items函数:显示堆页面上的所有行指针。param1:get_raw_page函数的返回值。
在pgsql插入一条数据则会在单个page新增一个tuple+tupleHeader+tuplePointer,其中tuple数据大小不定,tupleHeader占24字节,tupleHeader占4字节。
试验猜测:
1)当插入一条int类型的数据时,由于itemPointer是从前往后插入的,每次增加4字节,因此lower会以4字节的长度递增;
2)当插入一条int类型的数据时,由于tupleHeader是从后往前插入的,每次增加tupleHeader的长度24+tuple的长度(这里为int)4=28字节的长度递增,但是计算机中一般取8的整数倍,因此以32字节减小。
试验验证:
drop table if EXISTS test1;
create table test1(id int);
vacuum analyze test1;
-- 1.插入第一条数据
insert into test1 values(1);
-- 2.查看页头相关信息
select * from page_header(get_raw_page('test1', 'main', 0));
-- 3.插入第二条数据
insert into test1 values(2);
-- 4.查看页头相关信息
select * from page_header(get_raw_page('test1', 'main', 0));
-- 5.插入第三条数据
insert into test1 values(300);
select * from public.test1;
-- 6.查看页头相关信息
select * from page_header(get_raw_page('test1', 'main', 0));
--
select * from heap_page_items(get_raw_page('test1',0));
结果:
页头数据如下:
每插入一条int数据,lower增加4字节(itemId是从页头开始写的,写在header的后面),upper减少32字节(tuple是从页尾开始写的,tupleHader + data = 24 + 4 = 28,因为要取8的整数倍,所以为32字节)。
参数解释:
页五部分:
Item | Description |
---|---|
PageHeaderData | 24 bytes long. Contains general information about the page, including free space pointers. |
ItemIdData | Array of item identifiers pointing to the actual items. Each entry is an (offset,length) pair. 4 bytes per item. |
Free space | The unallocated space. New item identifiers are allocated from the start of this area, new items from the end. |
Items | The actual items themselves. |
Special space | Index access method specific data. Different methods store different data. Empty in ordinary tables. |
页头(24byte)
Field | Type | Length | Description |
---|---|---|---|
pd_lsn | PageXLogRecPtr | 8 bytes | LSN: next byte after last byte of WAL record for last change to this page |
pd_checksum | uint16 | 2 bytes | Page checksum |
pd_flags | uint16 | 2 bytes | Flag bits |
pd_lower | LocationIndex | 2 bytes | Offset to start of free space |
pd_upper | LocationIndex | 2 bytes | Offset to end of free space |
pd_special | LocationIndex | 2 bytes | Offset to start of special space |
pd_pagesize_version | uint16 | 2 bytes | Page size and layout version number information |
pd_prune_xid | TransactionId | 4 bytes | Oldest unpruned XMAX on page, or zero if none |
HeapTupleHeaderData Layout(27byte,由于内部是8byte对齐排列,因此整体对外是28byte)
Field | Type | Length | Description |
---|---|---|---|
t_xmin | TransactionId | 4 bytes | insert XID stamp |
t_xmax | TransactionId | 4 bytes | delete XID stamp |
t_cid | CommandId | 4 bytes | insert and/or delete CID stamp (overlays with t_xvac) |
t_xvac | TransactionId | 4 bytes | XID for VACUUM operation moving a row version |
t_ctid | ItemPointerData | 6 bytes | current TID of this or newer row version |
t_infomask2 | uint16 | 2 bytes | number of attributes, plus various flag bits |
t_infomask | uint16 | 2 bytes | various flag bits |
t_hoff | uint8 | 1 byte | offset to user data |
试验猜测:
pageHeader:24byte
ItemPointer占:4byte
tupleHeader:对外28字节
tuple:这里用int,占4字节
单页:8192字节
猜测存放:(8192-24)/(4+28+4)=226.88,即最多存放226行
试验如下:
drop table if EXISTS test1;
create table test1(id int);
vacuum analyze test1;
insert into test1 select generate_series(1,226);
select * from page_header(get_raw_page('test1', 'main', 0));
select * from heap_page_items(get_raw_page('test1',0));
select count(*) from heap_page_items(get_raw_page('test1',0));
select * from page_header(get_raw_page('test1', 'main', 1));
insert into test1 values(300);
select * from page_header(get_raw_page('test1', 'main', 1));
因为插入一条数据需要36字节空间,插入最后一次后,剩下的空间为960-928=32字节,不能满足下一条数据的插入,因此另起一页,并且查询第二页也没有再报out fo range错误。说明第一页已经插满了的。
最后记录一下,我用varchar做测试的时候发现数据达不到预期有,猜测是客户端和服务端编码的问题还是其它那里的知识漏洞,再看看文档再回来补充,这里先记录下。