问题来自 《PostgreSQL面试题集锦》学习与回答_Hehuyi_In的博客-CSDN博客 第11题
在pg元组头数据中,有一个t_bits数组,用于存储空值位图。当元组中没有null值的时候,t_bits可以被认为是空的,当元组有null值的列时,t_bits使用一个bit来表示列是否为null。
htup_details.h
struct HeapTupleHeaderData
{
…
ItemPointerData t_ctid; /* current TID of this or newer tuple (or a
* speculative insertion token) */
…
bits8 t_bits[FLEXIBLE_ARRAY_MEMBER]; /* bitmap of NULLs */
};
FLEXIBLE_ARRAY_MEMBER默认为空
#define FLEXIBLE_ARRAY_MEMBER /* empty */
判断字段是否为空代码在 tupmacs.h,留待后面研究
/*
* Check a tuple's null bitmap to determine whether the attribute is null.
* Note that a 0 in the null bitmap indicates a null, while 1 indicates
* non-null.
*/
#define att_isnull(ATT, BITS) (!((BITS)[(ATT) >> 3] & (1 << ((ATT) & 0x07))))
创建测试表
create table t(a int,b int,c int);
insert into t values(4,7,1);
insert into t values(6,NULL,3);
postgres=# select * from t;
a | b | c
---+---+---
4 | 7 | 1
6 | | 3
(2 rows)
pageinspact可以观察空值是如何存储的,infomask函数的定义参考:pg事务篇(三)—— 事务状态与Hint Bits(t_infomask)_access/htup_details.h:没有那个文件或目录_Hehuyi_In的博客-CSDN博客
select lp,infomask(t_infomask, 1) as infomask,t_bits,t_data from heap_page_items(get_raw_page('t',0));
为了看t_bits数组更清晰,我们加多几列
create table t0(i1 int,i2 int,i3 int,i4 int,i5 int,i6 int, i7 int,i8 int,i9 int,i10 int,i11 int,i12 int);
insert into t0 values(1,2,3,4,5,6,7,8,9,10,NULL,12);
insert into t0 values(1,2,3,4,5,6,7,8,9,NULL,NULL,12);
insert into t0 values(1,2,3,4,5,6,7,8,NULL,NULL,NULL,12);
insert into t0 values(1,2,3,4,5,6,7,8,9,10,11,12);
select t_bits from heap_page_items(get_raw_page('t0', 0));
t_bits
------------------
1111111111010000
1111111110010000
1111111100010000
(4 rows)
观察删掉其中一列的效果
alter table t0 drop column i1;
select t_bits from heap_page_items(get_raw_page('t0', 0));
发现位图并没有变化
再插入一行非空值
insert into t0 values(2,3,4,5,6,7,8,9,10,11,12);
select t_bits from heap_page_items(get_raw_page('t0', 0));
可以看到,表中已删除列会被视为空列。当表中有许多列时,删除列将为每条记录生成额外的t_bit,这将导致存储膨胀。
前面有提到过,判断字段是否为空代码如下
/*
* Check a tuple's null bitmap to determine whether the attribute is null.
* Note that a 0 in the null bitmap indicates a null, while 1 indicates
* non-null.
*/
#define att_isnull(ATT, BITS) (!((BITS)[(ATT) >> 3] & (1 << ((ATT) & 0x07))))
以t表为例,第二行t_bits数组值为10100000,注意实际存储的时候值是颠倒的,所以是00000101
postgres=# select * from t;
a | b | c
---+---+---
4 | 7 | 1
6 | | 3
以第1列为例,ATT是列号(从0开始)
insert into t0 values(1,2,3,4,5,6,7,8,9,10,NULL,12);
t_bits数组值为1111111111010000,颠倒后为0000101111111111
以第11列为例,其ATT=10,即00001010
参考
https://www.cnblogs.com/abclife/p/13855150.html
Postgres是如何管理空值的