PostgreSQL的pg_xlog文件数计算和在线清理

postgresql的pg_xlog是记录数据库事务信息用的,类似oracle的redo信息,也叫wal日志(write ahead log),就是在写数据到磁盘里成为固定数据之前,先写入到日志里,然后一定条件下触发调用fsync()将此数据刷到磁盘。好处不多说,下面主要看看postgresql通常会保留多少个pg_xlog文件,以及如何在线清理。

一、环境及参数
环境:
postgresql 9.2.4(单台,未做流复制)
vmware 6.0
centos 6.2

参数:
postgres=# show checkpoint_segments ;   --设置相对大一点,默认太小,防止频繁checkpoint
checkpoint_segments
---------------------
30
(1 row)

postgres=# show checkpoint_timeout ;
checkpoint_timeout
--------------------
5min
(1 row)

postgres=# show checkpoint_completion_target ;
checkpoint_completion_target
------------------------------
0.5
(1 row)

postgres=# show wal_keep_segments ;     --未开启流复制预留xlog参数
 wal_keep_segments 
-------------------
 0
(1 row)

postgres=# show archive_mode ;             --关闭归档
archive_mode 
--------------
off
(1 row)
理论上合理的pg_xlog一般在(2+checkpoint_completion_target)*checkpoint_segment+1左右浮动,但是高并发环境下如果checkpoint_segment设置比较大,且checkpoint较少被触发的时候,pg_xlog下会存有很多的事务日志,严重会耗掉磁盘存储,所以设置checkpoint_segment的大小要视磁盘容量和每个pg_xlog文件的大小来看,有些时候需要手工调整参数来收缩。

二、测试过程:

1.模拟pg_xlog文件增长
模拟大数据量载入,开启多个窗口插入数据,每个窗口插入1000W数据,可以看到pg_xlog增长很多,原来只有8个文件
postgres=# create table tbl_kenyon(id int,cname varchar(50),remark text);
CREATE TABLE

postgres=# insert into tbl_kenyon select generate_series(1,10000000),'kenyon good boy',md5('kenyon good boy');
INSERT 0 10000000

postgres=# insert into tbl_kenyon select generate_series(1,10000000),'kenyon good boy',md5('kenyon good boy');
INSERT 0 10000000

postgres=# insert into tbl_kenyon select generate_series(1,10000000),'kenyon good boy',md5('kenyon good boy');
INSERT 0 10000000

postgres=# insert into tbl_kenyon select generate_series(1,10000000),'kenyon good boy',md5('kenyon good boy');
INSERT 0 10000000

postgres=# insert into tbl_kenyon select generate_series(1,10000000),'kenyon good boy',md5('kenyon good boy');
INSERT 0 10000000

postgres=# insert into tbl_kenyon select generate_series(1,10000000),'kenyon good boy',md5('kenyon good boy');
INSERT 0 10000000
此时检查pg_xlog下面的文件数量
[postgres@localhost pg_xlog]$ ll
total 4063240
-rw-------  1 postgres postgres 67108864 Jul 14 23:05 00000001000000020000002F
-rw-------  1 postgres postgres 67108864 Jul 14 23:13 000000010000000200000030
-rw-------  1 postgres postgres 67108864 Jul 14 23:18 000000010000000200000031
-rw-------  1 postgres postgres 67108864 Jul 14 23:25 000000010000000200000032
-rw-------  1 postgres postgres 67108864 Jul 14 22:42 000000010000000200000033
-rw-------  1 postgres postgres 67108864 Jul 14 22:42 000000010000000200000034
-rw-------  1 postgres postgres 67108864 Jul 14 22:44 000000010000000200000035
-rw-------  1 postgres postgres 67108864 Jul 14 22:44 000000010000000200000036
-rw-------  1 postgres postgres 67108864 Jul 14 22:43 000000010000000200000037
-rw-------  1 postgres postgres 67108864 Jul 14 22:44 000000010000000200000038
-rw-------  1 postgres postgres 67108864 Jul 14 22:44 000000010000000200000039
-rw-------  1 postgres postgres 67108864 Jul 14 22:44 00000001000000020000003A
-rw-------  1 postgres postgres 67108864 Jul 14 22:43 00000001000000020000003B
-rw-------  1 postgres postgres 67108864 Jul 14 22:44 00000001000000020000003C
-rw-------  1 postgres postgres 67108864 Jul 14 22:43 00000001000000020000003D
-rw-------  1 postgres postgres 67108864 Jul 14 22:44 00000001000000020000003E
-rw-------  1 postgres postgres 67108864 Jul 14 22:44 000000010000000300000000
-rw-------  1 postgres postgres 67108864 Jul 14 22:43 000000010000000300000001
-rw-------  1 postgres postgres 67108864 Jul 14 22:44 000000010000000300000002
-rw-------  1 postgres postgres 67108864 Jul 14 22:43 000000010000000300000003
-rw-------. 1 postgres postgres 67108864 Jul 14 22:44 000000010000000300000004
-rw-------  1 postgres postgres 67108864 Jul 14 22:44 000000010000000300000005
-rw-------  1 postgres postgres 67108864 Jul 14 22:44 000000010000000300000006
-rw-------  1 postgres postgres 67108864 Jul 14 22:43 000000010000000300000007
-rw-------  1 postgres postgres 67108864 Jul 14 22:44 000000010000000300000008
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 000000010000000300000009
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 00000001000000030000000A
-rw-------  1 postgres postgres 67108864 Jul 14 22:44 00000001000000030000000B
-rw-------  1 postgres postgres 67108864 Jul 14 22:43 00000001000000030000000C
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 00000001000000030000000D
-rw-------  1 postgres postgres 67108864 Jul 14 22:43 00000001000000030000000E
-rw-------  1 postgres postgres 67108864 Jul 14 22:43 00000001000000030000000F
-rw-------  1 postgres postgres 67108864 Jul 14 22:44 000000010000000300000010
-rw-------  1 postgres postgres 67108864 Jul 14 22:44 000000010000000300000011
-rw-------  1 postgres postgres 67108864 Jul 14 22:44 000000010000000300000012
-rw-------  1 postgres postgres 67108864 Jul 14 22:43 000000010000000300000013
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 000000010000000300000014
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 000000010000000300000015
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 000000010000000300000016
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 000000010000000300000017
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 000000010000000300000018
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 000000010000000300000019
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 00000001000000030000001A
-rw-------  1 postgres postgres 67108864 Jul 14 22:46 00000001000000030000001B
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 00000001000000030000001C
-rw-------  1 postgres postgres 67108864 Jul 14 22:46 00000001000000030000001D
-rw-------  1 postgres postgres 67108864 Jul 14 22:46 00000001000000030000001E
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 00000001000000030000001F
-rw-------  1 postgres postgres 67108864 Jul 14 22:46 000000010000000300000020
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 000000010000000300000021
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 000000010000000300000022
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 000000010000000300000023
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 000000010000000300000024
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 000000010000000300000025
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 000000010000000300000026
-rw-------  1 postgres postgres 67108864 Jul 14 22:46 000000010000000300000027
-rw-------  1 postgres postgres 67108864 Jul 14 22:45 000000010000000300000028
-rw-------  1 postgres postgres 67108864 Jul 14 22:46 000000010000000300000029
-rw-------. 1 postgres postgres 67108864 Jul 14 22:46 00000001000000030000002A
-rw-------  1 postgres postgres 67108864 Jul 14 22:46 00000001000000030000002B
-rw-------  1 postgres postgres 67108864 Jul 14 22:46 00000001000000030000002C
-rw-------  1 postgres postgres 67108864 Jul 14 22:56 00000001000000030000002D
drwx------. 2 postgres postgres     4096 Jun 12 23:19 archive_status
[postgres@localhost pg_xlog]$ ll|wc -l
64
2.模拟pg_xlog文件清理
这类文件对DB是很重要的,不能手工直接清理。每个文件64MB,是源码编译时--with-wal-segsize时指定的,或者登陆查看
postgres=# show wal_segment_size ;  
wal_segment_size 
------------------
 64MB
(1 row)
要收缩这个事务日志空间,减少事务日志数,可以调整参数 checkpoint_segments,本次调为3
postgres=# show checkpoint_segments ;
 checkpoint_segments 
---------------------
 3
(1 row)


postgres=# checkpoint;
CHECKPOINT
 xlog文件有可能不是立即减少的,如果想模拟立即减少,可以模拟数据库频繁checkpoint,如重新做上述的大数据库加载或者更新操作,此时去看pg_xlog的文件数,能看到在减少了,开启两个窗口操作:
postgres=# insert into tbl_kenyon select generate_series(1,10000000),'kenyon good boy',md5('kenyon good boy');
INSERT 0 10000000

postgres=# insert into tbl_kenyon select generate_series(1,2000000),'kenyon good boy',md5('kenyon good boy');
INSERT 0 2000000

[postgres@localhost pg_xlog]$ ll|wc -l
62
[postgres@localhost pg_xlog]$ ll|wc -l
49
[postgres@localhost pg_xlog]$ ll|wc -l
46
[postgres@localhost pg_xlog]$ ll|wc -l
46
[postgres@localhost pg_xlog]$ ll|wc -l
43
[postgres@localhost pg_xlog]$ ll|wc -l
43
[postgres@localhost pg_xlog]$ ll|wc -l
43
[postgres@localhost pg_xlog]$ ll|wc -l
40
[postgres@localhost pg_xlog]$ ll|wc -l
40
[postgres@localhost pg_xlog]$ ll|wc -l
31
[postgres@localhost pg_xlog]$ ll|wc -l
25
后续发生的checkpoint,将会逐步收缩pg_xlog的文件数量,直到满足条件为止。之前写过另外一种较为激进的清理pg_xlog的方式,需要停机清理,参考 http://my.oschina.net/Kenyon/blog/101432

3.其他
如果是流复制,还有wal_keep_segments这个参数也要考虑一下。后续。

4.说明
checkpoint_segments (integer)
Maximum number of log file segments between automatic WAL checkpoints (each segment
is normally 16 megabytes). The default is three segments. Increasing this parameter can increase
the amount of time needed for crash recovery. This parameter can only be set in the
postgresql.conf file or on the server command line.
5.参考:
http://my.oschina.net/Kenyon/blog/101432
http://francs3.blog.163.com/blog/static/4057672720112182158386/

你可能感兴趣的:(PostgreSQL,pg_xlog)