PostgrSQL流复制wal异常

PostgreSQL的流复制的原理是通过传递主机(master)上的wal日志信息到备机(slave)然后恢复,这中间就有一个潜在的问题,如果主机端比较忙,wal日志被覆盖了,而从机可能因为网络或者其他原因没有接收到该日志,就会造成主从不一致,这时主从就断了,并且在主机端留下很多的流复制错误信息,看了一下,大概每隔5秒尝试连接一下。

版本: PostgreSQL 9.3.2

日志如下: requested WAL segment 00000001000000010000005A has already been removed
2014-01-23 17:11:45.783 CST,,,30471,"",52e0dcd1.7707,1,"",2014-01-23 17:11:45 CST,,0,LOG,00000,"connection received: host=10.1.11.72 port=52077",,,,,,,,,""
2014-01-23 17:11:45.786 CST,"repuser","",30471,"10.1.11.72:52077",52e0dcd1.7707,2,"authentication",2014-01-23 17:11:45 CST,2/3265,0,LOG,00000,"replication connection authorized: user=repuser",,,,,,,,,""
2014-01-23 17:11:45.788 CST,"repuser","",30471,"10.1.11.72:52077",52e0dcd1.7707,3,"idle",2014-01-23 17:11:45 CST,2/0,0,ERROR,58P01,"requested WAL segment 00000001000000010000005A has already been removed",,,,,,,,,"walreceiver"
2014-01-23 17:11:45.789 CST,"repuser","",30471,"10.1.11.72:52077",52e0dcd1.7707,4,"idle",2014-01-23 17:11:45 CST,,0,LOG,00000,"disconnection: session time: 0:00:00.005 user=repuser database= host=10.1.11.72 port=52077",,,,,,,,,"walreceiver"
影响与解决
通常这种情况不影响主库的使用,但如果有在备机上做一些查询,就需要注意了。出现了这个问题,需要到归档的日志文件里把文件拷贝到备机的pg_xlog下面。 如果主机的wal日志已经循环覆盖了,而且没有做wal的归档,那出现这种情况只能重新做流复制了。

参数wal_keep_segments的官方解释
wal_keep_segments (integer)
Specifies the minimum number of past log file segments kept in the pg_xlog directory, in case a standby server needs to fetch them for streaming replication. Each segment is normally 16 megabytes. If a standby server connected to the sending server falls behind by more than wal_keep_segments segments, the sending server might remove a WAL segment still needed by the standby, in which case the replication connection will be terminated. Downstream connections will also eventually fail as a result. (However, the standby server can recover by fetching the segment from archive, if WAL archiving is in use.)

This sets only the minimum number of segments retained in pg_xlog; the system might need to retain more segments for WAL archival or to recover from a checkpoint. If wal_keep_segments is zero (the default), the system doesn't keep any extra segments for standby purposes, so the number of old WAL segments available to standby servers is a function of the location of the previous checkpoint and status of WAL archiving. This parameter can only be set in the postgresql.conf file or on the server command line.
总结:
需要调大wal_keep_segments的值,并注意试情况开启归档.

你可能感兴趣的:(PostgrSQL流复制wal异常)