PostgreSQL主从切换测试

说明

在PostgreSQL(HOT-Standby)如主库出现异常。备库如何激活;来替换主库工作。有下列2种方式

备库在recovery.conf文件中有个配置项trigger_file。它是激活standby的触发文件。当它存在;就会激活standby。
使用pg_ctl promote来激活。

演示

模拟演示主库异常关机,将备库切换为主库,然后原主库修复后切换为新的备库继续工作。

环境说明

主机名 IP地址 角色 数据目录
master 192.168.20.133 主库 /var/lib/pgsql/11/data
slave 192.168.20.134 备库 /var/lib/pgsql/11/data

查看当前环境状态

主库

lei=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 3274
usesysid         | 16774
usename          | repuser
application_name | walreceiver
client_addr      | 192.168.20.134
client_hostname  | slave
client_port      | 49896
backend_start    | 2019-05-30 02:40:58.253032-04
backend_xmin     |
state            | streaming
sent_lsn         | 0/180003C8
write_lsn        | 0/180003C8
flush_lsn        | 0/180003C8
replay_lsn       | 0/180003C8
write_lag        |
flush_lag        |
replay_lag       |
sync_priority    | 0
sync_state       | async

主库关闭

[root@master data]# systemctl stop postgresql-11

激活备库

作为新主库运行,删除数据库lei中表test并创建表tt

[postgres@slave ~]$ pg_ctl -D /var/lib/pgsql/11/data/ promote
waiting for server to promote.... done
server promoted

删除表test,创建表tt

[postgres@slave ~]$ psql lei;
psql (11.3)
Type "help" for help.

lei=# \dt
        List of relations
 Schema | Name | Type  |  Owner
--------+------+-------+----------
 public | lei  | table | postgres
 public | t    | table | postgres
 public | test | table | postgres
(3 rows)

lei=# drop table test;
DROP TABLE
lei=# create table tt(id int);
CREATE TABLE

手动切换几次WAL日志

lei=# select pg_switch_wal();
 pg_switch_wal
---------------
 0/19019058
(1 row)

lei=# select pg_switch_wal();
 pg_switch_wal
---------------
 0/1A000078
(1 row)

lei=# select pg_switch_wal();
 pg_switch_wal
---------------
 0/1B000000
(1 row)

恢复原主库

用pg_rewind命令同步新备库

[postgres@master ~]$ pg_rewind --target-pgdata /var/lib/pgsql/11/data/ --source-server='host=slave port=5432 user=postgres dbname=postgres' -P
connected to server
servers diverged at WAL location 0/19000098 on timeline 3
rewinding from last common checkpoint at 0/19000028 on timeline 3
reading source file list
reading target file list
reading WAL in target
need to copy 133 MB (total source directory size is 165 MB)
136230/136230 kB (100%) copied
creating backup label and updating control file
syncing target data directory
Done!

修改recovery.conf文件
由于配置是同步过来的,所以需要修改一下配置primary_conninfo

[postgres@master ~]$ mv /var/lib/pgsql/11/data/recovery.done /var/lib/pgsql/11/data/recovery.conf
[postgres@master ~]$ vi /var/lib/pgsql/11/data/recovery.conf
primary_conninfo = 'host=slave port=5432 user=replica password=replica'

启动新备库

[root@master data]# systemctl start postgresql-11

查看数据是否同步过来
可以看到表test没有了,多了tt表

postgres=# \c lei;
You are now connected to database "lei" as user "postgres".
lei=# \dt
        List of relations
 Schema | Name | Type  |  Owner
--------+------+-------+----------
 public | lei  | table | postgres
 public | t    | table | postgres
 public | tt   | table | postgres
(3 rows)

主库查看进程状态

lei=# \x
Expanded display is on.
lei=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 8625
usesysid         | 16774
usename          | repuser
application_name | walreceiver
client_addr      | 192.168.20.133
client_hostname  | master
client_port      | 55306
backend_start    | 2019-05-30 03:26:14.645623-04
backend_xmin     |
state            | streaming
sent_lsn         | 0/1E0000D0
write_lsn        | 0/1E0000D0
flush_lsn        | 0/1E0000D0
replay_lsn       | 0/1E0000D0
write_lag        | 00:00:00.001552
flush_lag        | 00:00:00.002167
replay_lag       | 00:00:00.002169
sync_priority    | 0
sync_state       | async

如果有异常信息,请查看数据库日志来定位问题,通常问题都是出现在几个配置文件中。

  • pg_hba.conf
  • postgresql.conf
  • recovery.conf

至此PG主备就切换完成了!

你可能感兴趣的:(PostgreSQL)