postgresql 高可用 repmgr 的使用之四 1 Primary + 1 Standby 的 switchover

os:ubunbu 16.04
postgresql:9.6.8
repmgr:4.1.1

192.168.56.101 node1
192.168.56.102 node2

按照上一遍blog安装好 1 Primary + 1 Standby。

编辑 /etc/sudoers

# vi /etc/sudoers

postgres ALL = NOPASSWD: /usr/bin/pg_ctlcluster

或者

# vi /etc/sudoers

postgres ALL=(ALL:ALL) NOPASSWD:ALL

service_*_command

node1、node2都需要设置
Important: Debian/Ubuntu users: instead of calling sudo systemctl directly, use sudo pg_ctlcluster, e.g.:
https://repmgr.org/docs/4.1/configuration-file-service-commands.html

# service_start_command
service_start_command   = 'sudo pg_ctlcluster 9.6 main start'
service_stop_command    = 'sudo pg_ctlcluster 9.6 main stop'
service_restart_command = 'sudo pg_ctlcluster 9.6 main restart'
service_reload_command  = 'sudo pg_ctlcluster 9.6 main reload' 
service_promote_command  = 'sudo pg_ctlcluster 9.6 main promote'

上述的 service_*_command 需要写入 /etc/repmgr.conf

replication slots

node1、node2都需要设置

$ vi /etc/repmgr.conf
use_replication_slots=true

node1 上查看

$ psql -U repmgr
repmgr=# SELECT node_id, upstream_node_id, active, node_name, type, priority, slot_name
           FROM repmgr.nodes ORDER BY node_id;
 node_id | upstream_node_id | active | node_name |  type   | priority |   slot_name   
---------+------------------+--------+-----------+---------+----------+---------------
       1 |                  | t      | node1     | primary |      100 | repmgr_slot_1
       2 |                1 | t      | node2     | standby |      100 | repmgr_slot_2
(2 rows)
		   

authorized_keys

Performing a switchover with repmgr

node2 节点上操作

 ID | Name  | Role    | Status    | Upstream | Location | Connection string                                              
----+-------+---------+-----------+----------+----------+-----------------------------------------------------------------
 1  | node1 | primary | * running |          | default  | host=192.168.56.101 user=repmgr dbname=repmgr connect_timeout=2
 2  | node2 | standby |   running | node1    | default  | host=192.168.56.102 user=repmgr dbname=repmgr connect_timeout=2

node2 节点上操作

$ repmgr -f /etc/repmgr.conf standby switchover --siblings-follow --dry-run --force-rewind

NOTICE: checking switchover on node "node2" (ID: 2) in --dry-run mode
INFO: prerequisites for using pg_rewind are met
WARNING: unable to connect to remote host "192.168.56.101" via SSH
ERROR: unable to connect via SSH to host "192.168.56.101", user ""

提示 SSH,这个还需要 SSH 吗?去掉–siblings-follow 参数试试。

$ repmgr -f /etc/repmgr.conf standby switchover --dry-run --force-rewind

NOTICE: checking switchover on node "node2" (ID: 2) in --dry-run mode
INFO: prerequisites for using pg_rewind are met
WARNING: unable to connect to remote host "192.168.56.101" via SSH
ERROR: unable to connect via SSH to host "192.168.56.101", user ""

–siblings-follow 的含义

$ repmgr -f /etc/repmgr.conf standby switchover --help

STANDBY SWITCHOVER

  "standby switchover" promotes a standby node to primary, and demotes the previous primary to a standby.

  --always-promote                    promote standby even if behind original primary
  --dry-run                           perform checks etc. but don't actually execute switchover
  -F, --force                         ignore warnings and continue anyway
  --force-rewind[=VALUE]              use "pg_rewind" to reintegrate the old primary if necessary
                                        (9.3 and 9.4 - provide "pg_rewind" path)
  -R, --remote-user=USERNAME          database server username for SSH operations (default: "postgres")
  --repmgrd-no-pause                  don't pause repmgrd
  --siblings-follow                   have other standbys follow new primary
  

还是需要配置SSH,那就先配置SSH免密登录吧,可以参考另外一篇blog。
https://blog.csdn.net/ctypyb2002/article/details/80572181

配置完SSH免密登录后,在node2上完美运行

$ repmgr -f /etc/repmgr.conf standby switchover --siblings-follow --dry-run --force-rewind
NOTICE: checking switchover on node "node2" (ID: 2) in --dry-run mode
INFO: prerequisites for using pg_rewind are met
INFO: SSH connection to host "192.168.56.101" succeeded
INFO: able to execute "repmgr" on remote host "localhost"
INFO: 1 walsenders required, 10 available
INFO: demotion candidate is able to make replication connection to promotion candidate
INFO: 0 pending archive files
INFO: replication lag on this standby is 0 seconds
NOTICE: local node "node2" (ID: 2) would be promoted to primary; current primary "node1" (ID: 1) would be demoted to standby
INFO: following shutdown command would be run on node "node1":
  "sudo pg_ctlcluster 9.6 main stop"
  

node2 上开始执行

$ repmgr -f /etc/repmgr.conf standby switchover --siblings-follow --force-rewind

NOTICE: executing switchover on node "node2" (ID: 2)
NOTICE: local node "node2" (ID: 2) will be promoted to primary; current primary "node1" (ID: 1) will be demoted to standby
NOTICE: stopping current primary node "node1" (ID: 1)
NOTICE: issuing CHECKPOINT
DETAIL: executing server command "sudo pg_ctlcluster 9.6 main stop"
INFO: checking primary status; 1 of 6 attempts
NOTICE: current primary has been cleanly shut down at location 0/C000028
NOTICE: promoting standby to primary
DETAIL: promoting server "node2" (ID: 2) using "sudo pg_ctlcluster 9.6 main promote"
DETAIL: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
NOTICE: STANDBY PROMOTE successful
DETAIL: server "node2" (ID: 2) was successfully promoted to primary
NOTICE: issuing CHECKPOINT
NOTICE: setting node 1's slot name to "repmgr_slot_1"
NOTICE: setting node 1's primary to node 2
NOTICE: starting server using "sudo pg_ctlcluster 9.6 main start"
NOTICE: replication slot "repmgr_slot_2" deleted on node 1
NOTICE: NODE REJOIN successful
DETAIL: node 1 is now attached to node 2
NOTICE: switchover was successful
DETAIL: node "node2" is now primary and node "node1" is attached as standby
NOTICE: STANDBY SWITCHOVER has completed successfully

$ repmgr -f /etc/repmgr.conf cluster show
 ID | Name  | Role    | Status    | Upstream | Location | Connection string                                              
----+-------+---------+-----------+----------+----------+-----------------------------------------------------------------
 1  | node1 | standby |   running | node2    | default  | host=192.168.56.101 user=repmgr dbname=repmgr connect_timeout=2
 2  | node2 | primary | * running |          | default  | host=192.168.56.102 user=repmgr dbname=repmgr connect_timeout=2
 

达到预期效果。

参考:
https://repmgr.org/docs/4.1/repmgr-administration-manual.html
https://blog.csdn.net/ctypyb2002/article/details/80572181

你可能感兴趣的:(#,postgresql,ha,repmgr)