CentOS 7.6下PostgreSQL11.5流复制搭建以及keepalived切换

架构设计:
服务器A: 192.168.173.145 postgresql primary keepalived primary
服务器B: 192.168.173.122 postgresql standby keepalived slave
VIP:192.168.173.188

注意,如果没特殊注明,两台机器均需部署

1. 安装postgresql11.5

配置内核参数和资源限制

cat /usr/lib/sysctl.d/00-system.conf
添加以下内核参数
kernel.shmmni = 4096
kernel.sem = 50100 64128000 50100 1280
fs.file-max = 7672460
net.core.rmem_default = 1048576
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
net.core.netdev_max_backlog = 10000
fs.aio-max-nr = 1048576
net.ipv4.ip_local_port_range = 9000 65500
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_keepalive_time=72
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_intvl = 7
vm.overcommit_memory = 0
vm.swappiness=0
vm.dirty_background_bytes=102400000
vm.dirty_bytes=102400000

系统资源限制添加以下条目
/etc/security/limits.conf
* soft    nofile  131072
* hard    nofile  131072
* soft    nproc   131072
* hard    nproc   131072
* soft    core    unlimited
* hard    core    unlimited
* soft    memlock 50000000
* hard    memlock 50000000

配置环境变量

--创建系统用户
useradd postgres

postgres用户添加环境变量
export PS1="$USER@`/bin/hostname -s`-> "
export PGUSER=postgres
export PGPORT=1921
export PGDATA=/opt/pgdata/pg_root
export LANG=en_US.utf8
export PGHOME=/opt/pgsql11.5
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:
export DATE=`date +"%Y%m%d%H%M"`
export PATH=$PGHOME/bin:$PATH:.
export MANPATH=$PGHOME/share/man:$MANPATH
alias rm='rm -i'
alias ll='ls -lh'

下载安装包
wget https://ftp.postgresql.org/pub/source/v11.5/postgresql-11.5.tar.gz
安装依赖包以及编译安装数据库

--安装依赖包
yum -y install coreutils glib2 lrzsz mpstat dstat sysstat e4fsprogs xfsprogs ntp readline-devel zlib-devel openssl-devel pam-devel libxml2-devel libxslt-devel python-devel tcl-devel gcc make smartmontools flex bison perl-devel perl-Ext Utils* openldap-devel jadetex  openjade bzip2 nc mutt


--停止防火墙
systemctl stop  firewalld.service

--解压并创建相关目录,并赋予postgres用户权限  
tar -zxvf postgresql-11.5.tar.gz
mkdir -p /opt/pgsql11.5
mkdir -p /opt/pgdata/pg_root
chown -R postgres.postgres /opt/pgsql11.5 /opt/pgdata/pg_root

--编译安装
cd postgresql-11.5/
./configure --prefix=/opt/pgsql11.5 --with-pgport=1921 --with-segsize=8 --with-perl --with-python --with-openssl --with-pam --with-ldap --with-libxml --with-libxslt --enable-thread-safety
gmake world && gmake install-world

进入数据库,创建流复制用户(只需在primary操作),但是,.pgpass文件需要两台机器都存在

在postgres用户下,初始化数据库并启动数据库
initdb -D /opt/pgdata/pg_root
pg_ctl start 

--创建流复制用户
CREATE USER repuser
  REPLICATION 
  LOGIN
  CONNECTION LIMIT 5
  ENCRYPTED PASSWORD 'LKJKisdh767GHGHJshhsdh';
 
 --添加.pgpass免密登陆
 cat .pgpass 
192.168.173.122:1921:replication:repuser:LKJKisdh767GHGHJshhsdh
192.168.173.145:1921:replication:repuser:LKJKisdh767GHGHJshhsdh
192.168.173.188:1921:replication:repuser:LKJKisdh767GHGHJshhsdh
chmod 0600 ~/.pgpass

--sky_pg_cluster 数据库配置
--初始数据部署
create role sky_pg_cluster superuser nocreatedb nocreaterole noinherit login encrypted password 'shakjhsaduy2uieyuJKHKJsd3';
create database sky_pg_cluster with template template1 encoding 'UTF8' owner sky_pg_cluster;
\c sky_pg_cluster  sky_pg_cluster
create schema sky_pg_cluster ;
create table cluster_status (id int unique default 1, last_alive timestamp(0) without time zone);

--限制cluster_status表有且只有一行 :
CREATE FUNCTION cannt_delete ()
RETURNS trigger
LANGUAGE plpgsql AS $$
BEGIN
RAISE EXCEPTION 'You can not delete!';
END; $$;
--创建触发器
CREATE TRIGGER cannt_delete BEFORE DELETE ON cluster_status FOR EACH ROW EXECUTE PROCEDURE cannt_delete();
CREATE TRIGGER cannt_truncate BEFORE TRUNCATE ON cluster_status FOR STATEMENT EXECUTE PROCEDURE cannt_delete();
-- 插入初始数据
insert into cluster_status values (1, now());

数据库配置参数如下:(可自己根据需求配置)

listen_addresses = '*'          # what IP address(es) to listen on;
port = 1921                             # (change requires restart)
max_connections = 1000                  # (change requires restart)
superuser_reserved_connections = 13     # (change requires restart)
tcp_keepalives_idle = 60                # TCP_KEEPIDLE, in seconds;
tcp_keepalives_interval = 20            # TCP_KEEPINTVL, in seconds;
shared_buffers = 16384MB                        # min 128kB
work_mem = 4MB                          # min 64kB
maintenance_work_mem = 1024MB           # min 1MB
dynamic_shared_memory_type = posix      # the default is the first option
wal_level = replica                     # minimal, replica, or logical
synchronous_commit = off                # synchronization level;
full_page_writes = on                   # recover from partial page writes
wal_log_hints = on                      # also do full page writes of non-critical updates
wal_writer_delay = 20ms         # 1-10000 milliseconds
max_wal_size = 8GB
min_wal_size = 128MB
archive_mode = on               # enables archiving; off, on, or always
archive_command = '/usr/bin/date'               # command to use to archive a logfile segment
wal_keep_segments = 256         # in logfile segments; 0 disables
hot_standby = on                        # "off" disallows queries during recovery
effective_cache_size = 60GB
log_destination = 'csvlog'              # Valid values are combinations of
logging_collector = on          # Enable capturing of stderr and csvlog
log_directory = 'pg_log'                        # directory where log files are written,
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
log_file_mode = 0600                    # creation mode for log files,
log_rotation_age = 1d                   # Automatic rotation of logfiles will
log_rotation_size = 10MB                # Automatic rotation of logfiles will
log_checkpoints = on
log_connections = on
log_statement = 'ddl'                   # none, ddl, mod, all
log_timezone = 'Asia/Shanghai'
track_activities = on
track_counts = on
track_functions = none                  # none, pl, all
autovacuum = on                 # Enable autovacuum subprocess?  'on'
autovacuum_max_workers = 4              # max number of autovacuum subprocesses
autovacuum_naptime = 1min               # time between autovacuum runs
autovacuum_vacuum_threshold = 50        # min number of row updates before
autovacuum_analyze_threshold = 50       # min number of row updates before
autovacuum_vacuum_scale_factor = 0.2    # fraction of table size before vacuum
autovacuum_analyze_scale_factor = 0.1   # fraction of table size before analyze
autovacuum_freeze_max_age = 1500000000  # maximum XID age before forced vacuum
datestyle = 'iso, mdy'
timezone = 'Asia/Shanghai'
lc_messages = 'en_US.utf8'                      # locale for system error message
lc_monetary = 'en_US.utf8'                      # locale for monetary formatting
lc_numeric = 'en_US.utf8'                       # locale for number formatting
lc_time = 'en_US.utf8'                          # locale for time formatting
default_text_search_config = 'pg_catalog.english'
shared_preload_libraries = ''   # (change requires restart)
2. 搭建流复制(备节点postgres用户执行)
pg_basebackup -D $PGDATA -Fp -Xs -v -P -h 192.168.173.145 -p 1921 -U repuser
cd $PGDATA
mv recovery.done recovery.conf

--recovery.conf内容如下
cat $PGDATA/recovery.conf 
standby_mode = 'on'
recovery_target_timeline='latest'
primary_conninfo = 'host=192.168.173.188 port=1921 user=repuser keepalives_idle=60'
#restore_command = 'cp /path/to/archive/%f %p'
#archive_cleanup_command = 'pg_archivecleanup /path/to/archive %r'
3. keepalived自动切换部署

下载keepalived,并编译安装

wget https://www.keepalived.org/software/keepalived-2.0.18.tar.gz
tar -zxvf keepalived-2.0.18.tar.gz
cd keepalived-2.0.18
./configure --prefix=/usr/local/keepalived --sysconf=/etc
make && make install

配置文件如下:

cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
  notification_email {
    [email protected]
  }
  smtp_server 127.0.0.1
  smtp_connect_timeout 30
  router_id DB1_PG_HA
}

vrrp_script check_pg_alived {
   script "/usr/local/bin/pg_moniter.sh"
   interval 10 # 每10秒执行脚本检查一次
   fall 5    # 5次失败就KO
}

vrrp_instance VI_1 {
   state BACKUP  #注意主备都是BACKUP,如果设置为MASTER-BACKUP,VIP会被MASTER抢占
   nopreempt  #不抢占VIP
   interface eth0
   virtual_router_id 12
   priority 100
   advert_int 1
   authentication {
       auth_type PASS
       auth_pass t9rveMP0Z9S1
   }
   track_script { 
       check_pg_alived
   }
   virtual_ipaddress {
       192.168.173.188
   }
   smtp_alert
   notify_master /usr/local/bin/active_standby.sh
}

/usr/local/bin/pg_moniter.sh内容如下

#!/bin/bash

# Load Env
source /home/postgres/.bash_profile
export PGPORT=1921
export PGUSER=sky_pg_cluster
export PGDBNAME=sky_pg_cluster
export PGDATA=/opt/pgdata/pg_root
export LANG=en_US.utf8
export PGHOME=/opt/pgsql11.5
export PATH=$PGHOME/bin:$PATH:.
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib

MONITOR_LOG="/tmp/pg_monitor.log"
SQL1="update cluster_status set last_alive = now();"
SQL2='select 1;'

# 如果是备库,则退出,此脚本不检查备库存活状态
standby_flg=`psql -p $PGPORT -U postgres -At -c "select pg_is_in_recovery();"`
if [ ${standby_flg} == 't' ]; then
    echo -e "`date +%F\ %T`: This is a standby database, exit!\n" >> $MONITOR_LOG
    exit 0
fi

# 主库上更新 cluster_state 表
echo $SQL1 | psql -At -h 127.0.0.1 -p $PGPORT -U $PGUSER -d $PGDBNAME >> $MONITOR_LOG


# 判断自己端口是否可用
CMD=`nc -v -z 192.168.173.145 1921`
if [ $? -eq 0 ];then
        ret="yes"
else
        ret="no"
fi

COUNT=`ps -ef|grep postgres|wc -l`
if [ $COUNT -ge 5 ];then
        process="yes"
else
        process="no"
fi

echo $SQL2 | psql -At -h 127.0.0.1 -p $PGPORT -U $PGUSER -d $PGDBNAME 
if [[ $? -eq 0 ]] && [[ $ret = "yes" ]] && [[ $process = "yes" ]]; then
   echo -e "`date +%F\ %T`:  Primary db is health."  >> $MONITOR_LOG
   exit 0
else
   echo -e "`date +%F\ %T`:  Attention: Primary db is not health!" >> $MONITOR_LOG
   exit 1
fi

/usr/local/bin/active_standby.sh内容如下

#!/bin/bash

# 环境变量
source /home/postgres/.bash_profile
export PGPORT=1921
export PGUSER=sky_pg_cluster
export PG_OS_USER=postgres
export PGDBNAME=sky_pg_cluster
export PGDATA=/opt/pgdata/pg_root
export LANG=en_US.utf8
export PGHOME=/opt/pgsql11.5
export PATH=$PGHOME/bin:$PATH:.
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib

# 配置信息, LAG_MINUTES 配置允许的延迟时间
LAG_MINUTES=3
HOST_IP=`hostname -i`
NOTICE_EMAIL="[email protected]"
FAILOVE_LOG='/tmp/failover.log'

SQL1="select 'this_is_standby' as cluster_role from ( select pg_is_in_recovery() as std ) t where t.std is true;"
SQL2="select 'standby_in_allowed_lag' as cluster_lag from cluster_status where now()-last_alive < interval '$LAG_MINUTES min';"


# VIP 已发生漂移,记录到日志文件
echo -e "`date +%F\ %T`: keepalived VIP switchover!" >> $FAILOVE_LOG

# VIP 已漂移,邮件通知
echo -e "`date +%F\ %T`: ${HOST_IP}/${PGPORT} VIP 发生漂移,需排查问题!" | mutt -s "Error: 数据库 VIP 发生漂移 " ${NOTICE_EMAIL}


# pg_failover 函数,用于主库故障时激活从库
pg_failover()
{
# PROMOTE_STATUS 表示激活备库成功标志,1 表示失败,0 表示成功
PROMOTE_STATUS=1

# 激活备库
su - $PG_OS_USER -c "pg_ctl promote"
if [ $? -eq 0 ]; then
   echo -e "`date +%F\ %T`: `hostname` promote standby success. " 
   PROMOTE_STATUS=0
fi

if [ $PROMOTE_STATUS -ne 0 ]; then
  echo -e "`date +%F\ %T`: promote standby failed."
  return $PROMOTE_STATUS
fi

 echo -e "`date +%F\ %T`: pg_failover() function call success."
 return 0
}


# 故障切换过程
# standby是否正常的标记(is in recovery), CNT=1 表示正常.
CNT=`echo $SQL1 | psql -At -h 127.0.0.1 -p $PGPORT -U $PGUSER -d $PGDBNAME -f - | grep -c this_is_standby`
#echo -e "CNT: $CNT"
# 判断 standby lag 是否在接受范围内的标记, LAG=1 表示正常.
LAG=`echo $SQL2 | psql -At -h 127.0.0.1 -p $PGPORT -U $PGUSER -d $PGDBNAME | grep -c standby_in_allowed_lag`

if [ $CNT -eq 1 ] && [ $LAG -eq 1 ] ; then
  pg_failover >> $FAILOVE_LOG
  if [ $? -ne 0 ]; then
    echo -e "`date +%F\ %T`: pg_failover failed." >> $FAILOVE_LOG
    exit 1
  fi 
else
  echo -e "`date +%F\ %T`: `hostname` standby is not ok or laged far $LAG_MINUTES mintues from primary , failover not allowed! " >> $FAILOVE_LOG
  exit 1
fi
4.keepalived主备切换实例
--查看eth0,可见VIP在192.168.173.145上
ip addr show eth0
2: eth0:  mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether b8:ac:6f:12:fd:b4 brd ff:ff:ff:ff:ff:ff
    inet 192.168.173.145/24 brd 192.168.173.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.173.188/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::baac:6fff:fe12:fdb4/64 scope link 
       valid_lft forever preferred_lft forever
--模拟故障,关闭数据库
pg_ctl stop -m fast

--观察日志
tailf /tmp/pg_monitor.log 

UPDATE 1
2019-09-20 14:39:11:  Primary db is health.
2019-09-20 14:39:21:  Attention: Primary db is not health!
2019-09-20 14:39:31:  Attention: Primary db is not health!
2019-09-20 14:39:41:  Attention: Primary db is not health!
2019-09-20 14:39:51:  Attention: Primary db is not health!
2019-09-20 14:40:01:  Attention: Primary db is not health!

五次之后被KO,192.168.173.122可观察到切换为主的日志
tailf /var/log/messages 
Sep 20 14:40:01 192-168-173-122 Keepalived_vrrp[29340]: (VI_1) Backup received priority 0 advertisement
Sep 20 14:40:01 192-168-173-122 Keepalived_vrrp[29340]: (VI_1) Backup received priority 0 advertisement
Sep 20 14:40:02 192-168-173-122 Keepalived_vrrp[29340]: (VI_1) Receive advertisement timeout
Sep 20 14:40:02 192-168-173-122 Keepalived_vrrp[29340]: (VI_1) Entering MASTER STATE
Sep 20 14:40:02 192-168-173-122 Keepalived_vrrp[29340]: (VI_1) setting VIPs.
Sep 20 14:40:02 192-168-173-122 Keepalived_vrrp[29340]: Sending gratuitous ARP on eth0 for 192.168.173.188
Sep 20 14:40:02 192-168-173-122 Keepalived_vrrp[29340]: (VI_1) Sending/queueing gratuitous ARPs on eth0 for 192.168.173.188
Sep 20 14:40:02 192-168-173-122 Keepalived_vrrp[29340]: Sending gratuitous ARP on eth0 for 192.168.173.188
Sep 20 14:40:02 192-168-173-122 Keepalived_vrrp[29340]: Sending gratuitous ARP on eth0 for 192.168.173.188
Sep 20 14:40:02 192-168-173-122 Keepalived_vrrp[29340]: Sending gratuitous ARP on eth0 for 192.168.173.188
Sep 20 14:40:02 192-168-173-122 Keepalived_vrrp[29340]: Sending gratuitous ARP on eth0 for 192.168.173.188
Sep 20 14:40:02 192-168-173-122 Keepalived_vrrp[29340]: Remote SMTP server [127.0.0.1]:25 connected.
Sep 20 14:40:02 192-168-173-122 Keepalived_vrrp[29340]: SMTP alert successfully sent.

以下可见数据库已切换为primary
su - postgres
pg_controldata | grep 'cluster state'
Database cluster state:               in production
...
如果原主库类似我这样干净的关闭,可以切换为备机,如果不是非干净关闭,无法正常成为备节点,请使用pg_rewind,注意使用前提:要么在初始化数据库的时候打开checksums,要么后面修改wal_log_hints为on,而且 full_page_writes 也要设置为on
如:在192.168.173.145执行pg_rewind重建备节点
pg_rewind --target-pgdata /opt/pgdata/pg_root --source-server='host=192.168.173.122 port=1921 user=postgres dbname=postgres' -P

cd $PGDATA
mv recovery.done recovery.conf
pg_controldata | grep 'cluster state'
Database cluster state:               in archive recovery

--在新主节点192.168.173.122查看流复制状态,并新建一张表测试
psql
psql (11.5)
Type "help" for help.
--查看流复制状态
 \x
Expanded display is on.
postgres=# select * from pg_stat_replication ;
-[ RECORD 1 ]----+------------------------------
pid              | 17876
usesysid         | 16384
usename          | repuser
application_name | walreceiver
client_addr      | 192.168.173.145
client_hostname  | 
client_port      | 17519
backend_start    | 2019-09-20 14:43:16.554519+08
backend_xmin     | 
state            | streaming
sent_lsn         | 0/1401D388
write_lsn        | 0/1401D388
flush_lsn        | 0/1401D388
replay_lsn       | 0/1401D388
write_lag        | 00:00:00.0006
flush_lag        | 00:00:00.000757
replay_lag       | 00:00:00.000985
sync_priority    | 0
sync_state       | async
--创建测试表
postgres=# create table tb1 (a int);
CREATE TABLE

--在新备节192.168.173.145点验证,tb1也已经存在
psql
psql (11.5)
Type "help" for help.

postgres=# \d tb1
                Table "public.tb1"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           |          | 

参考:
https://www.keepalived.org/manpage.html
https://www.postgresql.org/docs/11/high-availability.html
https://github.com/francs/PostgreSQL-Keepalived-HA/blob/master/install.txt

你可能感兴趣的:(#,postgreSQL)