PostgreSQL+Keepalived主备高可用环境搭建

说明:本文档基于CentOS 7.9 、postgresql 11.9 、keepalived-1.3.5 搭建撰写。

一、服务器规划

服务器规划 IP及端口规划 VIP
PgMaster 192.168.11.3 192.168.11.5
PgSlave 192.168.11.4

二、准备相应的安装包

  • postgresql11-11.9-1PGDG.rhel7.x86_64.rpm
  • postgresql11-libs-11.9-1PGDG.rhel7.x86_64.rpm
  • postgresql11-contrib-11.9-1PGDG.rhel7.x86_64.rpm
  • postgresql11-server-11.9-1PGDG.rhel7.x86_64.rpm
  • ipset-libs-7.1-1.el7.x86_64.rpm
  • net-snmp-5.7.2-49.el7.x86_64.rpm
  • net-snmp-libs-5.7.2-49.el7.x86_64.rpm
  • net-snmp-agent-libs-5.7.2-49.el7.x86_64.rpm
  • keepalived-1.3.5-19.el7.x86_64.rpm
  • perl-Data-Dumper-2.145-3.el7.x86_64.rpm

三、配置防火墙、关闭selinux

  1. 使用root账号按以下命令操作

    所有服务器均需配置防火墙

    firewall-cmd --permanent --add-port=5432/tcp
    firewall-cmd --reload
    

    所有服务器均需重启防火墙,检查开发端口信息

     systemctl restart firewalld.service
     firewall-cmd --state
     firewall-cmd --list-ports
    
    1. 关闭SELinux
      永久关闭,修改/etc/selinux/config ⽂件,将SELINUX=enforcing改为SELINUX=disabled,重启机器即可

四、搭建主从环境

  1. 上传rpm包到服务器,安装postgresql包
    创建数据存储路径
mkdir -p /home/data

创建上传文件路径

mkdir -p /home/software

创建数据文件备份路径

mkdir -p /mnt/data/backups/

进入上传路径

cd /home/software/

执行下面命令进行安装

rpm -ivh postgresql11-libs-11.9-1PGDG.rhel7.x86_64.rpm 
rpm -ivh postgresql11-11.9-1PGDG.rhel7.x86_64.rpm
rpm -ivh postgresql11-server-11.9-1PGDG.rhel7.x86_64.rpm 
rpm -ivh postgresql11-contrib-11.9-1PGDG.rhel7.x86_64.rpm
  1. 指定数据存储路径
    pg数据存储路径默认是/var/lib/pgsql/11/data/,如需自定义存储路径,请按以下操作:
    创建自定义数据存储路径。
mkdir -p /home/data/postgresdb/11
chown -R postgres:postgres /home/data/postgresdb

创建归档路径

mkdir -p /home/postgres/arch/
chown -R postgres:postgres /home/postgres/arch/

创建postgresql-11.service.d目录

mkdir -p /etc/systemd/system/postgresql-11.service.d

创建override.conf,添加以下内容

vi /etc/systemd/system/postgresql-11.service.d/override.conf
[Service]
Environment=PGDATA=/home/data/postgresdb/11

其中/home/data/postgresdb/11,表示数据存储路径,修改完成重启载入。

systemctl daemon-reload
  1. 初始化数据库
/usr/pgsql-11/bin/postgresql-11-setup initdb
systemctl start postgresql-11
  1. 初始化postgres数据库用户密码(仅主库操作)
su - postgres
psql -U postgres
\password

创建同步账号

create role repl login replication encrypted password '123456';

修改完 \q 退出

exit

退出postgres账号。

  1. 修改postgresql配置(仅主库操作)
    进入自定义的数据存储路径
cd /home/data/postgresdb/11/
  1. 修改自定义的数据存储路径下的postgresql.conf文件。
vim postgresql.conf

查找并编辑以下配置项

port = 5432
max_connections = 2000
listen_addresses = '*'   
wal_level = replica
max_wal_senders = 20
wal_keep_segments = 2048
archive_mode = on
archive_timeout=3600   #每小时归档一次
archive_command = 'test ! -f /home/postgres/arch/%f && cp %p  /home/postgres/arch/%f'
  1. 修改pg_hba.conf文件。
vi pg_hba.conf

编辑配置如下

local   all             all                                     trust
host    all             all            0.0.0.0/0                md5
host    all             all             ::1/128                 md5
local   replication     all                                     trust
host    replication     all             0.0.0.0/0            md5
host    replication     all             ::1/128                 md5
  1. 编辑主库同步配置文件,recovery.done,如果没有则创建,找到以下配置项并修改
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=192.168.11.4 port=5432 user=repl password=123456'

其中host为另一台服务器地址,port为另一台服务器端口,user为同步账号,password为同步账号密码。
授权

chown -R postgres:postgres /home/data/postgresdb/11/recovery.done
chmod 700 /home/data/postgresdb/11/

然后重启主库

systemctl restart postgresql-11
  1. 搭建从库环境
    1)首先按上面的第五步的1、2步骤安装好从库
    2)切换到postgres
su - postgres

3)执行同步操作
先删除/home/data/postgresdb/11/下的内容

rm -rf  /home/data/postgresdb/11/*

执行同步操作

pg_basebackup -h 192.168.11.3 -p 5432 -U repl -F p -P -D /home/data/postgresdb/11/

其中:ip地址192.168.11.3为主库地址,/home/data/postgresdb/11/为从库的数据存储路径
4)编辑从库同步配置文件,将recovery.done改为recovery.conf,如果没有则创建,找到以下配置项并修改

cd /home/data/postgresdb/11/
mv recovery.done recovery.conf
vi recovery.conf

修改host配置

recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=192.168.11.3 port=5432 user=repl password=123456'

其中host为另一台服务器地址,port为另一台服务器端口,user为同步账号,password为同步账号密码。
授权

chown -R postgres:postgres /home/data/postgresdb/11/recovery.conf
chmod 700 /home/data/postgresdb/11/

5)使用root账号,启动从库

systemctl start postgresql-11

6)验证主从关系
切换到postgres账号

su - postgres

登录进入psql客户端

psql -U postgres 

执行下面sql,查询主从关系,f表示主库,t表示从库。

select pg_is_in_recovery(); 
  1. 配置环境变量(主从都执行)
    编辑环境变量
vim /etc/profile

添加如下内容:

export PATH=$PATH:/usr/pgsql-11/bin

执行以下命令

source /etc/profile

五、安装keepalived

  1. 上传rpm包到服务器,安装相关包
    进入到software
cd /home/software/

执行下面的命令安装

rpm -ivh perl-Data-Dumper-2.145-3.el7.x86_64.rpm  
rpm -ivh --force net-snmp-libs-5.7.2-49.el7.x86_64.rpm
rpm -ivh --force net-snmp-agent-libs-5.7.2-49.el7.x86_64.rpm
rpm -ivh --force ipset-libs-7.1-1.el7.x86_64.rpm
rpm -ivh libnl3-3.2.28-4.el7.x86_64.rpm --force
rpm -ivh keepalived-1.3.5-19.el7.x86_64.rpm
  1. keepalived配置
    创建脚本存放路径
mkdir -p /etc/keepalived/scripts/

创建脚本日志路径

mkdir -p /etc/keepalived/log/

编辑配置文件,具体内容参考入下

vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   router_id 192.168.11.3
   script_user root
   enable_script_security
   vrrp_skip_check_adv_addr
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}

vrrp_script check_pg_alived {
  script "/etc/keepalived/scripts/check_pg.sh"
  interval 2
  weight -5
  fall 2
  rise 1
}
vrrp_instance VI_1 {
    state BACKUP
    interface ens160
    virtual_router_id 66
    priority 99
    advert_int 1
    nopreempt
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    track_script {
      check_pg_alived
    }
   virtual_ipaddress {
        192.168.11.5
    }
    notify_master "/etc/keepalived/scripts/failover.sh"
    notify_fault "/etc/keepalived/scripts/fault.sh"
}

注意:
1.备库的priority设置要比主库的小,同时大于主库priority + weight 的值 。
比如这里主库的priority 为99,备库可以设置为98、97、96、95这几个值。
2.备库不需要配置nopreempt。

3.编写相关脚本
1)编写检查pg的脚本

vim /etc/keepalived/scripts/check_pg.sh
#!/bin/bash

export PGDATABASE=postgres
export PGPORT=5432
export PGUSER=postgres
export PGHOME=/usr/bin
export PATH=$PATH:$PGHOME

LOGFILE=/etc/keepalived/log/pg_status.log
SQL2='update sr_delay set sr_date = now() where id =1;'
SQL1='SELECT pg_is_in_recovery from pg_is_in_recovery();'
SQL3='SELECT 1;'

#db_role=`echo $SQL1  | $PGHOME/psql  -d $PGDATABASE -U $PGUSER -At -w`
db_role=`echo $SQL1  | psql -p $PGPORT -d $PGDATABASE -U $PGUSER -At -w`

if [ $db_role == 't' ];then
   echo -e `date +"%F %T"` "Attention1:the current database is standby DB!" >> $LOGFILE
   exit 0
fi

#备库不检查存活,主库更新状态
echo $SQL3 | psql -p $PGPORT -d $PGDATABASE -U $PGUSER -At -w

if [ $? -eq 0 ] ;then
   echo $SQL2 | psql -p $PGPORT -d $PGDATABASE -U $PGUSER -At -w
   echo -e `date +"%F %T"` "Success: update the master sr_delay successed!" >> $LOGFILE
   exit 0
else
   echo -e `date +"%F %T"` "Error:Is the server is running?" >> $LOGFILE
   exit 1
fi

2)编写备转为主时触发的脚本

vim /etc/keepalived/scripts/failover.sh
#!/bin/bash

export PGPORT=5432
export PGUSER=postgres
export PG_OS_USER=postgres
export PGDBNAME=postgres
export LANG=zh_CN.UTF-8
export PGPATH=/usr/bin
export PATH=$PATH:$PGPATH
LOGFILE=/etc/keepalived/log/failover.log

# 主备数据库同步时延,单位为秒
sr_allowed_delay_time=10000
PGDBPATH=/usr/local/postgresql/data

SQL1='select pg_is_in_recovery  from pg_is_in_recovery();'

SQL2="select sr_date as delay_time from sr_delay where now()-sr_date < interval \' $sr_allowed_delay_time \';"
#SQL2="select sr_date as delay_time from sr_delay where now()-sr_date < interval '100';"

db_role=`echo $SQL1 | psql -At -p $PGPORT -U $PGUSER -d $PGDBNAME -w`
db_sr_delaytime=`echo $SQL2 | psql -p $PGPORT -d $PGDBNAME -U $PGUSER -At -w`

SWITCH_COMMAND='pg_ctl promote -D '$PGDBPATH

# 如果为备库,且延迟大于指定时间则切换为主库
if [ $db_role == f ];then
   echo -e `date +"%F %T"` "Attention:The current postgreSQL DB is master database,cannot switched!" >> $LOGFILE
   exit 0
fi

if [ $db_sr_delaytime -gt 0 ];then
     echo -e `date +"%F %T"` "Attention:The current master database is health,the standby DB cannot switched!" >> $LOGFILE
     exit 0
fi

if [ !$db_sr_delaytime ];then
      echo -e `date +"%F %T"` "Attention:The current database is statndby,ready to switch master database!" >> $LOGFILE
      su - $PG_OS_USER -c "$SWITCH_COMMAND"
elif [ $? eq 0 ];then
          echo -e `date +"%F %T"` "success:The current standby database successed to switched the primary PG database !" >> $LOGFILE
          exit 0
else
          echo -e `date +"%F %T"` "Error: the standby database failed to switch the primary PG database !,pelease checked it!" >> $LOGFILE
          exit 1
fi

3)编写状态转换失败时触发的脚本

vim /etc/keepalived/scripts/fault.sh
#!/bin/bash

LOGGFILE=/etc/keepalived/log/pg_db_fault.log
PGPORT=5432

echo -e `date +"%F %T"` "Error:Because of the priamry DB happend some unknown problem,So turn off the PostgreSQL Database!" >> $LOGFILE

PGPID="`netstat -anp|grep $PGPORT |awk '{printf $7}'|cut -d/ -f1`"
kill -9 $PGPID
echo -e `date +"%F %T"` "Error:Because of the priamry DB happend some unknown problem,So turn off the PostgreSQL Database!" >> $LOGFILE
systemctl stop keepalived

4)授予可执行权限

cd /etc/keepalived/scripts
chmod 755 ./*.sh

5)创建用户检查检查心跳的数据库脚本
在主库上执行,plsql进入会话执行以下两条sql语句。

create table sr_delay(id serial ,sr_date timestamp default now());
insert into sr_delay values(1,now());

6)主备服务器分别修改配置
编辑配置

vim /etc/sysconfig/keepalived

把KEEPALIVED_OPTIONS="-D" 修改为KEEPALIVED_OPTIONS="-D -d -S 0"

vim /etc/rsyslog.conf

加入如下配置:

#keepalived -S 0
local0.* /var/log/keepalived.log

7)主备分别配置防火墙策略

firewall-cmd --direct --permanent --add-rule ipv4 filter INPUT 0  --destination 224.0.0.18 --protocol vrrp -j ACCEPT
firewall-cmd --direct --permanent --add-rule ipv4 filter OUTPUT 0  --destination 224.0.0.18 --protocol vrrp -j ACCEPT
firewall-cmd --reload
systemctl restart firewalld.service

8)启动keepalived服务
使用下面的命令依次启动主库和从库的keepalived服务

systemctl start keepalived.service

并用下面的命令检查是否绑定了指定的vip。

ip a

检查/etc/keepalived/log/pg_status.log,该日志会定时向数据库更新时间戳,通过内容可知是否正常。

六、测试主备切换

1)关闭主库服务,切换为postgres账号,执行如下命令:

systemctl stop postgresql-11

2)从库上执行ip a 命令,显示vip已经飘过来。
3)检查从库是否变为主库。
登录进入psql客户端,执行下面命令

psql 

执行下面sql,查询主从关系,f表示主库,t表示从库。

select pg_is_in_recovery();

六、恢复主库(原主库变为从库)

1)检查recovery.conf配置文件
将data下recovery.done改为recovery.conf
2)启动从库服务(原主库)

systemctl restart postgresql-11 

3)检查从库(原主库)的keepalived状态

ps -ef | grep keepalived

如果服务不存在,使用root账号用systemctl start keepalived命令启动keepalived。

七、备份

方案采用每周全量备份加开启日志归档操作,保证当灾难发生时,比如主从数据库都损坏时,保证根据全量数据加归档的日志文件能够最大程度还原数据。
1)编写全量备份的脚本

mkdir /etc/keepalived/scripts/tools/
vim /etc/keepalived/scripts/tools/backup_pg_all.sh
#!/bin/bash
export PGPORT=5432
export PGUSER=postgres
export PGHOME=/usr/bin
export PATH=$PATH:$PGHOME
export PASS='123456'

LOGFILE=/etc/keepalived/log/backup_pg.log
BACKPATH=/mnt/data/backups/

backtime=`date +"%F %T"`
temppath=`date +"%Y%m%d"`
echo ”备份时间为${backtime},备份数据库开始” >> ${LOGFILE}
source=`pg_basebackup -Ft -Pv -Xf -z -Z5 -p ${PGPORT} -U ${PGUSER} -w -D ${BACKPATH}${temppath}` 2>> ${LOGFILE}
if [ "$?" == 0 ];then

echo ”备份数据库成功!!” >> ${LOGFILE}
else
echo ”备份数据库失败!!” >> ${LOGFILE}
fi

如果端口号、密码等信息有改动,需要根据实际情况调整。
授予可执行权限

chmod 755 /etc/keepalived/scripts/tools/backup_pg_all.sh

2)每周一凌晨1点执行全量备份
执行crontab -e,进行定时任务编辑,增加如下配置:

1 1 * * 1       /etc/keepalived/scripts/tools/backup_pg_all.sh

你可能感兴趣的:(PostgreSQL+Keepalived主备高可用环境搭建)