postgresql11安装
1、创建postgres用户,并授予root权限
master和slave都需要操作:
1.1创建用户
groupadd postgres
useradd -g postgres postgres
1.2授权 root权限
chmod 755 /etc/sudoers
vi /etc/sudoers
## Allow root to run any commands anywher
root ALL=(ALL) ALL
postgres ALL=(ALL) ALL #这个是新增的用户
1.3创建安装目录
mkdir /opt/pgsql11/
1.4将安装目录授权给postgres用户
chown -R posrtgres:postgres /opt/pgsql11/
[root@bogon ~]# vim /etc/hosts#编辑内容如下:
192.168.8.60 master
192.168.8.61 slave
192.168.9.62 vip
2、postgresql11安装
1.1 配置环境
yum install gcc
yum install zlib-devel
yum -y install readline-devel
1.2下载postgres11
postgresql11下载地址:
https://ftp.postgresql.org/pub/source/v11.5/postgresql-11.5.tar.gz
tar -zxvf postgres.*.tar.gz
cd /postgresql/
./configure --prefix=/opt/pgsql11/
make
make install
1.3 配置环境变量
su - postgres
vim ~/.bashrc
PGHOME=/home/postgres
export PGHOME
PGDATA=$PGHOME/data
export PGDATA
PATH=$PATH:$HOME/.local/bin:$HOME/bin:$PGHOME/bin
export PATH
source .bashrc
master上配置:
初始化数据库:
initdb -D $PGDATA
启动数据库
pg_ctl start -D $PGDATA
登录数据库:
psql -U postgres
设置postgres用户的密码
\password
123456
创建流复制用户
CREATE USER repuser replication LOGIN CONNECTION LIMIT 3 ENCRYPTED PASSWORD 'repuser';
vim /opt/pgsql11/data/pg_hba.conf
host all all 0.0.0.0/0 md5
host replication repuser slave md5
配置postgresql.conf
listen_addresses = '*'
port = 5432
max_wal_senders = 16
wal_level = replica
archive_mode = on
archive_command = 'cd ./'
hot_standby = on
wal_keep_segments = 64
full_page_writes = on
wal_log_hints = on
[postgres@bogon ~]$ cp share/recovery.conf.sample data/recovery.done
[postgres@bogon ~]$ vim data/recovery.done
#编辑内容如下
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=slave port=5432 user=repuser password=repuser'
trigger_file = '/home/postgres/data/trigger_file'
[postgres@bogon ~]$ vim .pgpass
slave:5432:postgres:repuser:repuser
chmod 600 .pgpass
查询备机
select pid,application_name,client_addr,client_port,state,sync_state from pg_stat_replication;
postgres=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)
slave配置:
pg_basebackup -D $PGDATA -Fp -Xs -v -P -h master -p 5432 -U repuser
chown -R postgres:postgres /opt/pgsql11/
host all all 0.0.0.0/0 md5
host replication repuser master md5
配置recovery.conf
[postgres@bogon ~]$ cp share/recovery.conf.sample data/recovery.conf
[postgres@bogon ~]$ vim data/recovery.conf
#编辑内容如下
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=master port=5432 user=repuser password=repuser'
trigger_file = '/home/postgres/data/trigger_file'
[postgres@bogon ~]$ vim .pgpass
master:5432:postgres:repuser:repuser
chmod 600 .pgpass
postgres=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)
end;
主机done了,备机切成主机 touch /opt/pgsql11/data/trigger_file;
配置pgpool-II
master:
[root@localhost ~]# su - postgres
[postgres@localhost ~]$ ssh-keygen -t rsa
[postgres@localhost ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[postgres@localhost ~]$ chmod 600 ~/.ssh/authorized_keys
[postgres@localhost ~]$ scp ~/.ssh/authorized_keys postgres@slave:~/.ssh/
slave:
[root@localhost ~]# su - postgres
[postgres@localhost ~]$ ssh-keygen -t rsa
[postgres@localhost ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[postgres@localhost ~]$ chmod 600 ~/.ssh/authorized_keys
[postgres@localhost ~]$ scp ~/.ssh/authorized_keys postgres@master:~/.ssh/
验证:ssh postgres@master
pgpool-II下载地址:
http://www.pgpool.net/mediawiki/images/pgpool-II-3.7.2.tar.gz
tar -zxvf pgpool7.2.tar.gz
chown -R postgres:postgres /opt/pgpool-II/
su postgres
cd pgpool-7.2/
./configure -prefix=/opt/pgpool-II -with-pgsql=path -with-pgsql=/opt/pgsql11/
make
make install
[postgres@master ~]$ cd /home/postgres
[postgres@master ~]$ vim .bashrc
#编辑内容如下
PGPOOLHOME=/opt/pgpool-II
export PGPOOLHOME
PATH=$PATH:$HOME/.local/bin:$HOME/bin:$PGHOME/bin:$PGPOOLHOME/bin
export PATH
[postgres@master ~]$ cd /opt/pgpool/etc
[postgres@etc~]$ cp pool_hba.conf.sample pool_hba.conf
[postgres@etc~]$ vim pool_hba.conf
#编辑内容如下# "local" is for Unix domain socket connections only
local all all trust
# IPv4 local connections:
host all all 0.0.0.0/0 md5
host all all 0/0 md5
[postgres@master ~]$ cd /opt/pgpool/etc
[postgres@etc~]$ cp pcp.conf.sample pcp.conf
[postgres@etc~]$ cp pgpool.conf.sample pgpool.conf
# 使用pg_md5生成配置的用户名密码
[postgres@etc~]$ pg_md5 123456
6b07583ba8af8e03043a1163147faf6a
#pcp.conf是pgpool管理器自己的用户名和密码,用于管理集群。
[postgres@etc~]$ vim pcp.conf
#编辑内容如下postgres:6b07583ba8af8e03043a1163147faf6a
#保存退出!#在pgpool中添加pg数据库的用户名和密码
[postgres@etc~]$ pg_md5 -p -m -u postgres pool_passwd
#数据库登录用户是postgres,这里输入登录密码,不能出错#输入密码后,在pgpool/etc目录下会生成一个pool_passwd文件
------------如果要添加数据库用户需要重复上述步骤
并且在pgsql数据库配置文件的pg_hba.conf 中添加:
master:
host kong kong slave md5
slave:
host kong kong master md5
例如kong需要连接postgres数据库,创建一个kong用户和kong数据库,如果不吧kong用户添加到pool_passwd
中,使用vip连接数据库的时候会报错。
master的pgpool.conf:
# CONNECTIONS
listen_addresses = '*'
port = 9999
pcp_listen_addresses = '*'
pcp_port = 9898
# - Backend Connection Settings -
backend_hostname0 = 'master'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/home/postgres/data'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = 'slave'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/home/postgres/data'
backend_flag1 = 'ALLOW_TO_FAILOVER'
# - Authentication -
enable_pool_hba = on
pool_passwd = 'pool_passwd'
# FILE LOCATIONS
pid_file_name = '/opt/pgpool/pgpool.pid'
replication_mode = off
load_balance_mode = on
master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period = 5
sr_check_user = 'repuser'
sr_check_password = 'repuser'
sr_check_database = 'postgres'
#------------------------------------------------------------------------------# HEALTH CHECK 健康检查#------------------------------------------------------------------------------
health_check_period = 10 # Health check period
# Disabled (0) by default
health_check_timeout = 20
# Health check timeout
# 0 means no timeout
health_check_user = 'postgres'
# Health check user
health_check_password = 'nariadmin' #数据库密码
# Password for health check user
health_check_database = 'postgres'#必须设置,否则primary数据库down了,pgpool不知道,不能及时切换。从库流复制还在连接数据,报连接失败。#只有下次使用pgpool登录时,发现连接不上,然后报错,这时候,才知道挂了,pgpool进行切换。
#主备切换的命令行配置#------------------------------------------------------------------------------# FAILOVER AND FAILBACK#------------------------------------------------------------------------------
failover_command = '/opt/pgpool/failover_stream.sh %H '
#------------------------------------------------------------------------------# WATCHDOG#------------------------------------------------------------------------------
# - Enabling -
use_watchdog = on
# - Watchdog communication Settings -
wd_hostname = 'master'
# Host name or IP address of this watchdog
# (change requires restart)
wd_port = 9000
# port number for watchdog service
# (change requires restart)# - Virtual IP control Setting -
delegate_IP = 'vip'
# delegate IP address
# If this is empty, virtual IP never bring up.
# (change requires restart)
if_cmd_path = '/sbin'
# path to the directory where if_up/down_cmd exists
# (change requires restart)
if_up_cmd = 'ifconfig eth1:0 inet $_IP_$ netmask 255.255.255.0'
# startup delegate IP command
# (change requires restart)
# eth1根据不同机器的网卡改掉
# 此处时用来 创建虚拟ip的
if_down_cmd = 'ifconfig eth1:0 down'
# shutdown delegate IP command
# (change requires restart)
# eth1根据不同机器的网卡改掉# -- heartbeat mode --
# 此处时用来 删除虚拟ip的
wd_heartbeat_port = 9694
# Port number for receiving heartbeat signal
# (change requires restart)
wd_heartbeat_keepalive = 2
# Interval time of sending heartbeat signal (sec)
# (change requires restart)
wd_heartbeat_deadtime = 30
# Deadtime interval for heartbeat signal (sec)
# (change requires restart)
heartbeat_destination0 = 'slave'
# Host name or IP address of destination 0
# for sending heartbeat signal.
# (change requires restart)
heartbeat_destination_port0 = 9694
# Port number of destination 0 for sending
# heartbeat signal. Usually this is the
# same as wd_heartbeat_port.
# (change requires restart)
heartbeat_device0 = 'eth1'
# Name of NIC device (such like 'eth0')
# used for sending/receiving heartbeat
# signal to/from destination 0.
# This works only when this is not empty
# and pgpool has root privilege.
# (change requires restart)
# eth1根据不同机器的网卡改掉# - Other pgpool Connection Settings -
other_pgpool_hostname0 = 'slave' #对端
# Host name or IP address to connect to for other pgpool 0
# (change requires restart)
other_pgpool_port0 = 9999
# Port number for othet pgpool 0
# (change requires restart)
other_wd_port0 = 9000
# Port number for othet watchdog 0
# (change requires restart)
slave的pgpool.conf:
# CONNECTIONS
listen_addresses = '*'
port = 9999
pcp_listen_addresses = '*'
pcp_port = 9898
# - Backend Connection Settings -
backend_hostname0 = 'master'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/home/postgres/data'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = 'slave'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/home/postgres/data'
backend_flag1 = 'ALLOW_TO_FAILOVER'
# - Authentication -
enable_pool_hba = on
pool_passwd = 'pool_passwd'
# FILE LOCATIONS
pid_file_name = '/opt/pgpool/pgpool.pid'
replication_mode = off
load_balance_mode = on
master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period = 5
sr_check_user = 'repuser'
sr_check_password = 'repuser'
sr_check_database = 'postgres'
#------------------------------------------------------------------------------# HEALTH CHECK 健康检查#------------------------------------------------------------------------------
health_check_period = 10 # Health check period
# Disabled (0) by default
health_check_timeout = 20
# Health check timeout
# 0 means no timeout
health_check_user = 'postgres'
# Health check user
health_check_password = 'nariadmin' #数据库密码
# Password for health check user
health_check_database = 'postgres'#必须设置,否则primary数据库down了,pgpool不知道,不能及时切换。从库流复制还在连接数据,报连接失败。#只有下次使用pgpool登录时,发现连接不上,然后报错,这时候,才知道挂了,pgpool进行切换。
#主备切换的命令行配置#------------------------------------------------------------------------------# FAILOVER AND FAILBACK#------------------------------------------------------------------------------
failover_command = '/opt/pgpool/failover_stream.sh %H '
#------------------------------------------------------------------------------# WATCHDOG#------------------------------------------------------------------------------
# - Enabling -
use_watchdog = on
# - Watchdog communication Settings -
wd_hostname = 'slave' #本端
# Host name or IP address of this watchdog
# (change requires restart)
wd_port = 9000
# port number for watchdog service
# (change requires restart)# - Virtual IP control Setting -
delegate_IP = 'vip'
# delegate IP address
# If this is empty, virtual IP never bring up.
# (change requires restart)
if_cmd_path = '/sbin'
# path to the directory where if_up/down_cmd exists
# (change requires restart)
if_up_cmd = 'ifconfig eth1:0 inet $_IP_$ netmask 255.255.255.0'
# startup delegate IP command
# (change requires restart)
# eth1根据现场机器改掉
# ifconfig 查看本地网卡时什么 这个就是什么
if_down_cmd = 'ifconfig eth1:0 down'
# shutdown delegate IP command
# (change requires restart)
# eth1根据现场机器改掉# -- heartbeat mode --
# ifconfig 查看本地网卡时什么 这个就是什么
wd_heartbeat_port = 9694
# Port number for receiving heartbeat signal
# (change requires restart)
wd_heartbeat_keepalive = 2
# Interval time of sending heartbeat signal (sec)
# (change requires restart)
wd_heartbeat_deadtime = 30
# Deadtime interval for heartbeat signal (sec)
# (change requires restart)
heartbeat_destination0 = 'master' #对端
# Host name or IP address of destination 0
# for sending heartbeat signal.
# (change requires restart)
heartbeat_destination_port0 = 9694
# Port number of destination 0 for sending
# heartbeat signal. Usually this is the
# same as wd_heartbeat_port.
# (change requires restart)
heartbeat_device0 = 'eth1'
# Name of NIC device (such like 'eth0')
# used for sending/receiving heartbeat
# signal to/from destination 0.
# This works only when this is not empty
# and pgpool has root privilege.
# (change requires restart)
# eth1根据现场机器改掉# - Other pgpool Connection Settings -
# ifconfig 查看本地网卡时什么 这个就是什么
other_pgpool_hostname0 = 'master' #对端
# Host name or IP address to connect to for other pgpool 0
# (change requires restart)
other_pgpool_port0 = 9999
# Port number for othet pgpool 0
# (change requires restart)
other_wd_port0 = 9000
# Port number for othet watchdog 0
# (change requires restart)
配置failover_stream.sh
#! /bin/sh # Failover command for streaming replication. # Arguments: $1: new master hostname.
new_master=$1
trigger_command="$PGHOME/bin/pg_ctl promote -D $PGDATA"
# Prompte standby database.
/usr/bin/ssh -T $new_master $trigger_command
exit 0;
配置文件权限
[root@opt ~]$ chown -R postgres.postgres /opt/pgpool
[root@opt ~]]$ chmod 777 /opt/pgpool/failover_stream.sh
创建日志目录,并授权
[root@master ~]# mkdir /var/log/pgpool
[root@master ~]# chown -R postgres.postgres /var/log/pgpool
[root@master ~]# mkdir /var/run/pgpool
[root@master ~]# chown -R postgres.postgres /var/run/pgpool
启动pgpool-II
pgpool -n -d > /var/log/pgpool/pgpool.log 2>&1 &
关闭
pgpool -m fast stop
加入节点:
pcp_attach_node -d -U postgres -h vip -p 9898 -n 0
在主备切换时,修复节点并重启后,由于primary数据发生变化,或修复的节点数据发生变化再按照流复制模式加入集群,很可能报时间线不同步错误:
#slave机器重启后,由于master或slave数据不同步产生了
[postgres@slave data]$ mv recovery.done recovery.conf
[postgres@slave data]$ pg_ctl start
waiting for server to start....2017-07-24 19:31:44.563 PDT [2663] LOG: listening on IPv4 address "0.0.0.0", port 54322017-07-24 19:31:44.563 PDT [2663] LOG: listening on IPv6 address "::", port 54322017-07-24 19:31:44.565 PDT [2663] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"2017-07-24 19:31:44.584 PDT [2664] LOG: database system was shut down at 2017-07-24 19:31:30 PDT
2017-07-24 19:31:44.618 PDT [2664] LOG: entering standby mode
2017-07-24 19:31:44.772 PDT [2664] LOG: consistent recovery state reached at 0/2D000098
2017-07-24 19:31:44.772 PDT [2663] LOG: database system is ready to accept read only connections
2017-07-24 19:31:44.772 PDT [2664] LOG: invalid record length at 0/2D000098: wanted 24, got 02017-07-24 19:31:44.798 PDT [2668] LOG: fetching timeline history file for timeline 11 from primary server
2017-07-24 19:31:44.826 PDT [2668] FATAL: could not start WAL streaming: ERROR: requested starting point 0/2D000000 on timeline 10 is not in this server's history
DETAIL: This server's history forked from timeline 10 at 0/2B0001B0.
2017-07-24 19:31:44.826 PDT [2664] LOG: new timeline 11 forked off current database system timeline 10 before current recovery point 0/2D000098
done
产生这种情况,需要根据pg_rewind工具同步数据时间线,具体分5步走。
[postgres@slave ] pg_ctl stop
[postgres@slave data]$ pg_rewind --target-pgdata=/opt/pgsql11/data --source-server='host=master port=5432 user=postgres dbname=postgres password=123456'
servers diverged at WAL location 0/2B0001B0 on timeline 10
rewinding from last common checkpoint at 0/2B000108 on timeline 10
Done!
#pg_hba.conf与 recovery.done都是同步master上来的,要改成slave自己的
[postgres@slave ] cd $PGDATA
[postgres@slave data]$ mv recovery.done recovery.conf
[postgres@slave data]$ vi pg_hba.conf
#slave改成master(相当于slave的流复制对端)
host replication repuser master md5
[postgres@slave data]$ vi recovery.conf
#slave改成master(相当于slave的流复制对端)
primary_conninfo = 'host=master port=5432 user=repuser password=repuser'
[postgres@slave data]$ pg_ctl start
pcp_attach_node -d -U postgres -h vip -p 9898 -n 1
遇坑:
failover_stream.sh文件中内容的格式 最后要加回车。
vip 是根据本地的网关设置的,本机网卡如果是 eth0 vip就是 eth0:0 ;如果本地网关时ens192 vip设置时需要设置成 ens192:0。