笔者近来遇到一个任务:在多台Docker主机之间搭建PostgreSQL12的主从复制、读写分离高可用集群运行环境。确保即便在极端情况下(某台Docker主机宕机甚至磁盘损坏)依然能够读取PostgreSQL中的数据,而不至于导致数据丢失。
在两台linux主机(或虚拟机)上分别安装好Docker运行环境,建议Docker使用最新版。笔者使用了两个CentOS 8.3虚拟机。一个虚拟机(命名为dockerserver1)由Hyper-V提供,另一个(命名为虚拟机dockerserver2)由VMWare来提供。
在每个虚拟机里准备好PostgreSQL 12.3的镜像。
在dockerserver1上为Master和Slave节点创建数据文件夹master、slave和repl,并设置chown 999:999权限。这一步不是必须的,但是建议这样做。因为数据放在容器中在极端情况下很难被恢复。
[root@dockerserver1 postgredata]# mkdir master
[root@dockerserver1 postgredata]# mkdir slave1
[root@dockerserver1 postgredata]# mkdir repl
[root@dockerserver1 postgredata]# chown 999:999 master
[root@dockerserver1 postgredata]# chown 999:999 slave1
[root@dockerserver1 postgredata]# chown 999:999 repl
通过PostgreSQL 12.3镜像创建master节点,使用上一步创建的master文件夹作为PostgreSQL的data文件夹:
[root@dockerserver1 postgredata]# docker run -d -p 5433:5432 -e POSTGRES_PASSWORD=postgres -v /DataBaseDatas/postgredata/master/:/var/lib/postgresql/data --name pg12master -h pg12master -e LANG="C.UTF-8" -e 'TZ=Asia/Shanghai' -m 256m --memory-swap -1 postgres:12.3
238b7234bd4b9662b90c695cbf0ba448a44276928b80deeed5d23382ef48e8c6
[root@dockerserver1 postgredata]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
238b7234bd4b postgres:12.3 "docker-entrypoint.s…" 27 hours ago Up 3 hours 0.0.0.0:5433->5432/tcp pg12master
说明:网上已有在Docker下创建PostgreSQL主从复制集群的教程多是基于网桥bridge来创建的,这里笔者不想这么复杂,所以省略了创建docker bridge这一步,直接基于默认的网桥172.17.0.1来创建容器。
[root@dockerserver1 postgredata]# docker exec -it -u postgres pg12master /bin/bash
postgres@pg12master:/$ psql
psql (12.3 (Debian 12.3-1.pgdg100+1))
Type "help" for help.
postgres=# CREATE ROLE repuser WITH LOGIN REPLICATION CONNECTION LIMIT 5
PASSWORD 'repuser';
CREATE ROLE
在dockerserver1上创建slave1节点:
[root@dockerserver1 postgredata]# docker run -d -p 5434:5432 --name pg12slave1 -h pg12slave1 -e LANG="C.UTF-8" -e 'TZ=Asia/Shanghai' -e "POSTGRES_DB=postgres" -e "POSTGRES_USER=postgres" -e "POSTGRES_PASSWORD=postgres" -v /DataBaseDatas/postgredata/slave:/var/lib/postgresql/data -v /DataBaseDatas/postgredata/repl:/var/lib/postgresql/repl -m 256m --memory-swap -1 postgres:12.3
a8b1f661c9739e49b2432148e6c003c560d01bd1ae4019142681a877064c02d7
查看slave1容器的IP地址:
[root@dockerserver1 postgredata]# docker inspect --format '{{.NetworkSettings.IPAddress}}' pg12slave1
172.17.0.3
打开master节点的pg_hba.conf,在最下面加入如下内容:
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all trust
# IPv4 local connections:
host all all 127.0.0.1/32 trust
# IPv6 local connections:
host all all ::1/128 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication all trust
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
host replication all 0.0.0.0/0 md5
host replication repuser 172.17.0.0/24 md5
上述配置中的最后一行
host replication repuser 172.17.0.0/24 md5
即为新添加的配置。意为只要是同一个网段下的容器都可以访问。网上的教程基本都是配置的固定IP(即pg12slave1节点的IP地址172.17.0.3)。这里为了能够在同一个服务器的Docker环境下创建多个slave容器,所以直接将容器所在的网段172.17.0.*添加到了master的信任域里。
查找文件中以下几个参数,并调整如下:
archive_mode = on
archive_command = '/bin/date'
max_wal_senders = 10
wal_keep_segments = 16
synchronous_standby_names = '*'
将上述两个文件保存后,需要重启master节点:
[root@dockerserver1 postgredata]# docker restart pg12master
pg12master
进入pg12slave1容器内部,执行如下命令:
[root@dockerserver1 postgredata]# docker exec -it -u postgres pg12slave1 /bin/bash
postgres@pg12slave1:/$ pg_basebackup -R -D /var/lib/postgresql/repl -Fp -Xs -v -P -h 172.17.0.1 -p 5433 -U repuser
Password:
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/90000D8 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_86"
78736/78736 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/90001E8
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: base backup completed
说明:经笔者反复测试,上述pg_basebackup命令中的-h和-p必须使用pg12master容器所在主机的IP地址,以及映射到主机的端口号,否则会提示连接不到服务器。
[root@dockerserver1 postgredata]# docker stop pg12slave1 && docker rm pg12slave1
pg12slave1
[root@dockerserver1 postgredata]# rm -rf slave1
[root@dockerserver1 postgredata]# mv repl slave1
[root@dockerserver1 postgredata]# docker run -d --ip 172.17.0.5 -p 5434:5432 --name pg12slave -h pg12slave -e LANG="C.UTF-8" -e 'TZ=Asia/Shanghai' -e "POSTGRES_DB=postgres" -e "POSTGRES_USER=postgres" -e "POSTGRES_PASSWORD=postgres" -v /DataBaseDatas/postgredata/slave1:/var/lib/postgresql/data -m 256m --memory-swap -1 postgres:12.3
a8b1f661c9739e49b2432148e6c003c560d01bd1ae4019142681a877064c02d7
说明:注意此时pg12slave1容器已经不需要再挂载repl目录了。
用postgres用户登入pg12master容器内部,执行如下SQL语句,查看关联的从节点信息:
[root@dockerserver1 postgredata]# docker exec -it -u postgres pg12master /bin/bash
postgres@pg12master:/$ psql
psql (12.3 (Debian 12.3-1.pgdg100+1))
Type "help" for help.
postgres=# select * from pg_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_lsn | write_lsn | flush_lsn | replay_lsn | write_l
ag | flush_lag | replay_lag | sync_priority | sync_state | reply_time
-----+----------+---------+------------------+--------------+-----------------+-------------+-------------------------------+--------------+-----------+-----------+-----------+-----------+------------+--------
---+-----------+------------+---------------+------------+-------------------------------
149 | 16384 | repuser | walreceiver | 172.17.0.1 | | 51752 | 2021-11-25 15:21:28.80868+08 | | streaming | 0/A004518 | 0/A004518 | 0/A004518 | 0/A004518 |
| | | 1 | sync | 2021-11-25 18:42:51.27092+08
(2 rows)
postgres=#
能够看到上述记录时说明pg12master-pg12slave1主从复制集群环境搭建成功了。
可以按照上述的操作再在同一个服务器(虚拟机)dockerserver1上再创建更多的slave节点。
在dockerserver2服务器上创建pg12slave2的步骤与在dockerserver1上创建pg12slave1的步骤完全相同。只是在pg12slave2内执行pg_basebackup命令之前需要将dockerserver2的IP地址添加到pg12master的信任域中,即在pg_hba.conf中添加如下内容并重启pg12master容器:
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all trust
# IPv4 local connections:
host all all 127.0.0.1/32 trust
# IPv6 local connections:
host all all ::1/128 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication all trust
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
host replication all 0.0.0.0/0 md5
host replication repuser 172.17.0.0/24 md5
host replication repuser 192.169.33.0/24 md5
说明:上述内容中最下面一行
host replication repuser 192.169.33.0/24 md5
其中的192.169.33.0即为dockerserver2的主机IP地址所在的网段。
然后按照同样的步骤执行pg_basebackup命令:
postgres@pg12slave2:/$ pg_basebackup -R -D /var/lib/postgresql/repl -Fp -Xs -v -P -h 192.169.33.2 -p 5433 -U repuser
Password:
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/90000D8 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_86"
78736/78736 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/90001E8
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: base backup completed
然后重新创建pg12slave2容器。成功后在pg12master节点内执行pg_stat_replication SQL查询,就可以看到如下信息:
postgres=# select * from pg_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_lsn | write_lsn | flush_lsn | replay_lsn | write_l
ag | flush_lag | replay_lag | sync_priority | sync_state | reply_time
-----+----------+---------+------------------+--------------+-----------------+-------------+-------------------------------+--------------+-----------+-----------+-----------+-----------+------------+--------
---+-----------+------------+---------------+------------+-------------------------------
99 | 16384 | repuser | walreceiver | 192.169.33.3 | | 47044 | 2021-11-25 14:59:24.511927+08 | 1099 | streaming | 0/A004518 | 0/A004518 | 0/A004518 | 0/A004518 |
| | | 1 | potential | 2021-11-26 02:42:57.577931+08
149 | 16384 | repuser | walreceiver | 172.17.0.1 | | 51752 | 2021-11-25 15:21:28.80868+08 | | streaming | 0/A004518 | 0/A004518 | 0/A004518 | 0/A004518 |
| | | 1 | sync | 2021-11-25 18:42:51.27092+08
(2 rows)
postgres=#
OK,大功告成!
在dockerserver1和dockerserver2内将对应的端口从防火墙中开放出来。然后使用数据库可视化工具去连接后,在主库pg12master中进行数据库创建,即可看到从库pg12slave1和pg12slave2均可实时同步变化后的信息:
以上就是在多个Docker主机之间创建一主多从PostgreSQL集群的搭建步骤。其中,只有master节点是可写的,其他的slave节点都是只读的。从而搭建了读写分离主从同步集群。
大家可以按照这样的步骤扩展其他的slave节点。
参考过的网址:
Docker创建postgesql主从复制保姆级示例_无问西东j的博客-CSDN博客
(57条消息) 使用 Docker 搭建 PostgreSQL 12 主从环境_寒冰屋的专栏-CSDN博客_docker postgres