操作系统:CentOS Linux release 7.3.1611 (Core)
数据库版本:postgresql 10.6
本环境为一主二从的流复制集群,使用corosync+pacemaker进行高可用管控。一台从库使用同步复制,分担读压力;另一台从库使用异步复制,作为一个实时备份。
早上收到监控报警:
Stack: corosync
Current DC: sh01-oscar-cmp-pp-pg03 (version 1.1.19-8.el7_6.4-c3c624ea3d) - partition with quorum
Last updated: Wed Oct 16 10:17:02 2019
Last change: Wed Oct 16 10:15:17 2019 by root via crm_attribute on sh01-oscar-cmp-pp-pg03
3 nodes configured
11 resources configured
Online: [ sh01-oscar-cmp-pp-pg01 sh01-oscar-cmp-pp-pg02 sh01-oscar-cmp-pp-pg03 ]
Full list of resources:
fence-sh01-oscar-cmp-pp-pg01 (ocf::heartbeat:fence_check): Started sh01-oscar-cmp-pp-pg01
fence-sh01-oscar-cmp-pp-pg02 (ocf::heartbeat:fence_check): Started sh01-oscar-cmp-pp-pg02
fence-sh01-oscar-cmp-pp-pg03 (ocf::heartbeat:fence_check): Started sh01-oscar-cmp-pp-pg03
Resource Group: master-group
vip-master (ocf::heartbeat:IPaddr2): Started sh01-oscar-cmp-pp-pg03
Resource Group: slave-group
vip-slave (ocf::heartbeat:IPaddr2): Started sh01-oscar-cmp-pp-pg02
Master/Slave Set: msPostgresql [pgsql]
Masters: [ sh01-oscar-cmp-pp-pg03 ]
Slaves: [ sh01-oscar-cmp-pp-pg02 ]
Stopped: [ sh01-oscar-cmp-pp-pg01 ]
Clone Set: clnPingCheck [pingCheck]
Started: [ sh01-oscar-cmp-pp-pg01 sh01-oscar-cmp-pp-pg02 sh01-oscar-cmp-pp-pg03 ]
Failed Actions:
* pgsql_start_0 on sh01-oscar-cmp-pp-pg01 'unknown error' (1): call=84, status=complete, exitreason='My data may be inconsistent. You have to remove /var/lib/pgsql/tmp/PGSQL.lock file to force start.',
last-rc-change='Wed Oct 16 10:15:06 2019', queued=0ms, exec=167ms
显示01机器出现异常,同事Prometheus收到告警,01节点的服务端口已无法访问。
连入操作系统,发现01节点的postgresql已关闭,检查数据库日志发现问题:
2019-10-16 10:15:02.651 CST [55400] LOG: server process (PID 16342) was terminated by signal 9: Killed
2019-10-16 10:15:02.651 CST [55400] LOG: terminating any other active server processes
2019-10-16 10:15:02.651 CST [20414] WARNING: terminating connection because of crash of another server process
2019-10-16 10:15:02.651 CST [20414] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2019-10-16 10:15:02.681 CST [20523] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2019-10-16 10:15:02.694 CST [55400] LOG: all server processes terminated; reinitializing
2019-10-16 10:15:02.785 CST [20551] LOG: database system was interrupted; last known up at 2019-10-16 10:12:42 CST
2019-10-16 10:15:02.787 CST [20552] FATAL: the database system is in recovery mode
2019-10-16 10:15:02.971 CST [20572] FATAL: the database system is in recovery mode
2019-10-16 10:15:03.050 CST [20551] LOG: database system was not properly shut down; automatic recovery in progress
2019-10-16 10:15:03.058 CST [20551] LOG: redo starts at 0/65CBB4A0
2019-10-16 10:15:03.164 CST [20686] FATAL: the database system is in recovery mode
2019-10-16 10:15:03.190 CST [20688] FATAL: the database system is in recovery mode
2019-10-16 10:15:03.195 CST [20689] FATAL: the database system is in recovery mode
2019-10-16 10:15:03.214 CST [20690] FATAL: the database system is in recovery mode
2019-10-16 10:15:03.260 CST [20551] LOG: invalid record length at 0/67D224D8: wanted 24, got 0
2019-10-16 10:15:03.260 CST [20551] LOG: redo done at 0/67D224B0
2019-10-16 10:15:03.260 CST [20551] LOG: last completed transaction was at log time 2019-10-16 10:15:02.434297+08
2019-10-16 10:15:03.391 CST [55400] LOG: database system is ready to accept connections
2019-10-16 10:15:03.419 CST [55400] LOG: received fast shutdown request
2019-10-16 10:15:03.421 CST [55400] LOG: aborting any active transactions
2019-10-16 10:15:03.423 CST [55400] LOG: worker process: logical replication launcher (PID 20811) exited with exit code 1
2019-10-16 10:15:03.425 CST [20804] LOG: shutting down
2019-10-16 10:15:03.462 CST [20859] FATAL: the database system is shutting down
2019-10-16 10:15:03.465 CST [20860] FATAL: the database system is shutting down
2019-10-16 10:15:03.491 CST [55400] LOG: database system is shut down
可观察到,在10:15:02的时候,postgresql的进程被直接kill -9杀死了,,,
查看系统日志(/var/log/messages):
Oct 16 10:15:02 sh01-oscar-cmp-pp-pg01 kernel: Out of memory: Kill process 16342 (postgres) score 843 or sacrifice child
Oct 16 10:15:02 sh01-oscar-cmp-pp-pg01 kernel: Killed process 16342 (postgres) total-vm:8494044kB, anon-rss:3399704kB, file-rss:400kB, shmem-rss:21080kB
Oct 16 10:15:02 sh01-oscar-cmp-pp-pg01 kernel: postgres: page allocation failure: order:0, mode:0x2015a
原来是内存耗尽,触发了oom-kill,操作系统杀掉了耗费内存最高的postgresql进程。。。
检查机器总内存:
[root@sh01-oscar-cmp-pp-pg01 ~]# free -h
total used free shared buff/cache available
Mem: 3.7G 194M 2.6G 149M 971M 3.1G
Swap: 2.0G 290M 1.7G
机器内存只有不到4G
而postgresql.conf文件中配置的share_buffer为3G,再加上各个session占用的内存,导致系统内存不足,后修改share_buffer为1G,问题解决。
具体内存计算方法可参考这个博客:
PostgreSQL消耗的内存计算方法
计算公式为:
max_connections*work_mem
+ max_connections*temp_buffers
+ shared_buffers
+ (autovacuum_max_workers * maintenance_work_mem)
假设PostgreSQL的配置如下:
max_connections = 100
temp_buffers=32MB
work_mem=32MB
shared_buffers=19GB
autovacuum_max_workers = 3
maintenance_work_mem=1GB #默认值64MB
则计算出内存为:
select(
(100*(32*1024*1024)::bigint)
+ (100*(32*1024*1024)::bigint)
+ (19*(1024*1024*1024)::bigint)
+ (3 * (1024*1024*1024)::bigint )
)::float8 / 1024 / 1024 / 1024
--output
28.25
此时pg满载峰值时最多使用28.25GB内存,物理内容为32GB时,还有3.75GB内存给操作系统使用.
计算公式为:
max_connections*work_mem
+ max_connections*temp_buffers
+ shared_buffers+wal_buffers
+ (autovacuum_max_workers * autovacuum_work_mem)
假设PostgreSQL的配置如下:
max_connections = 100
temp_buffers=32MB
work_mem=32MB
shared_buffers=19GB
wal_buffers=16MB #--with-wal-segsize的默认值
autovacuum_max_workers = 3
maintenance_work_mem=1GB
则计算出内存为:
select(
(100*(32*1024*1024)::bigint)
+ (100*(32*1024*1024)::bigint)
+ (19*(1024*1024*1024)::bigint)
+ (16*1024*1024)::bigint
+ (3 * (1024*1024*1024)::bigint )
)::float8 / 1024 / 1024 / 1024
--output
28.26
此时pg满载峰值时最多使用28.5GB内存,物理内容为32GB,还有3.5GB内存给操作系统使用.
计算公式为:
max_connections*work_mem
+ max_connections*temp_buffers
+ shared_buffers+wal_buffers
+ (autovacuum_max_workers * autovacuum_work_mem)
+ maintenance_work_mem
假设PostgreSQL的配置如下:
max_connections = 100
temp_buffers=32MB
work_mem=32MB
shared_buffers=19GB
wal_buffers=262143kb
autovacuum_max_workers = 3
autovacuum_work_mem=256MB
maintenance_work_mem=2GB
则计算出内存为:
select(
(100*(32*1024*1024)::bigint)
+ (100*(32*1024*1024)::bigint)
+ (19*(1024*1024*1024)::bigint)
+ (262143*1024)::bigint
+ (3 * (256*1024*1024)::bigint )
+ ( 2 * (1024*1024*1024)::bigint )
)::float8 / 1024 / 1024 / 1024
--output
28.01
此时pg载峰值时最多使用28.25GB内存,物理内容为32GB时,还有3.75GB内存给操作系统使用.建议所有内存消耗根据硬件配置,也就是使用这个配置.