环境如下:
more /etc/redhat-release
Red Hat Enterprise Linux Server release 7.5 (Maipo)
系统安装采取最小化安装。
greenplum-db-5.16.0-rhel7-x86_64.zip
more /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.15.201 rhmdw
192.168.15.202 rhsdw1
192.168.15.203 rhsdw2
192.168.15.205 rhsdw03
192.168.15.206 rhsdw04
greenplum如何恢复故障segment及如何切换mirror和primary
--启动gp
[gpadmin@rhmdw ~]$ gpssh -f /gp/app/config/hostlist -e 'netstat -nltp | grep postgres'
[ rhmdw] netstat -nltp | grep postgres
[ rhmdw] (Not all processes could be identified, non-owned process info
[ rhmdw] will not be shown, you would have to be root to see it all.)
[rhsdw03] netstat -nltp | grep postgres
[rhsdw03] (Not all processes could be identified, non-owned process info
[rhsdw03] will not be shown, you would have to be root to see it all.)
[ rhsdw2] netstat -nltp | grep postgres
[ rhsdw2] (No info could be read for "-p": geteuid()=1000 but you should be root.)
[rhsdw04] netstat -nltp | grep postgres
[rhsdw04] (Not all processes could be identified, non-owned process info
[rhsdw04] will not be shown, you would have to be root to see it all.)
[ rhsdw1] netstat -nltp | grep postgres
[ rhsdw1] (No info could be read for "-p": geteuid()=1000 but you should be root.)
[gpadmin@rhmdw ~]$ gpstart -a
20190404:13:44:18:002757 gpstart:rhmdw:gpadmin-[INFO]:-Starting gpstart with args: -a
20190404:13:44:18:002757 gpstart:rhmdw:gpadmin-[INFO]:-Gathering information and validating the environment...
20190404:13:44:18:002757 gpstart:rhmdw:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:13:44:18:002757 gpstart:rhmdw:gpadmin-[INFO]:-Greenplum Catalog Version: '301705051'
20190404:13:44:18:002757 gpstart:rhmdw:gpadmin-[INFO]:-Starting Master instance in admin mode
20190404:13:44:19:002757 gpstart:rhmdw:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20190404:13:44:19:002757 gpstart:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:13:44:19:002757 gpstart:rhmdw:gpadmin-[INFO]:-Setting new master era
20190404:13:44:19:002757 gpstart:rhmdw:gpadmin-[INFO]:-Master Started...
20190404:13:44:19:002757 gpstart:rhmdw:gpadmin-[INFO]:-Shutting down master
20190404:13:44:20:002757 gpstart:rhmdw:gpadmin-[INFO]:-Commencing parallel primary and mirror segment instance startup, please wait...
...
20190404:13:44:23:002757 gpstart:rhmdw:gpadmin-[INFO]:-Process results...
20190404:13:44:23:002757 gpstart:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:13:44:23:002757 gpstart:rhmdw:gpadmin-[INFO]:- Successful segment starts = 5 <<===============启动成功5个seg
20190404:13:44:23:002757 gpstart:rhmdw:gpadmin-[INFO]:- Failed segment starts = 0
20190404:13:44:23:002757 gpstart:rhmdw:gpadmin-[INFO]:- Skipped segment starts (segments are marked down in configuration) = 0
20190404:13:44:23:002757 gpstart:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:13:44:23:002757 gpstart:rhmdw:gpadmin-[INFO]:-Successfully started 5 of 5 segment instances
20190404:13:44:23:002757 gpstart:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:13:44:23:002757 gpstart:rhmdw:gpadmin-[INFO]:-Starting Master instance rhmdw directory /gp/gpdata/master/gpseg-1
20190404:13:44:25:002757 gpstart:rhmdw:gpadmin-[INFO]:-Command pg_ctl reports Master rhmdw instance active
20190404:13:44:25:002757 gpstart:rhmdw:gpadmin-[INFO]:-No standby master configured. skipping...
20190404:13:44:25:002757 gpstart:rhmdw:gpadmin-[INFO]:-Database successfully started
[gpadmin@rhmdw ~]$
--查看状态:
[gpadmin@rhmdw ~]$ gpstate -m
20190404:13:45:20:002985 gpstate:rhmdw:gpadmin-[INFO]:-Starting gpstate with args: -m
20190404:13:45:21:002985 gpstate:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:13:45:21:002985 gpstate:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:13:45:21:002985 gpstate:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:13:45:21:002985 gpstate:rhmdw:gpadmin-[INFO]:--------------------------------------------------------------
20190404:13:45:21:002985 gpstate:rhmdw:gpadmin-[INFO]:--Current GPDB mirror list and status
20190404:13:45:21:002985 gpstate:rhmdw:gpadmin-[INFO]:--Type = Spread
20190404:13:45:21:002985 gpstate:rhmdw:gpadmin-[INFO]:--------------------------------------------------------------
20190404:13:45:21:002985 gpstate:rhmdw:gpadmin-[INFO]:- Mirror Datadir Port Status Data Status
20190404:13:45:21:002985 gpstate:rhmdw:gpadmin-[INFO]:- rhsdw2 /gp/gpdata/mirror/gpseg0 7000 Acting as Primary Change Tracking
20190404:13:45:21:002985 gpstate:rhmdw:gpadmin-[INFO]:- rhsdw1 /gp/gpdata/mirror/gpseg1 7000 Acting as Primary Change Tracking
20190404:13:45:21:002985 gpstate:rhmdw:gpadmin-[INFO]:- rhsdw04 /gp/gpdata/mirror/gpseg2 7000 Passive Synchronized
20190404:13:45:21:002985 gpstate:rhmdw:gpadmin-[INFO]:- rhsdw03 /gp/gpdata/mirror/gpseg3 7000 Acting as Primary Change Tracking
20190404:13:45:21:002985 gpstate:rhmdw:gpadmin-[INFO]:--------------------------------------------------------------
20190404:13:45:21:002985 gpstate:rhmdw:gpadmin-[WARNING]:-3 segment(s) configured as mirror(s) are acting as primaries
20190404:13:45:21:002985 gpstate:rhmdw:gpadmin-[WARNING]:-3 mirror segment(s) acting as primaries are in change tracking
查看状态:
rhsdw2、rhsdw1、rhsdw03主机mirror 充当primary启动,检查端口,发现故障节点6000端口未启动
[gpadmin@rhmdw ~]$ gpssh -f /gp/app/config/hostlist -e 'netstat -nltp | grep postgres'
[ rhsdw2] netstat -nltp | grep postgres
[ rhsdw2] (Not all processes could be identified, non-owned process info
[ rhsdw2] will not be shown, you would have to be root to see it all.)
[ rhsdw2] tcp 0 0 0.0.0.0:7000 0.0.0.0:* LISTEN 13492/postgres
[ rhsdw2] tcp6 0 0 :::7000 :::* LISTEN 13492/postgres
[ rhmdw] netstat -nltp | grep postgres
[ rhmdw] (Not all processes could be identified, non-owned process info
[ rhmdw] will not be shown, you would have to be root to see it all.)
[ rhmdw] tcp 0 0 0.0.0.0:5432 0.0.0.0:* LISTEN 2833/postgres
[ rhmdw] tcp6 0 0 :::55763 :::* LISTEN 2840/postgres: 543
[ rhmdw] tcp6 0 0 :::5432 :::* LISTEN 2833/postgres
[rhsdw03] netstat -nltp | grep postgres
[rhsdw03] (Not all processes could be identified, non-owned process info
[rhsdw03] will not be shown, you would have to be root to see it all.)
[rhsdw03] tcp 0 0 192.168.15.205:9000 0.0.0.0:* LISTEN 25726/postgres: 60
[rhsdw03] tcp 0 0 0.0.0.0:6000 0.0.0.0:* LISTEN 25714/postgres
[rhsdw03] tcp 0 0 0.0.0.0:7000 0.0.0.0:* LISTEN 25713/postgres
[rhsdw03] tcp6 0 0 :::6000 :::* LISTEN 25714/postgres
[rhsdw03] tcp6 0 0 :::7000 :::* LISTEN 25713/postgres
[ rhsdw1] netstat -nltp | grep postgres
[ rhsdw1] (Not all processes could be identified, non-owned process info
[ rhsdw1] will not be shown, you would have to be root to see it all.)
[ rhsdw1] tcp 0 0 0.0.0.0:7000 0.0.0.0:* LISTEN 19957/postgres
[ rhsdw1] tcp6 0 0 :::7000 :::* LISTEN 19957/postgres
[rhsdw04] netstat -nltp | grep postgres
[rhsdw04] (Not all processes could be identified, non-owned process info
[rhsdw04] will not be shown, you would have to be root to see it all.)
[rhsdw04] tcp 0 0 192.168.15.206:8000 0.0.0.0:* LISTEN 13713/postgres: 70
[rhsdw04] tcp 0 0 0.0.0.0:7000 0.0.0.0:* LISTEN 13707/postgres
[rhsdw04] tcp6 0 0 :::7000 :::* LISTEN 13707/postgres
通过gpstate -b查看,发现三个seg端postmaster.pid文件丢失
[gpadmin@rhmdw ~]$ gpstate -b
20190404:13:45:35:003051 gpstate:rhmdw:gpadmin-[INFO]:-Starting gpstate with args: -b
20190404:13:45:36:003051 gpstate:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:13:45:36:003051 gpstate:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:13:45:36:003051 gpstate:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:13:45:36:003051 gpstate:rhmdw:gpadmin-[INFO]:-Gathering data from segments...
..
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:-Greenplum instance status summary
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Master instance = Active
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Master standby = No master standby configured
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total segment instance count from metadata = 8
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Primary Segment Status
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total primary segments = 4
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total primary segment valid (at master) = 1
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[WARNING]:-Total primary segment failures (at master) = 3 <<<<<<<<
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[WARNING]:-Total number of postmaster.pid files missing = 3 <<<<<<<<
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files found = 1
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[WARNING]:-Total number of postmaster.pid PIDs missing = 3 <<<<<<<<
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs found = 1
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[WARNING]:-Total number of /tmp lock files missing = 3 <<<<<<<<
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files found = 1
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[WARNING]:-Total number postmaster processes missing = 3 <<<<<<<<
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes found = 1
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Mirror Segment Status
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total mirror segments = 4
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total mirror segment valid (at master) = 4
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total mirror segment failures (at master) = 0
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files missing = 0
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files found = 4
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs missing = 0
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs found = 4
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files missing = 0
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files found = 4
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes missing = 0
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes found = 4
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[WARNING]:-Total number mirror segments acting as primary segments = 3 <<<<<<<<
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:- Total number mirror segments acting as mirror segments = 1
20190404:13:45:38:003051 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
检查正常节点启动进程,发现是postgres启动,尝试在故障seg端启动
[root@rhsdw03 ~]# ps -ef |grep postgre
gpadmin 25713 1 0 13:43 ? 00:00:00 /gp/app/bin/postgres -D /gp/gpdata/mirror/gpseg3 -p 7000 --gp_dbid=8 --gp_num_contents_in_cluster=4 --silent-mode=true -i -M quiescent --gp_contentid=3
gpadmin 25714 1 0 13:43 ? 00:00:00 /gp/app/bin/postgres -D /gp/gpdata/primary/gpseg2 -p 6000 --gp_dbid=6 --gp_num_contents_in_cluster=4 --silent-mode=true -i -M quiescent --gp_contentid=2
gpadmin 25715 25713 0 13:43 ? 00:00:00 postgres: 7000, logger process
gpadmin 25716 25714 0 13:43 ? 00:00:00 postgres: 6000, logger process
gpadmin 25723 25713 0 13:43 ? 00:00:00 postgres: 7000, primary process
gpadmin 25724 25714 0 13:43 ? 00:00:00 postgres: 6000, primary process
gpadmin 25725 25723 0 13:43 ? 00:00:00 postgres: 7000, primary recovery process
gpadmin 25726 25724 0 13:43 ? 00:00:00 postgres: 6000, primary receiver ack process
gpadmin 25727 25724 0 13:43 ? 00:00:00 postgres: 6000, primary sender process
gpadmin 25728 25724 0 13:43 ? 00:00:00 postgres: 6000, primary consumer ack process
gpadmin 25729 25724 0 13:43 ? 00:00:00 postgres: 6000, primary recovery process
gpadmin 25732 25713 0 13:43 ? 00:00:00 postgres: 7000, stats collector process
gpadmin 25733 25713 0 13:43 ? 00:00:00 postgres: 7000, writer process
gpadmin 25734 25713 0 13:43 ? 00:00:00 postgres: 7000, checkpointer process
gpadmin 25735 25713 0 13:43 ? 00:00:00 postgres: 7000, sweeper process
gpadmin 25736 25713 0 13:43 ? 00:00:00 postgres: 7000, stats sender process
gpadmin 25737 25713 0 13:43 ? 00:00:00 postgres: 7000, wal writer process
gpadmin 25740 25714 0 13:43 ? 00:00:00 postgres: 6000, stats collector process
gpadmin 25741 25714 0 13:43 ? 00:00:00 postgres: 6000, writer process
gpadmin 25742 25714 0 13:43 ? 00:00:00 postgres: 6000, checkpointer process
gpadmin 25743 25714 0 13:43 ? 00:00:00 postgres: 6000, sweeper process
gpadmin 25744 25714 0 13:43 ? 00:00:00 postgres: 6000, stats sender process
gpadmin 25745 25714 0 13:43 ? 00:00:00 postgres: 6000, wal writer process
root 26091 24220 0 13:51 pts/8 00:00:00 grep --color=auto postgre
启动参数内容
[gpadmin@rhsdw03 gpseg2]$ more /gp/gpdata/primary/gpseg2/postmaster.opts
/gp/app/bin/postgres "-D" "/gp/gpdata/primary/gpseg2" "-p" "6000" "--gp_dbid=6" "--gp_num_contents_in_cluster=4" "--silent-mode=true" "-i" "-M" "quiescent" "--gp_contentid=2"
启动故障节点:
[gpadmin@rhsdw04 gpseg3]$ /gp/app/bin/postgres "-D" "/gp/gpdata/primary/gpseg3" "-p" "6000" "--gp_dbid=7" "--gp_num_contents_in_cluster=4" "--silent-mode=true" "-i" "-M" "quiescent" "--gp_contentid=3"
[gpadmin@rhsdw04 gpseg3]$ ps -ef | grep postgres
gpadmin 13707 1 0 13:42 ? 00:00:00 /gp/app/bin/postgres -D /gp/gpdata/mirror/gpseg2 -p 7000 --gp_dbid=9 --gp_num_contents_in_cluster=4 --silent-mode=true -i -M quiescent --gp_contentid=2
gpadmin 13708 13707 0 13:42 ? 00:00:00 postgres: 7000, logger process
gpadmin 13712 13707 0 13:42 ? 00:00:00 postgres: 7000, mirror process
gpadmin 13713 13712 0 13:42 ? 00:00:00 postgres: 7000, mirror receiver process
gpadmin 13714 13712 0 13:42 ? 00:00:00 postgres: 7000, mirror consumer process
gpadmin 13715 13712 0 13:42 ? 00:00:00 postgres: 7000, mirror consumer writer process
gpadmin 13716 13712 0 13:42 ? 00:00:00 postgres: 7000, mirror consumer append only process
gpadmin 13717 13712 0 13:42 ? 00:00:00 postgres: 7000, mirror sender ack process
gpadmin 13867 1 1 13:56 ? 00:00:00 /gp/app/bin/postgres -D /gp/gpdata/primary/gpseg3 -p 6000 --gp_dbid=7 --gp_num_contents_in_cluster=4 --silent-mode=true -i -M quiescent --gp_contentid=3
gpadmin 13868 13867 0 13:56 ? 00:00:00 postgres: 6000, logger process
gpadmin 13870 12742 0 13:56 pts/7 00:00:00 grep --color=auto postgres
分别启动另外两个故障seg
[gpadmin@rhsdw1 pg_log]$ /gp/app/bin/postgres "-D" "/gp/gpdata/primary/gpseg0" "-p" "6000" "--gp_dbid=2" "--gp_num_contents_in_cluster=4" "--silent-mode=true" "-i" "-M" "quiescent" "--gp_contentid=0"
[gpadmin@rhsdw2 ~]$ /gp/app/bin/postgres "-D" "/gp/gpdata/primary/gpseg1" "-p" "6000" "--gp_dbid=3" "--gp_num_contents_in_cluster=4" "--silent-mode=true" "-i" "-M" "quiescent" "--gp_contentid=1"
管理节点查看状态:
[gpadmin@rhmdw ~]$ gpstate -b
20190404:14:01:08:005024 gpstate:rhmdw:gpadmin-[INFO]:-Starting gpstate with args: -b
20190404:14:01:08:005024 gpstate:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:14:01:08:005024 gpstate:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:14:01:08:005024 gpstate:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:01:08:005024 gpstate:rhmdw:gpadmin-[INFO]:-Gathering data from segments...
..
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:-Greenplum instance status summary
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Master instance = Active
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Master standby = No master standby configured
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total segment instance count from metadata = 8
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Primary Segment Status
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total primary segments = 4
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total primary segment valid (at master) = 1
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[WARNING]:-Total primary segment failures (at master) = 3 <<<<<<<<
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files missing = 0
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files found = 4
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs missing = 0
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs found = 4
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files missing = 0
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files found = 4
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes missing = 0
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes found = 4
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Mirror Segment Status
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total mirror segments = 4
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total mirror segment valid (at master) = 4
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total mirror segment failures (at master) = 0
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files missing = 0
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files found = 4
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs missing = 0
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs found = 4
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files missing = 0
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files found = 4
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes missing = 0
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes found = 4
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[WARNING]:-Total number mirror segments acting as primary segments = 3 <<<<<<<<
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:- Total number mirror segments acting as mirror segments = 1
20190404:14:01:10:005024 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
重新平衡数据,发现依然提示存在未启动segments
[gpadmin@rhmdw ~]$ gprecoverseg -r
20190404:14:01:51:005157 gprecoverseg:rhmdw:gpadmin-[INFO]:-Starting gprecoverseg with args: -r
20190404:14:01:51:005157 gprecoverseg:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:14:01:51:005157 gprecoverseg:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:14:01:51:005157 gprecoverseg:rhmdw:gpadmin-[INFO]:-Checking if segments are ready to connect
20190404:14:01:51:005157 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:01:52:005157 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:01:52:005157 gprecoverseg:rhmdw:gpadmin-[CRITICAL]:-gprecoverseg failed. (Reason='Down segments still exist. All segments must be up to rebalance.') exiting...
停止整套集群,重新启动
[gpadmin@rhmdw ~]$ gpstop -a
20190404:14:03:05:005432 gpstop:rhmdw:gpadmin-[INFO]:-Starting gpstop with args: -a
20190404:14:03:05:005432 gpstop:rhmdw:gpadmin-[INFO]:-Gathering information and validating the environment...
20190404:14:03:05:005432 gpstop:rhmdw:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20190404:14:03:05:005432 gpstop:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:03:05:005432 gpstop:rhmdw:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:14:03:05:005432 gpstop:rhmdw:gpadmin-[INFO]:-There are 0 connections to the database
20190404:14:03:05:005432 gpstop:rhmdw:gpadmin-[INFO]:-Commencing Master instance shutdown with mode='smart'
20190404:14:03:05:005432 gpstop:rhmdw:gpadmin-[INFO]:-Master host=rhmdw
20190404:14:03:05:005432 gpstop:rhmdw:gpadmin-[INFO]:-Commencing Master instance shutdown with mode=smart
20190404:14:03:05:005432 gpstop:rhmdw:gpadmin-[INFO]:-Master segment instance directory=/gp/gpdata/master/gpseg-1
20190404:14:03:06:005432 gpstop:rhmdw:gpadmin-[INFO]:-Attempting forceful termination of any leftover master process
20190404:14:03:06:005432 gpstop:rhmdw:gpadmin-[INFO]:-Terminating processes for segment /gp/gpdata/master/gpseg-1
20190404:14:03:06:005432 gpstop:rhmdw:gpadmin-[ERROR]:-Failed to kill processes for segment /gp/gpdata/master/gpseg-1: ([Errno 3] No such process)
20190404:14:03:06:005432 gpstop:rhmdw:gpadmin-[INFO]:-No standby master host configured
20190404:14:03:06:005432 gpstop:rhmdw:gpadmin-[INFO]:-Targeting dbid [4, 2, 5, 3, 6, 9, 8, 7] for shutdown
20190404:14:03:07:005432 gpstop:rhmdw:gpadmin-[INFO]:-Commencing parallel primary segment instance shutdown, please wait...
20190404:14:03:07:005432 gpstop:rhmdw:gpadmin-[INFO]:-0.00% of jobs completed
20190404:14:03:08:005432 gpstop:rhmdw:gpadmin-[INFO]:-100.00% of jobs completed
20190404:14:03:08:005432 gpstop:rhmdw:gpadmin-[INFO]:-Commencing parallel mirror segment instance shutdown, please wait...
20190404:14:03:08:005432 gpstop:rhmdw:gpadmin-[INFO]:-0.00% of jobs completed
20190404:14:03:10:005432 gpstop:rhmdw:gpadmin-[INFO]:-100.00% of jobs completed
20190404:14:03:10:005432 gpstop:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:03:10:005432 gpstop:rhmdw:gpadmin-[INFO]:- Segments stopped successfully = 8
20190404:14:03:10:005432 gpstop:rhmdw:gpadmin-[INFO]:- Segments with errors during stop = 0
20190404:14:03:10:005432 gpstop:rhmdw:gpadmin-[INFO]:-
20190404:14:03:10:005432 gpstop:rhmdw:gpadmin-[WARNING]:-Segments that are currently marked down in configuration = 3 <<<<<<<<
20190404:14:03:10:005432 gpstop:rhmdw:gpadmin-[INFO]:- (stop was still attempted on these segments)
20190404:14:03:10:005432 gpstop:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:03:10:005432 gpstop:rhmdw:gpadmin-[INFO]:-Successfully shutdown 8 of 8 segment instances
20190404:14:03:10:005432 gpstop:rhmdw:gpadmin-[INFO]:-Database successfully shutdown with no errors reported
20190404:14:03:10:005432 gpstop:rhmdw:gpadmin-[INFO]:-Cleaning up leftover gpmmon process
20190404:14:03:10:005432 gpstop:rhmdw:gpadmin-[INFO]:-No leftover gpmmon process found
20190404:14:03:10:005432 gpstop:rhmdw:gpadmin-[INFO]:-Cleaning up leftover gpsmon processes
20190404:14:03:11:005432 gpstop:rhmdw:gpadmin-[INFO]:-No leftover gpsmon processes on some hosts. not attempting forceful termination on these hosts
20190404:14:03:11:005432 gpstop:rhmdw:gpadmin-[INFO]:-Cleaning up leftover shared memory
[gpadmin@rhmdw ~]$ gpssh -f /gp/app/config/hostlist -e 'netstat -nltp | grep postgres'
[rhsdw04] netstat -nltp | grep postgres
[rhsdw04] (Not all processes could be identified, non-owned process info
[rhsdw04] will not be shown, you would have to be root to see it all.)
[ rhmdw] netstat -nltp | grep postgres
[ rhmdw] (Not all processes could be identified, non-owned process info
[ rhmdw] will not be shown, you would have to be root to see it all.)
[ rhsdw2] netstat -nltp | grep postgres
[ rhsdw2] (No info could be read for "-p": geteuid()=1000 but you should be root.)
[ rhsdw1] netstat -nltp | grep postgres
[ rhsdw1] (No info could be read for "-p": geteuid()=1000 but you should be root.)
[rhsdw03] netstat -nltp | grep postgres
[rhsdw03] (Not all processes could be identified, non-owned process info
[rhsdw03] will not be shown, you would have to be root to see it all.)
[gpadmin@rhmdw ~]$ gpstart -a
20190404:14:03:34:005658 gpstart:rhmdw:gpadmin-[INFO]:-Starting gpstart with args: -a
20190404:14:03:34:005658 gpstart:rhmdw:gpadmin-[INFO]:-Gathering information and validating the environment...
20190404:14:03:34:005658 gpstart:rhmdw:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:14:03:34:005658 gpstart:rhmdw:gpadmin-[INFO]:-Greenplum Catalog Version: '301705051'
20190404:14:03:34:005658 gpstart:rhmdw:gpadmin-[INFO]:-Starting Master instance in admin mode
20190404:14:03:35:005658 gpstart:rhmdw:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20190404:14:03:35:005658 gpstart:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:03:35:005658 gpstart:rhmdw:gpadmin-[INFO]:-Setting new master era
20190404:14:03:36:005658 gpstart:rhmdw:gpadmin-[INFO]:-Master Started...
20190404:14:03:36:005658 gpstart:rhmdw:gpadmin-[INFO]:-Shutting down master
20190404:14:03:37:005658 gpstart:rhmdw:gpadmin-[INFO]:-Commencing parallel primary and mirror segment instance startup, please wait...
...
20190404:14:03:40:005658 gpstart:rhmdw:gpadmin-[INFO]:-Process results...
20190404:14:03:40:005658 gpstart:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:03:40:005658 gpstart:rhmdw:gpadmin-[INFO]:- Successful segment starts = 5
20190404:14:03:40:005658 gpstart:rhmdw:gpadmin-[INFO]:- Failed segment starts = 0
20190404:14:03:40:005658 gpstart:rhmdw:gpadmin-[INFO]:- Skipped segment starts (segments are marked down in configuration) = 0
20190404:14:03:40:005658 gpstart:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:03:40:005658 gpstart:rhmdw:gpadmin-[INFO]:-Successfully started 5 of 5 segment instances
20190404:14:03:40:005658 gpstart:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:03:40:005658 gpstart:rhmdw:gpadmin-[INFO]:-Starting Master instance rhmdw directory /gp/gpdata/master/gpseg-1
20190404:14:03:41:005658 gpstart:rhmdw:gpadmin-[INFO]:-Command pg_ctl reports Master rhmdw instance active
20190404:14:03:41:005658 gpstart:rhmdw:gpadmin-[INFO]:-No standby master configured. skipping...
20190404:14:03:41:005658 gpstart:rhmdw:gpadmin-[INFO]:-Database successfully started
[gpadmin@rhmdw ~]$ gpstate -b
20190404:14:03:45:005828 gpstate:rhmdw:gpadmin-[INFO]:-Starting gpstate with args: -b
20190404:14:03:45:005828 gpstate:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:14:03:45:005828 gpstate:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:14:03:45:005828 gpstate:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:03:45:005828 gpstate:rhmdw:gpadmin-[INFO]:-Gathering data from segments...
..
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:-Greenplum instance status summary
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Master instance = Active
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Master standby = No master standby configured
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total segment instance count from metadata = 8
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Primary Segment Status
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total primary segments = 4
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total primary segment valid (at master) = 1
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[WARNING]:-Total primary segment failures (at master) = 3 <<<<<<<<
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[WARNING]:-Total number of postmaster.pid files missing = 3 <<<<<<<<
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files found = 1
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[WARNING]:-Total number of postmaster.pid PIDs missing = 3 <<<<<<<<
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs found = 1
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[WARNING]:-Total number of /tmp lock files missing = 3 <<<<<<<<
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files found = 1
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[WARNING]:-Total number postmaster processes missing = 3 <<<<<<<<
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes found = 1
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Mirror Segment Status
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total mirror segments = 4
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total mirror segment valid (at master) = 4
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total mirror segment failures (at master) = 0
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files missing = 0
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files found = 4
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs missing = 0
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs found = 4
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files missing = 0
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files found = 4
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes missing = 0
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes found = 4
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[WARNING]:-Total number mirror segments acting as primary segments = 3 <<<<<<<<
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:- Total number mirror segments acting as mirror segments = 1
20190404:14:03:47:005828 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
[gpadmin@rhmdw ~]$ gpssh -f /gp/app/config/hostlist -e 'netstat -nltp | grep postgres'
[ rhmdw] netstat -nltp | grep postgres
[ rhmdw] (Not all processes could be identified, non-owned process info
[ rhmdw] will not be shown, you would have to be root to see it all.)
[ rhmdw] tcp 0 0 0.0.0.0:5432 0.0.0.0:* LISTEN 5734/postgres
[ rhmdw] tcp6 0 0 :::50430 :::* LISTEN 5741/postgres: 543
[ rhmdw] tcp6 0 0 :::5432 :::* LISTEN 5734/postgres
[ rhsdw2] netstat -nltp | grep postgres
[ rhsdw2] (Not all processes could be identified, non-owned process info
[ rhsdw2] will not be shown, you would have to be root to see it all.)
[ rhsdw2] tcp 0 0 0.0.0.0:7000 0.0.0.0:* LISTEN 14227/postgres
[ rhsdw2] tcp6 0 0 :::7000 :::* LISTEN 14227/postgres
[ rhsdw1] netstat -nltp | grep postgres
[ rhsdw1] (Not all processes could be identified, non-owned process info
[ rhsdw1] will not be shown, you would have to be root to see it all.)
[ rhsdw1] tcp 0 0 0.0.0.0:7000 0.0.0.0:* LISTEN 21653/postgres
[ rhsdw1] tcp6 0 0 :::7000 :::* LISTEN 21653/postgres
[rhsdw03] netstat -nltp | grep postgres
[rhsdw03] (Not all processes could be identified, non-owned process info
[rhsdw03] will not be shown, you would have to be root to see it all.)
[rhsdw03] tcp 0 0 192.168.15.205:9000 0.0.0.0:* LISTEN 27363/postgres: 60
[rhsdw03] tcp 0 0 0.0.0.0:6000 0.0.0.0:* LISTEN 27349/postgres
[rhsdw03] tcp 0 0 0.0.0.0:7000 0.0.0.0:* LISTEN 27350/postgres
[rhsdw03] tcp6 0 0 :::6000 :::* LISTEN 27349/postgres
[rhsdw03] tcp6 0 0 :::7000 :::* LISTEN 27350/postgres
[rhsdw04] netstat -nltp | grep postgres
[rhsdw04] (Not all processes could be identified, non-owned process info
[rhsdw04] will not be shown, you would have to be root to see it all.)
[rhsdw04] tcp 0 0 192.168.15.206:8000 0.0.0.0:* LISTEN 14304/postgres: 70
[rhsdw04] tcp 0 0 0.0.0.0:7000 0.0.0.0:* LISTEN 14298/postgres
[rhsdw04] tcp6 0 0 :::7000 :::* LISTEN 14298/postgres
查看数据库节点状态:
[gpadmin@rhmdw ~]$ psql -d postgres
psql (8.3.23)
Type "help" for help.
postgres=#
postgres=#
postgres=# select * from gp_segment_configuration order by hostname;
dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port
------+---------+------+----------------+------+--------+------+----------+---------+------------------
1 | -1 | p | p | s | u | 5432 | rhmdw | rhmdw |
6 | 2 | p | p | s | u | 6000 | rhsdw03 | rhsdw03 | 9000
8 | 3 | p | m | c | u | 7000 | rhsdw03 | rhsdw03 | 8000
9 | 2 | m | m | s | u | 7000 | rhsdw04 | rhsdw04 | 8000
7 | 3 | m | p | s | d | 6000 | rhsdw04 | rhsdw04 | 9000
2 | 0 | m | p | s | d | 6000 | rhsdw1 | rhsdw1 | 9000
5 | 1 | p | m | c | u | 7000 | rhsdw1 | rhsdw1 | 8000
3 | 1 | m | p | s | d | 6000 | rhsdw2 | rhsdw2 | 9000
4 | 0 | p | m | c | u | 7000 | rhsdw2 | rhsdw2 | 8000
(9 rows)
mode和status字段的含义:
mode=s、c、r分别表示synced、change logging、resyning(已同步,跟踪块变更、同步中)
status=u,d分别up和down
查看集群详细信息,截取有问题的一个节点:
[gpadmin@rhmdw ~]$ gpstate -s
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Segment Info
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Hostname = rhsdw04
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Address = rhsdw04
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Datadir = /gp/gpdata/mirror/gpseg2
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Port = 7000
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Mirroring Info
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Current role = Mirror
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Preferred role = Mirror
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Mirror status = Synchronized
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Status
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- PID = 14619
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Configuration reports status as = Up
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Segment status = Up
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Segment Info
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Hostname = rhsdw04
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Address = rhsdw04
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Datadir = /gp/gpdata/primary/gpseg3
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Port = 6000
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Mirroring Info
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Current role = Mirror
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Preferred role = Primary
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[WARNING]:- Mirror status = Out of Sync <<<<<<<<
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[INFO]:- Status
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[WARNING]:- PID = Not found <<<<<<<<
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[WARNING]:- Configuration reports status as = Down <<<<<<<<
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[WARNING]:- Segment status = Down in configuration <<<<<<<<
20190404:14:31:13:009593 gpstate:rhmdw:gpadmin-[WARNING]:-*****************************************************
问题节点mirror正常,原primary停止,无法同步数据。
通过recover进行数据恢复
[gpadmin@rhmdw ~]$ gprecoverseg -o ./recov
20190404:14:45:08:011212 gprecoverseg:rhmdw:gpadmin-[INFO]:-Starting gprecoverseg with args: -o ./recov
20190404:14:45:08:011212 gprecoverseg:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:14:45:08:011212 gprecoverseg:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:14:45:08:011212 gprecoverseg:rhmdw:gpadmin-[INFO]:-Checking if segments are ready to connect
20190404:14:45:08:011212 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:45:08:011212 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:45:10:011212 gprecoverseg:rhmdw:gpadmin-[INFO]:-Configuration file output to ./recov successfully.
检查
[gpadmin@rhmdw ~]$ more ./recov
filespaceOrder=
rhsdw1:6000:/gp/gpdata/primary/gpseg0
rhsdw2:6000:/gp/gpdata/primary/gpseg1
rhsdw04:6000:/gp/gpdata/primary/gpseg3
开始恢复:[gpadmin@rhmdw ~]$ gprecoverseg -i ./recov
20190404:14:46:02:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Starting gprecoverseg with args: -i ./recov
20190404:14:46:02:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:14:46:02:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:14:46:02:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Checking if segments are ready to connect
20190404:14:46:02:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:46:03:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Heap checksum setting is consistent between master and the segments that are candidates for recoverseg
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Greenplum instance recovery parameters
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Recovery from configuration -i option supplied
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Recovery 1 of 3
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Synchronization mode = Incremental
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance host = rhsdw1
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance address = rhsdw1
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance directory = /gp/gpdata/primary/gpseg0
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance port = 6000
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance replication port = 9000
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance host = rhsdw2
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance address = rhsdw2
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance directory = /gp/gpdata/mirror/gpseg0
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance port = 7000
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance replication port = 8000
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Target = in-place
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Recovery 2 of 3
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Synchronization mode = Incremental
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance host = rhsdw2
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance address = rhsdw2
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance directory = /gp/gpdata/primary/gpseg1
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance port = 6000
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance replication port = 9000
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance host = rhsdw1
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance address = rhsdw1
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance directory = /gp/gpdata/mirror/gpseg1
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance port = 7000
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance replication port = 8000
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Target = in-place
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Recovery 3 of 3
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Synchronization mode = Incremental
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance host = rhsdw04
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance address = rhsdw04
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance directory = /gp/gpdata/primary/gpseg3
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance port = 6000
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance replication port = 9000
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance host = rhsdw03
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance address = rhsdw03
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance directory = /gp/gpdata/mirror/gpseg3
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance port = 7000
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance replication port = 8000
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Target = in-place
20190404:14:46:04:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
Continue with segment recovery procedure Yy|Nn (default=N):
> Y
20190404:14:46:10:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-3 segment(s) to recover
20190404:14:46:10:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Ensuring 3 failed segment(s) are stopped
20190404:14:46:12:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-24373: /gp/gpdata/primary/gpseg0
20190404:14:46:14:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-15082: /gp/gpdata/primary/gpseg1
20190404:14:46:16:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-14825: /gp/gpdata/primary/gpseg3
20190404:14:46:18:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Ensuring that shared memory is cleaned up for stopped segments
updating flat files
20190404:14:46:19:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Updating configuration with new mirrors
20190404:14:46:19:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Updating mirrors
.
20190404:14:46:20:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Starting mirrors
20190404:14:46:20:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-era is 39675b482db44909_190404141816
20190404:14:46:20:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Commencing parallel primary and mirror segment instance startup, please wait...
..
20190404:14:46:22:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Process results...
20190404:14:46:22:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Updating configuration to mark mirrors up
20190404:14:46:22:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Updating primaries
20190404:14:46:22:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Commencing parallel primary conversion of 3 segments, please wait...
.
20190404:14:46:23:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Process results...
20190404:14:46:23:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Done updating primaries
20190404:14:46:23:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-******************************************************************
20190404:14:46:23:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Updating segments for resynchronization is completed.
20190404:14:46:23:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-For segments updated successfully, resynchronization will continue in the background.
20190404:14:46:23:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-
20190404:14:46:23:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-Use gpstate -s to check the resynchronization progress.
20190404:14:46:23:011358 gprecoverseg:rhmdw:gpadmin-[INFO]:-******************************************************************
检查状态:
[gpadmin@rhmdw ~]$ gpstate
20190404:14:51:16:012093 gpstate:rhmdw:gpadmin-[INFO]:-Starting gpstate with args:
20190404:14:51:16:012093 gpstate:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:14:51:16:012093 gpstate:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:14:51:16:012093 gpstate:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:51:16:012093 gpstate:rhmdw:gpadmin-[INFO]:-Gathering data from segments...
..
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:-Greenplum instance status summary
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Master instance = Active
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Master standby = No master standby configured
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total segment instance count from metadata = 8
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Primary Segment Status
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total primary segments = 4
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total primary segment valid (at master) = 4
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total primary segment failures (at master) = 0
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files missing = 0
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files found = 4
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs missing = 0
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs found = 4
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files missing = 0
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files found = 4
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes missing = 0
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes found = 4
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Mirror Segment Status
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total mirror segments = 4
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total mirror segment valid (at master) = 4
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total mirror segment failures (at master) = 0
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files missing = 0
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files found = 4
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs missing = 0
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs found = 4
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files missing = 0
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files found = 4
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes missing = 0
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes found = 4
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[WARNING]:-Total number mirror segments acting as primary segments = 3 <<<<<<<<
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:- Total number mirror segments acting as mirror segments = 1
20190404:14:51:19:012093 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
[gpadmin@rhmdw ~]$ psql -d postgres
psql (8.3.23)
Type "help" for help.
postgres=# select * from gp_segment_configuration order by hostname;
dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port
------+---------+------+----------------+------+--------+------+----------+---------+------------------
1 | -1 | p | p | s | u | 5432 | rhmdw | rhmdw |
6 | 2 | p | p | s | u | 6000 | rhsdw03 | rhsdw03 | 9000
8 | 3 | p | m | s | u | 7000 | rhsdw03 | rhsdw03 | 8000
9 | 2 | m | m | s | u | 7000 | rhsdw04 | rhsdw04 | 8000
7 | 3 | m | p | s | u | 6000 | rhsdw04 | rhsdw04 | 9000
2 | 0 | m | p | s | u | 6000 | rhsdw1 | rhsdw1 | 9000
5 | 1 | p | m | s | u | 7000 | rhsdw1 | rhsdw1 | 8000
4 | 0 | p | m | s | u | 7000 | rhsdw2 | rhsdw2 | 8000
3 | 1 | m | p | s | u | 6000 | rhsdw2 | rhsdw2 | 9000
(9 rows)
切换primary和mirror,切换过程发生报错:
[gpadmin@rhmdw ~]$ gprecoverseg -r
20190404:14:52:10:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Starting gprecoverseg with args: -r
20190404:14:52:10:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:14:52:10:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:14:52:10:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Checking if segments are ready to connect
20190404:14:52:10:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Greenplum instance recovery parameters
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Recovery type = Rebalance
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Unbalanced segment 1 of 6
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance host = rhsdw2
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance address = rhsdw2
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance directory = /gp/gpdata/mirror/gpseg0
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance port = 7000
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance replication port = 8000
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Balanced role = Mirror
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Current role = Primary
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Unbalanced segment 2 of 6
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance host = rhsdw1
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance address = rhsdw1
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance directory = /gp/gpdata/primary/gpseg0
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance port = 6000
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance replication port = 9000
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Balanced role = Primary
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Current role = Mirror
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Unbalanced segment 3 of 6
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance host = rhsdw1
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance address = rhsdw1
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance directory = /gp/gpdata/mirror/gpseg1
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance port = 7000
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance replication port = 8000
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Balanced role = Mirror
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Current role = Primary
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Unbalanced segment 4 of 6
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance host = rhsdw2
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance address = rhsdw2
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance directory = /gp/gpdata/primary/gpseg1
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance port = 6000
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance replication port = 9000
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Balanced role = Primary
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Current role = Mirror
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Unbalanced segment 5 of 6
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance host = rhsdw03
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance address = rhsdw03
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance directory = /gp/gpdata/mirror/gpseg3
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance port = 7000
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance replication port = 8000
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Balanced role = Mirror
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Current role = Primary
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Unbalanced segment 6 of 6
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance host = rhsdw04
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance address = rhsdw04
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance directory = /gp/gpdata/primary/gpseg3
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance port = 6000
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Unbalanced instance replication port = 9000
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Balanced role = Primary
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:- Current role = Mirror
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[WARNING]:-This operation will cancel queries that are currently executing.
20190404:14:52:11:012278 gprecoverseg:rhmdw:gpadmin-[WARNING]:-Connections to the database however will not be interrupted.
Continue with segment rebalance procedure Yy|Nn (default=N):
> y
20190404:14:52:13:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Getting unbalanced segments
20190404:14:52:13:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Stopping unbalanced primary segments...
..
20190404:14:52:15:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Triggering segment reconfiguration
20190404:14:52:19:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Starting segment synchronization
20190404:14:52:19:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-=============================START ANOTHER RECOVER=========================================
20190404:14:52:19:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:14:52:19:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:14:52:19:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Checking if segments are ready to connect
20190404:14:52:19:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:52:19:012278 gprecoverseg:rhmdw:gpadmin-[ERROR]:-[Errno 12] Cannot allocate memory
Traceback (most recent call last):
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 243, in run
self.cmd.run()
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 711, in run
self.exec_context.execute(self)
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 655, in execute
LocalExecutionContext.execute(self, cmd)
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 433, in execute
stdout=subprocess.PIPE, close_fds=True)
File "/gp/greenplum-db/ext/python/lib/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/gp/greenplum-db/ext/python/lib/python2.7/subprocess.py", line 1235, in _execute_child
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
20190404:14:52:19:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Unable to connect to database. Retrying 1
20190404:14:52:24:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Checking if segments are ready to connect
20190404:14:52:24:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:52:25:012278 gprecoverseg:rhmdw:gpadmin-[ERROR]:-[Errno 12] Cannot allocate memory
Traceback (most recent call last):
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 243, in run
self.cmd.run()
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 711, in run
self.exec_context.execute(self)
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 655, in execute
LocalExecutionContext.execute(self, cmd)
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 433, in execute
stdout=subprocess.PIPE, close_fds=True)
File "/gp/greenplum-db/ext/python/lib/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/gp/greenplum-db/ext/python/lib/python2.7/subprocess.py", line 1235, in _execute_child
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
20190404:14:52:25:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Unable to connect to database. Retrying 2
20190404:14:52:30:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Checking if segments are ready to connect
20190404:14:52:30:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:52:30:012278 gprecoverseg:rhmdw:gpadmin-[ERROR]:-[Errno 12] Cannot allocate memory
Traceback (most recent call last):
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 243, in run
self.cmd.run()
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 711, in run
self.exec_context.execute(self)
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 655, in execute
LocalExecutionContext.execute(self, cmd)
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 433, in execute
stdout=subprocess.PIPE, close_fds=True)
File "/gp/greenplum-db/ext/python/lib/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/gp/greenplum-db/ext/python/lib/python2.7/subprocess.py", line 1235, in _execute_child
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
20190404:14:52:30:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Unable to connect to database. Retrying 3
20190404:14:52:35:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Checking if segments are ready to connect
20190404:14:52:35:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:52:35:012278 gprecoverseg:rhmdw:gpadmin-[ERROR]:-[Errno 12] Cannot allocate memory
Traceback (most recent call last):
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 243, in run
self.cmd.run()
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 711, in run
self.exec_context.execute(self)
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 655, in execute
LocalExecutionContext.execute(self, cmd)
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 433, in execute
stdout=subprocess.PIPE, close_fds=True)
File "/gp/greenplum-db/ext/python/lib/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/gp/greenplum-db/ext/python/lib/python2.7/subprocess.py", line 1235, in _execute_child
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
20190404:14:52:36:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Unable to connect to database. Retrying 4
20190404:14:52:41:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Checking if segments are ready to connect
20190404:14:52:41:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:52:41:012278 gprecoverseg:rhmdw:gpadmin-[ERROR]:-[Errno 12] Cannot allocate memory
Traceback (most recent call last):
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 243, in run
self.cmd.run()
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 711, in run
self.exec_context.execute(self)
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 655, in execute
LocalExecutionContext.execute(self, cmd)
File "/gp/greenplum-db/lib/python/gppylib/commands/base.py", line 433, in execute
stdout=subprocess.PIPE, close_fds=True)
File "/gp/greenplum-db/ext/python/lib/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/gp/greenplum-db/ext/python/lib/python2.7/subprocess.py", line 1235, in _execute_child
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
20190404:14:52:41:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-Unable to connect to database. Retrying 5
20190404:14:52:46:012278 gprecoverseg:rhmdw:gpadmin-[INFO]:-==============================END ANOTHER RECOVER==========================================
20190404:14:52:46:012278 gprecoverseg:rhmdw:gpadmin-[CRITICAL]:-gprecoverseg failed. (Reason='Unable to connect to database and start transaction') exiting...
[gpadmin@rhmdw ~]$ gpstate -m
20190404:14:53:11:012542 gpstate:rhmdw:gpadmin-[INFO]:-Starting gpstate with args: -m
20190404:14:53:11:012542 gpstate:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:14:53:11:012542 gpstate:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:14:53:11:012542 gpstate:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:14:53:11:012542 gpstate:rhmdw:gpadmin-[INFO]:--------------------------------------------------------------
20190404:14:53:11:012542 gpstate:rhmdw:gpadmin-[INFO]:--Current GPDB mirror list and status
20190404:14:53:11:012542 gpstate:rhmdw:gpadmin-[INFO]:--Type = Spread
20190404:14:53:11:012542 gpstate:rhmdw:gpadmin-[INFO]:--------------------------------------------------------------
20190404:14:53:11:012542 gpstate:rhmdw:gpadmin-[INFO]:- Mirror Datadir Port Status Data Status
20190404:14:53:11:012542 gpstate:rhmdw:gpadmin-[WARNING]:-rhsdw2 /gp/gpdata/mirror/gpseg0 7000 Failed <<<<<<<<
20190404:14:53:11:012542 gpstate:rhmdw:gpadmin-[WARNING]:-rhsdw1 /gp/gpdata/mirror/gpseg1 7000 Failed <<<<<<<<
20190404:14:53:11:012542 gpstate:rhmdw:gpadmin-[INFO]:- rhsdw04 /gp/gpdata/mirror/gpseg2 7000 Passive Synchronized
20190404:14:53:11:012542 gpstate:rhmdw:gpadmin-[WARNING]:-rhsdw03 /gp/gpdata/mirror/gpseg3 7000 Failed <<<<<<<<
20190404:14:53:11:012542 gpstate:rhmdw:gpadmin-[INFO]:--------------------------------------------------------------
20190404:14:53:11:012542 gpstate:rhmdw:gpadmin-[WARNING]:-3 segment(s) configured as mirror(s) have failed
检查发现mirror seg状态down
postgres=# select * from gp_segment_configuration order by hostname;
dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port
------+---------+------+----------------+------+--------+------+----------+---------+------------------
1 | -1 | p | p | s | u | 5432 | rhmdw | rhmdw |
6 | 2 | p | p | s | u | 6000 | rhsdw03 | rhsdw03 | 9000
8 | 3 | m | m | s | d | 7000 | rhsdw03 | rhsdw03 | 8000
9 | 2 | m | m | s | u | 7000 | rhsdw04 | rhsdw04 | 8000
7 | 3 | p | p | c | u | 6000 | rhsdw04 | rhsdw04 | 9000
2 | 0 | p | p | c | u | 6000 | rhsdw1 | rhsdw1 | 9000
5 | 1 | m | m | s | d | 7000 | rhsdw1 | rhsdw1 | 8000
4 | 0 | m | m | s | d | 7000 | rhsdw2 | rhsdw2 | 8000
3 | 1 | p | p | c | u | 6000 | rhsdw2 | rhsdw2 | 9000
重新恢复mirror
[gpadmin@rhmdw ~]$ rm ./recov
[gpadmin@rhmdw ~]$ gprecoverseg -o ./recov
20190404:15:02:21:014355 gprecoverseg:rhmdw:gpadmin-[INFO]:-Starting gprecoverseg with args: -o ./recov
20190404:15:02:21:014355 gprecoverseg:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:15:02:21:014355 gprecoverseg:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:15:02:21:014355 gprecoverseg:rhmdw:gpadmin-[INFO]:-Checking if segments are ready to connect
20190404:15:02:21:014355 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:15:02:21:014355 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:15:02:22:014355 gprecoverseg:rhmdw:gpadmin-[INFO]:-Configuration file output to ./recov successfully.
[gpadmin@rhmdw ~]$
[gpadmin@rhmdw ~]$
[gpadmin@rhmdw ~]$ more ./recov
filespaceOrder=
rhsdw2:7000:/gp/gpdata/mirror/gpseg0
rhsdw1:7000:/gp/gpdata/mirror/gpseg1
rhsdw03:7000:/gp/gpdata/mirror/gpseg3
[gpadmin@rhmdw ~]$ gprecoverseg -i ./recov
20190404:15:02:41:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Starting gprecoverseg with args: -i ./recov
20190404:15:02:41:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:15:02:41:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:15:02:41:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Checking if segments are ready to connect
20190404:15:02:41:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:15:02:41:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Heap checksum setting is consistent between master and the segments that are candidates for recoverseg
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Greenplum instance recovery parameters
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Recovery from configuration -i option supplied
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Recovery 1 of 3
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Synchronization mode = Incremental
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance host = rhsdw2
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance address = rhsdw2
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance directory = /gp/gpdata/mirror/gpseg0
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance port = 7000
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance replication port = 8000
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance host = rhsdw1
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance address = rhsdw1
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance directory = /gp/gpdata/primary/gpseg0
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance port = 6000
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance replication port = 9000
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Target = in-place
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Recovery 2 of 3
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Synchronization mode = Incremental
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance host = rhsdw1
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance address = rhsdw1
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance directory = /gp/gpdata/mirror/gpseg1
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance port = 7000
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance replication port = 8000
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance host = rhsdw2
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance address = rhsdw2
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance directory = /gp/gpdata/primary/gpseg1
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance port = 6000
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance replication port = 9000
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Target = in-place
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Recovery 3 of 3
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Synchronization mode = Incremental
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance host = rhsdw03
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance address = rhsdw03
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance directory = /gp/gpdata/mirror/gpseg3
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance port = 7000
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Failed instance replication port = 8000
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance host = rhsdw04
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance address = rhsdw04
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance directory = /gp/gpdata/primary/gpseg3
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance port = 6000
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Source instance replication port = 9000
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:- Recovery Target = in-place
20190404:15:02:43:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:----------------------------------------------------------
Continue with segment recovery procedure Yy|Nn (default=N):
> Y
20190404:15:02:45:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-3 segment(s) to recover
20190404:15:02:45:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Ensuring 3 failed segment(s) are stopped
20190404:15:02:48:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-26787: /gp/gpdata/mirror/gpseg1
20190404:15:02:50:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Ensuring that shared memory is cleaned up for stopped segments
updating flat files
20190404:15:02:51:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Updating configuration with new mirrors
20190404:15:02:51:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Updating mirrors
.
20190404:15:02:52:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Starting mirrors
20190404:15:02:52:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-era is 39675b482db44909_190404145828
20190404:15:02:52:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Commencing parallel primary and mirror segment instance startup, please wait...
..
20190404:15:02:54:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Process results...
20190404:15:02:54:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Updating configuration to mark mirrors up
20190404:15:02:54:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Updating primaries
20190404:15:02:54:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Commencing parallel primary conversion of 3 segments, please wait...
.
20190404:15:02:55:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Process results...
20190404:15:02:55:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Done updating primaries
20190404:15:02:55:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-******************************************************************
20190404:15:02:55:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Updating segments for resynchronization is completed.
20190404:15:02:55:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-For segments updated successfully, resynchronization will continue in the background.
20190404:15:02:55:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-
20190404:15:02:55:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-Use gpstate -s to check the resynchronization progress.
20190404:15:02:55:014496 gprecoverseg:rhmdw:gpadmin-[INFO]:-******************************************************************
[gpadmin@rhmdw ~]$ gpstate -s
20190404:15:03:00:014621 gpstate:rhmdw:gpadmin-[INFO]:-Starting gpstate with args: -s
20190404:15:03:00:014621 gpstate:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:15:03:00:014621 gpstate:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:15:03:00:014621 gpstate:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:15:03:00:014621 gpstate:rhmdw:gpadmin-[INFO]:-Gathering data from segments...
..
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:--Master Configuration & Status
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Master host = rhmdw
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Master postgres process ID = 13500
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Master data directory = /gp/gpdata/master/gpseg-1
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Master port = 5432
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Master current role = dispatch
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Greenplum initsystem version = 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Greenplum current version = PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Postgres version = 8.3.23
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Master standby = No master standby configured
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:-Segment Instance Status Report
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Segment Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Hostname = rhsdw1
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Address = rhsdw1
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Datadir = /gp/gpdata/primary/gpseg0
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Port = 6000
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Mirroring Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Current role = Primary
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Preferred role = Primary
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Mirror status = Resynchronizing
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Change Tracking Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Change tracking data size = 32.1 kB
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Resynchronization Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Resynchronization mode = Incremental
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Data synchronized = 1.03 MB
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Estimated total data to synchronize = Sync complete; awaiting config change
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Estimated resync progress with mirror = 100%
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Total resync objects = 0
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Objects to resync = 0
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Estimated resync end time =
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Status
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- PID = 26441
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Configuration reports status as = Up
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Database status = Up
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Segment Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Hostname = rhsdw2
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Address = rhsdw2
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Datadir = /gp/gpdata/mirror/gpseg0
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Port = 7000
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Mirroring Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Current role = Mirror
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Preferred role = Mirror
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Mirror status = Resynchronizing
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Status
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- PID = 16754
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Configuration reports status as = Up
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Segment status = Up
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Segment Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Hostname = rhsdw2
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Address = rhsdw2
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Datadir = /gp/gpdata/primary/gpseg1
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Port = 6000
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Mirroring Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Current role = Primary
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Preferred role = Primary
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Mirror status = Resynchronizing
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Change Tracking Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Change tracking data size = 32.1 kB
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Resynchronization Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Resynchronization mode = Incremental
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Data synchronized = 1.03 MB
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Estimated total data to synchronize = Sync complete; awaiting config change
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Estimated resync progress with mirror = 100%
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Total resync objects = 0
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Objects to resync = 0
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Estimated resync end time =
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Status
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- PID = 16240
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Configuration reports status as = Up
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Database status = Up
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Segment Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Hostname = rhsdw1
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Address = rhsdw1
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Datadir = /gp/gpdata/mirror/gpseg1
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Port = 7000
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Mirroring Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Current role = Mirror
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Preferred role = Mirror
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Mirror status = Resynchronizing
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Status
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- PID = 27324
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Configuration reports status as = Up
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Segment status = Up
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Segment Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Hostname = rhsdw03
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Address = rhsdw03
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Datadir = /gp/gpdata/primary/gpseg2
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Port = 6000
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Mirroring Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Current role = Primary
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Preferred role = Primary
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Mirror status = Synchronized
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Status
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- PID = 32131
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Configuration reports status as = Up
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Database status = Up
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Segment Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Hostname = rhsdw04
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Address = rhsdw04
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Datadir = /gp/gpdata/mirror/gpseg2
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Port = 7000
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Mirroring Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Current role = Mirror
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Preferred role = Mirror
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Mirror status = Synchronized
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Status
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- PID = 15707
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Configuration reports status as = Up
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Segment status = Up
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Segment Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Hostname = rhsdw04
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Address = rhsdw04
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Datadir = /gp/gpdata/primary/gpseg3
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Port = 6000
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Mirroring Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Current role = Primary
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Preferred role = Primary
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Mirror status = Resynchronizing
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Change Tracking Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Change tracking data size = 32.1 kB
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Resynchronization Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Resynchronization mode = Incremental
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Data synchronized = 1.03 MB
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Estimated total data to synchronize = Sync complete; awaiting config change
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Estimated resync progress with mirror = 100%
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Total resync objects = 0
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Objects to resync = 0
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Estimated resync end time =
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Status
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- PID = 15706
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Configuration reports status as = Up
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Database status = Up
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Segment Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Hostname = rhsdw03
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Address = rhsdw03
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Datadir = /gp/gpdata/mirror/gpseg3
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Port = 7000
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Mirroring Info
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Current role = Mirror
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Preferred role = Mirror
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Mirror status = Resynchronizing
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Status
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- PID = 315
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Configuration reports status as = Up
20190404:15:03:02:014621 gpstate:rhmdw:gpadmin-[INFO]:- Segment status = Up
[gpadmin@rhmdw ~]$ gpstate -m
20190404:15:03:24:015067 gpstate:rhmdw:gpadmin-[INFO]:-Starting gpstate with args: -m
20190404:15:03:24:015067 gpstate:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:15:03:24:015067 gpstate:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:15:03:24:015067 gpstate:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:15:03:24:015067 gpstate:rhmdw:gpadmin-[INFO]:--------------------------------------------------------------
20190404:15:03:24:015067 gpstate:rhmdw:gpadmin-[INFO]:--Current GPDB mirror list and status
20190404:15:03:24:015067 gpstate:rhmdw:gpadmin-[INFO]:--Type = Spread
20190404:15:03:24:015067 gpstate:rhmdw:gpadmin-[INFO]:--------------------------------------------------------------
20190404:15:03:24:015067 gpstate:rhmdw:gpadmin-[INFO]:- Mirror Datadir Port Status Data Status
20190404:15:03:24:015067 gpstate:rhmdw:gpadmin-[INFO]:- rhsdw2 /gp/gpdata/mirror/gpseg0 7000 Passive Resynchronizing
20190404:15:03:24:015067 gpstate:rhmdw:gpadmin-[INFO]:- rhsdw1 /gp/gpdata/mirror/gpseg1 7000 Passive Resynchronizing
20190404:15:03:24:015067 gpstate:rhmdw:gpadmin-[INFO]:- rhsdw04 /gp/gpdata/mirror/gpseg2 7000 Passive Synchronized
20190404:15:03:24:015067 gpstate:rhmdw:gpadmin-[INFO]:- rhsdw03 /gp/gpdata/mirror/gpseg3 7000 Passive Resynchronizing
20190404:15:03:24:015067 gpstate:rhmdw:gpadmin-[INFO]:--------------------------------------------------------------
[gpadmin@rhmdw ~]$ gpstate
20190404:15:03:54:015153 gpstate:rhmdw:gpadmin-[INFO]:-Starting gpstate with args:
20190404:15:03:54:015153 gpstate:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:15:03:54:015153 gpstate:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:15:03:54:015153 gpstate:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:15:03:54:015153 gpstate:rhmdw:gpadmin-[INFO]:-Gathering data from segments...
..
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:-Greenplum instance status summary
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Master instance = Active
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Master standby = No master standby configured
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total segment instance count from metadata = 8
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Primary Segment Status
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total primary segments = 4
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total primary segment valid (at master) = 4
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total primary segment failures (at master) = 0
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files missing = 0
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files found = 4
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs missing = 0
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs found = 4
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files missing = 0
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files found = 4
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes missing = 0
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes found = 4
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Mirror Segment Status
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total mirror segments = 4
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total mirror segment valid (at master) = 4
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total mirror segment failures (at master) = 0
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files missing = 0
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid files found = 4
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs missing = 0
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number of postmaster.pid PIDs found = 4
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files missing = 0
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number of /tmp lock files found = 4
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes missing = 0
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number postmaster processes found = 4
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number mirror segments acting as primary segments = 0
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:- Total number mirror segments acting as mirror segments = 4
20190404:15:03:56:015153 gpstate:rhmdw:gpadmin-[INFO]:-----------------------------------------------------
[gpadmin@rhmdw ~]$ gprecoverseg -r
20190404:15:04:32:015266 gprecoverseg:rhmdw:gpadmin-[INFO]:-Starting gprecoverseg with args: -r
20190404:15:04:32:015266 gprecoverseg:rhmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
20190404:15:04:33:015266 gprecoverseg:rhmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
20190404:15:04:33:015266 gprecoverseg:rhmdw:gpadmin-[INFO]:-Checking if segments are ready to connect
20190404:15:04:33:015266 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:15:04:33:015266 gprecoverseg:rhmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20190404:15:04:33:015266 gprecoverseg:rhmdw:gpadmin-[INFO]:-No segments are running in their non-preferred role and need to be rebalanced.
至此恢复正常,总结如下:
1、当mirror或primary某个seg是down的状态时,通过gprecoverseg -o生成恢复配置文件,使用gprecoverseg -i ./recov 进行恢复。
2、当mirror和primary状态调换时,通过gprecoverseg -r进行切换
3、在进行恢复时,禁止启动需要恢复的seg
参考文档:
https://gp-docs-cn.github.io/docs/admin_guide/highavail/topics/g-when-a-segment-host-is-not-recoverable.html
https://yq.aliyun.com/articles/193