从greenplum集群中移除节点,并用串行恢复的方法将删除节点的数据重分布到剩余节点...

       昨天在把GP数据库从单机装的3.3.6版本升级到4.0.4版本时,老报内存不足的问题,考虑到4.0.4对内存的要求更高(目前虚拟机配的是2G内存),想删除4个节点(原来有8个节点)试试,以节省内存。
       在网上找到一篇文章(原贴地址:http://www.alidw.com/?p=1637),写得相当不错,但唯一欠缺的是在数据还原的过程中有点麻烦,特别是当数据库很大时,它这种方案基本是不可行的,考虑到GP数据库可以实现串行恢复,因此想通过串行恢复把删除节点后的数据还原过来。

第一步:对删除节点前数据的采样
[gpadmin@mdw-1 data]$ psql -d cust_utf8
psql (8.2.13)
Type "help" for help.


cust_utf8=# select gp_segment_id,count(1) from mi_boss.tb_dim_sys_param group by 1 order by 1;
 gp_segment_id | count
---------------+-------
             0 | 53737
             1 | 53727
             2 | 53770
             3 | 53654
             4 | 53858
             5 | 53805
             6 | 53690
             7 | 53756

第二步:查看当前节点情况:
cust_utf8=# select * from gp_configuration;
 content | definedprimary | dbid | isprimary | valid | hostname | port  |      datadir     
---------+----------------+------+-----------+-------+----------+-------+-------------------
      -1 | t              |    1 | t         | t     | mdw-1    |  5432 | /data/master/gp-1
       0 | t              |    2 | t         | t     | sdw1-1   | 50001 | /data/vol1/gp0
       1 | t              |    3 | t         | t     | sdw1-2   | 50002 | /data/vol2/gp1
       2 | t              |    4 | t         | t     | sdw1-3   | 50003 | /data/vol1/gp2
       3 | t              |    5 | t         | t     | sdw1-4   | 50004 | /data/vol2/gp3
       4 | t              |    6 | t         | t     | sdw2-1   | 50005 | /data/vol1/gp4
       5 | t              |    7 | t         | t     | sdw2-2   | 50006 | /data/vol2/gp5
       6 | t              |    8 | t         | t     | sdw2-3   | 50007 | /data/vol1/gp6
       7 | t              |    9 | t         | t     | sdw2-4   | 50008 | /data/vol2/gp7
(9 rows)

cust_utf8=# select * from gp_id;
       gpname       | numsegments | dbid | content
--------------------+-------------+------+---------
 Greenplum Database |           8 |    1 |      -1

第三步:做数据库备份
gp_dump --gp-d=/data/backup2 cust_utf8
这一步已经运行,这儿就不帮额外的操作

第四步:
4.1:以数据库维护模式对数据库的字典文件进行修改,将节点6,7,8,9踢除
[gpadmin@mdw-1 ~]$ gpstart -m
20110107:09:29:04:gpstart:mdw-1:gpadmin-[INFO]:-Starting gpstart with args: '-m'
20110107:09:29:04:gpstart:mdw-1:gpadmin-[INFO]:-Gathering information and validating the environment...
20110107:09:29:04:gpstart:mdw-1:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 3.3.6.1 build 1'
20110107:09:29:04:gpstart:mdw-1:gpadmin-[INFO]:-Starting Master instance in admin mode
20110107:09:29:06:gpstart:mdw-1:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20110107:09:29:06:gpstart:mdw-1:gpadmin-[INFO]:-Obtaining Segment details from master...
20110107:09:29:06:gpstart:mdw-1:gpadmin-[INFO]:-Master Started...
[gpadmin@mdw-1 ~]$ psql -d cust_utf8
psql: FATAL:  System was started in master-only utility mode - only utility mode connections are allowed
[gpadmin@mdw-1 ~]$ psql -d postgres
psql: FATAL:  System was started in master-only utility mode - only utility mode connections are allowed
[gpadmin@mdw-1 ~]$ psql
psql: FATAL:  System was started in master-only utility mode - only utility mode connections are allowed
[gpadmin@mdw-1 ~]$  PGOPTIONS=”-c gp_session_role=utility” psql
psql: FATAL:  System was started in master-only utility mode - only utility mode connections are allowed

[gpadmin@mdw-1 ~]$ PGOPTIONS='-c gp_session_role=utility' psql -d cust_utf8
psql (8.2.13)
Type "help" for help.

cust_utf8=# select * from gp_configuration;
 content | definedprimary | dbid | isprimary | valid | hostname | port  |      datadir     
---------+----------------+------+-----------+-------+----------+-------+-------------------
      -1 | t              |    1 | t         | t     | mdw-1    |  5432 | /data/master/gp-1
       0 | t              |    2 | t         | t     | sdw1-1   | 50001 | /data/vol1/gp0
       1 | t              |    3 | t         | t     | sdw1-2   | 50002 | /data/vol2/gp1
       2 | t              |    4 | t         | t     | sdw1-3   | 50003 | /data/vol1/gp2
       3 | t              |    5 | t         | t     | sdw1-4   | 50004 | /data/vol2/gp3
       4 | t              |    6 | t         | t     | sdw2-1   | 50005 | /data/vol1/gp4
       5 | t              |    7 | t         | t     | sdw2-2   | 50006 | /data/vol2/gp5
       6 | t              |    8 | t         | t     | sdw2-3   | 50007 | /data/vol1/gp6
       7 | t              |    9 | t         | t     | sdw2-4   | 50008 | /data/vol2/gp7
(9 rows)

cust_utf8=# select * from gp_id;
       gpname       | numsegments | dbid | content
--------------------+-------------+------+---------
 Greenplum Database |           8 |    1 |      -1
(1 row)

cust_utf8=# delete from gp_configuration where content in(4,5,6,7);
DELETE 4
cust_utf8=# update gp_id set numsegments=4;
UPDATE 1
cust_utf8=# select * from gp_configuration;
 content | definedprimary | dbid | isprimary | valid | hostname | port  |      datadir     
---------+----------------+------+-----------+-------+----------+-------+-------------------
      -1 | t              |    1 | t         | t     | mdw-1    |  5432 | /data/master/gp-1
       0 | t              |    2 | t         | t     | sdw1-1   | 50001 | /data/vol1/gp0
       1 | t              |    3 | t         | t     | sdw1-2   | 50002 | /data/vol2/gp1
       2 | t              |    4 | t         | t     | sdw1-3   | 50003 | /data/vol1/gp2
       3 | t              |    5 | t         | t     | sdw1-4   | 50004 | /data/vol2/gp3
(5 rows)

cust_utf8=# select * from gp_id;
       gpname       | numsegments | dbid | content
--------------------+-------------+------+---------
 Greenplum Database |           4 |    1 |      -1
(1 row)
4.2:再以维护模式关闭数据库
[gpadmin@mdw-1 ~]$ gpstop -m
20110107:09:43:27:gpstop:mdw-1:gpadmin-[INFO]:-Starting gpstop with args: '-m'
20110107:09:43:27:gpstop:mdw-1:gpadmin-[INFO]:-Gathering information and validating the environment...
20110107:09:43:28:gpstop:mdw-1:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20110107:09:43:28:gpstop:mdw-1:gpadmin-[INFO]:-Obtaining Segment details from master...
20110107:09:43:28:gpstop:mdw-1:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 3.3.6.1 build 1'
20110107:09:43:28:gpstop:mdw-1:gpadmin-[INFO]:-There are 0 connections to the database
20110107:09:43:28:gpstop:mdw-1:gpadmin-[INFO]:-Commencing Master instance shutdown with mode='smart'
20110107:09:43:28:gpstop:mdw-1:gpadmin-[INFO]:-Master host=mdw-1
20110107:09:43:28:gpstop:mdw-1:gpadmin-[INFO]:-Commencing Master instance shutdown with mode=smart
20110107:09:43:28:gpstop:mdw-1:gpadmin-[INFO]:-Master segment instance directory=/data/master/gp-1

4.3:正常打开数据库试试,看是否能打开数据库:
[gpadmin@mdw-1 ~]$ gpstart
20110107:09:45:58:gpstart:mdw-1:gpadmin-[INFO]:-Starting gpstart with args: ''
20110107:09:45:58:gpstart:mdw-1:gpadmin-[INFO]:-Gathering information and validating the environment...
20110107:09:45:59:gpstart:mdw-1:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 3.3.6.1 build 1'
20110107:09:45:59:gpstart:mdw-1:gpadmin-[INFO]:-Starting Master instance in admin mode
20110107:09:46:00:gpstart:mdw-1:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20110107:09:46:00:gpstart:mdw-1:gpadmin-[INFO]:-Obtaining Segment details from master...
20110107:09:46:01:gpstart:mdw-1:gpadmin-[INFO]:-Master Started...
20110107:09:46:01:gpstart:mdw-1:gpadmin-[INFO]:-Shutting down master
20110107:09:46:04:gpstart:mdw-1:gpadmin-[INFO]:---------------------------
20110107:09:46:04:gpstart:mdw-1:gpadmin-[INFO]:-Master instance parameters
20110107:09:46:04:gpstart:mdw-1:gpadmin-[INFO]:---------------------------
20110107:09:46:04:gpstart:mdw-1:gpadmin-[INFO]:-Database                 = template1
20110107:09:46:04:gpstart:mdw-1:gpadmin-[INFO]:-Master Port              = 5432
20110107:09:46:04:gpstart:mdw-1:gpadmin-[INFO]:-Master directory         = /data/master/gp-1
20110107:09:46:04:gpstart:mdw-1:gpadmin-[INFO]:-Master standby           = Off
20110107:09:46:04:gpstart:mdw-1:gpadmin-[INFO]:---------------------------------------
20110107:09:46:04:gpstart:mdw-1:gpadmin-[INFO]:-Segment instances that will be started
20110107:09:46:04:gpstart:mdw-1:gpadmin-[INFO]:---------------------------------------
20110107:09:46:04:gpstart:mdw-1:gpadmin-[INFO]:-Host    Datadir         Port   Status
20110107:09:46:04:gpstart:mdw-1:gpadmin-[INFO]:-sdw1-1  /data/vol1/gp0  50001  Valid
20110107:09:46:04:gpstart:mdw-1:gpadmin-[INFO]:-sdw1-2  /data/vol2/gp1  50002  Valid
20110107:09:46:04:gpstart:mdw-1:gpadmin-[INFO]:-sdw1-3  /data/vol1/gp2  50003  Valid
20110107:09:46:04:gpstart:mdw-1:gpadmin-[INFO]:-sdw1-4  /data/vol2/gp3  50004  Valid

Continue with Greenplum instance startup Yy|Nn (default=N):
> y
20110107:09:46:05:gpstart:mdw-1:gpadmin-[INFO]:-No standby master configured.  skipping...
20110107:09:46:06:gpstart:mdw-1:gpadmin-[INFO]:-Commencing parallel segment instance startup, please wait...
.............
20110107:09:46:19:gpstart:mdw-1:gpadmin-[INFO]:-Process results...
20110107:09:46:19:gpstart:mdw-1:gpadmin-[INFO]:-----------------------------------------------------
20110107:09:46:19:gpstart:mdw-1:gpadmin-[INFO]:-Total processes marked as completed             = 4
20110107:09:46:19:gpstart:mdw-1:gpadmin-[INFO]:-Total processes marked as failed                = 0
20110107:09:46:19:gpstart:mdw-1:gpadmin-[INFO]:-----------------------------------------------------
20110107:09:46:19:gpstart:mdw-1:gpadmin-[INFO]:-Total instances marked invalid and bypassed     = 0
20110107:09:46:19:gpstart:mdw-1:gpadmin-[INFO]:-Successfully started 4 of 4 segment instances
20110107:09:46:19:gpstart:mdw-1:gpadmin-[INFO]:-----------------------------------------------------
20110107:09:46:19:gpstart:mdw-1:gpadmin-[INFO]:-Starting Master instance mdw-1 directory /data/master/gp-1
20110107:09:46:22:gpstart:mdw-1:gpadmin-[INFO]:-Command pg_ctl reports Master mdw-1 instance active
20110107:09:46:22:gpstart:mdw-1:gpadmin-[INFO]:-Database successfully started with no errors reported


OK,成功启动。
4.4:验证数据分布情况:
[gpadmin@mdw-1 ~]$ psql -d cust_utf8
psql (8.2.13)
Type "help" for help.
cust_utf8=# select gp_segment_id,count(1) from mi_boss.tb_dim_sys_param group by 1 order by 1;
 gp_segment_id | count
---------------+-------
             0 | 53737
             1 | 53727
             2 | 53770
             3 | 53654
(4 rows)

对比第一步的结果,发现果然是后面4个节点的数据已经被删除了。下面用已经备份的数据对已经删除的四个节点数据进行串行恢复。


第五步:用串行恢复将删除的4个节点的数据追加到现有节点上
5.1:进行串行恢复:
[gpadmin@mdw-1 backup2]$ psql cust_utf8 -f gp_dump_0_6_20110106102229
SET
SET
SET
SET
SET
SET
[gpadmin@mdw-1 backup2]$ psql cust_utf8 -f gp_dump_0_7_20110106102229
SET
SET
SET
SET
SET
SET
[gpadmin@mdw-1 backup2]$ psql cust_utf8 -f gp_dump_0_8_20110106102229
SET
SET
SET
SET
SET
SET
[gpadmin@mdw-1 backup2]$ psql cust_utf8 -f gp_dump_0_9_20110106102229
SET
SET
SET
SET
SET
SET

5.2:验证数据是否已经恢复成功:
[gpadmin@mdw-1 backup2]$ psql -d cust_utf8
psql (8.2.13)
Type "help" for help.

cust_utf8=# select gp_segment_id,count(1) from mi_boss.tb_dim_sys_param group by 1 order by 1;
 gp_segment_id | count 
---------------+--------
             0 | 107595
             1 | 107532
             2 | 107460
             3 | 107410
(4 rows)

53737+53727+53770+53654+53858+53805+53690+53756=107595+107532+107460+107410=429997


OK,整个测试过程圆满完成。

 

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/12350275/viewspace-683416/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/12350275/viewspace-683416/

你可能感兴趣的:(数据库,运维)