GreenPlum分布式集群部署实战


哈喽!大家好,我是【IT邦德】,江湖人称jeames007,10余年DBA及大数据工作经验
一位上进心十足的【大数据领域博主】!
中国DBA联盟(ACDU)成员,目前服务于工业互联网
擅长主流Oracle、MySQL、PG、高斯及Greenplum运维开发,备份恢复,安装迁移,性能优化、故障应急处理等。
✨ 如果有对【数据库】感兴趣的【小可爱】,欢迎关注【IT邦德】
❤️❤️❤️感谢各位大可爱小可爱!❤️❤️❤️

文章目录

  • 前言
    • 1.环境准备
      • ✨ 1.1 安装包下载
      • ✨ 1.2 IP及实例规划
      • ✨ 1.3 操作系统
    • 2.安装前准备
      • ✨ 2.1 hostname设置
      • ✨ 2.2./etc/hosts
      • ✨ 2.3 创建用户
      • ✨ 2.4 host文件创建
      • ✨ 2.5 配置互信
      • ✨ 2.6 关闭防火墙
      • ✨ 2.7 禁用selinux
    • 3.安装GP
      • ✨ 3.1 安装依赖
      • ✨ 3.2 安装rpm包
      • ✨ 3.3 目录创建
      • ✨ 3.4 数据库初始化
    • 4.GP集群验证
    • 5.集群管理命令

前言

Greenplum是一个面向数据仓库应用的分布式关系型数据库,本文介绍了整个分布式集群部署的过程。

1.环境准备

✨ 1.1 安装包下载

1.Greenplum 的 GitHub
https://github.com/greenplum-db/gpdb/releases
2.Pivotal公司官网
https://network.pivotal.io/products/vmware-greenplum

✨ 1.2 IP及实例规划

IP 主机名 端口 备注
192.168.6.12 mdw1 5432 Master host
192.168.6.13 mdw2 5432 Standby host
192.168.6.14 sdw1 主:6000-6001
镜像:7000-7001
segment host1,配置2个主实例+2个镜像实例
192.168.6.15 sdw2 主:6000-6001
镜像:7000-7001
segment host2,配置2个主实例+2个镜像实例

GreenPlum分布式集群部署实战_第1张图片

✨ 1.3 操作系统

[root@jeames ~]# cat /etc/centos-release
CentOS Linux release 7.6.1810 (Core)
[root@jeames ~]# free -m
total used free shared buff/cache available
Mem: 3771 159 3407 11 204 3365
Swap: 3967 0 3967

2.安装前准备

✨ 2.1 hostname设置

1个master+1个standby master,2个segment的集群示例,OS均为CentOS 7.6.1810
–设置host名,4个节点设置
hostnamectl set-hostname mdw1
hostnamectl set-hostname mdw2
hostnamectl set-hostname sdw1
hostnamectl set-hostname sdw2

✨ 2.2./etc/hosts

在Greenplum中,习惯将Master机器叫做mdw,将Segment机器叫做sdw,dw的含义为Data Warehouse。
注意:4个节点设置一致
cat >> /etc/hosts <<“EOF”
192.168.6.12 mdw1
192.168.6.13 mdw2
192.168.6.14 sdw1
192.168.6.15 sdw2
EOF

✨ 2.3 创建用户

groupadd -g 1530 gpadmin
useradd -g 1530 -u 1530 -m -d /home/gpadmin -s /bin/bash gpadmin
chown -R gpadmin:gpadmin /home/gpadmin
echo “gpadmin:jeames” | chpasswd

✨ 2.4 host文件创建

1.为所有节点创建一个all_hosts文件,包含所有节点主机名
su - gpadmin
mkdir -p /home/gpadmin/conf/
cat > /home/gpadmin/conf/all_hosts <<“EOF”
mdw1
mdw2
sdw1
sdw2
EOF
2.为所有节点创建一个seg_hosts文件 ,包含所有的Segment Host的主机名
su - gpadmin
cat > /home/gpadmin/conf/seg_hosts <<“EOF”
sdw1
sdw2
EOF

✨ 2.5 配置互信

集群ssh免密,做互信的配置,只在master节点 192.168.6.12 操作
1.生成秘钥对
ssh-keygen -t rsa
2.分发公钥
su - gpadmin
ssh-copy-id gpadmin@mdw1
ssh-copy-id gpadmin@mdw2
ssh-copy-id gpadmin@sdw1
ssh-copy-id gpadmin@sdw2

✨ 2.6 关闭防火墙

每个节点机器 root 用户操作
systemctl status firewalld
systemctl stop firewalld
systemctl disable firewalld

✨ 2.7 禁用selinux

cat /etc/selinux/config

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=enforcing
# SELINUXTYPE= can take one of three values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected.
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted

sed -i "s#^SELINUX=.*#SELINUX=disabled#g"  /etc/selinux/config
注意:修改后记得重启

3.安装GP

✨ 3.1 安装依赖

##安装依赖包
yum install net-tools -y
yum install libcgroup-tools -y

yum install -y apr apr-util bash bzip2 curl krb5 libcurl libevent libxml2 libyaml \
zlib openldap openssh openssl openssl-libs perl readline rsync R sed tar zip krb5-devel

✨ 3.2 安装rpm包

## 在所有节点操作,root用户操作

1.默认安装到/usr/local下
rpm -ivh open-source-greenplum-db-6.24.3-rhel7-x86_64.rpm

2.赋权,修改该路径gpadmin操作权限
chown -R gpadmin:gpadmin /usr/local/greenplum*

✨ 3.3 目录创建

##创建目录,用作集群数据的存储目录,所有节点操作
mkdir -p /greenplum/data/
chown -R gpadmin:gpadmin /greenplum


-- 所有节点
echo ". /usr/local/greenplum-db/greenplum_path.sh" >> /home/gpadmin/.bashrc

-- master配置
echo "export MASTER_DATA_DIRECTORY=/greenplum/data/master/gpseg-1" >> /home/gpadmin/.bashrc
echo "export PGDATABASE=postgres" >> /home/gpadmin/.bashrc

# 使配置文件生效
source /home/gpadmin/.bashrc

✨ 3.4 数据库初始化

在master节点操作:创建一个初始化副本 initgp_config,修改参数:

-- 在所有节点操作(在master节点创建master目录,在segment节点分布创建primary目录和mirror目录) 或 3个目录创建都可以
su - gpadmin

-- master节点,Standby节点
mkdir -p /greenplum/data/master

-- segment节点
mkdir -p /greenplum/data/primary
mkdir -p /greenplum/data/mirror


-- master节点配置,有几个segment节点就设置几个DATA_DIRECTORY
cat > /home/gpadmin/conf/initgp_config <<"EOF"
#指定primary segment的数据目录,多个目录表示一台机器有多个segment
declare -a DATA_DIRECTORY=(/greenplum/data/primary /greenplum/data/primary)
# mirror的数据目录,和主数据一样,一个对一个,多个对多个
declare -a MIRROR_DATA_DIRECTORY=(/greenplum/data/mirror /greenplum/data/mirror)
#数据库代号
ARRAY_NAME="rptgp"
#segment前缀
SEG_PREFIX=gpseg
#primary segment 起始的端口号
PORT_BASE=6000
MIRROR_PORT_BASE=7000
MASTER_PORT=5432
MASTER_HOSTNAME=mdw1
MASTER_DIRECTORY=/greenplum/data/master
DATABASE_NAME=rptgpdb
MACHINE_LIST_FILE=/home/gpadmin/conf/seg_hosts
EOF


--在master节点操作,执行初始化命令
su - gpadmin
gpinitsystem -c /home/gpadmin/conf/initgp_config -e=jeames -s mdw2 -P 5432 -S /greenplum/data/master/gpseg-1 -m 200 -b 256MB

4.GP集群验证

若初始化成功,则GP自动启动,可以看到master节点上的5432已经在listen了,psql进入数据库,开始greenplum之旅。

[gpadmin@mdw1 ~]$ netstat -tulnp | grep 5432
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 0.0.0.0:5432            0.0.0.0:*               LISTEN      58954/postgres
tcp6       0      0 :::5432                 :::*                    LISTEN      58954/postgres


[gpadmin@mdw1 ~]$ psql postgres -c 'SELECT * FROM pg_stat_replication;'
  pid  | usesysid | usename | application_name | client_addr  | client_hostname | client_port |         backend_start         | backend_xmin |   state
 | sent_location | write_location | flush_location | replay_location | sync_priority | sync_state
-------+----------+---------+------------------+--------------+-----------------+-------------+-------------------------------+--------------+----------
-+---------------+----------------+----------------+-----------------+---------------+------------
 60403 |       10 | gpadmin | gp_walreceiver   | 192.168.6.13 |                 |       33428 | 2023-06-11 15:37:38.811796+08 |              | streaming
 | 0/C000000     | 0/C000000      | 0/C000000      | 0/C000000       |             1 | sync
(1 row)


[gpadmin@mdw1 ~]$ psql -d rptgpdb
psql (9.4.26)
Type "help" for help.

rptgpdb=# \l
                               List of databases
   Name    |  Owner  | Encoding |  Collate   |   Ctype    |  Access privileges
-----------+---------+----------+------------+------------+---------------------
 postgres  | gpadmin | UTF8     | en_US.utf8 | en_US.utf8 |
 rptgpdb   | gpadmin | UTF8     | en_US.utf8 | en_US.utf8 |
 template0 | gpadmin | UTF8     | en_US.utf8 | en_US.utf8 | =c/gpadmin         +
           |         |          |            |            | gpadmin=CTc/gpadmin
 template1 | gpadmin | UTF8     | en_US.utf8 | en_US.utf8 | =c/gpadmin         +
           |         |          |            |            | gpadmin=CTc/gpadmin

rptgpdb=# show port;
 port
------
 5432
(1 row)

rptgpdb=# show listen_addresses;
 listen_addresses
------------------
 *
(1 row)

rptgpdb=# select * from gp_segment_configuration order by 1;
 dbid | content | role | preferred_role | mode | status | port | hostname | address |            datadir
------+---------+------+----------------+------+--------+------+----------+---------+--------------------------------
    1 |      -1 | p    | p              | n    | u      | 5432 | mdw1     | mdw1    | /greenplum/data/master/gpseg-1
    2 |       0 | p    | p              | s    | u      | 6000 | sdw1     | sdw1    | /greenplum/data/primary/gpseg0
    3 |       1 | p    | p              | s    | u      | 6001 | sdw1     | sdw1    | /greenplum/data/primary/gpseg1
    4 |       2 | p    | p              | s    | u      | 6000 | sdw2     | sdw2    | /greenplum/data/primary/gpseg2
    5 |       3 | p    | p              | s    | u      | 6001 | sdw2     | sdw2    | /greenplum/data/primary/gpseg3
    6 |       0 | m    | m              | s    | u      | 7000 | sdw2     | sdw2    | /greenplum/data/mirror/gpseg0
    7 |       1 | m    | m              | s    | u      | 7001 | sdw2     | sdw2    | /greenplum/data/mirror/gpseg1
    8 |       2 | m    | m              | s    | u      | 7000 | sdw1     | sdw1    | /greenplum/data/mirror/gpseg2
    9 |       3 | m    | m              | s    | u      | 7001 | sdw1     | sdw1    | /greenplum/data/mirror/gpseg3
   10 |      -1 | m    | m              | s    | u      | 5432 | mdw2     | mdw2    | /greenplum/data/master/gpseg-1
(10 rows)

rptgpdb=# select * from gp_segment_configuration order by hostname,port;
 dbid | content | role | preferred_role | mode | status | port | hostname | address |            datadir
------+---------+------+----------------+------+--------+------+----------+---------+--------------------------------
    1 |      -1 | p    | p              | n    | u      | 5432 | mdw1     | mdw1    | /greenplum/data/master/gpseg-1
   10 |      -1 | m    | m              | s    | u      | 5432 | mdw2     | mdw2    | /greenplum/data/master/gpseg-1
    2 |       0 | p    | p              | s    | u      | 6000 | sdw1     | sdw1    | /greenplum/data/primary/gpseg0
    3 |       1 | p    | p              | s    | u      | 6001 | sdw1     | sdw1    | /greenplum/data/primary/gpseg1
    8 |       2 | m    | m              | s    | u      | 7000 | sdw1     | sdw1    | /greenplum/data/mirror/gpseg2
    9 |       3 | m    | m              | s    | u      | 7001 | sdw1     | sdw1    | /greenplum/data/mirror/gpseg3
    4 |       2 | p    | p              | s    | u      | 6000 | sdw2     | sdw2    | /greenplum/data/primary/gpseg2
    5 |       3 | p    | p              | s    | u      | 6001 | sdw2     | sdw2    | /greenplum/data/primary/gpseg3
    6 |       0 | m    | m              | s    | u      | 7000 | sdw2     | sdw2    | /greenplum/data/mirror/gpseg0
    7 |       1 | m    | m              | s    | u      | 7001 | sdw2     | sdw2    | /greenplum/data/mirror/gpseg1
(10 rows)

5.集群管理命令

1.集群状态
[gpadmin@mdw1 ~]$ gpstate

20230611:15:46:55:063875 gpstate:mdw1:gpadmin-[INFO]:-Starting gpstate with args:
20230611:15:46:55:063875 gpstate:mdw1:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.24.3 build commit:25d3498a400ca5230e81ab                                      b94861f23389315213 Open Source'
20230611:15:46:55:063875 gpstate:mdw1:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.26 (Greenplum Database 6.24.3 build commit:25d3498a400ca                                      5230e81abb94861f23389315213 Open Source) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on May  3 2023 21:05:45'
20230611:15:46:55:063875 gpstate:mdw1:gpadmin-[INFO]:-Obtaining Segment details from master...
20230611:15:46:55:063875 gpstate:mdw1:gpadmin-[INFO]:-Gathering data from segments...
..
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-Greenplum instance status summary
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-----------------------------------------------------
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Master instance                                           = Active
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Master standby                                            = mdw2
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Standby master state                                      = Standby host passive
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total segment instance count from metadata                = 8
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-----------------------------------------------------
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Primary Segment Status
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-----------------------------------------------------
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total primary segments                                    = 4
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total primary segment valid (at master)                   = 4
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total primary segment failures (at master)                = 0
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number of postmaster.pid files missing              = 0
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number of postmaster.pid files found                = 4
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs missing               = 0
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs found                 = 4
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number of /tmp lock files missing                   = 0
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number of /tmp lock files found                     = 4
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number postmaster processes missing                 = 0
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number postmaster processes found                   = 4
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-----------------------------------------------------
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Mirror Segment Status
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-----------------------------------------------------
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total mirror segments                                     = 4
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total mirror segment valid (at master)                    = 4
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total mirror segment failures (at master)                 = 0
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number of postmaster.pid files missing              = 0
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number of postmaster.pid files found                = 4
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs missing               = 0
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs found                 = 4
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number of /tmp lock files missing                   = 0
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number of /tmp lock files found                     = 4
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number postmaster processes missing                 = 0
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number postmaster processes found                   = 4
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number mirror segments acting as primary segments   = 0
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-   Total number mirror segments acting as mirror segments    = 4
20230611:15:46:57:063875 gpstate:mdw1:gpadmin-[INFO]:-----------------------------------------------------

gpstate -s 可以查看详细信息
命令gpstate -f可以查看standby master库的详情



2.开关闭集群
[gpadmin@mdw1 ~]$ gpstop -a
[gpadmin@mdw1 ~]$ gpstart -a

你可能感兴趣的:(Greenplum入门,分布式,数据库)