一、基础架构介绍
1.网络结构
2.部署架构
二、集群主机基础配置
1.机器准备
2.机器配置
1⃣️域名解析配置
2⃣️设置系统参数
3⃣️配置Linux文件描述符
4⃣️挂在磁盘
5⃣️关闭防火墙
6⃣️配置系统时钟
7⃣️重启系统,让所有配置生效
三、安装GreenPlum数据库,并配置gpadmin用户
1.在ansible主机的/etc/ansible/hosts上配置对应的机器域名:
2.Ansible Playbook样例
3.下载对应的greenplum压缩包
4.运行ansible-playbook命令
四、配置SSH免密登陆
1.登陆master主机并切换成gpadmin用户
2.初始化GreenPlum的path文件
3.在master节点上生成ssh key文件
4.使用ssh-copy-id命令将master的public key添加到其他主机上
5.在master节点上创建主机列表文件
6.执行gpssh-exkeys命令完成n-n的ssh免密登陆
7.验证
五、创建数据存储
1.在master上创建存储目录
2.用gpssh命令在segment主机上创建目录
六、初始化GreenPlum集群
1.登陆master主机,切换gpadmin用户
2.创建初始化host文件
3.拷贝初始化文件到用户目录下
4.修改初始化配置
5.保存并退出文件
6.执行初始化命令
1⃣️跳转到master机器上gpadmin的用户目录,执行初始化命令
2⃣️确认安装步骤
3⃣️初始化失败定位
七、配置GreenPlum环境变量
1.以gpadmin用户登陆master主机
2.编辑.bashrc文件
3.将GreenPlum命令行和MASTER_DATA_DIRECTORY加入初始化命令中
4.退出并保存
5.初始化文件使其生效
GP数据通过多台主机进行大量的数据处理;master节点是整个GP集群的入口,用户通过master节点连接并提交sql语句;segment节点功能是处理数据和存储数据,master负责协调各个节点直接的工作负载,如下图所示:
本编文章部署架构为单master节点,单segment节点;如需部署高可用集群:master主备,segment冗余,可参考官网https://gpdb.docs.pivotal.io/6-0/main/index.html,或评论区留言
本例子用的是5台16C32G的腾讯云机器,每台机器挂在一个100G的数据磁盘
机器配置主要包含一下三个方面:
用root用户登陆各台主机,编辑/etc/hosts,并将IP和域名映射配置加到末尾,为了让5台机器之间通过域名能相互访问,如:
# master
10.0.0.1 mdw
# segments
10.0.0.2 sdw1
10.0.0.3 sdw2
10.0.0.4 sdw3
10.0.0.5 sdw4
可在任意一台机器ping对方的域名测试,如:在master上执行 ping sdw1
5个地方需要根据系统的值配置
# kernel.shmall = _PHYS_PAGES / 2 # 备注<1>
kernel.shmall = 4000000000
# kernel.shmmax = kernel.shmall * PAGE_SIZE # 备注<2>
kernel.shmmax = 500000000
kernel.shmmni = 4096
vm.overcommit_memory = 2
vm.overcommit_ratio = 95
net.ipv4.ip_local_port_range = 10000 65535 # 备注<3>
kernel.sem = 500 2048000 200 40960
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.swappiness = 10
vm.zone_reclaim_mode = 0
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
vm.dirty_background_ratio = 0 # 备注<5>
vm.dirty_ratio = 0
vm.dirty_background_bytes = 1610612736
vm.dirty_bytes = 4294967296
Notes:
备注<1><2>
kernel.shmall(共享内存页总数)
kernel.shmmax (共享内存段的最大值)
一般来讲,这两个参数的值应该是物理内存的一半,可以通过操作系统的值_PHYS_PAGES和PAGE_SIZE计算得出。
kernel.shmall = ( _PHYS_PAGES / 2)
kernel.shmmax = ( _PHYS_PAGES / 2) * PAGE_SIZE
也可以通过以下两个命令得出这两个参数的值:
$ echo $(expr $(getconf _PHYS_PAGES) / 2)
$ echo $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE))
如果得出的kernel.shmmax值小于系统的默认值,则引用系统默认值即可
备注<3>
segment使用的端口是6000开始
segment mirror使用的端口是7000开始
所以配置默认值即可
net.ipv4.ip_local_port_range = 10000 65535
备注<5>
对于64G内存的操作系统,建议配置如下值:
vm.dirty_background_ratio = 0
vm.dirty_ratio = 0
vm.dirty_background_bytes = 1610612736 # 1.5GB
vm.dirty_bytes = 4294967296 # 4GB
对于小于64G内存的操作系统,建议配置如下值:
vm.dirty_background_ratio = 3
vm.dirty_ratio = 10
配置如下参数到/etc/security/limits.conf文件中:
* soft nofile 524288
* hard nofile 524288
* soft nproc 131072
* hard nproc 131072
官方建议使用XFS磁盘类型,当然其他磁盘类型也是可以
示例配置如下:
将/dev/data磁盘挂载到/data目录下,配置/etc/fstab文件以使Linux系统启动默认挂载磁盘,如下配置添加到文件/etc/fstab:
/dev/data /data xfs nodev,noatime,nobarrier,inode64 0 0
自行检查
# systemctl status firewalld
# systemctl stop firewalld.service
# systemctl disable firewalld.service
# /sbin/chkconfig iptables off
配置segment主机与master时钟同步
将如下配置加入到/etc/ntp.conf文件中:
server mdw prefer
mdw为前面master配置的域名
该步骤主要是安装GreenPlum软件包,创建gpadmin用户并配置目录权限
以下示例通过ansible-playbook安装,也可以通过yum、apt等包管理工具安装:
[greenplum]
10.0.0.1
10.0.0.2
10.0.0.3
10.0.0.4
10.0.0.5
可配置连接用户名或密码,如:
[greenplum]
10.0.0.1 ansible_ssh_user=root ansible_ssh_pass=xxx
10.0.0.2 ansible_ssh_user=root ansible_ssh_pass=xxx
10.0.0.3 ansible_ssh_user=root ansible_ssh_pass=xxx
10.0.0.4 ansible_ssh_user=root ansible_ssh_pass=xxx
10.0.0.5 ansible_ssh_user=root ansible_ssh_pass=xxx
Ansible Playbook - Greenplum Database Installation for CentOS 7
---
- hosts: greenplum
vars:
- version: "6.0.0"
- greenplum_admin_user: "gpadmin"
- greenplum_admin_password: "$changeme"
# - package_path: passed via the command line with: -e package_path=./greenplum-db-6.0.0-rhel7-x86_64.rpm
remote_user: root
become: yes
become_method: sudo
connection: ssh
gather_facts: yes
tasks:
- name: create greenplum admin user
user:
name: "{{ greenplum_admin_user }}"
password: "{{ greenplum_admin_password | password_hash('sha512', 'DvkPtCtNH+UdbePZfm9muQ9pU') }}"
- name: copy package to host
copy:
src: "{{ package_path }}"
dest: /tmp
- name: install package
yum:
name: "/tmp/{{ package_path | basename }}"
state: present
- name: cleanup package file from host
file:
path: "/tmp/{{ package_path | basename }}"
state: absent
- name: find install directory
find:
paths: /usr/local
patterns: 'greenplum*'
file_type: directory
register: installed_dir
- name: change install directory ownership
file:
path: '{{ item.path }}'
owner: "{{ greenplum_admin_user }}"
group: "{{ greenplum_admin_user }}"
recurse: yes
with_items: "{{ installed_dir.files }}"
- name: update pam_limits
pam_limits:
domain: "{{ greenplum_admin_user }}"
limit_type: '-'
limit_item: "{{ item.key }}"
value: "{{ item.value }}"
with_dict:
nofile: 524288
nproc: 131072
- name: find installed greenplum version
shell: . /usr/local/greenplum-db/greenplum_path.sh && /usr/local/greenplum-db/bin/postgres --gp-version
register: postgres_gp_version
- name: fail if the correct greenplum version is not installed
fail:
msg: "Expected greenplum version {{ version }}, but found '{{ postgres_gp_version.stdout }}'"
when: "version is not defined or version not in postgres_gp_version.stdout"
https://network.pivotal.io/products/pivotal-gpdb/#/releases/449820/file_groups/2047
ansible-playbook ansible-playbook.yml -i hosts -e package_path=./greenplum-db-6.0.0-rhel7-x86_64.rpm
$ source /usr/local/greenplum-db-/greenplum_path.sh
$ ssh-keygen
提示语一直按Enter使用默认值即可
$ ssh-copy-id sdw1
$ ssh-copy-id sdw2
$ ssh-copy-id sdw3
$ ssh-copy-id sdw4
按照提示语输入密码对应主机上gpadmin用户的密码即可
至此,完成1-n的ssh免密登陆
跳转到gpadmin用户目录下,创建hostfile_exkeys文件,写入包含master在内的所有节点的域名,如下:
mdw
sdw1
sdw2
sdw3
sdw4
tips:确保在每台机器上的/etc/hosts文件上配置域名解析文件,否则各个主机之间将不能访问,参考2-2-1
跳转到gpadmin用户目录下,执行命令:
$ gpssh-exkeys -f hostfile_exkeys
执行如下命令,如果显示内容一致,则表示配置成功:
$ gpssh -f hostfile_exkeys -e 'ls -l /usr/local/greenplum-db-'
root用户登陆,并执行如下命令:
# mkdir -p /data/master
# chown gpadmin:gpadmin /data/master
创建hostfile_gpssh_segonly文件,只包含segment主机的域名
sdw1
sdw2
sdw3
sdw4
使用gpssh命令创建primary和mirror目录,如下:
# source /usr/local/greenplum-db/greenplum_path.sh
# gpssh -f hostfile_gpssh_segonly -e 'mkdir -p /data/primary'
# gpssh -f hostfile_gpssh_segonly -e 'mkdir -p /data/mirror'
# gpssh -f hostfile_gpssh_segonly -e 'chown -R gpadmin /data/*'
$ su - gpadmin
在gpadmin用户目录下创建gpconfigs目录
$ mkdir -p ~/gpconfigs
生成hostfile_gpinitsystem文件,将segment的节点hostname加入,每行一个域名
vim ~/gpconfigs/hostfile_gpinitsystem
sdw1
sdw2
sdw3
sdw4
$ cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config \
/home/gpadmin/gpconfigs/gpinitsystem_config
打开上一步拷贝的文件,根据自己需求更改以下配置
ARRAY_NAME="Greenplum Data Platform"
SEG_PREFIX=gpseg
PORT_BASE=6000
declare -a DATA_DIRECTORY=(/data/primary /data/primary /data/primary /data/primary)
MASTER_HOSTNAME=mdw
MASTER_DIRECTORY=/data/master
MASTER_PORT=5432
TRUSTED SHELL=ssh
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODE
此示例没有配置mirror segment,如需要可以后续通过gpaddmirrors工具添加
$ cd ~
$ gpinitsystem -c gpconfigs/gpinitsystem_config -h gpconfigs/hostfile_gpinitsystem
工具会验证初始化配置文件、确保各个节点之间的网络互通和验证配置的目录是否能连通。如果验证都通过,系统会提示如下:
=> Continue with Greenplum creation? Yy/Nn
输入Y开始初始化
初始化成功后,控制台将输出
=> Greenplum Database instance successfully created.
如果初始化失败,错误信息类似如下显示:
...
.......................
20170124:16:53:30:031785 gpinitsystem:htcom:postgres-[INFO]:------------------------------------------------
20170124:16:53:30:031785 gpinitsystem:htcom:postgres-[INFO]:-Parallel process exit status
20170124:16:53:30:031785 gpinitsystem:htcom:postgres-[INFO]:------------------------------------------------
20170124:16:53:30:031785 gpinitsystem:htcom:postgres-[INFO]:-Total processes marked as completed = 6
20170124:16:53:30:031785 gpinitsystem:htcom:postgres-[INFO]:-Total processes marked as killed = 0
20170124:16:53:30:031785 gpinitsystem:htcom:postgres-[WARN]:-Total processes marked as failed = 3 <<<<<
20170124:16:53:30:031785 gpinitsystem:htcom:postgres-[INFO]:------------------------------------------------
20170124:16:53:30:031785 gpinitsystem:htcom:postgres-[FATAL]:-Errors generated from parallel processes
20170124:16:53:30:031785 gpinitsystem:htcom:postgres-[INFO]:-Dumped contents of status file to the log file
20170124:16:53:30:031785 gpinitsystem:htcom:postgres-[INFO]:-Building composite backout file
20170124:16:53:30:gpinitsystem:htcom:postgres-[FATAL]:-Failures detected, see log file /home/postgresql/gpAdminLogs/gpinitsystem_20170124.log for more detail Script Exiting!
20170124:16:53:30:031785 gpinitsystem:htcom:postgres-[WARN]:-Script has left Greenplum Database in an incomplete state
20170124:16:53:30:031785 gpinitsystem:htcom:postgres-[WARN]:-Run command /bin/bash /home/postgresql/gpAdminLogs/backout_gpinitsystem_postgres_20170124_165242 to remove these changes
20170124:16:53:30:031785 gpinitsystem:htcom:postgres-[INFO]:-Start Function BACKOUT_COMMAND
20170124:16:53:30:031785 gpinitsystem:htcom:postgres-[INFO]:-End Function BACKOUT_COMMAND
可以通过grep命令搜索日志中的错误信息定位问题
$ grep FATAL /home/postgresql/gpAdminLogs/gpinitsystem_20170124.log
执行如下命令重置初始化失败的变更
$ sh /bin/bash/home/postgresql/gpAdminLogs/backout_gpinitsystem_postgres_20170124_165242
$ su - gpadmin
$ vi ~/.bashrc
source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/data/master/gpseg-1
$ source ~/.bashrc