编 写 |
潘永雷 |
时 间 |
2017年04月13日 |
说明 |
本文档用于指导centos下安装gp greenplum-db-4.3.9.1-build-1-rhel5-x86_64.bin,不同环境可能略有不同,在文档所对应的环境下经验证安装成功 |
1 2 3 4 |
[gpadmin@sdw1 ~]$ uname -a Linux sdw1 2.6.32-642.el6.x86_64 #1 SMP Tue May 10 17:27:01 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux [gpadmin@sdw1 ~]$ cat /etc/issue CentOS release 6.8 (Final) Kernel \r on an \m |
192.168.0.223Master
192.168.0.224 Segment,
192.168.0.130 225 Segment,
参数修改后各个机器需要reboot下才能生效
以下是最小配置(把没有的输进去,不一样的修改之)
kernel.shmmax = 500000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 250 512000 100 2048
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.ip_local_port_range = 1025 65535
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.overcommit_memory = 2
编辑/etc/security/limits.conf,添加一下几行(注意*也需要添加)
* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072
检测防火墙
#/sbin/chkconfig --list iptables
关闭状态:iptables0:off 1:off 2:off 3:off 4:off 5:off 6:off
使用以下命令关闭
/sbin/chkconfigiptables off
修改/etc/selinux/config文件
SELINUX=disabled
出现以下情形则关闭成功(每太机器都要关闭)
新增elevator=deadline
master主机的/etc/hosts第一行 localhost localhost.localdomainlocalhost4 localhost4.localdomain4 改为 localhost mdw localhost4 localhost4.localdomain4
并增加
192.168.0.223 mdw
192.168.0.224 sdw1
192.168.0.225 sdw2
其他两个slave的/etc/hosts只需要改第一行就行,改成sdw1和sdw2
然后修改成下图形式,退出后加上一行命令
hostname mdw
(mdw是主节点的主机名,从节点是sdw1和sdw2)
passwd gpadmin
(三台机器都执行)
在master的/home/gpadmin目录下面创建all_hosts文件,内容如下
mdw
sdw1
sdw2
在master的/home/gpadmin目录下面创建all_segs文件,内容如下
sdw1
sdw2
(a) 得到greenplum-db-*.*.*.*-build-1-RHEL5-x86_64.bin文件,将其拷贝到/usr/local文件夹下进行安装(因为官网默认在此目录安装,为了不至于后面配置参数之类的太麻烦,我们也在这个目录下安装)。
/bin/bash greenplum-db-*.*.*.*-build-1-RHEL5-x86_64.bin
执行以上命令进行安装,过程中可能要按照提示输入几次yes。
(b)修改Greenplum所有者
# chown -R gpadmin/usr/local/greenplum-db
# chgrp -R gpadmin/usr/local/greenplum-db
# chown -R gpadmin/usr/local/greenplum-db-*.*.*.*
# chgrp -R gpadmin/usr/local/greenplum-db-*.*.*.*
(c)配置环境变量:
其环境变量在/usr/local/greenplum-db/greenplum_path.sh文件中也有,可以参考
打开/etc/profile对环境变量修改:
vim/etc/profile
添加以下文件:
GPHOME=/usr/local/greenplum-db-4.3.6.2
PATH=$GPHOME/bin:$GPHOME/ext/python/bin:$PATH
exportGPHOME
exportPATH
(d)设置Master主机上的数据目录,指定数据存放位置,空间要足够。
#mkdir /gpmaster
#chown -R gpadmin /gpmaster
#chgrp -R gpadmin /gpmaster
切记,在设置这个文件夹后需要修改gpadmin的环境变量,否则初始化GP的时候会找不到master的存储文件夹
在gpadmin用户下:
vim ~/.bashrc
在末尾添加:
MASTER_DATA_DIRECTORY=/gpmaster
exportMASTER_DATA_DIRECTORY
之后记得 source~/.bashrc
(a) 主节点上创建安装GP的tar文件
cd /usr/local
gtar -cvf /home/gpadmin/gp.tar greenplum-db-*.*.*.*
(b)复制到segments
scp /home/gpadmin/gp.tar sdw1:/usr/local
scp/home/gpadmin/gp.tar sdw2:/usr/local
(c) 每个从节点上执行
gtar--directory /usr/local -xvf/usr/local/gp.tar
建立gp当前版本目录连接:
ln -s/usr/local/greenplum-db-*.*.*.* /usr/local/greenplum-db
修改目录所有者
chown -Rgpadmin /usr/local/greenplum-db
chgrp -R gpadmin /usr/local/greenplum-db
chown -R gpadmin /usr/local/greenplum-db-*.*.*.*
chgrp -R gpadmin /usr/local/greenplum-db-*.*.*.*
(d)建立segment上的存储区
mkdir /home/gpadmin/primary #主文件
mkdir /home/gpadmin/mirror #镜像文件
修改权限和所有者(同上一步(c))
(1)同步时钟
在gpadmin下:
检查时钟:gpssh -f /home/gpadmin/all_hosts -v date
同步: gpssh -f /home/gpadmin/all_hosts -v ntpd
(2)系统检测:
在gpadmin下:
gpcheckos -f /home/gpadmin/all_hosts
cp /usr/local/greenplum-db/docs/cli_help/gpconfigs/gpinitsystem_config/home/gpadmin/gpconfigs/ 并修改gpinitsystem_config
# FILE NAME: gpinitsystem_config
# Configuration file needed by the gpinitsystem
################################################
#### REQUIRED PARAMETERS
################################################
#### Name of this Greenplum system enclosed in quotes.
ARRAY_NAME="EMC Greenplum DW"
#### Naming convention for utility-generated data directories.
SEG_PREFIX=gpseg
#### Base number by which primary segment port numbers
#### are calculated.
PORT_BASE=40000
#### File system location(s) where primary segment data directories
#### will be created. The number of locations in the list dictate
#### the number of primary segments that will get created per#### physical host (if multiple addresses for a host are listed in
#### the hostfile, the number of segments will be spread evenly across
#### the specified interface addresses).
declare -a DATA_DIRECTORY=(/home/gpadmin/primary)
#### OS-configured hostname or IP address of the master host.
MASTER_HOSTNAME=mdw
#### File system location where the master data directory
#### will be created.
MASTER_DIRECTORY=/gpmaster
#### Port number for the master instance.
MASTER_PORT=5432
#### Shell utility used to connect to remote hosts.
TRUSTED_SHELL=ssh
#### Maximum log file segments between automatic WAL checkpoints.
CHECK_POINT_SEGMENTS=8
#### Default server-side character set encoding.
ENCODING=UNICODE
################################################
#### OPTIONAL MIRROR PARAMETERS
################################################
#### Base number by which mirror segment port numbers
#### are calculated.
#MIRROR_PORT_BASE=50000
#### Base number by which primary file replication port
#### numbers are calculated.
#MIRROR_REPLICATION_PORT_BASE=51000
#### File system location(s) where mirror segment data directories
#### will be created. The number of mirror locations must equal the
#### number of primary locations as specified in the
#### DATA_DIRECTORY parameter.
##declare -a MIRROR_DATA_DIRECTORY=(/home/gpadmin/mirror /home/gpadmin/mirror /data1/mirror /data2/mirror /data2/mirror /data2/mirror)
declare -a MIRROR_DATA_DIRECTORY=(/home/gpadmin/mirror)
################################################
#### Create a database of this name after initialization.
#DATABASE_NAME=name_of_database
DATABASE_NAME=gpexmp
#### Specify the location of the host address file here instead of
#### with the the -h option of gpinitsystem.
MACHINE_LIST_FILE=/home/gpadmin/all_segs
(4)初始化数据库:
gpinitsystem-c /home/gpadmin/gpconfigs/gpinitsystem_config
会出现很多info或者warn,如果有以下信息,说明初始化成功,
然后输入Y。
这里在初始化之后建立了一个叫gpexmp的数据库,我们可以进行测试:
在gpadmin下:
执行以下查询
psql -d gpexmp
error while loading shared libraries:libnetsnmp.so.20: cannot open shared object file: No such file or directory
是因为环境变量配置的不正确,需要将/usr/local/greenplum-db/greenplum_path.sh 的内容复制到.bashrc下,相当于依赖LD_LIBRARY_PATH环境变量
[gpadmin@mdw gpseg-1]$ gpstart -v
20170412:16:23:44:003224gpstart:mdw:gpadmin-[INFO]:-Starting gpstart with args: -v
20170412:16:23:44:003224gpstart:mdw:gpadmin-[DEBUG]:-Setting level of parallelism to: 64
20170412:16:23:44:003224gpstart:mdw:gpadmin-[INFO]:-Gathering information and validating theenvironment...
20170412:16:23:44:003224gpstart:mdw:gpadmin-[DEBUG]:-Checking if GPHOME env variable is set.
20170412:16:23:44:003224gpstart:mdw:gpadmin-[DEBUG]:-Checking if MASTER_DATA_DIRECTORY env variable isset.
20170412:16:23:44:003224gpstart:mdw:gpadmin-[DEBUG]:-Checking if LOGNAME or USER env variable is set.
20170412:16:23:44:003224gpstart:mdw:gpadmin-[DEBUG]:---Checking that current user can use GP binaries
20170412:16:23:44:003224gpstart:mdw:gpadmin-[DEBUG]:-Obtaining master's port from master data directory
20170412:16:23:44:003224gpstart:mdw:gpadmin-[DEBUG]:-Read from postgresql.conf port=5432
20170412:16:23:44:003224gpstart:mdw:gpadmin-[DEBUG]:-Read from postgresql.conf max_connections=250
20170412:16:23:44:003224gpstart:mdw:gpadmin-[DEBUG]:-gp_external_grant_privileges is None
20170412:16:23:44:003224gpstart:mdw:gpadmin-[INFO]:-Reading the gp_dbid file - /gpmaster/gp_dbid...
20170412:16:23:44:003224gpstart:mdw:gpadmin-[ERROR]:-gpstart failed. exiting...
Traceback (most recent call last):
File"/usr/local/greenplum-db/lib/python/gppylib/mainUtils.py", line 281,in simple_main_locked
exitCode = commandObject.run()
File "/usr/local/greenplum-db/./bin/gpstart", line 95, in run
self._prepare()
File "/usr/local/greenplum-db/./bin/gpstart", line 196, in_prepare
self._basic_setup()
File "/usr/local/greenplum-db/./bin/gpstart", line 212, in_basic_setup
self.dbidfile = GpDbidFile(self.master_datadir, do_read=True,logger=get_logger_if_verbose())
File "/usr/local/greenplum-db/lib/python/gppylib/gp_dbid.py",line 39, in __init__
self.read_gp_dbid()
File "/usr/local/greenplum-db/lib/python/gppylib/gp_dbid.py",line 49, in read_gp_dbid
with open(self.filepath) as f:
IOError: [Errno 2] No such file ordirectory: '/gpmaster/gp_dbid'
gpstart -d /gpmaster/gpseg-1/ -v
gpstop -u
20170412:16:28:07:003451gpstop:mdw:gpadmin-[INFO]:-Starting gpstop with args: -u
20170412:16:28:07:003451gpstop:mdw:gpadmin-[INFO]:-Gathering information and validating theenvironment...
20170412:16:28:07:003451gpstop:mdw:gpadmin-[ERROR]:-gpstop error: postmaster.pid file does notexist. is Greenplum instance alreadystopped?
改为如下执行: gpstop -u -d/gpmaster/gpseg-1/
Command was: 'env GPSESSID=0000000000 GPKILL=NEVER GPERA=None$GPHOME/bin/pg_ctl -D /data/master/gpseg-1 -l /data/master/gpseg-1/pg_log/startup.log-w -t 600 -o " -p 5432 -b 1 -z 0 --silent-mode=true -i -M master -C -1 -x0 -c gp_role=utility " start'
rc=1, stdout='waiting for server to start......could not start server
', stderr='pg_ctl: PID file "/data/master/gpseg-1/postmaster.pid"does not exist
在配置pg_hba.conf文件时需要小心,这个文件出错后启动会报这个错误
如果已经安装上gp再修改端口,需要改.bashrc 和 postgresql.conf
.bashrc 需要 exportPGPORT=15432
http://blog.csdn.net/yxlllll/article/details/50266269此次安装参考这个链接
/usr/local/greenplum-db
/gpmaster/gpseg-1/pg_hba.conf
安装好gpfdist后,gpload也自动有了,可以自动使用。
基础组件
wget http://pyyaml.org/download/libyaml/yaml-0.1.7.tar.gz
tar -xvfyaml-0.1.7.tar.gz
cdyaml-0.1.7
./configure
make
makeinstall
unzipgreenplum-loaders-4.3.8.2-build-1-RHEL5-x86_64.zip
shgreenplum-loaders-4.3.8.2-build-1-RHEL5-x86_64.bin –y
默认安装在这个目录下面了/usr/local/greenplum-loaders-4.3.8.1-build-1
修改/etc/profile
PATH=/usr/local/greenplum-loaders-4.3.8.1-build-1/bin:$JAVA_HOME/bin:$PATH
source/usr/local/greenplum-loaders-4.3.8.1-build-1/greenplum_loaders_path.sh
执行source/usr/local/greenplum-loaders-4.3.8.1-build-1/greenplum_loaders_path.sh
可以创建外表指定gpfdist,外表指定gpfdist路径,数据没有移动到gp。 也可以用gpload导入。
Gpload下载链接 https://network.pivotal.io/products/pivotal-gpdb/#/releases/2146
参考链接
http://www.infocool.net/kb/OtherDB/201705/360938.html
http://blog.csdn.net/mchdba/article/details/72540806
https://discuss.pivotal.io/hc/en-us/articles/115002064167-GPLOAD-Unable-to-Import-the-PyGreSQL-Python-Module-pg-py-