GPDB-gphdfs

Greenplum本地支持并行地将HDFS上的数据加载到数据库中,采用的方式就是用gphdfs协议,本文简要介绍部署和测试细节。

1.master和segment安装java
1.1 删除已经安装的java组件

  yum -y remove java
  或者:
  rpm -qa | grep java查找安装的包
  rpm -e --nodeps java-1.4.2-gcj-compat-1.4.2.0-40jpp.115 分步骤删除

1.2 安装java,这里安装java6

 cd /usr
   chmod 777 ./jdk-6u45-linux-x64-rpm.bin
   ./jdk-6u45-linux-x64-rpm.bin

1.3 配置环境变量

vi /etc/profile
export JAVA_HOME=/usr/java/jdk1.6.0_45
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export JAVA_HOME JAVA_BIN PATH CLASSPATH
PATH=$PATH:$HOME/bin
export PATH

1.4 验证

source /etc/profile
java -version查看是否是java

java version "1.6.0_45"
Java(TM) SE Runtime Environment (build 1.6.0_45-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)

2.hadoop相关设置

2.1 将hadoop机器上的hadoop目录下的文件直接copy到GPDB的master和segment

2.2 确保hadoop/etc/hadoop/hadoop-env.sh中的java_home配置和gpdb的java配置一致,如果不一致,修改hadoop-env.sh

2.3 修改/etc/hosts将hadoop的namenode和datanode的机器加入

3.环境参数设置

3.1 master和segment都需要增加

  su - gpadmin
  vi .bashrc
  export JAVA_HOME=/usr/java/jdk1.6.0_45
  export CLASSPATH=$JAVA_HOME/lib/tools.jar
  export HADOOP_HOME=/home/hadoop/hadoop
  export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

3.2 master修改配置参数

  gpconfig -c gp_hadoop_target_version -v "'hadoop2'"
  gpconfig -c gp_hadoop_home -v "'/home/hadoop/hadoop'"

  gpstop -M fast -ra重启后检查配置参数
  gpconfig -s gp_hadoop_target_version
  gpconfig -s gp_hadoop_home

4.基本验证

  su - gpadmin
  hdfs dfs -ls /

5.设置权限

psql dbname
#写权限
GRANT INSERT ON PROTOCOL gphdfs TO gpadmin;
#读权限
GRANT SELECT ON PROTOCOL gphdfs TO gpadmin;
#所有权限
GRANT ALL ON PROTOCOL gphdfs TO gpadmin;

6.外部表访问

reate external table external_test
(
       id int,
       name text
)
LOCATION ('gphdfs://c9test91:9000/test.txt')
FORMAT 'TEXT' (delimiter '\t');

select * from external_test;

注意:这里的端口号是指的namenode的IPC: ClientProtocol协议端口号,是参数fs.default.name指定的,默认9000
如果端口号指定错误,会出现如下的报错信息

ERROR: external table gphdfs protocol command ended with error. 17/03/28 12:15:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable  (seg0 slice1 viltual2:40000 pid=27368)
  详细:

17/03/28 12:15:12 WARN conf.Configuration: hdfs-site.xml:an attempt to override final parameter: dfs.namenode.name.dir;  Ignoring.
17/03/28 12:15:12 WARN conf.Configuration: hdfs-site.xml:an attempt to override final parameter: dfs.datanode.data.dir;  Ignoring.
17/03/28 12:15:12 WARN conf.Configuration: hdfs-site.xml:an attempt to override final paramete
Command: 'gphdfs://c9test91:8020/test.txt'
External table external_test, file gphdfs://c9test91:8020/test.txt

你可能感兴趣的:(GPDB,gphdfs)