Hive扩展功能(六)--HPL/SQL(可使用存储过程)

软件环境:

linux系统: CentOS6.7
Hadoop版本: 2.6.5
zookeeper版本: 3.4.8


主机配置:

一共m1, m2, m3这五部机, 每部主机的用户名都为centos
192.168.179.201: m1 
192.168.179.202: m2 
192.168.179.203: m3 

m1: Zookeeper, Namenode, DataNode, ResourceManager, NodeManager, Master, Worker
m2: Zookeeper, Namenode, DataNode, ResourceManager, NodeManager, Worker
m3: Zookeeper, DataNode, NodeManager, Worker

资料:

搭建教程:
    http://lxw1234.com/archives/2015/09/487.htm
下载HPL/SQL地址:
    http://www.hplsql.org/download
HPL/SQL官网:
    http://www.hplsql.org/doc


注意事项:

一定不能在HPL/SQL上使用Hive语法的语句,要使用MySQLOracle等其他HPL/SQL支持的数据库(具体可去官网查看),不然会报找不到dual表或者dual表中无该字段的错误,切记


版本选择:

HPL/SQL0.3.17版本(必须是0.3.17或者0.3.17之后的版本)解决了强制读From dual表的问题, 而本次安装的是Hive2.1.1版本自带的是HPLSQL0.3.31版本, 已解决强制读From dual表的问题.
若要解决强制读From dual表的问题,应下载一个0.3.17或0.3.17之后版本的HPL/SQL, 然后将解压后得到的hplsql-0.3.17.jar包放入$HIVE_HOME/lib包下, 并重命名为hive-hplsql-*.jar格式的包,如:hive-hplsql-0.3.17.jar




1.编辑hive-site.xml文件

HPL/SQL与Hive是通过thrift方式连接, 编辑hive-site.xml, 添加以下配置项

<property>
    <name>hive.server2.thrift.bind.hostname>
    <value>m1value>
property>
<property>
    <name>hive.server2.thrift.portname>
    <value>10000value>
property>


2.编辑hplsql-site.xml文件

配置HPL/SQL与Hive的连接, 创建hplsql-site.xml文件(若已有则无需创建), 并将以下配置项拷贝到文件中

<configuration>
<property>
  <name>hplsql.conn.defaultname>
  <value>hive2connvalue>
  <description>The default connection profiledescription>
property>
<property>
  <name>hplsql.conn.hiveconnname>
  <value>org.apache.hadoop.hive.jdbc.HiveDriver;jdbc:hive://value>
  <description>Hive embedded JDBC (not requiring HiveServer)description>
property>

<property>
  <name>hplsql.conn.init.hiveconnname>
  <value>
     set mapred.job.queue.name=default;
     set hive.execution.engine=mr; 
     use default;
  value>
  <description>Statements for execute after connection to the databasedescription>
property>
<property>
  <name>hplsql.conn.convert.hiveconnname>
  <value>truevalue>
  <description>Convert SQL statements before executiondescription>
property>
<property>
  <name>hplsql.conn.hive2connname>
  <value>org.apache.hive.jdbc.HiveDriver;jdbc:hive2://m1:10000value>
  <description>HiveServer2 JDBC connectiondescription>
property>

<property>
  <name>hplsql.conn.init.hive2connname>
  <value>
     set mapred.job.queue.name=default;
     set hive.execution.engine=mr; 
     use default;
  value>
  <description>Statements for execute after connection to the databasedescription>
property>
<property>
  <name>hplsql.conn.convert.hive2connname>
  <value>truevalue>
  <description>Convert SQL statements before executiondescription>
property>
<property>
  <name>hplsql.conn.db2connname>
  <value>com.ibm.db2.jcc.DB2Driver;jdbc:db2://localhost:50001/dbname;user;passwordvalue>
  <description>IBM DB2 connectiondescription>
property>
<property>
  <name>hplsql.conn.tdconnname>
  <value>com.teradata.jdbc.TeraDriver;jdbc:teradata://localhost/database=dbname,logmech=ldap;user;passwordvalue>
  <description>Teradata connectiondescription>
property>
<property>
  <name>hplsql.conn.mysqlconnname>
  <value>com.mysql.jdbc.Driver;jdbc:mysql://localhost/test;user;passwordvalue>
  <description>MySQL connectiondescription>
property>
<property>
  <name>hplsql.dual.tablename>
  <value>default.dualvalue>
  <description>Single row, single column table for internal operationsdescription>
property>
<property>
  <name>hplsql.insert.valuesname>
  <value>nativevalue>
  <description>How to execute INSERT VALUES statement: native (default) and selectdescription>
property>
<property>
  <name>hplsql.onerrorname>
  <value>exceptionvalue>
  <description>Error handling behavior: exception (default), seterror and stopdescription>
property>
<property>
  <name>hplsql.temp.tablesname>
  <value>nativevalue>
  <description>Temporary tables: native (default) and manageddescription>
property>
<property>
  <name>hplsql.temp.tables.schemaname>
  <value>value>
  <description>Schema for managed temporary tablesdescription>
property>
<property>
  <name>hplsql.temp.tables.locationname>
  <value>/home/centos/soft/hive/tmp/plhqlvalue>
  <description>LOcation for managed temporary tables in HDFSdescription>
property>

<property>
<name>hive.server2.thrift.bind.hostname>
<value>m1value>
property>
<property>
<name>hive.server2.thrift.portname>
<value>10000value>
property>
configuration>


3.配置dual表 (此步骤可跳过)

启动Hive服务,依照在hplsql-site.xml文件中的配置去创建(默认是在default库中创建了dual表)

use default;
create table dual(DUMMY VARCHAR(1));


4.在使用hplsql存储过程前, 需先启动HiveServer2和Metastore服务

sh $HIVE_HOME/bin/hive  --service  metastore
sh $HIVE_HOME/bin/hive  --service  hiveserver2




你可能感兴趣的:(hive,扩展功能,存储过程,HPL-SQL,云计算,技术博客)