我做的是Hive0.10+Hbase0.96的整合,其他版本的整合过程也差不多。
一.修改hive的配置文件conf/hive-site.xml
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.35.59:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive123</value>
</property>
<property>
<name>hive.querylog.location</name>
<value>/usr/local/test/hive-0.10.0-bin/logs</value>
</property>
<property>
<name>hive.aux.jars.path</name>
<!--这里需要特别注意,每一个文件路径之间不能有回车不能有空格,我是方便排版才有回车和空格的,要写成一行-->
<value>file:///usr/local/test//lib/hive-hbase-handler-0.13.0.jar,
file:///usr/local/test/hive-0.10.0-bin/lib/hbase-client-0.96.1.1-hadoop2.jar,
file:///usr/local/test/hive-0.10.0-bin/lib/hbase-common-0.96.1.1-hadoop2.jar,
file:///usr/local/test/hive-0.10.0-bin/lib/hbase-common-0.96.1.1-hadoop2-tests.jar,
file:///usr/local/test/hive-0.10.0-bin/lib/hbase-protocol-0.96.1.1-hadoop2.jar,
file:///usr/local/test/hive-0.10.0-bin/lib/hbase-server-0.96.1.1-hadoop2.jar,
file:///usr/local/test/hive-0.10.0-bin/lib/htrace-core-2.01.jar,
file:///usr/local/test/hive-0.10.0-bin/lib/zookeeper-3.4.5.jar,
file:///usr/local/test/hive-0.10.0-bin/lib/protobuf-java-2.5.0.jar,
file:///usr/local/test/hive-0.10.0-bin/lib/guava-12.0.1.jar
</value>
</property>
二.将hive配置文件hive.aux.jars.path中涉及到的jar包从hbase的lib目录下拷贝到hive的lib目录下,其他没有涉及到的在hbase的lib目录下的所有hbase开头的jar包也都拷贝到 hive的lib目录下,删除hive lib目录下原有的hbase-0.92.0-tests.jar和hbase-0.92.0.jar,还有其他和引入hbase包冲突的包。
三.以集群的方式启动hive,zookeeper1,zookeeper2,zookeeper3是hbase连接的zookeeper地址,默认端口2181
在hive的bin目录下执行./hive -hiveconf hbase.zookeeper.quorum=zookeeper1,zookeeper2,zookeeper3
四.在hive中创建和hbase的关联表
CREATE TABLE hbase_table_1(key int, value string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
TBLPROPERTIES ("hbase.table.name" = "xyz");
解释:hbase.table.name 定义在hbase的table名称
hbase.columns.mapping 定义在hbase的列族
后续数据操作参考博客:http://blog.csdn.net/aaaaaaaa2000/article/details/7565456
六.整合过程中遇到的问题
java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/MasterNotRunningException
是由于没有将hbase的jar包拷贝到hive的lib下