hive和hbase同步
https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration
1、把hive目录中的hive-hbase-handler-1.2.1.jar cp到hbase/lib 下(h15/h16/h17上,都要有)
## scp /home/hive-1.2.1/lib/hive-hbase-handler-1.2.1.jar root@h15:/home/hbase-0.98/lib/
#scp /home/hive-1.2.1/lib/hive-hbase-handler-1.2.1.jar root@h16:/home/hbase-0.98/lib/
#scp /home/hive-1.2.1/lib/hive-hbase-handler-1.2.1.jar root@h17:/home/hbase-0.98/lib/
同时把hbase中的所有的jar,cp到hive/lib
#cp -r /home/hbase-0.98/lib/ /home/hive-1.2.1/
2、在hive的配置文件增加属性:
3、在h15的hive客户端中执行命令,创建临时表(项目中需要创建hive的外部表,此处测试用内部表)
CREATE TABLE t_phone_cdr(key string, dest string,type int)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:dest,cf1:type")
TBLPROPERTIES ("hbase.table.name" = "t_phone_cdr", "hbase.mapred.output.outputtable" = "t_phone_cdr");
--中间表第一个字段名字为key(必须)
4、进入hbase客户端查看是否创建hive表
5、在hbase中添加数据(注意要手敲)
#put 't_phone_cdr','186_1234','cf1:dest','139'
#put 't_phone_cdr','186_1234','cf1:type','1'
6、进入hive客户端查看数据是否进入
#select * from t_phone_cdr;
如果有和hbase相同的数据,那么hive整合hbase成功!
以下是其他方式数据参考:(不用)
1、CREATE EXTERNAL TABLE tmp_order
(key string,id string,user_id string,order_amount double,order_status int,order_create_time timestamp)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" =
":key,order:order_id,order:user_id,order:order_amount,order:order_status,order:order_create_time")
TBLPROPERTIES ("hbase.table.name" = "t_order"); 、
2、导入数据,从hive的t_hive_order 导入到hive临时(tmp_order)在hive执行命令:
insert into table tmp_order
select concat(user_id,'_',date_format(order_createtime,'yyyy'),'_',id),
id,
user_id,
order_amount,
order_status,
order_createtime
from t_order where day='20160113'
3、以下通过shell方式导入数据
#!/bin/bash
#
label="2015-04-01"
hive -e "insert into table tmp_order
select concat(user_id,'_',order_createtime,'_',id),
id,
user_id,
order_amount,
order_status,
order_createtime
from t_hive_order where day='$label'" && echo 'import order data success!'