Hive扩展功能(九)--Hive的行级更新操作(Update)

软件环境:

linux系统: CentOS6.7
Hadoop版本: 2.6.5
zookeeper版本: 3.4.8


主机配置:

一共m1, m2, m3这三部机, 每部主机的用户名都为centos
192.168.179.201: m1 
192.168.179.202: m2 
192.168.179.203: m3 

m1: Zookeeper, Namenode, DataNode, ResourceManager, NodeManager, Master, Worker
m2: Zookeeper, Namenode, DataNode, ResourceManager, NodeManager, Worker
m3: Zookeeper, DataNode, NodeManager, Worker

资料:

官方资料:
Update资料  <=>      https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
Join资料    <=>      https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins

网上参考资料:
Update资料  <=>      http://www.aboutyun.com/thread-12155-1-1.html


一.为Hive配置Update功能

1.编辑hive-site.xml文件:

<property>
    <name>hive.optimize.sort.dynamic.partitionname>
    <value>falsevalue>
property>
<property>
    <name>hive.support.concurrencyname>
    <value>truevalue>
property>
<property>
    <name>hive.enforce.bucketingname>
    <value>truevalue>
property>
<property>
    <name>hive.exec.dynamic.partition.modename>
    <value>nonstrictvalue>
property>
<property>
    <name>hive.txn.managername>
    <value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManagervalue>
property>
<property>
    <name>hive.compactor.initiator.onname>
    <value>truevalue>
property>
<property>
    <name>hive.compactor.worker.threadsname>
    <value>1value>
property>
<property>
    <name>hive.in.testname>
    <value>truevalue>
property>


二.Update语法

1.创表语句

Hive对使用Update功能的表有特定的语法要求, 语法要求如下:
(1)要执行Update的表中, 建表时必须带有buckets(分桶)属性
(2)要执行Update的表中, 需要指定格式,其余格式目前赞不支持, 如:parquet格式, 目前只支持ORCFileformat和AcidOutputFormat
(3)要执行Update的表中, 建表时必须指定参数(‘transactional’ = true);
举例:

create table student (id bigint,name string) clustered by (name) into 2 buckets stored as orc TBLPROPERTIES('transactional'='true');

2.更新语句:

update student set id='444' where name='tom';




你可能感兴趣的:(技术博客)