hive分区以及动态分区

hive静态分区

(hive分区主要是为了提高检索效率,内部表和外部表都可以创建分区)
hive分区以及动态分区_第1张图片

hive静态分区操作

静态分区操作数据

1,小明1,lol-book-move,beijing:shangxuetang-shanghai:pudong
2,小明2,lol-book-move,beijing:shangxuetang-shanghai:pudong
3,小明3,lol-book-move,beijing:shangxuetang-shanghai:pudong
4,小明4,lol-book,beijing:shangxuetang-shanghai:pudong
5,小明5,lol-book-move,beijing:shangxuetang-shanghai:pudong
6,小明6,lol-book-move,beijing:shangxuetang-shanghai:pudong
7,小明7,lol-book-move,beijing:shangxuetang-shanghai:pudong

Hive 在创建表时添加分区partition
在表定义时指定对应的partition字段
a、不分区建表

create table psn(
id int,
name string,
likes array,
address map
)
ROW FORMAT delimited 
fields terminated by ','            //hdfs存放的数据字段之间的分隔符(记得是存储在hdfs里的,不是select* from 查询显示的分割)
COLLECTION ITEMS TERMINATED BY '-'  //hdfs存放的数据集合之间的分割符
map keys terminated by ':' ;        //hdfs存放的数据map之间的分割符
location '/root/data/'              //hdfs集群存放的数据

b、单分区建表语句:

--单分区建表
create  table psn5(
id int,
name string,
likes array,
addres map
)
partitioned by (age int)
row format delimited
fields terminated by ','
collection items terminated by '-'
map keys terminated by ':'

--添加数据
load data local inpath '/root/data/data' into table psn5 partition (age =10);

c、 双分区建表语句:

--多分区建表
create  table psn6(
id int,
name string,
likes array,
addres map
)
partitioned by (age int,sex string)
row format delimited
fields terminated by ','
collection items terminated by '-'
map keys terminated by ':'

--添加数据
load data local inpath '/root/data/data' into table psn6 partition (age=10,sex='body');
--增加分区
alter table psn6 add partition (age =30,sex='girl');
--多分区查询
select * from psn6 where  age=30 and  sex='girl';

(表已创建,在此基础上添加分区):

ALTER TABLE table_name ADD [IF NOT EXISTS] PARTITION partition_spec  [LOCATION 'location1'] partition_spec [LOCATION 'location2'] ...;
partition_spec:
: (partition_column = partition_col_value, partition_column = partition_col_value, ...)
例:
ALTER TABLE day_table ADD PARTITION (dt='2008-08-08', hour='08')

Hive删除分区语法:

ALTER TABLE table_name DROP partition_spec, partition_spec,...
partition_spec:
: (partition_column = partition_col_value, partition_column = partition_col_value, ...)
用户可以用 ALTER TABLE DROP PARTITION 来删除分区。
内部表中、对应分区的元数据和数据将被一并删除。
例:
ALTER TABLE day_hour_table DROP PARTITION (dt='2008-08-08', hour='09');

hive动态分区操作

动态分区操作数据

1,小明1,12,man,book-move,beijing:shangxuetang-shanghai:pudong
2,小明2,13,man,book-move,beijing:shangxuetang-shanghai:pudong
3,小明3,12,boy,book-move,beijing:shangxuetang-shanghai:pudong
4,小明4,12,man,book,beijing:shangxuetang-shanghai:pudong
5,小明5,13,boy,book-move,beijing:shangxuetang-shanghai:pudong
6,小明6,12,boy,book-move,beijing:shangxuetang-shanghai:pudong
7,小明7,13,man,book-move,beijing:shangxuetang-shanghai:pudong

普通建表

create table psn7(
id int,
name string ,
age int,
sex string,
likes array,
addres map
)
row format delimited 
fields terminated by ','
collection items terminated by '-'
map keys terminated by ':'

建分区表

create table psn_part(
id int,
name string ,
likes array,
addres map
)
partitioned by (age int,sex string)
row format delimited 
fields terminated by ','
collection items terminated by '-'
map keys terminated by ':'

插入数据

使用模式
from 
insert
select 

动态分区语句
from psn7
insert overwrite  table psn_part partition (age,sex)
select id,name,likes,addres,age,sex

在插入数据的时候,会报错如下:

FAILED: SemanticException [Error 10096]: Dynamic partition strict mode requires at least one static partition column. To turn this off set hive.exec
.dynamic.partition.mode=nonstrict

解决方案:

set hive.exec.dynamic.partition.mode=nonstrict;

参考文档链接 https://hugh-wangp.iteye.com/blog/1612268
https://www.cnblogs.com/luogankun/p/4111145.html

你可能感兴趣的:(hive)