当使用静态分区时,在向分区表中插入数据时,我们需要指定具体分区列的值。此外,hive还支持动态提供分区值(即在插入数据时,不指定具体的分区列值,而是仅仅指定分区字段)。动态分区在默认情况下是禁用的(在hive2.3.4版本中默认是开启的,在hive-default.xml.template文件中进行配置),所以需要将hive.exec.dynamic.partition设为true。默认情况下,用户必须至少指定一个静态分区列,这是为了避免意外覆盖分区。要禁用此限制,可以设置分区模式为非严格模式(即将hive.exec.dynamic.partition.mode设为nonstrict,默认值为strict)。可以选择在命令行终端方式设置:
>SET hive.exec.dynamic.partition=true;
>SET hive.exec.dynamic.partition.mode=nonstrict;
或者在hive-site.xml中设置:
hive.exec.dynamic.partition
true
Whether or not to allow dynamic partitions in DML/DDL.
hive.exec.dynamic.partition.mode
nonstrict
In strict mode, the user must specify at least one static partition
in case the user accidentally overwrites all partitions.
In nonstrict mode all partitions are allowed to be dynamic.
001,tom,23,2019-03-16
002,jack,12,2019-03-13
003,robin,14,2018-08-13
004,justin,34,2018-10-12
005,jarry,24,2017-11-11
006,jasper,24,2017-12-12
(1)创建普通表,用于load数据
create table dynamic_tbl_temp(id int,name string,age int,start_date date)
row format delimited
fields terminated by ','
(2)创建分区表
create table dynamic_tbl(id int,name string,age int,start_date date)
partitioned by (year string,month string)
row format delimited
fields terminated by ','
(1)向普通表加载数据
load data local inpath '/opt/modules/data/dynamic.txt' into table dynamic_tbl_temp;
(2)向分区表中插入数据
insert into dynamic_tbl
partition(year,month)
select
id,
name,
age,
start_date,
year(start_date),
month(start_date)
from
dynamic_tbl_temp;
(1)查询year=2018的数据
select * from dynamic_tbl where year = 2018;
--结果
4 justin 34 2018-10-12 2018 10
3 robin 14 2018-08-13 2018 8
(2)查询year=2017,month=11的数据
select * from dynamic_tbl where year = 2017 and month = 11;
--结果:
5 jarry 24 2017-11-11 2017 11