partitioned for hive

Hive中,建立最简单的table,然后从文件导入数据 带分区参数partitioned。

1、单字段,不指定分隔符,如下:


#建表so_popu_281
hadoop fs -mkdir '/home/helh/solog/so_popu_281'
hive -e "
drop table if exists so_popu_281;
create external table so_popu_281(
str string
)
partitioned by (pdate string)
location '/data/log/so/re/popu_281/'
"

pre_day=`date --date '-1 days' +%Y-%m-%d`
hadoop fs -cat "/data/log/re/pdate=$pre_day/re.$pre_day.117.122.217.6"|grep 'popu_281' >./popu_281.txt
#导入文件
hive -e "load data local inpath 'popu_281.txt' into table so_popu_281 partition (pdate='$pre_day')"

2、指定分隔符

带分区的表定义语句

create table people(
id STRING,
name STRING,
likes ARRAY,
addr MAP
)
partitioned by (dt string)
ROW FORMAT DELIMITED
 FIELDS TERMINATED BY '\t'
 COLLECTION ITEMS  TERMINATED BY ','
 MAP KEYS  TERMINATED BY ':'
STORED AS TEXTFILE;

最后导入的命令也不一样,这里其实就是按照DT这个字段来分区的

load data local inpath '/tmp/test.txt' into table people PARTITION (dt='2016-1-1')

 

更多讲解请看看官网


你可能感兴趣的:(hadoop)