hive提供了类sql语句来查询hdfs上的数据,这些语句被翻译成mapreduce程序,实现简单的mr程序。
./hive进入命令行后,可使用跟sql语句一样的命令。
一,建库:
1,创建数据库语句:
hive> create database student;
二,建表(hive有各种各样表,一一详解):
示例1:基本建表格式
有如下形式的学生表students.txt(存放在本地或者hdfs上)
95001,李勇,男,20,CS
95002,刘晨,女,19,IS
95003,王敏,女,22,MA
95004,张立,男,19,IS
95005,刘刚,男,18,MA
95006,孙庆,男,23,CS
95007,易思玲,女,19,MA
95008,李娜,女,18,CS
95009,梦圆圆,女,18,MA
95010,孔小涛,男,19,CS
95011,包小柏,男,18,MA
95012,孙花,女,20,CS
95013,冯伟,男,21,CS
95014,王小丽,女,19,CS
95015,王君,男,18,MA
95016,钱国,男,21,MA
95017,王风娟,女,18,IS
95018,王一,女,19,IS
95019,邢小丽,女,19,IS
95020,赵钱,男,21,IS
95021,周二,男,17,MA
95022,郑明,男,20,MA
create table stu(
id int,
name string,
gender string,
age int,
master string
)
row format delimited
fields terminated by ','
stored as textfile;
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)];
create external table stu_external(
id int,
name string,
gender string,
age int,
master string
)
row format delimited
fields terminated by ','
stored as textfile
location '/user/hive_external_table';
内部表在删除时,会同时删除存储在hdfs上的真实数据和在mysql中的元数据
外部表再删除时,仅删除在mysql中的元数据,并不会删除在hdfs上建立的表(数据仓库)。
######################################################################
示例3:分区表
有2份表,分别是1,2班的学生名单
students1.txt
1.jimmy,20
2,tim,22
3,jerry,19
students2.txt
1,tom,23
2,angela,19
3,cat,20
create table stu_partition(
id int,
name string,
age int
)
partitioned by(classId int)
row format delimited
fields terminated by ','
stored as textfile;
load data local inpath 'students1.txt' into table stu_partition partition(classId=1);
load data local inpath 'students2.txt' into table stu_partition partition(classId=2);
####################################################################
示例4:分桶表
有数据如下:
1,jimmy
2,henry
3,tom
4,jerry
5,angela
6,lucy
7,lili
8,lilei
9,hanmeimei
10,timmy
11,jenef
12,alice
13,anna
14,donna
15,ella
16,fiona
17,grace
18,hebe
19,jean
20,joy
21,kelly
22,lydia
23,mary
1,建立分桶表
#要先开启分桶
set hive.enforce.bucketing = true;
create table stu_list(
id int,
name string
)
row format delimited
fields terminated by ','
stored as textfile;
load data local inpath '.../list.txt' into table stu_list;
create table stu_buckets(
id int,
name string
)
clustered by(id) sorted by(id) into 3 buckets
row format delimited
fields terminated by ','
stored as textfile;