Hive之DML操作


Hive系列第六章


第六章 DML操作

基本上都是针对表的操作了。

DDL: Data Definition Language     数据定义语言
DML: Data Manipulation Language   数据操纵语言
DQL :Data Query Language 数据查询语言

6.1 数据导入

6.1.1 装载数据

6.1.1.1 装载数据理论

语法结构:

LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]

解释:

1、LOAD DATA:加载数据
2、LOCAL:从本地加载数据到 hive 表;不加local表示从 HDFS 移动(剪切)数据到 hive 表。注意:在分区表中,当使用的是hive3.x版本,那么如果不指定具体的分区,也会从hdfs上面复制一份到默认的分区__HIVE_DEFAULT_PARTITION__中间。这是3.x版本新特性。
3、INPATH: 加载数据的路径
4、'filepath' : 具体的待上传的数据的路径
5、OVERWRITE:  覆盖表中已有数据,否则表示追加
6、INTO TABLE: 加载到哪张表
7、tablename: 具体的表
8、PARTITION: 上传到指定分区

6.1.1.2 装载数据示例

// 查看当前正在使用的库:
select current_database();

// 创建表
create table student(id int, name string, sex string, age int, department string) row format delimited fields terminated by ",";

// 查看表
show tables;

// 往表中加载数据:
load data local inpath "/home/data/student.txt" into table student;

// 查询数据
select id,name,sex,age,department from student;

6.1.2 插入数据

1、插入一条数据:

INSERT INTO TABLE tablename VALUES(a,b,c)

示例:

insert into table student (id, name, sex, age, department) values (101,"张三2","M",222,"IT");

2、利用查询语句将结果导入新表:

INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...) [IF NOT EXISTS]] select_statement1 FROM from_statement;

示例:

student2表需要先创建
create table student2(id int, name string, sex string, age int, department string) row format delimited fields terminated by ",";

hive (mydb)> insert into table student2 select id,sex,name,age,department from student where department = 'CS';
hive (mydb)> select * from student2;
OK
student2.id	student2.name	student2.sex	student2.age	student2.department
95013	男	李四	21	CS
95014	女	李五	19	CS
95012	女	李十	20	CS
95010	男	王一	19	CS
95006	男	王三	23	CS
95008	女	王五	18	CS
95001	男	王九	20	CS
Time taken: 0.097 seconds, Fetched: 7 row(s)
insert into 插入多次会多次在后面追加数据。

INSERT OVERWRITE 插入数据会覆盖原来的表里面的所有的数据,不仅仅是重复的那几行,是所有都覆盖。

3、多重插入

FROM from_statement  
INSERT OVERWRITE TABLE table_name1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1  
INSERT OVERWRITE TABLE table_name2 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement2] ... 

示例(上面2和3一起):

创建分区表:
create table student_ptn (id int, name string, sex string, age int) partitioned by (department string) row format delimited fields terminated by ",";

单重插入:
insert into table student_ptn partition (department = 'IS') select id,sex,name,age from student where department  = 'IS';
insert into table student_ptn partition (department = 'CS') select id,sex,name,age from student where department  = 'CS';
insert into table student_ptn partition (department = 'MA') select id,sex,name,age from student where department  = 'MA';


多重插入:
from student 
insert into table student_ptn partition (department = 'IS') select id,sex,name,age where department = 'IS' 
insert into table student_ptn partition (department = 'CS') select id,sex,name,age where department = 'CS' 
insert into table student_ptn partition (department = 'MA') select id,sex,name,age where department = 'MA' 

6.2 数据导出

6.2.1 INSERT OVERWRITE导出

语法结构:

INSERT OVERWRITE [LOCAL] DIRECTORY directory1 select_statement 

示例:

insert overwrite local directory "/home/data/cs_student" select * from student where department = 'CS';

insert overwrite directory "/home/data/cs_student5" select * from student where department = 'CS';

6.2.2 Hive Shell命令导出

[root@hadoop10 ~]# hive -e 'select * from mydb.student;' > /home/data/student2.txt;
which: no hbase in (.:.:.:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/software/jdk/bin:/software/zk/bin:/software/hadoop/bin:/software/hadoop/sbin:/software/hive/bin:/root/bin)
Hive Session ID = 36ee5fc2-cdfc-4057-86ee-f4dad8fbddbc

Logging initialized using configuration in jar:file:/software/hive/lib/hive-common-3.1.2.jar!/hive-log4j2.properties Async: true
Hive Session ID = fd035b16-f727-490f-beed-c4fb58122c0f
OK
Time taken: 2.273 seconds, Fetched: 23 row(s)
[root@hadoop10 ~]# cd /home/data/
[root@hadoop10 data]# cat student2.txt 

6.2.3 HDFS命令导出

hive (mydb)> dfs -get /user/hive/warehouse/mydb.db/student/student.txt /home/data/student3.txt;
hive (mydb)> 


声明:
        文章中代码及相关语句为自己根据相应理解编写,文章中出现的相关图片为自己实践中的截图和相关技术对应的图片,若有相关异议,请联系删除。感谢。转载请注明出处,感谢。


By luoyepiaoxue2014

微博地址: http://weibo.com/luoyepiaoxue2014 点击打开链接

你可能感兴趣的:(Hive,Hive,大数据)