好玩的大数据之18:Hive实验1( 使用load data导入数据到外部表和内部表)

一、启动hive客户端


            hive

二、创建表


在hive提示符下

CREATE TABLE IF NOT EXISTS `test_01`(

  `id` int,`name` String,`age` INT,`score` FLOAT)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;

CREATE external TABLE IF NOT EXISTS `test_02`(

  `id` int, `name` String,`age` INT,`score` FLOAT)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;

三、制作数据


在Ubuntu Terminal下

vi /home/hadoop/share/mydata/hive/score.txt

内容如下:

1,'zhang',20,120

2,'zhao',19,119

3,'qian',18,118

4,'li',21,121

vi /home/hadoop/share/mydata/hive/score02.txt

内容如下:

5,'wang',20,120

6,'zhou',19,119

7,'wu',18,118

8,'hu',21,121

四、加载数据


1.在hive提示符下

load data local inpath '/home/hadoop/share/mydata/hive/score.txt' overwrite into table test_01;

load data local inpath '/home/hadoop/share/mydata/hive/score.txt' overwrite into table test_02;

select * from test_01;

select * from test_02;

加载数据

2.在Ubuntu Terminal下

hadoop fs -ls /mylab/soft/apache-hive-3.1.2-bin/working/metastore.warehouse/testdb.db/test_01

hadoop fs -ls /mylab/soft/apache-hive-3.1.2-bin/working/metastore.warehouse/testdb.db/test_02

hadoop fs -cat /mylab/soft/apache-hive-3.1.2-bin/working/metastore.warehouse/testdb.db/test_01/score.txt

hadoop fs -cat /mylab/soft/apache-hive-3.1.2-bin/working/metastore.warehouse/testdb.db/test_02/score.txt

数据文件

五、删除表


1.在hive提示符下

drop table test_01;

drop table test_02;

2.在Ubuntu Terminal下

hadoop fs -ls /mylab/soft/apache-hive-3.1.2-bin/working/metastore.warehouse/testdb.db

test_01目录消失了

六、创建表


在hive提示符下

CREATE TABLE IF NOT EXISTS `test_01`(

  `id` int,`name` String,`age` INT,`score` FLOAT)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;

CREATE external TABLE IF NOT EXISTS `test_02`(

  `id` int, `name` String,`age` INT,`score` FLOAT)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;

select * from test_01;

select * from test_02;

重新建表后

七、重新加载数据


1.在hive提示符下

load data local inpath '/home/hadoop/share/mydata/hive/score02.txt' overwrite into table test_01;

load data local inpath '/home/hadoop/share/mydata/hive/score02.txt' overwrite into table test_02;

select * from test_01;

select * from test_02;

重新加载数据

2.在Ubuntu Terminal下

hadoop fs -ls /mylab/soft/apache-hive-3.1.2-bin/working/metastore.warehouse/testdb.db

hadoop fs -ls /mylab/soft/apache-hive-3.1.2-bin/working/metastore.warehouse/testdb.db/test_01

hadoop fs -ls /mylab/soft/apache-hive-3.1.2-bin/working/metastore.warehouse/testdb.db/test_02

hadoop fs -cat /mylab/soft/apache-hive-3.1.2-bin/working/metastore.warehouse/testdb.db/test_02/*

重新加载后HDFS文件情况

八、继续加载数据

1.在hive提示符下

注意没有用overwrite

load data local inpath '/home/hadoop/share/mydata/hive/score02.txt' into table test_02;


2.在Ubuntu Terminal下

hadoop fs -cat /mylab/soft/apache-hive-3.1.2-bin/working/metastore.warehouse/testdb.db/test_02/*

hadoop fs -ls /mylab/soft/apache-hive-3.1.2-bin/working/metastore.warehouse/testdb.db/test_02

3.在hive提示符下

注意这次用overwrite

load data local inpath '/home/hadoop/share/mydata/hive/score02.txt' overwrite into table test_02;

select * from test_02;

4.在Ubuntu Terminal下

hadoop fs -ls /mylab/soft/apache-hive-3.1.2-bin/working/metastore.warehouse/testdb.db/test_02

hadoop fs -cat /mylab/soft/apache-hive-3.1.2-bin/working/metastore.warehouse/testdb.db/test_02/*

九、结论


不指明类型的情况下,HIVE会默认新建的表为内部表,外部表需要使用external关键字。

当我们删除外部表时,删除的只是元数据,存储数据仍被保留。当我们删除内部表时,元数据和存储数据都被删除。

使用load data操作的时候,不管是外部表还是内部表,如果源数据存在于HDFS层,都是数据的移动。即源数据从HDFS存储路径移动到HIVE数据仓库默认路径。

使用load data操作的时候,要是使用了overwrite,则情况原来的文件,生成正在load的文件,要是没有用overwrite,则在原来的基础上,增加新加载的文件,要是有重名,hive会自动补足成唯一的文件名


十、参考资料


    https://blog.csdn.net/henrrywan/article/details/90612741

你可能感兴趣的:(好玩的大数据之18:Hive实验1( 使用load data导入数据到外部表和内部表))