1、在原来表emp的路径下再创建一个emp1的表:
create table if not exists emp1(
empno int,
enname string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptno int
)
row format delimited fields terminated by '\t'
LOCATION '/user/hive/warehouse/db_emp.db/emp';
2、查询emp1:
hive (db_emp)> select * from emp1;
OK
emp1.empno emp1.enname emp1.job emp1.mgr emp1.hiredate emp1.sal emp1.comm emp1.deptno
7499 ALLEN SALESMAN 7698 1981-02-20 1600.0 300.0 30
7521 WARD SALESMAN 7698 1981-02-22 1250.0 500.0 30
7566 JONES MANAGER 7839 1981-04-02 2975.0 0.0 20
7654 MARTIN SALESMAN 7698 1981-09-28 1250.0 1400.0 30
7698 BLAKE MANAGER 7839 1981-05-01 2850.0 0.0 30
7782 CLARK MANAGER 7839 1981-06-09 2450.0 0.0 10
7788 SCOTT ANALYST 7566 1987-07-13 3000.0 0.0 20
7839 KING PRESIDENT NULL 1981-11-07 5000.0 0.0 10
7844 TURNER SALESMAN 7698 1981-09-08 1500.0 0.0 30
7876 ADAMS CLERK 7788 1987-07-13 1100.0 0.0 20
7900 JAMES CLERK 7698 1981-12-03 950.0 0.0 30
7902 FORD ANALYST 7566 1981-12-03 3000.0 0.0 20
7934 MILLER CLERK 7782 1982-01-23 1300.0 0.0 10
Time taken: 0.503 seconds, Fetched: 13 row(s)
emp1和emp共用Hdfs中的数据:/user/hive/warehouse/db_emp.db/emp
3、删除表emp1:
hive (db_emp)> drop table emp1;
OK
Time taken: 0.592 seconds
hive (db_emp)> show tables;
OK
tab_name
dept
emp
Time taken: 0.083 seconds, Fetched: 2 row(s)
4、查询表emp,可以发现emp中数据已经被清除,只剩下表结构了。
hive (db_emp)> select * from emp;
OK
emp.empno emp.enname emp.job emp.mgr emp.hiredate emp.sal emp.comm emp.deptno
Time taken: 0.121 seconds
5、为了规避这样的问题,创建外部表:
create EXTERNAL table if not exists dept1(
deptno int,
dnname string,
loc string
)
row format delimited fields terminated by '\t'
LOCATION '/user/hive/warehouse/db_emp.db/dept';
6、查询dept1表中数据:
hive (db_emp)> select * from dept1;
OK
dept1.deptno dept1.dnname dept1.loc
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON
Time taken: 0.166 seconds, Fetched: 4 row(s)
7、查看表dept1 的信息,表类型为:EXTERNAL_TABLE
hive (db_emp)> desc formatted dept1;
OK
col_name data_type comment
# col_name data_type comment
deptno int
dnname string
loc string
# Detailed Table Information
Database: db_emp
Owner: root
CreateTime: Wed Jul 25 04:13:36 CST 2018
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://ns1/user/hive/warehouse/db_emp.db/dept
Table Type: EXTERNAL_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE false
EXTERNAL TRUE
numFiles 0
numRows -1
rawDataSize -1
totalSize 0
transient_lastDdlTime 1532463216
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
field.delim \t
serialization.format \t
Time taken: 0.087 seconds, Fetched: 35 row(s)
8、删除表dept1,并查询表dept的数据,可以发现dept的数据并没有再次被删除了。
hive (db_emp)> drop table dept1;
OK
Time taken: 0.086 seconds
hive (db_emp)> show tables;
OK
tab_name
dept
emp
Time taken: 0.02 seconds, Fetched: 2 row(s)
hive (db_emp)> select * from dept;
OK
dept.deptno dept.dnname dept.loc
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON
Time taken: 0.098 seconds, Fetched: 4 row(s)
总结:外部表可以解决一份数据多人共用数据的场景,保证了数据的安全性。