Hive从 创建数据库到数据导入的小案例测试

DDL

  1. 数据库(Create/Drop/Alter/Use Database)
  • 创建数据库

注意: 不指定location默认创建的数据库在hdfs的 /user/hive/warehouse下

hive> create database if not exists ruozedata_test comment 'hive test database' location '/user/hadoop';
OK
Time taken: 0.325 seconds
  • 查看数据库
hive> desc database ruozedata_test;
OK
ruozedata_test	hive test database	hdfs://192.168.52.130:9000/user/hive/warehouse/ruozedata_test.db	hadoop	USER	
Time taken: 0.126 seconds, Fetched: 1 row(s)
  • 修改数据库
    注意:修改数据库存储路径 Hive 2.2.1, 2.4.0 and later版本才支持
hive> alter database ruozedata_test set owner  user  root;
OK
Time taken: 0.198 seconds
hive> desc database ruozedata_test;
OK
ruozedata_test	hive test database	hdfs://192.168.52.130:9000/user/hive/warehouse/ruozedata_test.db	root	USER	
Time taken: 0.061 seconds, Fetched: 1 row(s)

  1. 表(Create/Drop/Truncate Table)
  • 创建
  • 内部表(默认)
    方法1:
hive> create table  MANAGED_TABLE_emp1(
    > empno int,
    > ename string,
    > job string,
    > mgr int,
    > hiredate string,
    > sal int,
    > comm int,
    > deptno int
    > )comment 'This is a emp test table'
    > partitioned by(dt string)
    > row format delimited fields terminated by '\t';
OK
Time taken: 0.203 seconds

方法2:复制一张已有表的表结构:

hive> create table managed_table_emp2 like managed_table_emp1;
OK
Time taken: 0.494 seconds
hive> show tables;
OK

方法3:复制一张已经存在的表结构和数据

hive> create table managed_table_emp3 as select * from  managed_table_emp1;

  • 外部表
    注意:要指定 使用 external location
hive> create  external table  external_TABLE_emp1(
    > empno int,
    > ename string,
    > job string,
    > mgr int,
    > hiredate string,
    > sal int,
    > comm int,
    > deptno int
    > )comment 'This is a emp test external table'
    > row format delimited fields terminated by '\t'
    > location '/user/hadoop/external';
OK
Time taken: 0.415 seconds
  1. 分区
  • 静态分区
Time taken: 0.034 seconds
hive> create  external table  user(
    > id int,
    > name string,
    > address string,
    > birthday string
    > )partitioned by (dt string)
    > row format delimited fields terminated by '\t';
OK
Time taken: 0.123 seconds
hive> show create table user;

  • 动态分区:
  • 设置 hive.exec.dynamic.partition=true
hive> insert overwrite table ruoze_emp_partition PARTITION(deptno)
    > select empno,ename,job,mgr,hiredate,sal,comm,deptno from ruoze_emp;
Query ID = hadoop_20181029235151_8ea2d815-d0a8-4dd8-b4f7-f3b9e6a53b47
Total jobs = 3
  • HDFS里数据结构
[hadoop@hadoop001 data]$ hadoop fs  -ls /user/hive/warehouse/ruoze_emp_partition
18/10/30 03:33:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 4 items
drwxr-xr-x   - hadoop supergroup          0 2018-10-30 03:28 /user/hive/warehouse/ruoze_emp_partition/deptno=10
drwxr-xr-x   - hadoop supergroup          0 2018-10-30 03:28 /user/hive/warehouse/ruoze_emp_partition/deptno=20
drwxr-xr-x   - hadoop supergroup          0 2018-10-30 03:28 /user/hive/warehouse/ruoze_emp_partition/deptno=30
drwxr-xr-x   - hadoop supergroup          0 2018-10-30 03:28 /user/hive/warehouse/ruoze_emp_partition/deptno=__HIVE_DEFAULT_PARTITION__
[hadoop@hadoop001 data]$ hadoop fs  -ls /user/hive/warehouse/ruoze_emp_partition/deptno=__HIVE_DEFAULT_PARTITION__
  1. 加载数据
  • 本地:
load data [local] inpath 'file-path'  [overwrite] into table emp [partition (deptno=10)]
  • hdfs:
在这里插入代码片
  • mysql:
在这里插入代码片
  1. 删除数据
  • 内部表:
在这里插入代码片
  • 外部表:
在这里插入代码片
  1. 查看表结构
  • 简单结构
hive> desc  ruoze_emp;
OK
empno               	int                 	                    
ename               	string              	                    
job                 	string              	                    
mgr                 	int                 	                    
hiredate            	string              	                    
sal                 	double              	                    
comm                	double              	                    
deptno              	int                 	                    
Time taken: 0.296 seconds, Fetched: 8 row(s)
hive> 

  • 详细结构
hive> desc formatted ruoze_emp;
OK
# col_name            	data_type           	comment             
	 	 
empno               	int                 	                    
ename               	string              	                    
job                 	string              	                    
mgr                 	int                 	                    
hiredate            	string              	                    
sal                 	double              	                    
comm                	double              	                    
deptno              	int                 	                    
	 	 
# Detailed Table Information	 	 
Database:           	default             	 
Owner:              	hadoop              	 
CreateTime:         	Mon Oct 22 15:32:07 CST 2018	 
LastAccessTime:     	UNKNOWN             	 
Protect Mode:       	None                	 
Retention:          	0                   	 
Location:           	hdfs://hadoop001:9000/user/hive/warehouse/ruoze_emp	 
Table Type:         	MANAGED_TABLE       	 
Table Parameters:	 	 
	COLUMN_STATS_ACCURATE	true                
	numFiles            	1                   
	numRows             	0                   
	rawDataSize         	0                   
	totalSize           	700                 
	transient_lastDdlTime	1540817159          
	 	 
# Storage Information	 	 
SerDe Library:      	org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe	 
InputFormat:        	org.apache.hadoop.mapred.TextInputFormat	 
OutputFormat:       	org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat	 
Compressed:         	No                  	 
Num Buckets:        	-1                  	 
Bucket Columns:     	[]                  	 
Sort Columns:       	[]                  	 
Storage Desc Params:	 	 
	field.delim         	\t                  
	serialization.format	\t                  
Time taken: 1.14 seconds, Fetched: 39 row(s)

  1. 修改表结构
在这里插入代码片

整理中。。。。。。

小提示

  • 在hive中通过配置set hive.cli.print.header=true可以打印出查询表的表头
  • 通过配置 set hive.cli.print.current.db =true 可以显示当前数据库
  • 也可以在 hive-site.xml中 通过key,value配置

你可能感兴趣的:(Hive)