参考资料:https://cwiki.apache.org/confluence/display/Hive/Home#Home-UserDocumentation
hive版本:hive1.1.0
主要通过参考官网wiki,顺便做做笔记。具体的参数功能建议直接参考官网。
创建一个数据库的语法("[]"为可选参数):
CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name
[COMMENT database_comment]
[LOCATION hdfs_path]
[WITH DBPROPERTIES (property_name=property_value, ...)];
hive cli默认使用的是default数据库,用过show databases 可以查看有什么数据库。
例子:
CREATE DATABASE IF NOT EXISTS soctt
COMMENT "learn hive."
WITH DBPROPERTIES ("creator"="ran","date"="20180107");
通过 use database即可使用指定数据库。数据库的元数据都保存在mysql上,可以在mysql中hive的数据库查看dbs,database_params查看元数据。
mysql> select a.desc,a.name,b.PARAM_KEY,b.PARAM_VALUE from dbs a, database_params b where a.db_id = b.db_id and a.name='soctt';
+-------------+-------+-----------+-------------+
| desc | name | PARAM_KEY | PARAM_VALUE |
+-------------+-------+-----------+-------------+
| learn hive. | soctt | creator | ran |
| learn hive. | soctt | date | 20180107 |
+-------------+-------+-----------+-------------+
2 rows in set (0.00 sec)
删除数据库语法:
DROP (DATABASE|SCHEMA) [IF EXISTS] database_name [RESTRICT|CASCADE];
例子
DROP DATABASE IF EXISTS soctt ;
注:默认RESTRICT,使用CASCADE可删除含表的数据库。
修改数据库信息
ALTER (DATABASE|SCHEMA) database_name SET DBPROPERTIES (property_name=property_value, ...); -- (Note: SCHEMA added in Hive 0.14.0)
ALTER (DATABASE|SCHEMA) database_name SET OWNER [USER|ROLE] user_or_role; -- (Note: Hive 0.13.0 and later; SCHEMA added in Hive 0.14.0)
ALTER (DATABASE|SCHEMA) database_name SET LOCATION hdfs_path; -- (Note: Hive 2.2.1, 2.4.0 and later)
创建一张表的语法
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name -- (Note: TEMPORARY available in Hive
0.14
.
0
and later)
[(col_name data_type [COMMENT col_comment], ... [constraint_specification])]
[COMMENT table_comment]
[PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
[CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]
[SKEWED BY (col_name, col_name, ...) -- (Note: Available in Hive
0.10
.
0
and later)]
ON ((col_value, col_value, ...), (col_value, col_value, ...), ...)
[STORED AS DIRECTORIES]
[
[ROW FORMAT row_format]
[STORED AS file_format]
| STORED BY
'storage.handler.class.name'
[WITH SERDEPROPERTIES (...)] -- (Note: Available in Hive
0.6
.
0
and later)
]
[LOCATION hdfs_path]
[TBLPROPERTIES (property_name=property_value, ...)] -- (Note: Available in Hive
0.6
.
0
and later)
[AS select_statement]; -- (Note: Available in Hive
0.5
.
0
and later; not supported
for
external tables)
例子:
CREATE TABLE emp (
empno int,
ename string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptno int
) ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t";
将数据导入表emp,一般导入数据都会使用 OVERWRITE,后续维护的时候基本不实用insert.。local参数代表本地文件系统,不写的话表示hdfs,所以路径也是要写hdfs的路径。
LOAD DATA LOCAL INPATH "/home/hadoop/hiveData/emp.txt" OVERWRITE INTO TABLE emp;
[root@hadoop001 hiveData]# cat emp.txt
7369 SMITH CLERK 7902 1980/12/17 800.00 20
7499 ALLEN SALESMAN 7698 1981/2/20 1600.00 300.00 30
7521 WARD SALESMAN 7698 1981/2/22 1250.00 500.00 30
7566 JONES MANAGER 7839 1981/4/2 2975.00 20
7654 MARTIN SALESMAN 7698 1981/9/28 1250.00 1400.00 30
7698 BLAKE MANAGER 7839 1981/5/1 2850.00 30
7782 CLARK MANAGER 7839 1981/6/9 2450.00 10
7788 SCOTT ANALYST 7566 1987/4/19 3000.00 20
7839 KING PRESIDENT 1981/11/17 5000.00 10
7844 TURNER SALESMAN 7698 1981/9/8 1500.00 0.00 30
7876 ADAMS CLERK 7788 1987/5/23 1100.00 20
7900 JAMES CLERK 7698 1981/12/3 950.00 30
7902 FORD ANALYST 7566 1981/12/3 3000.00 20
7934 MILLER CLERK 7782 1982/1/23 1300.00 10
hive> LOAD DATA LOCAL INPATH "/home/hadoop/hiveData/emp.txt" OVERWRITE INTO TABLE emp;
Loading data to table soctt.emp
Table soctt.emp stats: [numFiles=1, numRows=0, totalSize=657, rawDataSize=0]
OK
Time taken: 1.019 seconds
hive> select * from emp;
OK
7369 SMITH CLERK 7902 1980/12/17 800.0 NULL 20
7499 ALLEN SALESMAN 7698 1981/2/20 1600.0 300.0 30
7521 WARD SALESMAN 7698 1981/2/22 1250.0 500.0 30
7566 JONES MANAGER 7839 1981/4/2 2975.0 NULL 20
7654 MARTIN SALESMAN 7698 1981/9/28 1250.0 1400.0 30
7698 BLAKE MANAGER 7839 1981/5/1 2850.0 NULL 30
7782 CLARK MANAGER 7839 1981/6/9 2450.0 NULL 10
7788 SCOTT ANALYST 7566 1987/4/19 3000.0 NULL 20
7839 KING PRESIDENT NULL 1981/11/17 5000.0 NULL 10
7844 TURNER SALESMAN 7698 1981/9/8 1500.0 0.0 30
7876 ADAMS CLERK 7788 1987/5/23 1100.0 NULL 20
7900 JAMES CLERK 7698 1981/12/3 950.0 NULL 30
7902 FORD ANALYST 7566 1981/12/3 3000.0 NULL 20
7934 MILLER CLERK 7782 1982/1/23 1300.0 NULL 10
Time taken: 0.41 seconds, Fetched: 14 row(s)