SHOW (DATABASES|SCHEMAS) [LIKE 'identifier_with_wildcards'];
#SHOW DATABASES或SHOW SCHEMAS列出了元存储中定义的所有数据库。 SCHEMAS和DATABASES的用法是可互换的–它们意味着同一件事。
#可选的LIKE子句允许使用正则表达式过滤数据库列表。 对于任何字符,正则表达式中的通配符只能是“ *”或“ |” 供选择。 示例为'employees','emp *','emp * | * ees',所有这些都将匹配名为'employees'的数据库。
SHOW TABLES [IN database_name] ['identifier_with_wildcards'];
#SHOW TABLES列出当前数据库(或使用IN子句明确命名的数据库)中的所有基本表和视图,其名称与可选的正则表达式匹配。 对于任何字符,正则表达式中的通配符只能是“ *”或“ |” 供选择。 示例为“ page_view”,“ page_v *”,“ * view | page *”,所有这些都将与“ page_view”表匹配。 匹配表按字母顺序列出。 如果在metastore中没有找到匹配的表,这不是错误。 如果未给出正则表达式,则会列出所选数据库中的所有表。
SHOW VIEWS [IN/FROM database_name] [LIKE 'pattern_with_wildcards'];
SHOW VIEWS; -- show all views in the current database
SHOW VIEWS 'test_*'; -- show all views that start with "test_"
SHOW VIEWS '*view2'; -- show all views that end in "view2"
SHOW VIEWS LIKE 'test_view1|test_view2'; -- show views named either "test_view1" or "test_view2"
SHOW VIEWS FROM test1; -- show views from database test1
SHOW VIEWS IN test1; -- show views from database test1 (FROM and IN are same)
SHOW VIEWS IN test1 "test_*"; -- show views from database test2 that start with "test_"
SHOW MATERIALIZED VIEWS [IN/FROM database_name] [LIKE 'pattern_with_wildcards’];
SHOW PARTITIONS table_name;
SHOW PARTITIONS table_name PARTITION(ds='2010-03-03'); -- (Note: Hive 0.6 and later)
SHOW PARTITIONS table_name PARTITION(hr='12'); -- (Note: Hive 0.6 and later)
SHOW PARTITIONS table_name PARTITION(ds='2010-03-03', hr='12'); -- (Note: Hive 0.6 and later)
SHOW PARTITIONS table_name;
SHOW PARTITIONS table_name PARTITION(ds='2010-03-03'); -- (Note: Hive 0.6 and later)
SHOW PARTITIONS table_name PARTITION(hr='12'); -- (Note: Hive 0.6 and later)
SHOW PARTITIONS table_name PARTITION(ds='2010-03-03', hr='12'); -- (Note: Hive 0.6 and later)
SHOW PARTITIONS databaseFoo.tableBar PARTITION(ds='2010-03-03', hr='12'); -- (Note: Hive 0.13.0 and later)
SHOW PARTITIONS databaseFoo.tableBar LIMIT 10; -- (Note: Hive 4.0.0 and later)
SHOW PARTITIONS databaseFoo.tableBar PARTITION(ds='2010-03-03') LIMIT 10; -- (Note: Hive 4.0.0 and later)
SHOW PARTITIONS databaseFoo.tableBar PARTITION(ds='2010-03-03') ORDER BY hr DESC LIMIT 10; -- (Note: Hive 4.0.0 and later)
SHOW PARTITIONS databaseFoo.tableBar PARTITION(ds='2010-03-03') WHERE hr >= 10 ORDER BY hr DESC LIMIT 10; -- (Note: Hive 4.0.0 and later)
SHOW PARTITIONS databaseFoo.tableBar WHERE hr >= 10 AND ds='2010-03-03' ORDER BY hr DESC LIMIT 10;
SHOW TABLE EXTENDED [IN|FROM database_name] LIKE 'identifier_with_wildcards' [PARTITION(partition_spec)];
show table extended like part_table
-- SHOW COLUMNS
CREATE DATABASE test_db;
USE test_db;
CREATE TABLE foo(col1 INT, col2 INT, col3 INT, cola INT, colb INT, colc INT, a INT, b INT, c INT);
-- SHOW COLUMNS basic syntax
SHOW COLUMNS FROM foo; -- show all column in foo
SHOW COLUMNS FROM foo "*"; -- show all column in foo
SHOW COLUMNS IN foo "col*"; -- show columns in foo starting with "col" OUTPUT col1,col2,col3,cola,colb,colc
SHOW COLUMNS FROM foo '*c'; -- show columns in foo ending with "c" OUTPUT c,colc
SHOW COLUMNS FROM foo LIKE "col1|cola"; -- show columns in foo either col1 or cola OUTPUT col1,cola
SHOW COLUMNS FROM foo FROM test_db LIKE 'col*'; -- show columns in foo starting with "col" OUTPUT col1,col2,col3,cola,colb,colc
SHOW COLUMNS IN foo IN test_db LIKE 'col*'; -- show columns in foo starting with "col" (FROM/IN same) OUTPUT col1,col2,col3,cola,colb,colc
-- Non existing column pattern resulting in no match
SHOW COLUMNS IN foo "nomatch*";
SHOW COLUMNS IN foo "col+"; -- + wildcard not supported
SHOW COLUMNS IN foo "nomatch";
HiveQL DDL statements are documented here, including:
CREATE DATABASE/SCHEMA, TABLE, VIEW, FUNCTION, INDEX
DROP DATABASE/SCHEMA, TABLE, VIEW, INDEX
TRUNCATE TABLE
ALTER DATABASE/SCHEMA, TABLE, VIEW
MSCK REPAIR TABLE (or ALTER TABLE RECOVER PARTITIONS)
SHOW DATABASES/SCHEMAS, TABLES, TBLPROPERTIES, VIEWS, PARTITIONS, FUNCTIONS, INDEX[ES], COLUMNS, CREATE TABLE
DESCRIBE DATABASE/SCHEMA, table_name, view_name, materialized_view_name
PARTITION statements are usually options of TABLE statements, except for SHOW PARTITIONS.
CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name
[COMMENT database_comment]
[LOCATION hdfs_path]
[MANAGEDLOCATION hdfs_path]
[WITH DBPROPERTIES (property_name=property_value, ...)];
The uses of SCHEMA and DATABASE are interchangeable – they mean the same thing
DROP (DATABASE|SCHEMA) [IF EXISTS] database_name [RESTRICT|CASCADE];
默认行为是RESTRICT,如果数据库不为空,则DROP DATABASE将失败。 要将表也拖放到数据库中,请使用DROP DATABASE ... CASCADE。 在Hive 0.8中添加了对RESTRICT和CASCADE的支持。
ALTER (DATABASE|SCHEMA) database_name SET DBPROPERTIES (property_name=property_value, ...); -- (Note: SCHEMA added in Hive 0.14.0)
ALTER (DATABASE|SCHEMA) database_name SET OWNER [USER|ROLE] user_or_role; -- (Note: Hive 0.13.0 and later; SCHEMA added in Hive 0.14.0)
ALTER (DATABASE|SCHEMA) database_name SET LOCATION hdfs_path; -- (Note: Hive 2.2.1, 2.4.0 and later)
ALTER (DATABASE|SCHEMA) database_name SET MANAGEDLOCATION hdfs_path; -- (Note: Hive 4.0.0 and later)
ALTER DATABASE ... SET LOCATION语句不会将数据库当前目录的内容移动到新指定的位置。 它不会更改与指定数据库下任何表/分区关联的位置。 它仅更改默认的父目录,在该目录中将为此数据库添加新表。 此行为类似于更改表目录不会将现有分区移动到其他位置。
ALTER DATABASE ... SET MANAGEDLOCATION语句不会将数据库的托管表目录的内容移动到新指定的位置。 它不会更改与指定数据库下任何表/分区关联的位置。 它仅更改默认的父目录,在该目录中将为此数据库添加新表。 此行为类似于更改表目录不会将现有分区移动到其他位置。
关于数据库的其他元数据无法更改。
USE database_name;
USE DEFAULT;
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
LIKE existing_table_or_view_name
[LOCATION hdfs_path];
create table table_name (
id int,
dtDontQuery string,
name string
)
partitioned by (date string)
CREATE TABLE page_view(viewTime INT, userid BIGINT,
page_url STRING, referrer_url STRING,
ip STRING COMMENT 'IP Address of the User')
COMMENT 'This is the page view table'
PARTITIONED BY(dt STRING, country STRING)
STORED AS SEQUENCEFILE;
CREATE TABLE创建具有给定名称的表。 如果已经存在具有相同名称的表或视图,则会引发错误。 您可以使用IF NOT EXISTS跳过该错误。
可以使用PARTITIONED BY子句创建分区表。一个表可以具有一个或多个分区列,并为分区列中的每个不同值组合创建一个单独的数据目录。此外,可以使用CLUSTERED BY列对表或分区进行存储,并且可以通过SORT BY列在该存储区中对数据进行排序。这样可以提高某些查询的性能。
如果在创建分区表时收到以下错误消息:“ FAILED:语义分析错误:分区列中重复的列”,则表示您试图将分区列包含在表本身的数据中。您可能确实定义了该列。但是,您创建的分区将创建一个可查询的伪列,因此您必须将表列重命名为其他名称(用户不应在其上查询!)。
例如,假设您原始的未分区表具有三列:id,date和name,现在您想按日期进行分区。您的Hive定义可以使用“ dtDontQuery”作为列名,以便可以将“ date”用于分区(和查询)。
DROP TABLE [IF EXISTS] table_name [PURGE];
如果指定了PURGE,则表数据不会进入.Trash / Current目录,因此如果DROP错误,则无法检索该表数据。 还可以使用表属性auto.purge指定清除选项。
TRUNCATE [TABLE] table_name [PARTITION partition_spec];
partition_spec:
: (partition_column = partition_col_value, partition_column = partition_col_value, ...)
从表或分区中删除所有行。 如果启用了文件系统“废纸will”,则这些行将被废纸,否则将被删除
#Rename Table
ALTER TABLE table_name RENAME TO new_table_name;
#Alter Table Properties
ALTER TABLE table_name SET TBLPROPERTIES table_properties;
table_properties:
: (property_name = property_value, property_name = property_value, ... )
#Alter Table Comment
ALTER TABLE table_name SET TBLPROPERTIES ('comment' = new_comment);
#Add SerDe Properties
ALTER TABLE table_name [PARTITION partition_spec] SET SERDE serde_class_name [WITH SERDEPROPERTIES serde_properties];
ALTER TABLE table_name [PARTITION partition_spec] SET SERDEPROPERTIES serde_properties;
serde_properties:
: (property_name = property_value, property_name = property_value, ... )
CREATE TABLE test_change (a int, b int, c int);
// First change column a's name to a1.
ALTER TABLE test_change CHANGE a a1 INT;
// Next change column a1's name to a2, its data type to string, and put it after column b.
ALTER TABLE test_change CHANGE a1 a2 STRING AFTER b;
// The new table's structure is: b int, a2 string, c int.
// Then change column c's name to c1, and put it as the first column.
ALTER TABLE test_change CHANGE c c1 INT FIRST;
// The new table's structure is: c1 int, b int, a2 string.
// Add a comment to column a1
ALTER TABLE test_change CHANGE a1 a1 INT COMMENT 'this is column a1';
create psn_view
as
select id , name from psn;
视图是从一个或几个基本表导出的表。把复杂结果/重复接过保存下来;
视图本身不存在独立存储在数据库中,是一个虚表。即数据库中只存放视图的定义而不存放视图对应的数据,这些数据仍然存放在导出视图的基本表中。当然hdfs中没有表目录。这里看视图和表在数据库里的存储视图保存的是sql语句,类型显示为virtual_view。
DROP VIEW [IF EXISTS] view_name
ALTER VIEW [db_name.]view_name SET TBLPROPERTIES table_properties;
table_properties:
: (property_name = property_value, property_name = property_value, ...)
ALTER VIEW [db_name.]view_name AS select_statement;
CREATE MATERIALIZED VIEW [IF NOT EXISTS] [db_name.]materialized_view_name
[DISABLE REWRITE]
[COMMENT materialized_view_comment]
[PARTITIONED ON (col_name, ...)]
[CLUSTERED ON (col_name, ...) | DISTRIBUTED ON (col_name, ...) SORTED ON (col_name, ...)]
[
[ROW FORMAT row_format]
[STORED AS file_format]
| STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]
]
[LOCATION hdfs_path]
[TBLPROPERTIES (property_name=property_value, ...)]
AS SELECT ...;
DROP MATERIALIZED VIEW [db_name.]materialized_view_name;
ALTER MATERIALIZED VIEW [db_name.]materialized_view_name ENABLE|DISABLE REWRITE;
CREATE INDEX index_name
ON TABLE base_table_name (col_name, ...)
AS index_type
[WITH DEFERRED REBUILD]
[IDXPROPERTIES (property_name=property_value, ...)]
[IN TABLE index_table_name]
[
[ ROW FORMAT ...] STORED AS ...
| STORED BY ...
]
[LOCATION hdfs_path]
[TBLPROPERTIES (...)]
[COMMENT "index comment"];
DROP INDEX [IF EXISTS] index_name ON table_name;
ALTER INDEX index_name ON table_name [PARTITION partition_spec] REBUILD;
index使用指引