hql语言学习

Show

SHOW (DATABASES|SCHEMAS) [LIKE 'identifier_with_wildcards'];
#SHOW DATABASES或SHOW SCHEMAS列出了元存储中定义的所有数据库。 SCHEMAS和DATABASES的用法是可互换的–它们意味着同一件事。
#可选的LIKE子句允许使用正则表达式过滤数据库列表。 对于任何字符,正则表达式中的通配符只能是“ *”或“ |” 供选择。 示例为'employees','emp *','emp * | * ees',所有这些都将匹配名为'employees'的数据库。

SHOW TABLES [IN database_name] ['identifier_with_wildcards'];
#SHOW TABLES列出当前数据库(或使用IN子句明确命名的数据库)中的所有基本表和视图,其名称与可选的正则表达式匹配。 对于任何字符,正则表达式中的通配符只能是“ *”或“ |” 供选择。 示例为“ page_view”,“ page_v *”,“ * view | page *”,所有这些都将与“ page_view”表匹配。 匹配表按字母顺序列出。 如果在metastore中没有找到匹配的表,这不是错误。 如果未给出正则表达式,则会列出所选数据库中的所有表。

SHOW VIEWS [IN/FROM database_name] [LIKE 'pattern_with_wildcards'];

SHOW VIEWS;                                -- show all views in the current database
SHOW VIEWS 'test_*';                       -- show all views that start with "test_"
SHOW VIEWS '*view2';                       -- show all views that end in "view2"
SHOW VIEWS LIKE 'test_view1|test_view2';   -- show views named either "test_view1" or "test_view2"
SHOW VIEWS FROM test1;                     -- show views from database test1
SHOW VIEWS IN test1;                       -- show views from database test1 (FROM and IN are same)
SHOW VIEWS IN test1 "test_*";              -- show views from database test2 that start with "test_"

SHOW MATERIALIZED VIEWS [IN/FROM database_name] [LIKE 'pattern_with_wildcards’];

SHOW PARTITIONS table_name;

SHOW PARTITIONS table_name PARTITION(ds='2010-03-03');            -- (Note: Hive 0.6 and later)
SHOW PARTITIONS table_name PARTITION(hr='12');                    -- (Note: Hive 0.6 and later)
SHOW PARTITIONS table_name PARTITION(ds='2010-03-03', hr='12');   -- (Note: Hive 0.6 and later)

SHOW PARTITIONS table_name;
SHOW PARTITIONS table_name PARTITION(ds='2010-03-03');            -- (Note: Hive 0.6 and later)
SHOW PARTITIONS table_name PARTITION(hr='12');                    -- (Note: Hive 0.6 and later)
SHOW PARTITIONS table_name PARTITION(ds='2010-03-03', hr='12');   -- (Note: Hive 0.6 and later)
SHOW PARTITIONS databaseFoo.tableBar PARTITION(ds='2010-03-03', hr='12');   -- (Note: Hive 0.13.0 and later)
SHOW PARTITIONS databaseFoo.tableBar LIMIT 10;                                                               -- (Note: Hive 4.0.0 and later)
SHOW PARTITIONS databaseFoo.tableBar PARTITION(ds='2010-03-03') LIMIT 10;                                    -- (Note: Hive 4.0.0 and later)
SHOW PARTITIONS databaseFoo.tableBar PARTITION(ds='2010-03-03') ORDER BY hr DESC LIMIT 10;                   -- (Note: Hive 4.0.0 and later)
SHOW PARTITIONS databaseFoo.tableBar PARTITION(ds='2010-03-03') WHERE hr >= 10 ORDER BY hr DESC LIMIT 10;    -- (Note: Hive 4.0.0 and later)
SHOW PARTITIONS databaseFoo.tableBar WHERE hr >= 10 AND ds='2010-03-03' ORDER BY hr DESC LIMIT 10; 

SHOW TABLE EXTENDED [IN|FROM database_name] LIKE 'identifier_with_wildcards' [PARTITION(partition_spec)];
show table extended like part_table

-- SHOW COLUMNS
CREATE DATABASE test_db;
USE test_db;
CREATE TABLE foo(col1 INT, col2 INT, col3 INT, cola INT, colb INT, colc INT, a INT, b INT, c INT);
-- SHOW COLUMNS basic syntax
SHOW COLUMNS FROM foo;                            -- show all column in foo
SHOW COLUMNS FROM foo "*";                        -- show all column in foo
SHOW COLUMNS IN foo "col*";                       -- show columns in foo starting with "col"                 OUTPUT col1,col2,col3,cola,colb,colc
SHOW COLUMNS FROM foo '*c';                       -- show columns in foo ending with "c"                     OUTPUT c,colc
SHOW COLUMNS FROM foo LIKE "col1|cola";           -- show columns in foo either col1 or cola                 OUTPUT col1,cola
SHOW COLUMNS FROM foo FROM test_db LIKE 'col*';   -- show columns in foo starting with "col"                 OUTPUT col1,col2,col3,cola,colb,colc
SHOW COLUMNS IN foo IN test_db LIKE 'col*';       -- show columns in foo starting with "col" (FROM/IN same)  OUTPUT col1,col2,col3,cola,colb,colc
-- Non existing column pattern resulting in no match
SHOW COLUMNS IN foo "nomatch*";
SHOW COLUMNS IN foo "col+";                       -- + wildcard not supported
SHOW COLUMNS IN foo "nomatch";

Data Definition Statements

Overview

HiveQL DDL statements are documented here, including:

CREATE DATABASE/SCHEMA, TABLE, VIEW, FUNCTION, INDEX
DROP DATABASE/SCHEMA, TABLE, VIEW, INDEX
TRUNCATE TABLE
ALTER DATABASE/SCHEMA, TABLE, VIEW
MSCK REPAIR TABLE (or ALTER TABLE RECOVER PARTITIONS)
SHOW DATABASES/SCHEMAS, TABLES, TBLPROPERTIES, VIEWS, PARTITIONS, FUNCTIONS, INDEX[ES], COLUMNS, CREATE TABLE
DESCRIBE DATABASE/SCHEMA, table_name, view_name, materialized_view_name

PARTITION statements are usually options of TABLE statements, except for SHOW PARTITIONS.

Create/Drop/Alter/Use Database

Create Database

CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name
  [COMMENT database_comment]
  [LOCATION hdfs_path]
  [MANAGEDLOCATION hdfs_path]
  [WITH DBPROPERTIES (property_name=property_value, ...)];

The uses of SCHEMA and DATABASE are interchangeable – they mean the same thing

Drop Database

DROP (DATABASE|SCHEMA) [IF EXISTS] database_name [RESTRICT|CASCADE];

 默认行为是RESTRICT,如果数据库不为空,则DROP DATABASE将失败。 要将表也拖放到数据库中,请使用DROP DATABASE ... CASCADE。 在Hive 0.8中添加了对RESTRICT和CASCADE的支持。

Alter Database

ALTER (DATABASE|SCHEMA) database_name SET DBPROPERTIES (property_name=property_value, ...);   -- (Note: SCHEMA added in Hive 0.14.0) 
ALTER (DATABASE|SCHEMA) database_name SET OWNER [USER|ROLE] user_or_role;   -- (Note: Hive 0.13.0 and later; SCHEMA added in Hive 0.14.0)
ALTER (DATABASE|SCHEMA) database_name SET LOCATION hdfs_path; -- (Note: Hive 2.2.1, 2.4.0 and later)
ALTER (DATABASE|SCHEMA) database_name SET MANAGEDLOCATION hdfs_path; -- (Note: Hive 4.0.0 and later)

ALTER DATABASE ... SET LOCATION语句不会将数据库当前目录的内容移动到新指定的位置。 它不会更改与指定数据库下任何表/分区关联的位置。 它仅更改默认的父目录,在该目录中将为此数据库添加新表。 此行为类似于更改表目录不会将现有分区移动到其他位置。

ALTER DATABASE ... SET MANAGEDLOCATION语句不会将数据库的托管表目录的内容移动到新指定的位置。 它不会更改与指定数据库下任何表/分区关联的位置。 它仅更改默认的父目录,在该目录中将为此数据库添加新表。 此行为类似于更改表目录不会将现有分区移动到其他位置。

关于数据库的其他元数据无法更改。

Use Database

USE database_name;
USE DEFAULT;

Create/Drop/Truncate Table

Create Table

CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
  LIKE existing_table_or_view_name
  [LOCATION hdfs_path];

create table table_name (
  id                int,
  dtDontQuery       string,
  name              string
)
partitioned by (date string)

CREATE TABLE page_view(viewTime INT, userid BIGINT,
     page_url STRING, referrer_url STRING,
     ip STRING COMMENT 'IP Address of the User')
 COMMENT 'This is the page view table'
 PARTITIONED BY(dt STRING, country STRING)
 STORED AS SEQUENCEFILE;

CREATE TABLE创建具有给定名称的表。 如果已经存在具有相同名称的表或视图,则会引发错误。 您可以使用IF NOT EXISTS跳过该错误。

可以使用PARTITIONED BY子句创建分区表。一个表可以具有一个或多个分区列,并为分区列中的每个不同值组合创建一个单独的数据目录。此外,可以使用CLUSTERED BY列对表或分区进行存储,并且可以通过SORT BY列在该存储区中对数据进行排序。这样可以提高某些查询的性能。

如果在创建分区表时收到以下错误消息:“ FAILED:语义分析错误:分区列中重复的列”,则表示您试图将分区列包含在表本身的数据中。您可能确实定义了该列。但是,您创建的分区将创建一个可查询的伪列,因此您必须将表列重命名为其他名称(用户不应在其上查询!)。

例如,假设您原始的未分区表具有三列:id,date和name,现在您想按日期进行分区。您的Hive定义可以使用“ dtDontQuery”作为列名,以便可以将“ date”用于分区(和查询)。

 Drop Table

DROP TABLE [IF EXISTS] table_name [PURGE];

如果指定了PURGE,则表数据不会进入.Trash / Current目录,因此如果DROP错误,则无法检索该表数据。 还可以使用表属性auto.purge指定清除选项。

Truncate Table

TRUNCATE [TABLE] table_name [PARTITION partition_spec];
partition_spec:
  : (partition_column = partition_col_value, partition_column = partition_col_value, ...)

从表或分区中删除所有行。 如果启用了文件系统“废纸will”,则这些行将被废纸,否则将被删除

Alter Table/Partition/Column

#Rename Table
ALTER TABLE table_name RENAME TO new_table_name;
#Alter Table Properties
ALTER TABLE table_name SET TBLPROPERTIES table_properties;
table_properties:
  : (property_name = property_value, property_name = property_value, ... )
#Alter Table Comment
ALTER TABLE table_name SET TBLPROPERTIES ('comment' = new_comment);
#Add SerDe Properties
ALTER TABLE table_name [PARTITION partition_spec] SET SERDE serde_class_name [WITH SERDEPROPERTIES serde_properties];
ALTER TABLE table_name [PARTITION partition_spec] SET SERDEPROPERTIES serde_properties;
serde_properties:
  : (property_name = property_value, property_name = property_value, ... )
CREATE TABLE test_change (a int, b int, c int);
 
// First change column a's name to a1.
ALTER TABLE test_change CHANGE a a1 INT;
 
// Next change column a1's name to a2, its data type to string, and put it after column b.
ALTER TABLE test_change CHANGE a1 a2 STRING AFTER b;
// The new table's structure is:  b int, a2 string, c int.
  
// Then change column c's name to c1, and put it as the first column.
ALTER TABLE test_change CHANGE c c1 INT FIRST;
// The new table's structure is:  c1 int, b int, a2 string.
  
// Add a comment to column a1
ALTER TABLE test_change CHANGE a1 a1 INT COMMENT 'this is column a1';

 Create/Drop/Alter View

create psn_view 
as
select id , name from psn;

视图是从一个或几个基本表导出的表。把复杂结果/重复接过保存下来;

视图本身不存在独立存储在数据库中,是一个虚表。即数据库中只存放视图的定义而不存放视图对应的数据,这些数据仍然存放在导出视图的基本表中。当然hdfs中没有表目录。这里看视图和表在数据库里的存储视图保存的是sql语句,类型显示为virtual_view。
 

DROP VIEW [IF EXISTS] view_name  
ALTER VIEW [db_name.]view_name SET TBLPROPERTIES table_properties;
 
table_properties:
  : (property_name = property_value, property_name = property_value, ...)
ALTER VIEW [db_name.]view_name AS select_statement;

 Create/Drop/Alter Materialized View

CREATE MATERIALIZED VIEW [IF NOT EXISTS] [db_name.]materialized_view_name
  [DISABLE REWRITE]
  [COMMENT materialized_view_comment]
  [PARTITIONED ON (col_name, ...)]
  [CLUSTERED ON (col_name, ...) | DISTRIBUTED ON (col_name, ...) SORTED ON (col_name, ...)]
  [
    [ROW FORMAT row_format]
    [STORED AS file_format]
      | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]
  ]
  [LOCATION hdfs_path]
  [TBLPROPERTIES (property_name=property_value, ...)]
AS SELECT ...;

DROP MATERIALIZED VIEW [db_name.]materialized_view_name;

ALTER MATERIALIZED VIEW [db_name.]materialized_view_name ENABLE|DISABLE REWRITE;

 Create/Drop/Alter Index

CREATE INDEX index_name
  ON TABLE base_table_name (col_name, ...)
  AS index_type
  [WITH DEFERRED REBUILD]
  [IDXPROPERTIES (property_name=property_value, ...)]
  [IN TABLE index_table_name]
  [
     [ ROW FORMAT ...] STORED AS ...
     | STORED BY ...
  ]
  [LOCATION hdfs_path]
  [TBLPROPERTIES (...)]
  [COMMENT "index comment"];

DROP INDEX [IF EXISTS] index_name ON table_name;

ALTER INDEX index_name ON table_name [PARTITION partition_spec] REBUILD;

index使用指引 

Create/Drop Macro

Create/Drop/Reload Function

 

 

 

 

 

 

 

 

 

你可能感兴趣的:(大数据)