Hive Lateral view介绍

1). Lateral View语法

[html]   view plain copy
  1. lateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias (',' columnAlias)* fromClause: FROM baseTable (lateralView)*  

2). Lateral View用于UDTF(user-defined table generating functions)中将行转成列,例如explode().

3). 目前Lateral View不支持有上而下的优化。如果使用Where子句,查询可能将不被编译。解决方法见:

此时,在查询之前执行et hive.optimize.ppd=false;

4).例子

pageAds。它有两个列:pageid,adid_list:

string pageid Array<int> adid_list
"front_page" [1, 2, 3]
"contact_page" [3, 4, 5]
SELECT pageid, adid FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid;

将输出如下结果:

string pageid int adid
"front_page" 1
"front_page" 2
"front_page" 3
"contact_page" 3
"contact_page" 4
"contact_page"  

count/group可以被使用:

SELECT adid, count(1) FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid GROUP BY adid;

结果如下:

int adid count(1)
1 1
2 1
3 2
4 1
5 1

5). 多个Lateral View

例如下面的HQL:

SELECT * FROM exampleTable LATERAL VIEW explode(col1) myTable1 AS myCol1 LATERAL VIEW explode(myCol1) myTable2 AS myCol2;

假设使用的base表如下:

Array<int> col1 Array<string> col2
[1, 2] [a", "b", "c"]
[3, 4] [d", "e", "f"]
HQL:
[sql]   view plain copy
  1. SELECT myCol1, col2 FROM baseTable LATERAL VIEW explode(col1) myTable1 AS myCol1;  

将产生如下结果:

int mycol1 Array<string> col2
1 [a", "b", "c"]
2 [a", "b", "c"]
3 [d", "e", "f"]
4 [d", "e", "f"]

多个Lateral View的HQL:

[sql]   view plain copy
  1. SELECT myCol1, myCol2 FROM baseTable LATERAL VIEW explode(col1) myTable1 AS myCol1  
  2. LATERAL VIEW explode(col2) myTable2 AS myCol2;  

将产生如下结果:

int myCol1 string myCol2
1 "a"
1 "b"
1 "c"
2 "a"
2 "b"
2 "c"
3 "d"
3 "e"
3 "f"
4 "d"
4 "e"
4 "f"

你可能感兴趣的:(Hive Lateral view介绍)