Hive explode lateral view 用法

类似于SqlServer的游标,把字段的内容转换成行显示。

lateral view UDTF(expression) tableAliasName as colAliasName

其中UDTF(expression)表示表生成函数说白了就是行转列的函数,即一行变为多行的函数,比如explode,当然也可以通过UDF自定义函数把一行转为多行,或者UDF返回Array,再通过explode炸成多行

tableAliasName表示表的别名,colAliasName表示表的列的别名

原理是:通过lateral view UDTF(expression)函数把一行转换为多行,会生成一个临时表,把这些数据放入这个临时表中,然后使用这个临时表和base表做inner join 使用的条件就是原始表的关系

建表语句:

create table sales_info_new(
sku_id string comment '商品id',
sku_name string comment '商品名称',
state_map map comment '商品状态信息',
id_array array comment '商品相关id列表'
)
partitioned by(
dt  string comment '年-月-日'
)
row format delimited
  fields terminated by '|'
  collection items terminated by ','
  map keys terminated by ':';        

从本地导入后,查询数据:

------

SELECT explode(id_array) AS new_id FROM sales_info where dt = '2019-04-26'; -- 1列 10行

Hive explode lateral view 用法_第1张图片

------

explode(array):

select sku_id,sku_name from sales_info lateral view explode(id_array) table_alias as id where dt = '2019-04-26'; -- 2列 10行,id列没有自动追加在后面

select sku_id,sku_name ,id from sales_info lateral view explode(id_array) table_alias as id where dt = '2019-04-26';  -- 3列 10行,id列追加在后面
 
select  * from sales_info lateral view explode(id_array) table_alias as id where dt = '2019-04-26';    -- 6列 10行,id列自动追加在后面

Hive explode lateral view 用法_第2张图片

lateral view:如果指定字段名则需要把lateral view查询出的列写到select中,才能在结果中出现;如果直接是select * 则自动会把lateral view查询出的列追加在后面;

------

where 条件要写在lateral view 后面,不然报错

select  * from sales_info  where dt = '2019-04-26' lateral view explode(id_array) table_alias as id;    -- 报错 2019-04-28 16:48:49,044 FAILED: ParseException line 1:51 missing EOF at 'lateral' near ''2019-04-26''

 array类型不能再按照逗号split,string类型才可以

select  * from sales_info lateral view explode(split(id_array,',')) table_alias as id where dt = '2019-04-26';
-- 2019-04-28 16:26:29,224 FAILED: ClassCastException org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector

 explode要与lateral view配合使用,不能单独出现在字段中!

SELECT sku_id,sku_name,explode(id_array) AS new_id FROM sales_info where dt = '2019-04-26'; -- 2019-04-28 15:55:41,642 FAILED: SemanticException [Error 10081]: UDTF's are not supported outside the SELECT clause, nor nested in expressions

------

explode(map)

只有key,value两列,所以输出3列报错

select explode(state_map) as (id,token,user_name) from sales_info where dt = '2019-04-26'; -- 2019-04-28 15:57:24,080 FAILED: SemanticException [Error 10083]: The number of aliases supplied in the AS clause does not match the number of columns output by the UDTF expected 2 aliases but got 3

------

select explode(state_map) as (key,value) from sales_info where dt = '2019-04-26'; -- 2列 15行

Hive explode lateral view 用法_第3张图片

------

select sku_name from sales_info lateral view explode(state_map)  table_alias as key,value  where dt = '2019-04-26' -- 只有sku_name 1列 共15行
 
select sku_name ,key,value from sales_info lateral view explode(state_map)  table_alias as key,value  where dt = '2019-04-26' -- 有3列共15行

Hive explode lateral view 用法_第4张图片

------

参考:https://blog.csdn.net/lyzx_in_csdn/article/details/85628867

你可能感兴趣的:(Hive)