类似于SqlServer的游标,把字段的内容转换成行显示。
lateral view UDTF(expression) tableAliasName as colAliasName
其中UDTF(expression)表示表生成函数说白了就是行转列的函数,即一行变为多行的函数,比如explode,当然也可以通过UDF自定义函数把一行转为多行,或者UDF返回Array,再通过explode炸成多行
tableAliasName表示表的别名,colAliasName表示表的列的别名
原理是:通过lateral view UDTF(expression)函数把一行转换为多行,会生成一个临时表,把这些数据放入这个临时表中,然后使用这个临时表和base表做inner join 使用的条件就是原始表的关系
建表语句:
create table sales_info_new(
sku_id string comment '商品id',
sku_name string comment '商品名称',
state_map map comment '商品状态信息',
id_array array comment '商品相关id列表'
)
partitioned by(
dt string comment '年-月-日'
)
row format delimited
fields terminated by '|'
collection items terminated by ','
map keys terminated by ':';
从本地导入后,查询数据:
------
SELECT explode(id_array) AS new_id FROM sales_info where dt = '2019-04-26'; -- 1列 10行
------
explode(array):
select sku_id,sku_name from sales_info lateral view explode(id_array) table_alias as id where dt = '2019-04-26'; -- 2列 10行,id列没有自动追加在后面
select sku_id,sku_name ,id from sales_info lateral view explode(id_array) table_alias as id where dt = '2019-04-26'; -- 3列 10行,id列追加在后面
select * from sales_info lateral view explode(id_array) table_alias as id where dt = '2019-04-26'; -- 6列 10行,id列自动追加在后面
lateral view:如果指定字段名则需要把lateral view查询出的列写到select中,才能在结果中出现;如果直接是select * 则自动会把lateral view查询出的列追加在后面;
------
where 条件要写在lateral view 后面,不然报错
select * from sales_info where dt = '2019-04-26' lateral view explode(id_array) table_alias as id; -- 报错 2019-04-28 16:48:49,044 FAILED: ParseException line 1:51 missing EOF at 'lateral' near ''2019-04-26''
array类型不能再按照逗号split,string类型才可以
select * from sales_info lateral view explode(split(id_array,',')) table_alias as id where dt = '2019-04-26';
-- 2019-04-28 16:26:29,224 FAILED: ClassCastException org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
explode要与lateral view配合使用,不能单独出现在字段中!
SELECT sku_id,sku_name,explode(id_array) AS new_id FROM sales_info where dt = '2019-04-26'; -- 2019-04-28 15:55:41,642 FAILED: SemanticException [Error 10081]: UDTF's are not supported outside the SELECT clause, nor nested in expressions
------
explode(map)
只有key,value两列,所以输出3列报错
select explode(state_map) as (id,token,user_name) from sales_info where dt = '2019-04-26'; -- 2019-04-28 15:57:24,080 FAILED: SemanticException [Error 10083]: The number of aliases supplied in the AS clause does not match the number of columns output by the UDTF expected 2 aliases but got 3
------
select explode(state_map) as (key,value) from sales_info where dt = '2019-04-26'; -- 2列 15行
------
select sku_name from sales_info lateral view explode(state_map) table_alias as key,value where dt = '2019-04-26' -- 只有sku_name 1列 共15行
select sku_name ,key,value from sales_info lateral view explode(state_map) table_alias as key,value where dt = '2019-04-26' -- 有3列共15行
------
参考:https://blog.csdn.net/lyzx_in_csdn/article/details/85628867