hive抽取类数组型数据

数据样式,

user apps
X123 ["123,微信,abc","234,QQ,bcd"]

抽取apps字段中的“微信”和“QQ”,目标数据如下,

user apps
X123 QQ,微信

思路:按","将apps字段进行拆分成行,然后在按进行分割后取第二部分,

select 
user
,concat_ws(',', collect_set(split(app, '\\,')[1])) AS applist --从0开始计数,第二部分标号为1
from (
select 'X123' as user ,'["123,微信,abc","234,QQ,bcd"]' as apps
)a
 LATERAL VIEW explode(split(apps, '\\"\\,\\"')) a AS app
 group by user

 

你可能感兴趣的:(SQL)