hive中的concat,concat_ws,collect_set用法

需求:对用户的信息进行分析,相同用户的地址信息按照不同类型分别展示出来,每个用户一行。

table1:

user_id location location_type
123 w2ny6s home
123 w2ny6k work

目标:

user_id home_location work_location
123 w2ny6s w2ny6k

使用concat_ws()和collect_set()进行合并行,将上面列表中一个user可能会占用多行转换为每个user占一行的目标表格式,实际是“列转行”

    select user_id,
           concat_ws('',collect_set(home_location)) as home_location,
           concat_ws('',collect_set(work_location)) as work_location
      from (
            select user_id,
                   if(location_type='home', location, "") as home_location,
                   if(location_type='work', location, "") as work_location
              from table1
           ) a

collect_set的作用:collect_set(col)函数只接受基本数据类型,它的主要作用是将某字段的值进行去重汇总,产生array类型字段。

concat_ws的作用:表示concat with separator,即有分隔符的字符串连接,concat_ws(”,collect_set(home_location))表示用空的字符”来连接collect_set返回的array中的每个元素。

concat:可以连接一个或者多个字符串,select concat(‘11’,’22’,’33’);//112233

你可能感兴趣的:(Hive)