需求:
孙悟空 白羊座 A
沙悟净 射手座 A 白羊座,A 孙悟空|猪八戒
宋松松 白羊座 B ======= 白羊座,B 宋松松
猪八戒 白羊座 A 射手座,A 沙悟净|小凤姐
小凤姐 射手座 A
思路:表的行转列
案例一:使用Hive的beeline:
知识点:concat(string1,string2) string1和string2需要是列名
[root@master hive-1.2.2]# ./bin/beeline
...
0: jdbc:hive2://master:10000> desc t_vehicle_log;
0: jdbc:hive2://master:10000> select concat(vehicle_speed,vehicle_plate) from t_vehicle_log;
--concat()
0: jdbc:hive2://master:10000> select concat(vehicle_speed,",",vehicle_plate) from t_vehicle_log;
加上一个分隔符
案例二、concat_ws()
0: jdbc:hive2://master:10000> select concat_ws(",",monitor_id,camera_id,vehicle_plate) from t_vehicle_log;
案例三、collect_set() 对某一列去重,并返回数组
0: jdbc:hive2://master:10000> select monitor_id from t_vehicle_log;
0: jdbc:hive2://master:10000> select collect_set(monitor_id) from t_vehicle_log;
建表:
create table person_info(
name string,
constellation string,
blood_type string)
row format delimited fields terminated by "\t";
--插数据
load data local inpath '/usr/local/src/test4/hive/person_info.txt' into table person_info;
concat_ws()查询:
0: jdbc:hive2://master:10000> select concat_ws(",",constellation,blood_type) c_b,name from person_info;
结果:
对上一步结果作为子查询,在查询:
select
t1.c_b,
collect_set(t1.name)
from(
select concat_ws(",",constellation,blood_type) c_b,
name from person_info
) t1
group by t1.c_b;
最终结果:collect_set()返回的是数组,concat_ws()接受的string或者是string数组
将数组划分开:
select
t1.c_b,
concat_ws("|",collect_set(t1.name))
from(
select concat_ws(",",constellation,blood_type) c_b,
name from person_info
) t1
group by t1.c_b;
结果: