Hive学习—行列转换

智者的梦再美,也不如愚人实干的脚印。供学习HSQL的童鞋们参考!

在工作学习中,往往需要对查询的表结构进行简单的行转列或列转行的优化

列转行

表数据如图,表名customer_details
Hive学习—行列转换_第1张图片
要求:查询每个国家,女性,男性的人数,如图
Hive学习—行列转换_第2张图片
我们很容易通过以下语句查出结果,但是格式和目标还需要转换下

select country,gender,count(*) as people from customer_details group by country,gender;

Hive学习—行列转换_第3张图片

方法一(易于理解)

查出每个国家的男性人数作为临时表1,查出每个国家的女性人数作为临时表2,然后通过country字段来inner join实现左右连接

临时表1

select country,count(*) as female from customer_details where gender='Female' group by country

Hive学习—行列转换_第4张图片
临时表2

select country,count(*) as female from customer_details where gender='Female' group by country

Hive学习—行列转换_第5张图片
连接

select a.country,female,male from
(select country,count(*) as female from customer_details where gender='Female' group by country)a 
join 
(select country,count(*) as male from customer_details where gender='Male' group by country)b 
on
a.country=b.country

方法二(效率高)

利用sum,if函数计算分组时男性,女性的人数

select country,
sum(if(gender='Male',1,0)) as male,
sum(if(gender='Female',1,0)) as female
from customer_details
group by country

Hive学习—行列转换_第6张图片

方法三(冷门)

分组,利用collect_list集合的size方法获取性别的数量

select a.country,female,male from
(select country,size(collect_list(gender)) as female from customer_details where gender='Female' group by country)a 
join 
(select country,size(collect_list(gender)) as male from customer_details where gender='Male' group by country)b 
on
a.country=b.country

为方便演示行转列,将上述的查询结果作为临时视图 tmpview

行转列

主要是利用union/union all实现上下连接

select country,'Female' as gender, female as people from tmpview 
union all
select country,'Male' as gender,male as people from tmpview

Hive学习—行列转换_第7张图片

你可能感兴趣的:(Hive,大数据,hive,zeppelin集成,数据库)