Hive案例01-行列转换

介绍Hive查询中的行列转换的用法

1. 案例一:求数学成绩比语文成绩好的学生的ID

(1) 需求分析

现有 hive 表 score, 内容如下:

hive> select * from score;
1   1   yuwen   43
2   1   shuxue  55
3   2   yuwen   77
4   2   shuxue  88
5   3   yuwen   98
6   3   shuxue  65

其中字段意义:
id(int) sid(int) subject(string) score(int)
分别代表:
本条记录的ID 学生ID 科目 成绩
需求:
求数学成绩比语文成绩好的学生的ID

(2) 方法一(join)

SELECT s1.sid FROM score s1 INNER JOIN score s2
ON s1.sid = s2.sid 
AND s1.score > s2.score
AND s1.subject = 'shuxue'
AND s2.subject = 'yuwen';

# 结果
1
2

(3) 方法二(行列转换)

思路:


Hive案例01-行列转换_第1张图片
--(1)
CREATE TABLE t1 AS
SELECT sid, 
CASE subject WHEN 'yuwen' THEN score END AS yuwen, 
CASE subject WHEN 'shuxue' THEN score END AS shuxue 
FROM score;

t1中的数据:
1   43      NULL
1   NULL    55
2   77      NULL
2   NULL    88
3   98      NULL
3   NULL    65

--(2)
CREATE TABLE t2 AS
SELECT sid, max(yuwen) yuwen, max(shuxue) shuxue 
FROM t1
GROUP BY sid;

t2中的数据:
1   43  55
2   77  88
3   98  65

--(3)
SELECT sid FROM t2 WHERE shuxue > yuwen;

结果:
1
2

2.案例二:销售表的行列转换

(1) 需求

现有hive表sales,内容如下:

hive> select * from sales;
sales.y sales.season    sales.sale
1991        1               11
1991        2               12
1991        3               13
1991        4               14
1992        1               21
1992        2               22
1992        3               23
1992        4               24

各字段分别代表:
y 年份
season 季度
sale 销售量
要求:在一行中显示每年每个季度的销售量

(2)实现

SELECT
    tmp.y,
    max(tmp.season1) season1,
    max(tmp.season2) season2,
    max(tmp.season3) season3,
    max(tmp.season4) season4
FROM (SELECT y,
        CASE season WHEN 1 THEN sale END AS season1,
        CASE season WHEN 2 THEN sale END AS season2,
        CASE season WHEN 3 THEN sale END AS season3,
        CASE season WHEN 4 THEN sale END AS season4
        FROM sales) tmp
GROUP BY tmp.y;

结果:
tmp.y   season1 season2 season3 season4
1991    11      12      13      14
1992    21      22      23      24

3. 案例三:学生成绩表的列转行

(1) 需求

有如下学生成绩表score:

id  sname   math    computer    english
1   Jed     34      58          58
2   Tony    45      87          45
3   Tom     76      34          89

请编写一个SQL语句把以上的这张表转换成下面这张表:

id  sname   course      score
1   Jed     computer    58
1   Jed     english     58
1   Jed     math        34
2   Tony    computer    87
2   Tony    english     45
2   Tony    math        45
3   Tom     computer    34
3   Tom     english     89
3   Tom     math        76

(2) 实现

select id, sname, 'math' as course, math as score from score
union 
select id, sname, 'computer' as course, computer as score from score
union 
select id, sname, 'english' as course, english as score from score
order by id, sname, course;

你可能感兴趣的:(Hive案例01-行列转换)