题目要求:
现在有这样一份数据:
1,huangxiaoming,45,a-c-d-f
2,huangzitao,36,b-c-d-e
3,huanglei,41,c-d-e
4,liushishi,22,a-d-e
5,liudehua,39,e-f-d
6,liuyifei,35,a-d-e
字段的意义:
id,name,age,favors
id,姓名,年龄,爱好
其中需要注意的是:每一条记录中的爱好有多个值,以"-"分隔
需求:
求出每种爱好中,年龄最大的两个人(爱好,年龄,姓名)
解题步骤:
1) 通过 explode() 函数将一条数据转为多行数据。
select id,name,age,h from hobby lateral view explode(split(hobby,"-")) hob as h;
+-----+----------------+------+----+
| id | name | age | h |
+-----+----------------+------+----+
| 1 | huangxiaoming | 45 | a |
| 1 | huangxiaoming | 45 | c |
| 1 | huangxiaoming | 45 | d |
| 1 | huangxiaoming | 45 | f |
| 2 | huangzitao | 36 | b |
| 2 | huangzitao | 36 | c |
| 2 | huangzitao | 36 | d |
| 2 | huangzitao | 36 | e |
| 3 | huanglei | 41 | c |
| 3 | huanglei | 41 | d |
| 3 | huanglei | 41 | e |
| 4 | liushishi | 22 | a |
| 4 | liushishi | 22 | d |
| 4 | liushishi | 22 | e |
| 5 | liudehua | 39 | e |
| 5 | liudehua | 39 | f |
| 5 | liudehua | 39 | d |
| 6 | liuyifei | 35 | a |
| 6 | liuyifei | 35 | d |
| 6 | liuyifei | 35 | e |
+-----+----------------+------+----+
2) 用开窗函数 row_number( ) over( ),按 h【爱好】分区,按age 排序,为每一行添加行号。
为了方便查看此处创建中间表hobby_bak:
create table hobby_bak as
select tt.id id,tt.name name,tt.age age,tt.h h,
row_number()over(partition by h order by age desc) as rownum
from
(select id,name,age,h from hobby
lateral view explode(split(hobby,"-")) hob as h
) as tt);
结果如下:
+---------------+-----------------+----------------+--------------+-------------------+
| hobby_bak.id | hobby_bak.name | hobby_bak.age | hobby_bak.h | hobby_bak.rownum |
+---------------+-----------------+----------------+--------------+-------------------+
| 1 | huangxiaoming | 45 | a | 1 |
| 6 | liuyifei | 35 | a | 2 |
| 4 | liushishi | 22 | a | 3 |
| 2 | huangzitao | 36 | b | 1 |
| 1 | huangxiaoming | 45 | c | 1 |
| 3 | huanglei | 41 | c | 2 |
| 2 | huangzitao | 36 | c | 3 |
| 1 | huangxiaoming | 45 | d | 1 |
| 3 | huanglei | 41 | d | 2 |
| 5 | liudehua | 39 | d | 3 |
| 2 | huangzitao | 36 | d | 4 |
| 6 | liuyifei | 35 | d | 5 |
| 4 | liushishi | 22 | d | 6 |
| 3 | huanglei | 41 | e | 1 |
| 5 | liudehua | 39 | e | 2 |
| 2 | huangzitao | 36 | e | 3 |
| 6 | liuyifei | 35 | e | 4 |
| 4 | liushishi | 22 | e | 5 |
| 1 | huangxiaoming | 45 | f | 1 |
| 5 | liudehua | 39 | f | 2 |
+---------------+-----------------+----------------+--------------+-------------------+
3) 取行号小于3的记录。
select * from hobby_bak where rownum<3;
+---------------+-----------------+----------------+--------------+-------------------+
| hobby_bak.id | hobby_bak.name | hobby_bak.age | hobby_bak.h | hobby_bak.rownum |
+---------------+-----------------+----------------+--------------+-------------------+
| 1 | huangxiaoming | 45 | a | 1 |
| 6 | liuyifei | 35 | a | 2 |
| 2 | huangzitao | 36 | b | 1 |
| 1 | huangxiaoming | 45 | c | 1 |
| 3 | huanglei | 41 | c | 2 |
| 1 | huangxiaoming | 45 | d | 1 |
| 3 | huanglei | 41 | d | 2 |
| 3 | huanglei | 41 | e | 1 |
| 5 | liudehua | 39 | e | 2 |
| 1 | huangxiaoming | 45 | f | 1 |
| 5 | liudehua | 39 | f | 2 |
+---------------+-----------------+----------------+--------------+-------------------+