sql实现分组排序

1. 首先是mysql实现分组排序

如下scores表记录了某次考试各班级学生的成绩:
id是学号(主键),class是班级,Chinese、math、English分别为语文数学英语的成绩:

如何查询出每个班级语文成绩前三名的同学记录?
图片.png

先上一下sql

SELECT
    a.`id`,
    a.class,
    a.`name`,
    a.chinese,
    count(b.`id`) + 1 as rank_num
from scores a
left join scores b on a.class = b.class
    and a.chinese < b.chinese
group by a.`id`,
    a.class,
    a.`name`,
    a.chinese
HAVING rank_num <=3
order by a.class asc,a.chinese desc

结果下如图:


图片.png

可以看到,rank_num实现了按照班级对语文成绩的降序排列,下面分步解释:
1.1 把scores表与它自己left Join,会产生以下结果

SELECT
    a.*,
    b.*
from scores a
left join scores b on a.class = b.class
    and a.chinese < b.chinese
where a.class = 215    -- 方便说明,只选取一个班级
order by a.chinese desc,a.name 
图片.png

可以看到,语文成绩最高的胡贵波没有关联到b表的记录,因为有a.chinese < b.chinese限制,语文成绩第二的马雨韵只关联到了第一的胡贵波,而第三的丰雪关联到了前两位,

那么此时对b表的id进行count,就会得出一个rank数字
SELECT
    a.`id`,
    a.class,
    a.`name`,
    a.chinese,
    count(b.`id`) as rank_nnum
from scores a
left join scores b on a.class = b.class
    and a.chinese < b.chinese
where a.class = 215    -- 方便说明,只选取一个班级
group by a.`id`,
    a.class,
    a.`name`,
    a.chinese
order by a.chinese desc,a.name 

图片.png

此时,rank已对语文成绩做出了排名,按照习惯没有第 0 名,所以rank_num加个1,也就是count(b.id) + 1,
所以,最终sql是:

SELECT
    a.`id`,
    a.class,
    a.`name`,
    a.chinese,
    count(b.`id`) + 1 as rank_num
from scores a
left join scores b on a.class = b.class
    and a.chinese < b.chinese
group by a.`id`,
    a.class,
    a.`name`,
    a.chinese
HAVING rank_num <=3
order by a.class asc,a.chinese desc

2. Hive实现分组排序

a 表如下,id为人员id,日期为签到日期,取每个用户前两天的记录

id date
1 2017-01-01
1 2017-01-02
1 2017-01-03
2 2017-01-01
2 2017-01-02
2 2017-01-03
3 2017-01-01
3 2017-01-02
4 2017-01-01

在hive里分组排序可以使用 row_number() over(partition by 列名1 order by 列名2)

select
    id,
    date,
    row_number() over(partition by id order by date) as rank
from a 

结果如下:

id date rank
1 2017-01-01 1
1 2017-01-02 2
1 2017-01-03 3
2 2017-01-01 1
2 2017-01-02 2
2 2017-01-03 3

在此基础上对rank进行筛选就可以得出想要的结果了

你可能感兴趣的:(sql实现分组排序)