Hive 笔记三 sql语句实现每班前三名,分数一样并列,同时求出前三名按名次排序的分差

也是题目:

编写 sql 语句实现每班前三名,分数一样并列,同时求出前三名按名次排序的分差
数据:
1,1901,90
2,1901,90
3,1901,83
4,1901,60
5,1902,66
6,1902,23
7,1902,99
8,1902,67
9,1902,87

结果要求:

class score rank lagscore
1901 	90 	 1 		0 
1901 	90 	 1 		0 
1901 	83 	 2 		-7 
1901 	60 	 3 		-23 
1902 	99 	 1 		0 
1902 	87 	 2 		-12 
1902 	67 	 3 		-20

表结构:

create table stu( sno int,class string, score int )row format delimited fields terminated by ',';

加载数据:

load data local inpath '/home/hadoop/data/stu.dat' into table stu;

处理逻辑:
 1、上排名函数,分数一样并列,所以用dense_rank
 2、将上一行数据下移,相减即得到分数差
 3、处理 NULL

1) 排名:

select class,score,dense_rank() over(partition by class order by  score desc  ) as rank from stu;

class   score   rank
1901    90      1
1901    90      1
1901    83      2
1901    60      3
1902    99      1
1902    87      2
1902    67      3
1902    66      4
1902    23      5

2)将上一行数据下移,相减即得到分数差 以及处理 NULL

select class,score,rank,
nvl(score-lag(score) over (partition by class order by  score  desc),0)
from (
select class,score,dense_rank() over(partition by class order by  score desc  ) as rank from stu) tmp
where rank<=3;

class   score   rank    _c3
1901    90      1       0
1901    90      1       0
1901    83      2       -7
1901    60      3       -23
1902    99      1       0
1902    87      2       -12
1902    67      3       -20

函数说明

lag(score) 

LAG  (scalar_expression [,offset] [,default]) OVER ([query_partition_clause] order_by_clause); The LAG function is used to access data from a previous row.
Example:
 select p1.p_mfgr, p1.p_name, p1.p_size,
 p1.p_size - lag(p1.p_size,1,p1.p_size) over( distribute by p1.p_mfgr sort by p1.p_name) as deltaSz
 from part p1 join part p2 on p1.p_partkey = p2.p_partkey

简单点说:
lag。返回当前数据行的上一行数据
lead。返回当前数据行的下一行数据
举例:

select class,score,rank,
lag(score) over (partition by class order by  score  desc),
lead(score) over (partition by class order by  score  desc)
from (
select class,score,dense_rank() over(partition by class order by  score desc  ) as rank from stu) tmp
where rank<=3;

class   score   rank    lag_window_0    lead_window_1
1901    90      1       NULL            90
1901    90      1       90              83
1901    83      2       90              60
1901    60      3       83              NULL
1902    99      1       NULL            87
1902    87      2       99              67
1902    67      3       87              NULL


 

 

你可能感兴趣的:(Hive)