针对以上需求,有两种做法
lag()函数,取当前行的上一列,用法是lag(列,往上取的行数,填充值),如lag(score, 1, 0)
表示取score这一列当前行的上一行作为新的一行,若超出窗口范围,则给值为0
lag(score,1,0) over(order by score desc) lag_score
select
name,
subject,
score,
lag(score,1,0) over(order by score desc) lag_score
from default.score
where subject = '英语'
考虑到第一名同学没有上一行数据,给予0
if(lag_score - score > 0, lag_score - score, 0) score_diff
select
name,
score,
lag_score,
if(lag_score - score > 0, lag_score - score, 0) score_diff
from
(
select
name,
subject,
score,
lag(score,1,0) over(order by score desc) lag_score
from default.score
where subject = '英语'
)t1
2.使用排名函数row_number来解决
rank()函数在进行排名时,值相同,总数排名会跳过,这样在进行关联时,等号左右两边关联不上,数据减少。
所以这里使用row_number()进行排名,分数相同,排名顺序递增。
这里将排名存储在虚表中,避免二次查询。
with RankScore as (
select
name,
subject,
score,
-- 使用rank()导致排名减少,造成关联不上
row_number() over(partition by subject order by score desc) rk
from default.score
where subject = '英语'
)
select
from RankScore a
-- 当前名与上一名学生的差距。
join RankScore b on a.rk = b.rk + 1
order by a.score desc
select
a.name,
a.subject,
a.score,
a.score - b.score as score_diff
from RankScore a
-- 当前名与上一名学生的差距。
join RankScore b on a.rk = b.rk + 1
order by a.score desc
with RankScore as (
select
name,
subject,
score,
-- 使用rank()导致排名减少,造成关联不上
row_number() over(partition by subject order by score desc) rk
from default.score
where subject = '英语'
)
select
a.name,
a.subject,
a.score,
a.score - b.score as score_diff
from RankScore a
-- 当前名与上一名学生的差距。
join RankScore b on a.rk = b.rk + 1
order by a.score desc
通过这道题,我们学习了排名函数、虚拟表的使用。