窗口限定一个范围,它可以理解为满足某些条件的记录集合,窗口函数也就是在窗口范围内执行的函数。
适用场景: 对分组中每一条记录进行计算场景下,普通聚合函数 group by是每一组只有一条记录!!!
注意Mysql8才支持窗口函数
窗口函数作用:
类似于分组函数,不同的是,分组操作会把结果聚合成一条记录,而窗口函数可以展示分组中每一条记录。
窗口函数分类:静态窗口函数 和 动态窗口函数。
静态窗口:窗口大小是固定的, 不会因为记录的不同而不同;
滑动窗口:不同的记录对应着不同的窗口。
<函数名> over ( partition by <分组的列>
order by <排序的列>
rows between <起始行> and <终止行>)
注意 partition by与group by的区别
1)前者不会压缩行数但是后者会
2)后者只能选取分组的列和聚合的列
group by 改变了原表结果集的行数和列数。
不排序可以写成order by null 或者直接不写
asc:不写表示升序,desc:降序
根据多个字段,进行排序:如 order by cid, sname
滑动窗口子句设置
分为两种:ROWS 、RANGE
不写时:默认是以分区为单位。【默认值为RANGE】
select *,
sum(score) over (partition by cid order by score rows current row ) as '分区1',
sum(score) over (partition by cid order by score range current row ) as '分区2',
sum(score) over (partition by cid range current row ) as '分区3'
from SQL_5;
窗口子句的描述
between 开始行 and 结束行 :指定区间范围。
N preceding :当前行上面第N行。
current row:当前行。
unbounded preceding :当前行上面所有行。(起点)
unbounded following :当前行下面所有行。(终点)
举例:
rows between unbounded preceding and current row 从之前所有的行到当前行
rows between current row and unbounded following 从当前行到之后所有的行
rows between 2 preceding and current row 从前面两行到当前行
rows between current row and following 从当前行到后面一行
注意:
缺少窗口子句,默认为: rows between unbounded preceding and current row
排序子句、窗口子句都缺失,默认为: rows between unbounded preceding and unbounded following
1) 通过partition by 和 order by 子句确定大窗口( 定义出上界unbounded preceding和下界unbounded following)
2) 通过row 子句针对每一行数据确定小窗口(滑动窗口)
3) 对每行的小窗口内的数据执行函数并生成新的列
-- 【排序类】
select *,
row_number() over w as '不可并列排名',
rank() over w as '跳跃可并列排名',
dense_rank() over w as '连续可并列排名'
from SQL_5
WINDOW w as (partition by cid order by score desc);
-- 查询每科成绩前三名:row_number()。
select * from (
select *, row_number() over (partition by cid order by score desc) as '前三名' from sql_5
) temp
where 前三名 <= 3;
-- 【聚合类】
-- 让同一班级每个学生都知道班级总分是多少
select *, sum(score) over (partition by cid) as '班级总分' from SQL_5;
-- 计算同一班级,每个同学和比他分数低的同学的累计总分是多少
select *, sum(score) over (partition by cid order by score) '累加分数' from SQL_5;
-- 【跨行类】
-- 同一班级内,成绩比自己低一名的分数是多少
select *, lag(score, 1, 0) over (partition by cid order by score) as '低一名的分数' from SQL_5;
-- 同一班级内,成绩比自己高2名的分数是多少
select *, lead(score, 2) over (partition by cid order by score) as '高两名的分数' from SQL_5;
-- 查询每科成绩第一名,倒数第一名。【first_value、last_value】
select distinct cid,
first_value(score) over (partition by cid order by score desc) as '第一名',
last_value(score) over (partition by cid ) as '倒数第一名'
from SQL_5
-- 查询每科成绩的第三名。【nth_value】
select distinct cid,
nth_value(score, 3) over (partition by cid order by score desc
rows between unbounded preceding and unbounded following ) as '第三名'
from SQL_5;
/*
* CUME_DIST():分组中 <= 当前行值的行数 / 分组内总行数
*/
select
*,
cume_dist() over (order by score) as cd1
from SQL_5
/*
* ntile(N) : 结果集平分成 N 组
*/
select
*,
ntile(4) OVER (order by score) aa
from
sql_5
分组内查找前N名
公式:
select * from
(
select *, row_number() over (partition by 分组列 order by 比较列) as rn from table
) as tmp
where rn <= N;
1) 窗口函数 -> 生成辅助列(相当于高级语言的临时变量)
把复杂的问题拆分成多个子问题并用临时表去表达
引用文章1
引用文章2
引用文章3