MYSQL8.0支持窗口函数(Window Function),也称分析函数。窗口函数与组分聚合函数类似,但是每一行数据都会生成一个结果。如果我们将mysql与pandas中的DataFrame做类比学习的话他们的对应关系如下:
所以以下函数:SUM/AVG/COUNT/MAX/MIN等既能做聚合函数又能做窗口函数,可称聚合窗口函数。
如果对pandas的DataFrame中agg()/apply()/transform()这三个方法比较清楚的小伙伴,下面学习开窗函数会特别简单。
function(args)over(
partition by …
order by… [desc]
frame
)
对于滑动窗口的范围指定,通常使用rows between frame_start and frame_end 语法来表示行范围,frame_start 和 frame_end 可以支持如下关键字,来确定不同的动态行记录:
分数排序leetcode178题 【不分组排序 】
select
score,
dense_rank() over(order by score desc) as 'rank'
from
Scores
部门工资最高员工leetcode184题【分组排序】
select
Department,
Employee,
Salary
from
#----------将下面看作一个表----------
(select
b.name as Department,
a.name as Employee,
Salary,
rank() over(partition by departmentID order by salary desc) as salary_rank
from
Employee a
join
Department b
on
a.departmentID = b.id) t
#----------用dense_rank()效果一样------------
where
salary_rank=1
select
product,
year_month,
gmv,
avg(gmv) over (partition by department, product order by year_month rows 2 preceding) as avg_gmv
from
product
滚动求从上架到本月平均GMV?
select
product,
year_month,
gmv,
avg(gmv) over (partition by department, product order by year_month) as avg_gmv
from
product
等价与:
select
product,
year_month,
gmv,
avg(gmv) over (partition by department, product order by year_month rows unbounded preceding) as avg_gmv
from
product
select
product,
year_month,
gmv,
avg(gmv) over (partition by department, product) as avg_gmv
from
product
参数解析:
expression:作用的字段
n:阶数
select
product,
year_month,
department,
gmv,
lag(gmv,1) over (partition by department, product order by year_month) as lag_gmv,
cast(gmv as double) / lag(gmv,1) over (partition by department, product order by year_month) - 1 as growth_rate
from product
简化写法:
select
product,
year_month,
department,
gmv,
lag(gmv,1) over w as lag_gmv,
cast(gmv as double) / lag(gmv,1) over w - 1 as growth_rate
from product
WINDOW w as (partition by department, product order by year_month)
注意:cast(gmv as double)是将gmv转化为double类型。
问题:
日期不连续怎么办?
可以通过join万年历解决。
对求解出的结果做限制result<=0.1即可得到前10%
求top10%:去ntile(n)中的n=10分桶后得到组号为1的即为前10%。