数据库开窗函数是一种在SQL中使用的函数,它可以用来对结果集中的数据进行分组和排序,以便更好地分析和处理数据。开窗函数与聚合函数不同,它不会将多行数据聚合成一行,而是保留每一行数据,并对其进行分组和排序。
常见的开窗函数包括ROW_NUMBER()、RANK()、DENSE_RANK()、NTILE()、LAG()、LEAD()等。这些函数可以帮助用户在结果集中生成分组和排序的结果,以便更好地理解和分析数据。
例如,使用ROW_NUMBER()函数可以根据一个或多个字段对结果集进行分组,并在每个分组内生成一个行号,以便用户可以轻松地跟踪数据。使用LAG()和LEAD()函数可以在结果集中的每一行之前和之后提取数据,以便用户可以查看当前行之前或之后的数据。
开窗函数是SQL中非常有用的工具,可以帮助用户对结果集中的数据进行分组和排序,以便更好地分析和处理数据。
MySQL 官方文档: https://dev.mysql.com/doc/refman/8.0/en/window-functions.html
注意: 官方解释 开窗函数只有MySQL8.0版本之后才有哦。
官方说的很官方,稍有点难以理解还是。
<窗口函数> OVER ([PARTITION BY <分组列>] [ORDER BY <排序列> {ASC|DESC}] [<行窗口>|<范围窗口>] [<开始位置>|<结束位置>|<长度>])
在 MySQL 8.0 中,行窗口是指一组连续的行,这些行被视为一个整体,并且可以用于窗口函数的计算。
行窗口由以下关键字指定:
常用的行窗口指定方式:
说明: 行窗口可以用于计算每组的总和、平均值、计数等聚合操作,也可以用于计算每个行的排名、累积和等操作。
DROP TABLE IF EXISTS `order_for_goods`;
CREATE TABLE `order_for_goods` (
`order_id` int(0) NOT NULL AUTO_INCREMENT,
`user_id` int(0) NULL DEFAULT NULL,
`money` decimal(10, 2) NULL DEFAULT NULL,
`quantity` int(0) NULL DEFAULT NULL,
`join_time` datetime(0) NULL DEFAULT NULL,
PRIMARY KEY (`order_id`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 12 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic;
INSERT INTO order_for_goods (user_id, money, quantity, join_time )
VALUES
( 1001, 1800.90, 1, '2023-06-07'),
( 1001, 3600.89, 5, '2023-05-02'),
( 1001, 1000.10, 6, '2023-01-08'),
( 1002, 1100.90, 9, '2023-04-07'),
( 1002, 4500.99, 1, '2023-03-14'),
( 1003, 2500.10, 3, '2023-02-14'),
( 1002, 2500.90, 1, '2023-03-14'),
( 1003, 2500.90, 1, '2022-12-12'),
( 1003, 2500.90, 2, '2022-09-08'),
( 1003, 6000.90, 8, '2023-01-10');
1、语句如下
select
*,
sum(money) over(partition by user_id order by order_id) as alias_sum,
avg(money) over(partition by user_id order by order_id) as alias_avg,
max(money) over(partition by user_id order by order_id) as alias_max,
min(money) over(partition by user_id order by order_id) as alias_min,
count(money) over(partition by user_id order by order_id) as alias_count
from order_for_goods;
2、查询结果返回了选择的列和计算出的别名列如下
1、执行语句
select *
from (
select *,row_number() over(partition by user_id order by money desc) as alias_row_number
from order_for_goods) t
where alias_row_number<=3;
2、执行结果
3、执行语句
select *
from (
select *,row_number() over(partition by user_id order by money desc) as alias_row_number
from order_for_goods) t
where alias_row_number<=1;
4、执行结果
总结: 可以发散思维想一想,举个栗子: 比如统计各个商品领域销量排行前三。使用开窗是不是可以解决很多问题,也避免了大量难以维护且看不懂的sql逻辑。
1、执行语句
select
*,
rank() over(partition by user_id order by money desc) as alias_rank
from order_for_goods;
2、执行结果
1、执行语句
select
*,
dense_rank() over(partition by user_id order by money desc) as alias_dense_rank
from order_for_goods;
2、执行结果
1、执行语句
select
*,
row_number() over(partition by user_id order by money desc) as alias_row_number,
rank() over(partition by user_id order by money desc) as alias_rank,
dense_rank() over(partition by user_id order by money desc) as alias_dense_rank
from order_for_goods;
2、执行结果
1、执行语句
select
*,
percent_rank() over(partition by user_id order by money desc) as alias_percent_rank
from order_for_goods;
2、执行结果
1、执行语句
select
*,
cume_dist() over(partition by user_id order by money desc) as alias_percent_rank
from order_for_goods;
2、执行结果
1、语法说明
LAG(expression, offset, default_value)
2、执行语句
select
*,
lag(join_time, 1, 0) over(partition by user_id order by join_time desc) as alias_lag
from order_for_goods;
3、执行结果
1、语法说明
LAG(expression, offset, default_value)
2、执行语句
select
*,
lead(join_time, 1, 0) over(partition by user_id order by join_time desc) as alias_lead
from order_for_goods;
3、执行结果
1、语法说明
FIRST_VALUE(expression)
2、执行语法
select
*,
first_value(money) over(partition by user_id order by join_time desc) as alias_first_value
from order_for_goods;
3、执行结果
1、语法说明
LAST_VALUE(expression)
2、执行语法
select
*,
last_value(money) over(partition by user_id order by join_time) as alias_last_value
from order_for_goods;
2、执行结果
3、解释
3、验证
select
*,
last_value(money) over(partition by user_id order by join_time) as alias_last_value1,
last_value(money) over(partition by user_id order by join_time rows between unbounded preceding and current row) as alias_last_value2,
last_value(money) over(partition by user_id order by join_time rows between unbounded preceding and unbounded following) as alias_last_value3
from order_for_goods;
+----------+---------+---------+----------+---------------------+------------------+------------------+------------------+
| order_id | user_id | money | quantity | join_time | alias_last_value | alias_last_value | alias_last_value |
+----------+---------+---------+----------+---------------------+------------------+------------------+------------------+
| 34 | 1001 | 1000.10 | 6 | 2023-01-08 00:00:00 | 1000.10 | 1000.10 | 1800.90 |
| 33 | 1001 | 3600.89 | 5 | 2023-05-02 00:00:00 | 3600.89 | 3600.89 | 1800.90 |
| 32 | 1001 | 1800.90 | 1 | 2023-06-07 00:00:00 | 1800.90 | 1800.90 | 1800.90 |
| 36 | 1002 | 4500.99 | 1 | 2023-03-14 00:00:00 | 2500.90 | 4500.99 | 1100.90 |
| 38 | 1002 | 2500.90 | 1 | 2023-03-14 00:00:00 | 2500.90 | 2500.90 | 1100.90 |
| 35 | 1002 | 1100.90 | 9 | 2023-04-07 00:00:00 | 1100.90 | 1100.90 | 1100.90 |
| 40 | 1003 | 2500.90 | 2 | 2022-09-08 00:00:00 | 2500.90 | 2500.90 | 2500.10 |
| 39 | 1003 | 2500.90 | 1 | 2022-12-12 00:00:00 | 2500.90 | 2500.90 | 2500.10 |
| 41 | 1003 | 6000.90 | 8 | 2023-01-10 00:00:00 | 6000.90 | 6000.90 | 2500.10 |
| 37 | 1003 | 2500.10 | 3 | 2023-02-14 00:00:00 | 2500.10 | 2500.10 | 2500.10 |
+----------+---------+---------+----------+---------------------+------------------+------------------+------------------+
10 rows in set (0.00 sec)
1、语法说明
NTILE(bucket_size)
2、执行语句
select
*,
ntile(1) over(partition by user_id order by join_time desc) as alias_ntile1,
ntile(2) over(partition by user_id order by join_time desc) as alias_ntile2,
ntile(3) over(partition by user_id order by join_time desc) as alias_ntile3
from order_for_goods;
3、执行结果
说明: NTILE()函数,可以将有序的数据集合平均分配到指定的数量的桶中,将桶号分配给每一行。如果不能平均分配,则较小桶号的桶分配额外的行,并且各个桶中能放的行数最多相差1。
1、语法说明
NTH_VALUE(expression, nth_parameter)
2、执行语句
select
*,
nth_value(money, 2) over(partition by user_id order by join_time ) as alias_nth_value
from order_for_goods;
3、执行结果