知识点
分组排序
案例
分组排序
1.准备数据
创建订单表,字段包括订单id(orderid),产品id(Itemid),产品类别(category),订单日期(orderdate),售卖金额(sales)
代码如下
create table orders1 (
orderid int(11) not null primary key ,
Itemid varchar(30) not null ,
category varchar(10) not null ,
orderdate datetime not null ,
sales float not null );#创建表
#插入数据
insert into orders1 (orderid, Itemid, category, orderdate, sales)
values ('1','k1','A','2020-1-2','459.5'),
('2','k2','A','2020-2-2','345.4'),
('3','k1','B','2020-1-7','47'),
('4','k3','C','2020-1-21','678'),
('5','k4','B','2020-5-2','345'),
('6','k7','A','2020-3-12','654'),
('7','k4','C','2020-4-25','464'),
('8','k2','B','2020-5-28','632'),
('9','k3','A','2020-7-20','98'),
('10','k5','C','2020-4-28','455.6'),
('11','k1','B','2020-3-27','459.7'),
('12','k2','A','2020-2-23','776.6'),
('13','k4','B','2020-2-12','759'),
('14','k5','A','2020-8-2','999'),
('15','k2','C','2020-9-21','599'),
('16','k3','A','2020-10-21','433'),
('17','k4','C','2020-10-28','232'),
('18','k5','B','2020-10-3','124'),
('19','k3','B','2020-6-6','321.4'),
('20','k3','C','2020-9-11','788.6')
2.取出每类产品销售金额的前2名
方法一:使用窗口函数row_number()over()
代码如下:
select category,orderid,Itemid,sales from (
select category,orderid,Itemid,sales,
row_number()over(partition by category order by category,sales desc ) as ranking
from orders1
group by category,orderid) t1 where ranking <=2;
在这段代码运行的过程中我发现了一些问题,
group by category,orderid
把orderid去掉或者换成Itemid,均会报错
“[42000][1055] Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column ‘mysql.orders1.orderid’ which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by”
这个地方我不是很明白,orderid特殊的地方在于,在这张表中,它是主键,且有唯一值。
追加
通过后面的学习和发现,这里我又理解了。category和Itemid进行groupby后是分组聚合,必须有orderid才能保留原表结构,进行排序。但是使用 row_number()over()之后,不需要再用groupby了。
方法二:不使用窗口函数完成分组排序
mysq8.0以上版本才支持窗口函数,所以对于8.0以下的版本需要使用自定义函数
代码如下
select category,orderid,Itemid,sales,
case @ca
when category
then @rk:=@rk+1 else @rk:=1
end ranking,@ca:=orders1.category
from orders1,(select @ca:='',@rk:=0) b
order by category,sales
解释:
1.@rk变量是排名,@ca变量对应的是分组对象的切换
2.这个代码能运行成功的背景是,在MySQL中语句的执行顺序,在MySQL中order by是在select之前执行的,这样的话 ,我们先进行排序,语句就可以按照设想的顺序来执行了
3.但需要注意的一点是,如果查询中有两个以上的关联表,那么order by是在select 州执行的,所以我们要先完成表关联,得到一个临时表,再进行排序操作;
参考链接:https://wenku.baidu.com/view/d60c6f37a000a6c30c22590102020740be1ecdc1.html
一点小扩展
如果分类依据需要两个及以上,
代码分别为:
select category,orderid,Itemid,sales from (
select category,orderid,Itemid,sales,
row_number()over(partition by category,Itemid order by category,sales desc ) as ranking
from orders1
group by category,Itemid,orderid) t1 where ranking <=2
select category,orderid,Itemid,sales from (
select orders1.category,orders1.orderid,orders1.Itemid,orders1.sales,
case when @ca=orders1.category and @it=orders1.Itemid
then @rk:=@rk+1 else @rk:=1
end as ranking,
@ca:=orders1.category as category,
@it:=orders1.Itemid as Itemid
from orders1,(select @ca:='',@it:='',@rk:=0) b
order by orders1.category,orders1.Itemid,orders1.sales desc) t1 where ranking <=2;
一点小问题
#在学习窗口函数row_number()over()的时候,我学习的他人代码是下面的第一种,但我发现将group by语句删掉也没有影响查询结果。
#第一种
select category,orderid,Itemid,sales,
row_number()over(partition by category order by category,sales desc ) as ranking
from orders1
group by category,orderid;
# 与
#第二种
select category,orderid,Itemid,sales,
row_number()over(partition by category order by category,sales desc ) as ranking
from orders1;