Mysql-分组排序求topN的问题

知识点

分组排序

  • 使用窗口函数,例如row_number()over()
  • mysql8.0以下版本不支持窗口函数,则需要自定义函数

案例

分组排序
1.准备数据
创建订单表,字段包括订单id(orderid),产品id(Itemid),产品类别(category),订单日期(orderdate),售卖金额(sales)

代码如下

create table orders1 (
    orderid int(11) not null primary key ,
    Itemid varchar(30) not null ,
    category varchar(10) not null ,
    orderdate datetime not null ,
    sales float not null );#创建表
#插入数据
insert into orders1 (orderid, Itemid, category, orderdate, sales)
values ('1','k1','A','2020-1-2','459.5'),
       ('2','k2','A','2020-2-2','345.4'),
       ('3','k1','B','2020-1-7','47'),
       ('4','k3','C','2020-1-21','678'),
       ('5','k4','B','2020-5-2','345'),
       ('6','k7','A','2020-3-12','654'),
       ('7','k4','C','2020-4-25','464'),
       ('8','k2','B','2020-5-28','632'),
       ('9','k3','A','2020-7-20','98'),
       ('10','k5','C','2020-4-28','455.6'),
         ('11','k1','B','2020-3-27','459.7'),
       ('12','k2','A','2020-2-23','776.6'),
       ('13','k4','B','2020-2-12','759'),
       ('14','k5','A','2020-8-2','999'),
       ('15','k2','C','2020-9-21','599'),
       ('16','k3','A','2020-10-21','433'),
       ('17','k4','C','2020-10-28','232'),
       ('18','k5','B','2020-10-3','124'),
       ('19','k3','B','2020-6-6','321.4'),
       ('20','k3','C','2020-9-11','788.6')

2.取出每类产品销售金额的前2名

方法一:使用窗口函数row_number()over()
代码如下:

select category,orderid,Itemid,sales from (
select category,orderid,Itemid,sales,
       row_number()over(partition by category order by category,sales desc ) as ranking
from orders1
group by category,orderid) t1 where ranking <=2;

在这段代码运行的过程中我发现了一些问题,

group by category,orderid

把orderid去掉或者换成Itemid,均会报错
“[42000][1055] Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column ‘mysql.orders1.orderid’ which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by”

这个地方我不是很明白,orderid特殊的地方在于,在这张表中,它是主键,且有唯一值。

追加
通过后面的学习和发现,这里我又理解了。category和Itemid进行groupby后是分组聚合,必须有orderid才能保留原表结构,进行排序。但是使用 row_number()over()之后,不需要再用groupby了。

方法二:不使用窗口函数完成分组排序
mysq8.0以上版本才支持窗口函数,所以对于8.0以下的版本需要使用自定义函数
代码如下

select category,orderid,Itemid,sales,
case  @ca
    when category
        then @rk:=@rk+1 else @rk:=1
    end ranking,@ca:=orders1.category
from orders1,(select @ca:='',@rk:=0) b
order by category,sales

解释:
1.@rk变量是排名,@ca变量对应的是分组对象的切换
2.这个代码能运行成功的背景是,在MySQL中语句的执行顺序,在MySQL中order by是在select之前执行的,这样的话 ,我们先进行排序,语句就可以按照设想的顺序来执行了
3.但需要注意的一点是,如果查询中有两个以上的关联表,那么order by是在select 州执行的,所以我们要先完成表关联,得到一个临时表,再进行排序操作;
参考链接:https://wenku.baidu.com/view/d60c6f37a000a6c30c22590102020740be1ecdc1.html

一点小扩展
如果分类依据需要两个及以上,
代码分别为:

select category,orderid,Itemid,sales from (
select category,orderid,Itemid,sales,
       row_number()over(partition by category,Itemid order by category,sales desc ) as ranking
from orders1
group by category,Itemid,orderid) t1 where ranking <=2
select category,orderid,Itemid,sales from (
select orders1.category,orders1.orderid,orders1.Itemid,orders1.sales,
case when @ca=orders1.category and @it=orders1.Itemid
        then @rk:=@rk+1 else @rk:=1
    end as ranking,
    @ca:=orders1.category as category,
    @it:=orders1.Itemid as Itemid
from orders1,(select @ca:='',@it:='',@rk:=0) b
order by orders1.category,orders1.Itemid,orders1.sales desc) t1 where ranking <=2;

一点小问题

#在学习窗口函数row_number()over()的时候,我学习的他人代码是下面的第一种,但我发现将group by语句删掉也没有影响查询结果。
#第一种
select category,orderid,Itemid,sales,
       row_number()over(partition by category order by category,sales desc ) as ranking
from orders1
group by category,orderid;
# 与
#第二种
select category,orderid,Itemid,sales,
       row_number()over(partition by category order by category,sales desc ) as ranking
from orders1;

你可能感兴趣的:(MySQL,mysql,sql)