How to select the first least max top N rows from each group in SQL

从分组中选择极值查询/前N项查询 是经常会遇到的问题 ,下面通过简单举例展示这种SQL的写法

举例表
type variety price
apple gala 2.79
apple fuji 0.24
apple limbertwig 2.87
orange valencia 3.59
orange navel 9.36
pear bradford 6.05
pear bartlett 2.14
cherry bing 2.55
cherry chelan 6.33


Selecting the one minimum row from each group

期望结果
type variety price
apple fuji 0.24
orange valencia 3.59
pear bartlett 2.14
cherry bing 2.55



方法一 通过分组子查询实现

select f.type, f.variety, f.price
from (
select type, min(price) as minprice
from fruits group by type
) as x inner join fruits as f on f.type = x.type and f.price = x.minprice;


结果:

type variety price
apple fuji 0.24
cherry bing 2.55
orange valencia 3.59
pear bartlett 2.14


方法二 通过关联子查询实现


select type, variety, price
from fruits
where price = (select min(price) from fruits as f where f.type = fruits.type);


结果:

type variety price
apple fuji 0.24
orange valencia 3.59
pear bartlett 2.14
cherry bing 2.55


以上两个查询是等价的.

Select the top N rows from each group

每组前N个查询是比较痛苦的问题,因为聚合函数只返回一个值,所以通过聚集函数分组查询前几个数据是不可能的.

比方说,我要选择每个类型最便宜的两个水果。
可以通过变换SQL写法实现

方法一:

select type, variety, price
from fruits
where price = (select min(price) from fruits as f where f.type = fruits.type)
or price = (select min(price) from fruits as f where f.type = fruits.type
and price > (select min(price) from fruits as f2 where f2.type = fruits.type));


结果:
type variety price
apple gala 2.79
apple fuji 0.24
orange valencia 3.59
orange navel 9.36
pear bradford 6.05
pear bartlett 2.14
cherry bing 2.55
cherry chelan 6.33

这种大量子查询方法性能很差,如果是前3个,前4个等等 这种查询变得不可实现


方法二 从每个品种的水果,品种不超过第二便宜的查询,通过关联子查询实现

select type, variety, price
from fruits
where (
 select count(1) from fruits as f
 where f.type = fruits.type and f.price < fruits.price
) <= 2;


第二种方法在fruits表很大时效果不佳

方法三

可以使用union all 实现 (union all 与 union 的区别是 前者不会通过排序消除重复)

(select * from fruits where type = 'apple' order by price limit 2)
union all
(select * from fruits where type = 'orange' order by price limit 2)
union all
(select * from fruits where type = 'pear' order by price limit 2)
union all
(select * from fruits where type = 'cherry' order by price limit 2)


如果分组(这里是水果种类)数量不大/分页的情况下,可以使用union把数据切成多段分开查询,在好的索引支持下效率很高.

(测试中发现如果去掉每段查询的圆括号则limit 限制整个结果集的返回行数 而不是每段.)

使用union all 联合查询是解决N+1 问题的利器(特别是1 - n 中 n方数据量特别大时),不过JPA 框架在处理union all 查询时有bug (悲催啊)

h3. 实际项目使用情况:

   CRM中商家页面 商家与分店是典型的N+1查询问题,由于有些商家分店数比较多,页面中只展示前3个分店,此时可以使用union all 查询优化 , 通过一条SQL 返回所有商家的前3家分店.

实例参考自Baron Schwartz的博客: http://www.xaprb.com/blog

你可能感兴趣的:(mysql)