MySQL必知必会07:条件语句where与having不同

阅读整理自《MySQL 必知必会》- 朱晓峰,详细内容请登录 极客时间 官网购买专栏。

文章目录

    • 实际查询需求
    • where
    • having
      • group by
      • having
    • 正确使用where和having

查询的时候,经常需要按条件对查询结果进行筛选,这就要用到条件语句 WHERE 和 HAVING 了。

  • WHERE 是直接对表中的字段进行限定,来筛选结果;

  • HAVING 则需要跟分组关键字 GROUP BY 一起使用,通过对分组字段或分组计算函数进行限定,来筛选结果。

虽然它们都是对查询进行限定,却有着各自的特点和适用场景。很多时候,会遇到 2 个都可以用的情况。一旦用错,就很容易出现执行效率低下、查询结果错误,甚至是查询无法运行的情况。

实际查询需求

超市的经营者提出,要查单笔销售金额超过 50 元的商品。我们来分析一下这个需求:需要查询出一个商品记录集,限定条件是单笔销售金额超过 50 元。

假设我们有一个这样的商品信息表(demo.goodsmaster),里面有 2 种商品:书和笔。

数据准备:

show databases;
use demo;

create table demo.goodsmaster
(
itemnumber int primary key auto_increment,
barcode text,
goodsname text,
specification text,
unit text,
saleprice decimal(5,2)
);
describe goodsmaster;
insert into demo.goodsmaster (itemnumber, barcode, goodsname, specification, unit, saleprice) values (1, '0001', '书', '', '本', 89);
insert into demo.goodsmaster (itemnumber, barcode, goodsname, specification, unit, saleprice) values (2, '0002', '笔', '', '支', 5);
select * from demo.goodsmaster;

create table demo.transactiondetails
(
transactionid int,
itemnumber int,
quantity decimal(10,3),
price decimal(5,2),
salesvalue decimal(5,2)
);
describe transactiondetails;
insert into demo.transactiondetails (transactionid, itemnumber, quantity, price, salesvalue) values (1, 1, 1, 89, 89);
insert into demo.transactiondetails (transactionid, itemnumber, quantity, price, salesvalue) values (1, 2, 2, 5, 10);
insert into demo.transactiondetails (transactionid, itemnumber, quantity, price, salesvalue) values (2, 1, 2, 89, 178);
insert into demo.transactiondetails (transactionid, itemnumber, quantity, price, salesvalue) values (3, 2, 10, 5, 50);
select * from demo.transactiondetails;

商品信息表:

mysql> select * from demo.goodsmaster;
+------------+---------+-----------+---------------+------+-----------+
| itemnumber | barcode | goodsname | specification | unit | saleprice |
+------------+---------+-----------+---------------+------+-----------+
|          1 | 0001    ||               ||     89.00 |
|          2 | 0002    ||               ||      5.00 |
+------------+---------+-----------+---------------+------+-----------+
2 rows in set (0.01 sec)

商品销售明细表:

mysql> select * from demo.transactiondetails;
+---------------+------------+----------+-------+------------+
| transactionid | itemnumber | quantity | price | salesvalue |
+---------------+------------+----------+-------+------------+
|             1 |          1 |    1.000 | 89.00 |      89.00 |
|             1 |          2 |    2.000 |  5.00 |      10.00 |
|             2 |          1 |    2.000 | 89.00 |     178.00 |
|             3 |          2 |   10.000 |  5.00 |      50.00 |
+---------------+------------+----------+-------+------------+
4 rows in set (0.00 sec)

使用 where查询:

select distinct b.goodsname
from demo.transactiondetails as a
join demo.goodsmaster as b
on (a.itemnumber = b.itemnumber)
where a.salesvalue > 50;

使用 having 查询:

select b.goodsname
from demo.transactiondetails as a
join demo.goodsmaster as b
on (a.itemnumber = b.itemnumber)
group by b.goodsname
having max(a.salesvalue) > 50;

结果都是:

+-----------+
| goodsname |
+-----------+
||
+-----------+
1 row in set (0.02 sec)

where

分析一下使用 WHERE 条件的查询语句,是如何执行这个查询的

  1. 首先,MySQL 从数据表 demo.transactiondetails 中抽取满足条件 a.salesvalue>50 的记录:

    mysql> select * from demo.transactiondetails as a  where a.salesvalue > 50;
    +---------------+------------+----------+-------+------------+
    | transactionid | itemnumber | quantity | price | salesvalue |
    +---------------+------------+----------+-------+------------+
    |             1 |          1 |    1.000 | 89.00 |      89.00 |
    |             2 |          1 |    2.000 | 89.00 |     178.00 |
    +---------------+------------+----------+-------+------------+
    2 rows in set (0.00 sec)
    
  2. 为了获取到销售信息所对应的商品名称,需要通过公共字段 itemnumbers 与数据表 demo.goodsmaster 进行关联,从 demo.goodsmaster 中获取商品名称:

    mysql> select a.*, b.goodsname
        -> from demo.transactiondetails a
        -> join demo.goodsmaster b
        -> on (a.itemnumber = b.itemnumber)
        -> where a.salesvalue > 50;
    +---------------+------------+----------+-------+------------+-----------+
    | transactionid | itemnumber | quantity | price | salesvalue | goodsname |
    +---------------+------------+----------+-------+------------+-----------+
    |             1 |          1 |    1.000 | 89.00 |      89.00 ||
    |             2 |          1 |    2.000 | 89.00 |     178.00 ||
    +---------------+------------+----------+-------+------------+-----------+
    2 rows in set (0.00 sec)
    
  3. 查询商品名称,就会出现两个重复的记录:

    mysql> select b.goodsname from  demo.transactiondetails a join demo.goodsmaster b on (a.itemnumber = b.itemnumber) where a.salesvalue > 50;
    +-----------+
    | goodsname |
    +-----------+
    ||
    ||
    +-----------+
    2 rows in set (0.00 sec)
    
  4. 为了消除重复的语句,这里我们需要用到一个关键字:DISTINCT,它的作用是返回唯一不同的值。比如,DISTINCT 字段 1,就表示返回所有字段 1 的不同的值

    mysql> select distinct( b.goodsname )from  demo.transactiondetails a join demo.goodsmaster b on (a.itemnumber = b.itemnumber) where a.salesvalue > 50;
    +-----------+
    | goodsname |
    +-----------+
    ||
    +-----------+
    1 row in set (0.00 sec)
    

WHERE 关键字的特点是,直接用表的字段对数据集进行筛选。

如果需要通过关联查询从其他的表获取需要的信息,那么执行的时候,也是先通过 WHERE 条件进行筛选,用筛选后的比较小的数据集进行连接。

这样一来,连接过程中占用的资源比较少,执行效率也比较高。


having

HAVING 不能单独使用,必须要跟 GROUP BY 一起使用。可以把 GROUP BY 理解成对数据进行分组,方便对组内的数据进行统计计算。

group by

先说明 GROUP BY 如何使用,以及如何在分组里面进行统计计算。
假设现在有一组销售数据,我们需要从里面查询每天、每个收银员的销售数量和销售金额。

数据准备:

create table demo.transactionhead
(
transactionid int primary key auto_increment,
transactionno text,
operatorid int,
transdate datetime
);

insert into demo.transactionhead (transactionid, transactionno, operatorid, transdate) values (1, '0120201201000001', 1, '2020-12-10 00:00:00');
insert into demo.transactionhead (transactionid, transactionno, operatorid, transdate) values (2, '0120201202000001', 2, '2020-12-11 00:00:00');
insert into demo.transactionhead (transactionid, transactionno, operatorid, transdate) values (3, '0120201202000002', 2, '2020-12-12 00:00:00');
select * from demo.transactionhead;

当前数据:

mysql> SELECT * FROM demo.transactionhead;
+---------------+------------------+------------+---------------------+
| transactionid | transactionno    | operatorid | transdate           |
+---------------+------------------+------------+---------------------+
|             1 | 0120201201000001 |          1 | 2020-12-10 00:00:00 |
|             2 | 0120201202000001 |          2 | 2020-12-11 00:00:00 |
|             3 | 0120201202000002 |          2 | 2020-12-12 00:00:00 |
+---------------+------------------+------------+---------------------+

mysql> SELECT * FROM demo.transactiondetails;
+---------------+------------+----------+-------+------------+
| transactionid | itemnumber | quantity | price | salesvalue |
+---------------+------------+----------+-------+------------+
|             1 |          1 |    1.000 | 89.00 |      89.00 |
|             1 |          2 |    2.000 |  5.00 |      10.00 |
|             2 |          1 |    2.000 | 89.00 |     178.00 |
|             3 |          2 |   10.000 |  5.00 |      50.00 |
+---------------+------------+----------+-------+------------+

mysql> SELECT * FROM demo.operator;
+------------+----------+--------+--------------+-------------+---------+--------------------+--------+
| operatorid | branchid | workno | operatorname | phone       | address | pid                | duty   |
+------------+----------+--------+--------------+-------------+---------+--------------------+--------+
|          1 |        1 | 001    | 张静         | 18612345678 | 北京    | 110392197501012332 | 店长   |
|          2 |        1 | 002    | 李强         | 13312345678 | 北京    | 110222199501012332 | 收银员 |
+------------+----------+--------+--------------+-------------+---------+--------------------+--------+

mysql> SELECT * FROM demo.goodsmaster;
+------------+---------+-----------+---------------+------+-----------+
| itemnumber | barcode | goodsname | specification | unit | saleprice |
+------------+---------+-----------+---------------+------+-----------+
|          1 | 0001    ||               ||     89.00 |
|          2 | 0002    ||               ||      5.00 |
+------------+---------+-----------+---------------+------+-----------+

查询:

mysql> SELECT a.transdate, c.operatorname, d.goodsname, b.quantity, b.price, b.salesvalue
    -> FROM demo.transactionhead AS a
    -> JOIN demo.transactiondetails AS b ON (a.transactionid = b.transactionid)
    -> JOIN demo.operator AS c ON (a.operatorid = c.operatorid)
    -> JOIN demo.goodsmaster AS d ON (b.itemnumber = d.itemnumber);
+---------------------+--------------+-----------+----------+-------+------------+
| transdate           | operatorname | goodsname | quantity | price | salesvalue |
+---------------------+--------------+-----------+----------+-------+------------+
| 2020-12-10 00:00:00 | 张静         ||    1.000 | 89.00 |      89.00 |
| 2020-12-10 00:00:00 | 张静         ||    2.000 |  5.00 |      10.00 |
| 2020-12-11 00:00:00 | 李强         ||    2.000 | 89.00 |     178.00 |
| 2020-12-12 00:00:00 | 李强         ||   10.000 |  5.00 |      50.00 |
+---------------------+--------------+-----------+----------+-------+------------+

如果想看看每天的销售数量和销售金额,可以按照一个字段 transdate 对数据进行分组和统计:

mysql> select a.transdate, sum(b.quantity), sum(b.salesvalue)
    -> from demo.transactionhead as a
    -> join demo.transactiondetails as b on (a.transactionid = b.transactionid)
    -> group by a.transdate;
+---------------------+-----------------+-------------------+
| transdate           | sum(b.quantity) | sum(b.salesvalue) |
+---------------------+-----------------+-------------------+
| 2020-12-10 00:00:00 |           3.000 |             99.00 |
| 2020-12-11 00:00:00 |           2.000 |            178.00 |
| 2020-12-12 00:00:00 |          10.000 |             50.00 |
+---------------------+-----------------+-------------------+

如果想看每天、每个收银员的销售数量和销售金额,就可以按 2 个字段进行分组和统计,分别是 transdate 和 operatorname:

mysql> select a.transdate, c.operatorname, sum(b.quantity), sum(b.salesvalue)
    -> from demo.transactionhead as a
    -> join demo.transactiondetails as b on (a.transactionid = b.transactionid)
    -> join demo.operator as c on (a.operatorid = c.operatorid)
    -> group by a.transdate, c.operatorid;
+---------------------+--------------+-----------------+-------------------+
| transdate           | operatorname | sum(b.quantity) | sum(b.salesvalue) |
+---------------------+--------------+-----------------+-------------------+
| 2020-12-10 00:00:00 | 张静         |           3.000 |             99.00 |
| 2020-12-11 00:00:00 | 李强         |           2.000 |            178.00 |
| 2020-12-12 00:00:00 | 李强         |          10.000 |             50.00 |
+---------------------+--------------+-----------------+-------------------+

通过对销售数据按照交易日期和收银员进行分组,再对组内数据进行求和统计,就实现了对每天、每个收银员的销售数量和销售金额的查询。

having

回到开头的超市经营者的需求:查询单笔销售金额超过 50 元的商品:

mysql> select a.goodsname
    -> from demo.goodsmaster as a
    -> join demo.transactiondetails as b on (a.itemnumber = b.itemnumber)
    -> group by a.goodsname
    -> having max(b.salesvalue) > 50;
+-----------+
| goodsname |
+-----------+
||
+-----------+
1 row in set (0.00 sec)

这种查询方式在 MySQL 里面是分四步实现的:

第一步,把流水明细表和商品信息表通过公共字段 itemnumber 连接起来,从 2 个表中获取数据:

mysql> select a.*, b.*
    -> from demo.transactiondetails as a
    -> join demo.goodsmaster as b on (a.itemnumber = b.itemnumber);
+---------------+------------+----------+-------+------------+------------+---------+-----------+---------------+------+-----------+
| transactionid | itemnumber | quantity | price | salesvalue | itemnumber | barcode | goodsname | specification | unit | saleprice |
+---------------+------------+----------+-------+------------+------------+---------+-----------+---------------+------+-----------+
|             1 |          1 |    1.000 | 89.00 |      89.00 |          1 | 0001    ||               ||     89.00 |
|             1 |          2 |    2.000 |  5.00 |      10.00 |          2 | 0002    ||               ||      5.00 |
|             2 |          1 |    2.000 | 89.00 |     178.00 |          1 | 0001    ||               ||     89.00 |
|             3 |          2 |   10.000 |  5.00 |      50.00 |          2 | 0002    ||               ||      5.00 |
+---------------+------------+----------+-------+------------+------------+---------+-----------+---------------+------+-----------+

MySQL必知必会07:条件语句where与having不同_第1张图片

第二步,把结果集按照商品名称(itemnumber)分组:

组 1: MySQL必知必会07:条件语句where与having不同_第2张图片
组 2:
MySQL必知必会07:条件语句where与having不同_第3张图片
第三步,对分组后的数据集进行筛选,把组中字段 salesvalue 的最大值 >50 的组筛选出来:
MySQL必知必会07:条件语句where与having不同_第4张图片

第四步,返回商品名称。这时就得到了需要的结果:单笔销售金额超过 50 元的商品就是“书”。

简单小结下使用 HAVING 的查询过程。

  • 首先,把所有的信息都准备好,包括从关联表中获取需要的信息,对数据集进行分组,形成一个包含所有需要的信息的数据集合
  • 接着,再通过 HAVING 条件的筛选,得到需要的数据

正确使用where和having

两者的区别:

  • 如果需要通过连接从关联表中获取需要的数据,where是先筛选后连接,而having是先连接后筛选

    这一点,就决定了在关联查询中,wherehaving更高效。因为where可以先筛选,用一个筛选后的较小数据集和关联表进行连接,这样占用的资源比较少,执行效率也就比较高。having则需要先把结果集准备好,也就是用未被筛选的数据集进行关联,然后对这个大的数据集进行筛选,这样占用的资源就比较多,执行效率也较低。

  • where可以直接使用表中的字段作为筛选条件,但不能使用分组中的计算函数作为筛选条件;having必须要与group by配合使用,可以把分组计算的函数和分组字段作为筛选条件

    这决定了,在需要对数据进行分组统计的时候,having可以完成where不能完成的任务。这是因为,在查询语法结构中,wheregroup by之前,所以无法对分组结果进行筛选。havinggroup by之后,可以使用分组字段和分组中的计算函数,对分组的结果集进行筛选,这个功能是where无法完成的。

假如超市经营者提出,要查询一下是哪个收银员、在哪天卖了 2 单商品。这种必须先分组才能筛选的查询,用 where 语句实现就比较难,我们可能要分好几步,通过把中间结果存储起来,才能搞定。但是用 having,则很轻松,代码如下:

mysql> select a.transdate, c.operatorname
    -> from demo.transactionhead as a
    -> join demo.transactiondetails as b on (a.transactionid = b.transactionid)
    -> join demo.operator as c on (a.operatorid = c.operatorid)
    -> group by a.transdate, c.operatorid
    -> having count(*) = 2;
+---------------------+--------------+
| transdate           | operatorname |
+---------------------+--------------+
| 2020-12-10 00:00:00 | 张静         |
+---------------------+--------------+

where 和 having 各自的优缺点:

优点 缺点
where 先筛选数据,再关联,执行效率高 不能使用分组中俄计算函数进行筛选
having 可以使用分组中的计算函数 在最后的结果集中进行筛选,执行效率较低

需要注意的是,where 和 having 也不是互相排斥的,我们可以在一个查询里面同时使用where 和 having。

举个例子,假设现在有一组销售数据,包括交易时间、收银员、商品名称、销售数量、价格和销售金额等信息,超市的经营者要查询“2020-12-10”和“2020-12-11”这两天收银金额超过 100 元的销售日期、收银员名称、销售数量和销售金额。

mysql> select a.transdate, c.operatorname, d.goodsname, b.quantity, b.price, b.salesvalue
    -> from demo.transactionhead as a
    -> join demo.transactiondetails as b on (b.transactionid = a.transactionid)
    -> join demo.operator as c on (c.operatorid = a.operatorid)
    -> join demo.goodsmaster as d on (d.itemnumber = b.itemnumber);
+---------------------+--------------+-----------+----------+-------+------------+
| transdate           | operatorname | goodsname | quantity | price | salesvalue |
+---------------------+--------------+-----------+----------+-------+------------+
| 2020-12-10 00:00:00 | 张静         ||    1.000 | 89.00 |      89.00 |
| 2020-12-10 00:00:00 | 张静         ||    2.000 |  5.00 |      10.00 |
| 2020-12-11 00:00:00 | 李强         ||    2.000 | 89.00 |     178.00 |
| 2020-12-12 00:00:00 | 李强         ||   10.000 |  5.00 |      50.00 |
+---------------------+--------------+-----------+----------+-------+------------+

分析需求:由于是要按照销售日期和收银员进行统计,所以,必须按照销售日期和收银员进行分组,因此,我们可以通过使用group byhaving进行查询:

mysql> select a.transdate, c.operatorname, sum(b.quantity), sum(b.salesvalue)
    -> from demo.transactionhead as a
    -> join demo.transactiondetails as b on (b.transactionid = a.transactionid)
    -> join demo.operator as c on (c.operatorid = a.operatorid)
    -> group by a.transdate, c.operatorname
    -> having a.transdate in ('2020-12-10' , '2020-12-11') and sum(b.salesvalue) > 100;
+---------------------+--------------+-----------------+-------------------+
| transdate           | operatorname | sum(b.quantity) | sum(b.salesvalue) |
+---------------------+--------------+-----------------+-------------------+
| 2020-12-11 00:00:00 | 李强         |           2.000 |            178.00 |
+---------------------+--------------+-----------------+-------------------+

其实having后面的筛选条件,就会发现,条件 a.transdate IN ('2020-12-10' , '2020-12-11'),其实可以用where来限定:

mysql> select a.transdate, c.operatorname, sum(b.quantity), sum(b.salesvalue)
    -> from demo.transactionhead as a
    -> join demo.transactiondetails as b on (b.transactionid = a.transactionid)
    -> join demo.operator as c on (c.operatorid = a.operatorid)
    -> where a.transdate in ('2020-12-10' , '2020-12-11')  -- 先按日期筛选
    -> group by a.transdate, c.operatorname
    -> having sum(b.salesvalue) > 100;  -- 后按金额筛选
+---------------------+--------------+-----------------+-------------------+
| transdate           | operatorname | sum(b.quantity) | sum(b.salesvalue) |
+---------------------+--------------+-----------------+-------------------+
| 2020-12-11 00:00:00 | 李强         |           2.000 |            178.00 |
+---------------------+--------------+-----------------+-------------------+

得到了需要的结果,这是因为把条件拆分开,包含分组统计函数的条件用having,普通条件用where。这就既利用了where条件的高效快速,又发挥了having可以使用包含分组统计函数的查询条件的优点。当数据量特别大的时候,运行效率会有很大的差别。

你可能感兴趣的:(MySQL,mysql,数据库,database)