Group By
利用GROUP BY和聚合函数可以实现分组累计。举例来说,如果要求显示各
个部门的薪水合计,可以使用下面的语句。
SELECT department_number
,SUM (salary_amount)
FROM employee
GROUP BY department_number;
结果:
department_number Sum(salary_amount)
401 74150.00
403 80900.00
301 58700.00
需要注意的是,在SELECT子句中不作分组累计的所有字段必须出现在
GROUP BY子句中,否则会返回如下出错信息:
ERROR: 3504 Selected non-aggregate values must be part of the associated group.
以上例来说,Department_Number字段未作累计,因此它必须出现在GROUP
BY子句中。这条基本规则必须牢记。
WHERE子句和GROUP BY子句
WHERE子句和GROUP BY子句同时使用时,GROUP BY只对符合WHERE限
制的数据记录进行分组聚合计算。换言之,在作真正的聚合计算之前,WHERE子
句将不符合条件的数据记录剔除了。
例:部门401和403的合计薪水是多少?
SELECT department_number
,SUM (salary_amount)
FROM employee
WHERE department_number IN (401, 403)
GROUP BY department_number
;
结果:
department_number Sum (salary_amount)
403 80900.00
401 74150.00
GROUP BY和ORDER BY
在GROUP BY后加上ORDER BY,可以使得分组统计按照指定的秩序来显
示。例如,按部门编号顺序显示部门人数、合计薪水、部门最高薪水、部门最底薪
水和部门平均薪水,可以使用下面的SQL语句:
SELECT department_number (TITLE 'DEPT')
,COUNT (*) (TITLE '#_EMPS')
,SUM (salary_amount) (TITLE 'TOTAL')
(FORMAT 'zz,zzz,zz9.99')
,MAX (salary_amount) (TITLE 'HIGHEST')
(FORMAT 'zz,zzz,zz9.99')
,MIN (salary_amount) (TITLE 'LOWEST')
(FORMAT 'zz,zzz,zz9.99')
,AVG (salary_amount) (TITLE 'AVERAGE')
(FORMAT 'zz,zzz,zz9.99')
FROM employee
GROUP BY department_number
ORDER BY department_number
;
结果如下:
DEPT #_EMPS TOTAL HIGHEST LOWEST AVERAGE
301 3 116,400.00 57,700.00 29,250.00 38,800.00
401 7 245,575.00 46,000.00 24,500.00 35,082.14
403 6 233,000.00 49,700.00 31,000.00 38,833.33
上面的SQL语句也可以写成:
SELECT department_number AS DEPT
,COUNT (*) AS #_EMPS
,CAST ( SUM (salary_amount) AS FORMAT 'zz,zzz,zz9.99')
AS TOTAL
,CAST ( MAX (salary_amount) AS FORMAT 'zz,zzz,zz9.99')
AS HIGHEST
,CAST ( MIN (salary_amount) AS FORMAT 'zz,zzz,zz9.99')
AS LOWEST
,CAST ( AVG (salary_amount) AS FORMAT 'zz,zzz,zz9.99')
AS _AVERAGE)
FROM employee
GROUP BY department_number
ORDER BY department_number;
由于AVERAGE本身是一个关键词,所以在上面的例子中在它前面加上下划线
以便区分。
当对多个字段进行分组统计时,GROUP BY只能产生一个级别的汇总。例
如:对部门401和403按照工作代码分组统计薪水。
SELECT department_number
,job_code
,SUM (salary_amount)
FROM employee
WHERE department_number IN (401, 403)
GROUP BY department_number, job_code
ORDER BY 1, 2;
结果:
department_number job_code SUM (salary_amount)
401 411100 37850.00
401 412101 107825.00
401 412102 56800.00
401 413201 43100.00
403 431100 31200.00
403 432101 201800.00
从这个例子可以看到,当GROUP BY中有多个字段时,它只能产生一个级别
的汇总,而且是按照最后一个字段(这里是job_code)来进行汇总。
GROUP BY和HAVING条件限定
HAVING条件子句是和GROUP一起使用的,用来对分组统计的结果进行限
定,只返回满足其条件的分组统计结果。
举例来说,按部门编号顺序显示部门人数、合计薪水、部门最高薪水、部门
最低薪水和部门平均薪水,条件是只显示部门平均薪水小于36000的部门。
SELECT department_number (TITLE 'DEPT')
,COUNT (*) (TITLE '#_EMPS')
,SUM (salary_amount) (TITLE 'TOTAL')
(FORMAT 'zz,zzz,zz9.99')
,MAX (salary_amount) (TITLE 'HIGHEST')
(FORMAT 'zz,zzz,zz9.99')
,MIN (salary_amount) (TITLE 'LOWEST')
(FORMAT 'zz,zzz,zz9.99)
,AVG (salary_amount) (TITLE 'AVERAGE')
(FORMAT 'zz,zzz,zz9.99')
FROM employee
GROUP BY department_number
HAVING AVG (salary_amount) < 36000;
结果:
DEPT #_EMPS TOTAL HIGHEST LOWEST AVERAGE
401 7 245,575.00 46,000.00 24,500.00 35,082.14
GROUP BY小结
在进行分组聚合操作时,要特别注意以下各点:
1、WHERE:用来限定参与分组聚合运算的表的数据记录,只有满足条件的
数据记录才会被选中参与分组聚合。
2、GROUP BY:将符合WHERE条件子句的记录进行分组
3、HAVING:用来限定可以返回的分组聚合的结果
4、ORDER BY:用来指定结果的输出顺序