定义:
Aggregate functionsare functions that take acollection (a set or multiset) of values as input and return a single value.
聚集函数就是将一系列的属性作为输入,然后输出单个值的函数。
– Average: avg
– Minimum: min
– Maximum: max
– Total : sum
– Count: count
默认情况下是可以保留重复的,可以加上distinct来进行去重:例如count(*)不去重,而count(distinct teaches.semester)是去重的。
E.g. “Find the totalnumber of instructors who teach a course inthe Spring 2010 semester.”
select count(distinct ID)
from teaches
where semester = ‟Spring‟ and year = 2010;
We use the aggregate function count frequently to count the number of tuples in a relation.一般我们可以用聚集函数计算分好组后每个分组中,元素的总数(SUM),平均数(AVG),元组个数(COUNT)等。要注意的是AVG,SUM 等函数只能用于一些可计算的值,比如int, double等,而不能计算字串等数据类型,其实这也比较好理解。
计算元组个数可以用count(*),但是count(*)不能与distinct一起使用,不过可以这样用max(distinct *), min(distict *)。
select count(*)
from course;
SQL does not allow the use of distinct with count (*). It is legal to use distinct with max and min, even thoughthe result does not change.
分组:
E.g. “Find the average salary in each department.”
select dept_name,avg (salary) as avg_salary
from instructor
group by dept_name;
如果group by之后传给avg函数的一组元组集合是空集,那么这组集合将不会被统计。(分子分母都不改变)
group by 里的小规矩:
出现在select子句中但没有被聚集的属性必须出现在group by子句中。
利用having子句可以对分组后的结果集进行过滤:
Try: Find the names and average salaries of all departments whose average salary is greater than 42000
select dept_name,avg (salary)
from instructor
group by dept_name
having avg (salary) > 42000;
注意:除了count(*)外所有的聚集函数都忽略输入集合中的空值。