SQL-92 and earlier does not permit queries for which the select list, HAVING
condition, or ORDER BY
list refer to nonaggregated columns that are not named in the GROUP BY
clause. For example, this query is illegal in standard SQL-92 because the nonaggregated name
column in the select list does not appear in the GROUP BY
:
#在SQL-92及更早的标准中,如果某个列不在 group by 列表中,那么如果不对该列使用聚集函数,该列不能出现在select列表,having条件,以及order by 列表中
SELECT o.custid, c.name, MAX(o.payment)
FROM orders AS o, customers AS c
WHERE o.custid = c.custid
GROUP BY o.custid;
For the query to be legal in SQL-92, the name
column must be omitted from the select list or named in the GROUP BY
clause.
#上面的语句在SQL-92标准中是不合法的,会报错,因为name列出现在了select 列表(并且没有对name使用聚集函数),但是 name列却不是分组字段。
SQL:1999 and later permits such nonaggregates per optional feature T301 if they are functionally dependent on GROUP BY
columns: If such a relationship exists between name
and custid
, the query is legal. This would be the case, for example, were custid
a primary key of customers
.
#在SQL:1999及之后的标准中,对于上面的例子中的sql,如果select列表中的name列和group by 列表中的custid满足 functionally dependent 关系(比如customers表中custid是主键),那么上述语句就是合法的,不会报语法错误。
关于functionally dependent的具体概念请见:
Functional dependency
MySQL 5.7.5 and up implements detection of functional dependence. If the ONLY_FULL_GROUP_BY
SQL mode is enabled (which it is by default), MySQL rejects queries for which the select list, HAVING
condition, or ORDER BY
list refer to nonaggregated columns that are neither named in the GROUP BY
clause nor are functionally dependent on them. (Before 5.7.5, MySQL does not detect functional dependency and ONLY_FULL_GROUP_BY
is not enabled by default. For a description of pre-5.7.5 behavior, see the MySQL 5.6 Reference Manual.)
#Mysql 5.7.5及之后版本实现了functional dependence检查。如果ONLY_FULL_GROUP_BY 被启用(Mysql 5.7中默认启用),那么如果某个列不在group by 列表中,该列同group by列表中的列也不符合functional dependence关系,那么如果不对该列进行聚合处理,该列不能出现在 select 列表,having条件,及order by 列表中。
MySQL 5.7.5 and later also permits a nonaggregate column not named in a GROUP BY
clause when ONLY_FULL_GROUP_BY
SQL mode is enabled, provided that this column is limited to a single value, as shown in the following example:
#Mysql 5.7.5 及之后版本,对于不在group by 列表中的列,且不满足functional dependence,但是如果能保证该列值的唯一性,该列也可以出现在select 列表中,例子如下:
mysql> CREATE TABLE mytable (
-> id INT UNSIGNED NOT NULL PRIMARY KEY,
-> a VARCHAR(10),
-> b INT
-> );
mysql> INSERT INTO mytable
-> VALUES (1, 'abc', 1000),
-> (2, 'abc', 2000),
-> (3, 'def', 4000);
mysql> SET SESSION sql_mode = sys.list_add(@@session.sql_mode, 'ONLY_FULL_GROUP_BY');
mysql> SELECT a, SUM(b) FROM mytable WHERE a = 'abc';
+------+--------+
| a | SUM(b) |
+------+--------+
| abc | 3000 |
+------+--------+
It is also possible to have more than one nonaggregate column in the SELECT
list when employing ONLY_FULL_GROUP_BY
. In this case, every such column must be limited to a single value, and all such limiting conditions must be joined by logical AND
, as shown here:
mysql> DROP TABLE IF EXISTS mytable;
mysql> CREATE TABLE mytable (
-> id INT UNSIGNED NOT NULL PRIMARY KEY,
-> a VARCHAR(10),
-> b VARCHAR(10),
-> c INT
-> );
mysql> INSERT INTO mytable
-> VALUES (1, 'abc', 'qrs', 1000),
-> (2, 'abc', 'tuv', 2000),
-> (3, 'def', 'qrs', 4000),
-> (4, 'def', 'tuv', 8000),
-> (5, 'abc', 'qrs', 16000),
-> (6, 'def', 'tuv', 32000);
mysql> SELECT @@session.sql_mode;
+---------------------------------------------------------------+
| @@session.sql_mode |
+---------------------------------------------------------------+
| ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION |
+---------------------------------------------------------------+
mysql> SELECT a, b, SUM(c) FROM mytable
-> WHERE a = 'abc' AND b = 'qrs';
+------+------+--------+
| a | b | SUM(c) |
+------+------+--------+
| abc | qrs | 17000 |
+------+------+--------+
If ONLY_FULL_GROUP_BY
is disabled, a MySQL extension to the standard SQL use of GROUP BY
permits the select list, HAVING
condition, or ORDER BY
list to refer to nonaggregated columns even if the columns are not functionally dependent on GROUP BY
columns. This causes MySQL to accept the preceding query. In this case, the server is free to choose any value from each group, so unless they are the same, the values chosen are nondeterministic, which is probably not what you want. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY
clause. Result set sorting occurs after values have been chosen, and ORDER BY
does not affect which value within each group the server chooses. Disabling ONLY_FULL_GROUP_BY
is useful primarily when you know that, due to some property of the data, all values in each nonaggregated column not named in the GROUP BY
are the same for each group.
#如果ONLY_FULL_GROUP_BY未启用,那么即不在group by列表中,也不满足 functional dependence的列也可以出现在select列表,having条件,order by列表中。server层可以自由的在每组中选取一个值,因此除非它们的值都相同,否则选择的值是不确定的。此外通过添加 order by字句也不会影响从每个分组中选择的值,因为结果集排序是在选择值之后进行的。
You can achieve the same effect without disabling ONLY_FULL_GROUP_BY
by using ANY_VALUE()
to refer to the nonaggregated column.
#你可以通过使用ANY_VALUE()函数来实现禁用 ONLY_FULL_GROUP_BY相同的效果
The following discussion demonstrates functional dependence, the error message MySQL produces when functional dependence is absent, and ways of causing MySQL to accept a query in the absence of functional dependence.
#下面我们例子来了解一下 functional dependence
This query might be invalid with ONLY_FULL_GROUP_BY
enabled because the nonaggregated address
column in the select list is not named in the GROUP BY
clause:
#在启用 ONLY_FULL_GROUP_BY 的情况下面的查询可能会报错,因为select 列表中的 address列并不在 group by列表中
SELECT name, address, MAX(age) FROM t GROUP BY name;
The query is valid if name
is a primary key of t
or is a unique NOT NULL
column. In such cases, MySQL recognizes that the selected column is functionally dependent on a grouping column. For example, if name
is a primary key, its value determines the value of address
because each group has only one value of the primary key and thus only one row. As a result, there is no randomness in the choice of address
value in a group and no need to reject the query.
#如果name列是t表的主键,或者name列是一个有not null属性的唯一索引,那么上面的查询是合法的(因为此时 address列和name列满足 functional dependence关系)。因为如果name列是主键的话,那么每一个分组中只有一个name值,只可能对应一个address值。
The query is invalid if name
is not a primary key of t
or a unique NOT NULL
column. In this case, no functional dependency can be inferred and an error occurs:
#如果启用 ONLY_FULL_GROUP_BY 并且address和name列不满足functional dependence关系,那么查询将报如下错误
mysql> SELECT name, address, MAX(age) FROM t GROUP BY name;
ERROR 1055 (42000): Expression #2 of SELECT list is not in GROUP
BY clause and contains nonaggregated column 'mydb.t.address' which
is not functionally dependent on columns in GROUP BY clause; this
is incompatible with sql_mode=only_full_group_by
If you know that, for a given data set, each name
value in fact uniquely determines the address
value, address
is effectively functionally dependent on name
. To tell MySQL to accept the query, you can use the ANY_VALUE()
function:
#你可以通过使用ANY_VALUE()函数来实现禁用ONLY_FULL_GROUP_BY的效果,但是每次查询为每个分组选择的address具有不确定性。在你能保证每个分组address值相同时才建议这么用。
SELECT name, ANY_VALUE(address), MAX(age) FROM t GROUP BY name;
Alternatively, disable ONLY_FULL_GROUP_BY
.
The preceding example is quite simple, however. In particular, it is unlikely you would group on a single primary key column because every group would contain only one row. For addtional examples demonstrating functional dependence in more complex queries, see Section 12.20.4, “Detection of Functional Dependence”.
If a query has aggregate functions and no GROUP BY
clause, it cannot have nonaggregated columns in the select list, HAVING
condition, or ORDER BY
list with ONLY_FULL_GROUP_BY
enabled:
#如果ONLY_FULL_GROUP_BY启用,select 语句中没有group by,那么select列表,having条件及order by列表中不能出现非聚合列
mysql> SELECT name, MAX(age) FROM t;
ERROR 1140 (42000): In aggregated query without GROUP BY, expression
#1 of SELECT list contains nonaggregated column 'mydb.t.name'; this
is incompatible with sql_mode=only_full_group_by
Without GROUP BY
, there is a single group and it is nondeterministic which name
value to choose for the group. Here, too, ANY_VALUE()
can be used, if it is immaterial which name
value MySQL chooses:
#同前文所述,可以通过使用ANY_VALUE()函数来避免上面的报错
SELECT ANY_VALUE(name), MAX(age) FROM t;
In MySQL 5.7.5 and higher, ONLY_FULL_GROUP_BY
also affects handling of queries that use DISTINCT
and ORDER BY
. Consider the case of a table t
with three columns c1
, c2
, and c3
that contains these rows:
#Mysql 5.7.5及更高版本中,ONLY_FULL_GROUP_BY 还会影响distinct 和 order by操作
c1 c2 c3
1 2 A
3 4 B
1 2 C
Suppose that we execute the following query, expecting the results to be ordered by c3
:
SELECT DISTINCT c1, c2 FROM t ORDER BY c3;
To order the result, duplicates must be eliminated first. But to do so, should we keep the first row or the third? This arbitrary choice influences the retained value of c3
, which in turn influences ordering and makes it arbitrary as well. To prevent this problem, a query that has DISTINCT
and ORDER BY
is rejected as invalid if any ORDER BY
expression does not satisfy at least one of these conditions:
#假设我们对表的c1,c2列去重,并且根据c3列队结果集进行排序。我们需要先进行去重,然后对去重后的结果集进行排序。去重时对于第一行和第二行重复,只能保留一行,那么我们是保留第一行的c3值还是保留第三行的c3值呢?因为排序列值的选择具有不确定性,所以我们结果集的排序也就具有不确定性。为了避免这个问题,对于使用了distinct和order by的查询,如果order by表达式不满足以下两个条件中的至少一项,查询就会报错
The expression is equal to one in the select list #使用select列表中出现的某个列进行排序
All columns referenced by the expression and belonging to the query's selected tables are elements of the select list #order by表达式所引用的列都属于select 列表
Another MySQL extension to standard SQL permits references in the HAVING
clause to aliased expressions in the select list. For example, the following query returns name
values that occur only once in table orders
:
#Mysql对标准sql的另一个扩展是允许在having条件中使用别名
SELECT name, COUNT(name) FROM orders
GROUP BY name
HAVING COUNT(name) = 1;
The MySQL extension permits the use of an alias in the HAVING
clause for the aggregated column:
SELECT name, COUNT(name) AS c FROM orders
GROUP BY name
HAVING c = 1;
Note
Before MySQL 5.7.5, enabling ONLY_FULL_GROUP_BY
disables this extension, thus requiring the HAVING
clause to be written using unaliased expressions.
#注意在5.7.5之前版本中如果启用ONLY_FULL_GROUP_BY,则会禁用该扩展,不能在having字句中使用别名
Standard SQL permits only column expressions in GROUP BY
clauses, so a statement such as this is invalid because FLOOR(value/100)
is a noncolumn expression:
#标准sql中只允许在group by中使用列,如下语句执行会报错
SELECT id, FLOOR(value/100)
FROM tbl_name
GROUP BY id, FLOOR(value/100);
MySQL extends standard SQL to permit noncolumn expressions in GROUP BY
clauses and considers the preceding statement valid.
#同样的mysql对Mysql进行了扩展使上面的 group by id,FLOOR(value/100)也变成合法语句
Standard SQL also does not permit aliases in GROUP BY
clauses. MySQL extends standard SQL to permit aliases, so another way to write the query is as follows:
#标准sql中不允许group by字句中使用别名,mysql扩展后允许在group by字句中使用别名,上面的例子也可以写成如下语句
SELECT id, FLOOR(value/100) AS val
FROM tbl_name
GROUP BY id, val;
The alias val
is considered a column expression in the GROUP BY
clause.
In the presence of a noncolumn expression in the GROUP BY
clause, MySQL recognizes equality between that expression and expressions in the select list. This means that with ONLY_FULL_GROUP_BY
SQL mode enabled, the query containing GROUP BY id, FLOOR(value/100)
is valid because that same FLOOR()
expression occurs in the select list. However, MySQL does not try to recognize functional dependence on GROUP BY
noncolumn expressions, so the following query is invalid with ONLY_FULL_GROUP_BY
enabled, even though the third selected expression is a simple formula of the id
column and the FLOOR()
expression in the GROUP BY
clause:#Mysql不会对group by 字句中的非普通列进行 functional dependence检查,所以下面的语句无效
SELECT id, FLOOR(value/100), id+FLOOR(value/100)
FROM tbl_name
GROUP BY id, FLOOR(value/100);
A workaround is to use a derived table:
#上面的语句可以改成如下写法
SELECT id, F, id+F
FROM
(SELECT id, FLOOR(value/100) AS F
FROM tbl_name
GROUP BY id, FLOOR(value/100)) AS dt;