DISTINCT and GROUP BY

今天突然想到理论上 DISTINCT 逻辑上可以用 GROUP BY 来替代。

比如下面两个查询。

SELECT DISTINCT type FROM question;
type
choice
single_choice
uncertain_choice
determine
fill
essay
material
SELECT type FROM questiton GROUP BY type;
type
choice
single_choice
uncertain_choice
determine
fill
essay
material

结果完全是一样的。

具体 explain 一下,整个执行计划是一样的。 DISTINCT 直接用索引做了 GROUP BY

EXPLAIN SELECT DISTINCT type FROM question;
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE question range type type 1 NULL 16 Using index for group-by
EXPLAIN SELECT type FROM question GROUP BY type;
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE question range type type 1 NULL 16 Using index for group-by

是不是所有的 DISTINCT 的执行计划都看这样呢?换一个属性看看。

EXPLAIN SELECT DISTINCT score FROM question;
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE question ALL NULL NULL NULL NULL 37424 Using temporary
EXPLAIN SELECT score FROM question GROUP BY score;
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE question ALL NULL NULL NULL NULL 37424 Using temporary; Using filesort

中间有微妙的差别。两个查询都用了 temporary. 说明 MySQL 全表扫描再做去重。 GROUP BY 去重用的是排序。 DISTINCT 仅这里看不出来。具体会有什么差别? 直接查询看一下。

SELECT DISTINCT score FROM question;
score
1.0
2.0
0.0
3.0
8.0
12.0
6.0
4.0
10.0
14.0
18.0
2.5
15.0
2.3
5.0
1.5
4.5
SELECT score FROM question GROUP BY score;
score
0.0
1.0
1.5
2.0
2.3
2.5
3.0
4.0
4.5
5.0
6.0
8.0
10.0
12.0
14.0
15.0
18.0

可以看出 GROUP BY 是用排序去重, DISTINCT 不是。

MySQL 的优化策略还是相当复杂的。

你可能感兴趣的:(DISTINCT and GROUP BY)