转载自 【http://www.jb51.net/article/77997.htm】
MySQL通常使用GROUPBY(本质上是排序动作)完成DISTINCT操作,如果DISTINCT操作和ORDERBY操作组合使用,通常会用到临时表.这样会影响性能. 在一些情况下,MySQL可以使用索引优化DISTINCT操作,但需要活学活用.本文涉及一个不能利用索引完成DISTINCT操作的实例.
实例1 使用索引优化DISTINCT操作
1
2
3
4
5
|
create
table
m11 (a
int
, b
int
, c
int
, d
int
,
primary
key
(a)) engine=INNODB;
insert
into
m11
values
(1,1,1,1),(2,2,2,2),(3,3,3,3),(4,4,4,4),(5,5,5,5),(6,6,6,6),(7,7,7,7),(8,8,8,8);
explain
select
distinct
(a)
from
m11;
|
1
|
mysql> explain
select
distinct
(a)
from
m11;
|
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+| 1 | SIMPLE | m11 | NULL | index | PRIMARY | PRIMARY | 4 | NULL | 1 | 100.00 | Using index |+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
2 这是使用索引优化DISTINCT操作的典型实例.
实例2 使用索引不能优化DISTINCT操作
1
2
3
4
5
|
create
table
m31 (a
int
, b
int
, c
int
, d
int
,
primary
key
(a)) engine=MEMORY;
insert
into
m31
values
(1,1,1,1),(2,2,2,2),(3,3,3,3),(4,4,4,4),(5,5,5,5),(6,6,6,6),(7,7,7,7),(8,8,8,8);
explain
select
distinct
(a)
from
m31;
|
1
|
mysql> explain
select
distinct
(a)
from
m31;
|
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+| 1 | SIMPLE | m31 | NULL | ALL | NULL | NULL | NULL | NULL | 8 | 100.00 | NULL |+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
2 对比实例1的建表语句,只是存储引擎不同.
3 为什么主键索引没有起作用? 难道MEMORY存储引擎上的索引不可使用?
实例3 使用索引可以优化DISTINCT操作的Memory表
1
2
3
4
5
|
create
table
m33 (a
int
, b
int
, c
int
, d
int
,
INDEX
USING BTREE (a)) engine=MEMORY;
insert
into
m33
values
(1,1,1,1),(2,2,2,2),(3,3,3,3),(4,4,4,4),(5,5,5,5),(6,6,6,6),(7,7,7,7),(8,8,8,8);
explain
select
distinct
(a)
from
m33;
|
1
|
mysql> explain
select
distinct
(a)
from
m33;
|
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------+| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------+| 1 | SIMPLE | m33 | NULL | index | NULL | a | 5 | NULL | 8 | 100.00 | NULL |+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------+
说明:
1 'a'列上存在主键索引,MySQL可以利用索引(key列值表明使用了主键索引)完成了DISTINCT操作.
2 对比实例2,可以发现,二者都使用了Memory引擎. 但实例3指名使用Btree类型的索引.
3 实例2没有指定使用什么类型的索引,MySQL将采用默认值. MySQL手册上说:
As indicated by the engine name, MEMORY tables are stored in memory. They use hash indexes by default, which makes them very fast for single-value lookups, and very useful for creating temporary tables.
结论:
1 看索引对查询的影响,要注意索引的类型.
2 HASH索引适合等值查找,但不适合需要有序的场景,而Btree却适合有序的场景.
3 看查询执行计划,发现索引没有被使用,需要进一步考察索引的类型.
DISTINCT不能选择多个字段的解决方法
在实际应用中,我们经常要选择数据库某表中重复数据,通常我们是使用DISTINCT函数。
但DISTINCT只能对一个字段有效,比如:
1
|
sql=
"select DISTINCT title from Table where id>0"
|
当我们需要列出数据中的另一列,比如:
1
|
sql=
"select DISTINCT title,posttime from Table where id>0"
|
得出的结果就不是我们想要的了,所以我们需要用另外的方法来解决这个问题。
下面的是我写的SQL语句,我不知道是不是很好,但愿有更好的人拿出来分享一下:
写法一:
1
|
sql =
"Select DISTINCT(title),posttime From Table1 Where id>0"
|
写法二:
1
|
sql =
"Select title,posttime From Table1 Where id>0 group by title,posttime"
|
写法三:
1
|
sql=
"select title,posttime from Table where id in (select min(id) from Table
|