普通索引即在普通字段上创建的索引:
mysql> create table book(book_id int, book_name varchar(100), authors varchar(100), info varchar(100), comment varchar(100), year_publication year, index idx_bname(book_name));
查看book表上的索引:
mysql> show index from book;
+-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| book | 1 | idx_bname | 1 | book_name | A | 0 | NULL | NULL | YES | BTREE | | |
+-------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
1 row in set (0.01 sec)
可以看到普通索引idx_bname
是非唯一的,且使用的引擎类型为B+树,即数据仅存放在叶子结点上,而不存放在非叶子结点上。
我们查看一下通过索引字段查数据时的执行计划:
mysql> explain select * from book where book_name = "book1";
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-------+
| 1 | SIMPLE | book | NULL | ref | idx_bname | idx_bname | 103 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.01 sec)
可见如果通过索引字段查数据,会使用我们创建的idx_bname
索引。再看一下用非索引字段查看时的执行计划:
mysql> explain select * from book where authors = "author1";
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | book | NULL | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
可以看到不会使用任何的索引(possible_keys
为null)。
唯一索引会给索引字段加上唯一性约束,即表内唯一:
mysql> create table book1(book_id int, book_name varchar(100), authors varchar(100), info varchar(100), comment varchar(100), year_publication year, unique index idx_comment(comment));
Query OK, 0 rows affected (0.02 sec)
我们看一下book1
表上的索引:
mysql> show index from book1;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| book1 | 0 | idx_comment | 1 | comment | A | 0 | NULL | NULL | YES | BTREE | | |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
1 row in set (0.00 sec)
可见唯一索引idx_comment
的Non_unique
字段为0,即唯一的
给表定义主键时,会自动添加对应的主键索引:
mysql> create table book2(book_id int primary key, book_name varchar(100), authors varchar(100), info varchar(100), comment varchar(100), year_publication year);
Query OK, 0 rows affected (0.01 sec)
mysql> show index from book2;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| book2 | 0 | PRIMARY | 1 | book_id | A | 0 | NULL | NULL | | BTREE | | |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
1 row in set (0.00 sec)
因此我们可以通过删除主键的方式删除主键索引:
mysql> alter table book2 drop primary key;
Query OK, 0 rows affected (0.04 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> show index from book2;
Empty set (0.00 sec)
注意,主键索引和包含主键的其他索引都不会显示在explain
中:
mysql> create table test_index(c1 int primary key auto_increment, c2 int, c3 int, c4 int);
Query OK, 0 rows affected (0.01 sec)
mysql> create index idx_c1_c2 on test_index(c1, c2);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> create index idx_c2_c3 on test_index(c2, c3);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> create index idx_c2_c3_c4 on test_index(c2, c3, c4);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> create index idx_c2 on test_index(c2);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> create index idx_c3 on test_index(c3);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
上例中:
c1
是主键;idx_c1_c2
包含主键c1
和普通字段c2
;idx_c2_c3
包含普通字段c2
和c3
;idx_c2_c3_c4
包含包含普通字段c2
、c3
和c4
;idx_c2
包含普通字段c2
;idx_c3
包含普通字段c3
。我们看下对主键c1
和普通字段c2
的查询会用到哪些索引:
mysql> explain select * from test_index where c1 = 1;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | no matching row in const table |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain select * from test_index where c2 = 1;
+----+-------------+------------+------------+------+-------------------------------+-----------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+-------------------------------+-----------+---------+-------+------+----------+-------+
| 1 | SIMPLE | test_index | NULL | ref | idx_c2_c3,idx_c2_c3_c4,idx_c2 | idx_c2_c3 | 5 | const | 1 | 100.00 | NULL |
+----+-------------+------------+------------+------+-------------------------------+-----------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
可见,对主键c1
的查询没用到任何索引(主键索引和联合索引),对普通字段c2
的查询进行索引筛选时,筛选范围也仅是idx_c2_c3
、idx_c2_c3_c4
和idx_c2
,不包含idx_c1_c2
,由此可见任何包含主键的索引都不会被列入优化器的索引候选列表。
在单个列上建立的索引叫做单列索引,上面的例子创建的都是单列索引。下面是一个创建联合索引的例子,即通过多个列创建的索引:
mysql> create table book3(book_id int, book_name varchar(100), authors varchar(100), info varchar(100), comment varchar(100), year_publication year, index idx_id_name_authors(book_id, book_name, authors));
Query OK, 0 rows affected (0.02 sec)
查看一下这个联合索引:
mysql> show index from book3;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| book3 | 1 | idx_id_name_authors | 1 | book_id | A | 0 | NULL | NULL | | BTREE | | |
| book3 | 1 | idx_id_name_authors | 2 | book_name | A | 0 | NULL | NULL | YES | BTREE | | |
| book3 | 1 | idx_id_name_authors | 3 | authors | A | 0 | NULL | NULL | YES | BTREE | | |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.00 sec)
可以看到联合索引idx_id_name_authors
中的三个字段的索引序号(Seq_in_index
)分别为1、2、3,这决定了通过该索引检索时的字段匹配顺序:即先匹配book_id
,相等时再匹配book_name
,相等时再匹配authors
,如果三者均相等,则返回对应记录,否则不返回。
会不会存在三者均相等但其他字段不等的多条记录呢?不会的,因为我们这里定义了book_id
为主键,主键值是表内唯一的。如果联合索引的字段都不是主键呢?此时该联合索引在InnoDB
中的存储方式就是非聚簇索引B+树,检索时MySQL引擎会通过索引值检索到叶子结点中的主键,再通过聚簇索引B+树检索到该主键对应的数据行。如果没有设置主键呢?此时InnoDB会自动给表加上隐式主键。
因为通过联合索引检索时,是根据字段索引ID顺序进行最左前缀匹配的,因此,只有按序指定过滤字段才会使用该联合索引的全部字段:
mysql> explain select * from book3 where book_id = 1 and book_name = "book1" and authors = 'author1';
+----+-------------+-------+------------+------+---------------------------------+-------------+---------+-------------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------------------------+-------------+---------+-------------------+------+----------+-------+
| 1 | SIMPLE | book3 | NULL | ref | idx_id_name_authors | idx_id_name_authors | 210 | const,const,const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------------------------+-------------+---------+-------------------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
如果跳过联合索引的某个字段,则该联合索引后面的字段都不会被使用:
mysql> explain select * from book3 where book_name = "book1" and authors = "author1";
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | book3 | NULL | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain select * from book3 where book_name = "author1";
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | book3 | NULL | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
对于以下例子:
mysql> create table test_index(c1 int primary key auto_increment, c2 int, c3 int, c4 int);
Query OK, 0 rows affected (0.01 sec)
mysql> create index idx_c1_c2 on test_index(c1, c2);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> create index idx_c2_c3 on test_index(c2, c3);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> create index idx_c2_c3_c4 on test_index(c2, c3, c4);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> create index idx_c2 on test_index(c2);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> create index idx_c3 on test_index(c3);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
我们看一下过滤条件为(c3, c4)时会发生什么:
mysql> explain select * from test_index where c3 = 1 and c4 = 2;
+----+-------------+------------+------------+------+---------------+--------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+---------------+--------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | test_index | NULL | ref | idx_c3 | idx_c3 | 5 | const | 1 | 100.00 | Using where |
+----+-------------+------------+------------+------+---------------+--------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
因为联合索引idx_c2_c3_c4
的索引顺序为(c2, c3, c4)
,而过滤条件为(c3, c4)
,因此该查询不能用到联合索引idx_c2_c3_c4
。在这个例子中,只能把c4
当成非索引列,从而使用idx_c3
将过滤条件c3
当成索引列进行查询。
InnoDB还支持对varchar
、char
和text
字段的全文索引,全文索引可以是单列或联合索引:
mysql> create table book4(book_id int, book_name varchar(100), authors varchar(100), info varchar(100), comment varchar(100), year_publication year, fulltext index full_txt_idx(info(50), comment));
Query OK, 0 rows affected (0.13 sec)
mysql> show index from book4;
+-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| book4 | 1 | full_txt_idx | 1 | info | NULL | 0 | NULL | NULL | YES | FULLTEXT | | |
| book4 | 1 | full_txt_idx | 2 | comment | NULL | 0 | NULL | NULL | YES | FULLTEXT | | |
+-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0.00 sec)
fulltext index full_txt_idx(info(50), comment)
中的info(50)
指定了使用info
字段前50个字符进行全文索引查询。建立好全文索引后,我们可以通过match+against
进行检索:
mysql> explain select * from book4 where match(info, comment) against("string to be matched");
+----+-------------+-------+------------+----------+---------------+--------------+---------+-------+------+----------+-------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+----------+---------------+--------------+---------+-------+------+----------+-------------------------------+
| 1 | SIMPLE | book4 | NULL | fulltext | full_txt_idx | full_txt_idx | 0 | const | 1 | 100.00 | Using where; Ft_hints: sorted |
+----+-------------+-------+------------+----------+---------------+--------------+---------+-------+------+----------+-------------------------------+
1 row in set, 1 warning (0.00 sec)
不过,还是推荐使用ElasticSearch、solr(或者Atlas)等专门的大数据查询引擎进行NLP检索。
先创建一张没有主键和索引的表:
mysql> create table book5(book_id int, book_name varchar(100), authors varchar(100), info varchar(100), comment varchar(100), year_publication year);
Query OK, 0 rows affected (0.01 sec)
mysql> show index from book5;
Empty set (0.00 sec)
然后添加一个普通索引:
mysql> alter table book5 add index idx_cmt(comment);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> show index from book5;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| book5 | 1 | idx_cmt | 1 | comment | A | 0 | NULL | NULL | YES | BTREE | | |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
1 row in set (0.00 sec)
添加其他类型索引的方式跟【创建表时创建索引】一节完全一样,只需要在alter table tableName add
后加上想要的索引即可:
mysql> alter table book5 add unique idx_book_id(book_id);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> alter table book5 add index idx_id_name(book_id, book_name);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> show index from book5;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| book5 | 0 | idx_book_id | 1 | book_id | A | 0 | NULL | NULL | YES | BTREE | | |
| book5 | 1 | idx_cmt | 1 | comment | A | 0 | NULL | NULL | YES | BTREE | | |
| book5 | 1 | idx_id_name | 1 | book_id | A | 0 | NULL | NULL | YES | BTREE | | |
| book5 | 1 | idx_id_name | 2 | book_name | A | 0 | NULL | NULL | YES | BTREE | | |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.00 sec)
这也是一种动态给表添加索引的方法,先创建一张没有主键和索引的表:
mysql> create table book6(book_id int, book_name varchar(100), authors varchar(100), info varchar(100), comment varchar(100), year_publication year);
Query OK, 0 rows affected (0.01 sec)
mysql> show index from book6;
Empty set (0.00 sec)
再通过create index index_name on table_name(column_name)
的方式添加索引:
mysql> create index idx_cmt on book6(comment);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> show index from book6;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| book6 | 1 | idx_cmt | 1 | comment | A | 0 | NULL | NULL | YES | BTREE | | |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
1 row in set (0.00 sec)
同样,添加唯一索引和联合索引:
mysql> create unique index idx_bname on book6(book_name);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> create index idx_id_name on book6(book_id, book_name);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> show index from book6;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| book6 | 0 | idx_bname | 1 | book_name | A | 0 | NULL | NULL | YES | BTREE | | |
| book6 | 1 | idx_cmt | 1 | comment | A | 0 | NULL | NULL | YES | BTREE | | |
| book6 | 1 | idx_id_name | 1 | book_id | A | 0 | NULL | NULL | YES | BTREE | | |
| book6 | 1 | idx_id_name | 2 | book_name | A | 0 | NULL | NULL | YES | BTREE | | |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.00 sec)
mysql> show index from book5;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| book5 | 0 | idx_book_id | 1 | book_id | A | 0 | NULL | NULL | YES | BTREE | | |
| book5 | 1 | idx_cmt | 1 | comment | A | 0 | NULL | NULL | YES | BTREE | | |
| book5 | 1 | idx_id_name | 1 | book_id | A | 0 | NULL | NULL | YES | BTREE | | |
| book5 | 1 | idx_id_name | 2 | book_name | A | 0 | NULL | NULL | YES | BTREE | | |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.00 sec)
mysql> alter table book5 drop index idx_id_name;
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> show index from book5;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| book5 | 0 | idx_book_id | 1 | book_id | A | 0 | NULL | NULL | YES | BTREE | | |
| book5 | 1 | idx_cmt | 1 | comment | A | 0 | NULL | NULL | YES | BTREE | | |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.00 sec)
注意,添加了auto_increment
约束的字段的唯一索引不能被删除。
mysql> show index from book5;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| book5 | 0 | idx_book_id | 1 | book_id | A | 0 | NULL | NULL | YES | BTREE | | |
| book5 | 1 | idx_cmt | 1 | comment | A | 0 | NULL | NULL | YES | BTREE | | |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.00 sec)
mysql> drop index idx_cmt on book5;
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> show index from book5;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| book5 | 0 | idx_book_id | 1 | book_id | A | 0 | NULL | NULL | YES | BTREE | | |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
1 row in set (0.00 sec)
如果删除存在于某个联合索引中的字段,该联合索引也会进行对应的修改:
mysql> show index from book6;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| book6 | 0 | idx_bname | 1 | book_name | A | 0 | NULL | NULL | YES | BTREE | | |
| book6 | 1 | idx_cmt | 1 | comment | A | 0 | NULL | NULL | YES | BTREE | | |
| book6 | 1 | idx_id_name | 1 | book_id | A | 0 | NULL | NULL | YES | BTREE | | |
| book6 | 1 | idx_id_name | 2 | book_name | A | 0 | NULL | NULL | YES | BTREE | | |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.00 sec)
mysql> alter table book5 drop column name book_name;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'book_name' at line 1
mysql> alter table book6 drop column book_name;
Query OK, 0 rows affected (0.04 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> show index from book6;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| book6 | 1 | idx_cmt | 1 | comment | A | 0 | NULL | NULL | YES | BTREE | | |
| book6 | 1 | idx_id_name | 1 | book_id | A | 0 | NULL | NULL | YES | BTREE | | |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.00 sec)
上例删除了book_name
字段,该字段即存在于单列索引idx_bname
中,也存在于复合索引idx_id_name
中,因此该删除操作导致idx_bname
索引被删除,idx_id_name
索引中的字段少了一个。
首先,创建表:
mysql> CREATE TABLE `student_info` ( `id` INT(11) NOT NULL AUTO_INCREMENT, `student_id` INT NOT NULL , `name` VARCHAR(20) DEFAULT NULL, `course_id` INT NOT NULL , `class_id` INT(11) DEFAULT NULL, `create_time` DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`) ) ENGINE=INNODB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.01 sec)
mysql> CREATE TABLE `course` ( `id` INT(11) NOT NULL AUTO_INCREMENT, `course_id` INT NOT NULL , `course_name` VARCHAR(40) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=INNODB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.01 sec)
然后,开启允许创建函数:
set global log_bin_trust_function_creators = 1;
创建两个函数:生成随机字符串和随机数字:
# 生成随机字符串
CREATE FUNCTION rand_string(n INT)
RETURNS VARCHAR(255)
BEGIN
DECLARE chars_str VARCHAR(100) DEFAULT 'abcdefghijklmnopqrstuvwxyzABCDEFJHIJKLMNOPQRSTUVWXYZ';
DECLARE return_str VARCHAR(255) DEFAULT '';
DECLARE i INT DEFAULT 0;
WHILE i < n DO
SET return_str = CONCAT(return_str,SUBSTRING(chars_str,FLOOR(1+RAND()*52),1));
SET i = i + 1;
END WHILE;
RETURN return_str;
END;
# 生成随机数
CREATE FUNCTION rand_num (from_num INT ,to_num INT)
RETURNS INT(11)
BEGIN
DECLARE i INT DEFAULT 0;
SET i = FLOOR(from_num +RAND()*(to_num - from_num+1));
RETURN i;
END;
然后,创建插入模拟数据的存储过程:
# 插入随机课程数据
CREATE PROCEDURE insert_course(max_num INT)
BEGIN
DECLARE i INT DEFAULT 0;
SET autocommit = 0; # 设置手动提交事务
REPEAT # 循环
SET i = i + 1; # 赋值
INSERT INTO course(course_id, course_name) VALUES (rand_num(10000,10100), rand_string(6));
UNTIL i = max_num
END REPEAT;
COMMIT; # 提交事务
END;
# 插入随机学生数据
CREATE PROCEDURE insert_stu(max_num INT)
BEGIN
DECLARE i INT DEFAULT 0;
SET autocommit = 0; # 设置手动提交事务
REPEAT #循环
SET i = i + 1; # 赋值
INSERT INTO student_info(course_id, class_id, student_id, NAME) VALUES (rand_num(10000,10100), rand_num(10000,10200), rand_num(1,200000), rand_string(6));
UNTIL i = max_num
END REPEAT;
COMMIT; # 提交事务
END;
最后,调用存储过程,插入100门课程和1000000名学生:
CALL insert_course(100);
CALL insert_stu(1000000);
业务上具有唯一特性的字段,即使是组合字段,也必须建立唯一索引(或唯一性约束)。虽然唯一索引影响了数据的插入速率,但这个损耗可以忽略,而带来的对查询速率的提升,却是明显的。
如果某个字段在select
语句中经常被用作where
过滤的条件,那么就需要给该字段创建索引了,一个普通索引即可在数据量大的情况下大幅提升数据的查询效率。
针对,student_id
字段,创建索引前查询:
mysql> select course_id, class_id, name, create_time, student_id from student_info where student_id = 123111;
+-----------+----------+--------+---------------------+------------+
| course_id | class_id | name | create_time | student_id |
+-----------+----------+--------+---------------------+------------+
| 10064 | 10061 | iXmtgB | 2022-01-19 20:06:26 | 123111 |
| 10090 | 10173 | zFBTMg | 2022-01-19 20:07:14 | 123111 |
| 10036 | 10028 | IxmTJM | 2022-01-19 20:07:32 | 123111 |
| 10008 | 10196 | hVeMvU | 2022-01-19 20:08:07 | 123111 |
| 10098 | 10021 | NVoJDN | 2022-01-19 20:08:23 | 123111 |
| 10023 | 10133 | eDEmuq | 2022-01-19 20:09:00 | 123111 |
| 10088 | 10049 | rRhnQs | 2022-01-19 20:10:10 | 123111 |
+-----------+----------+--------+---------------------+------------+
7 rows in set (0.50 sec)
为student_id
创建索引:
mysql> create index idx_stu_id on student_info(student_id);
此步耗时会跟着数据量的增大而增大,建立好后,我们再次进行查询:
mysql> select course_id, class_id, name, create_time, student_id from student_info where student_id = 123111;
+-----------+----------+--------+---------------------+------------+
| course_id | class_id | name | create_time | student_id |
+-----------+----------+--------+---------------------+------------+
| 10064 | 10061 | iXmtgB | 2022-01-19 20:06:26 | 123111 |
| 10090 | 10173 | zFBTMg | 2022-01-19 20:07:14 | 123111 |
| 10036 | 10028 | IxmTJM | 2022-01-19 20:07:32 | 123111 |
| 10008 | 10196 | hVeMvU | 2022-01-19 20:08:07 | 123111 |
| 10098 | 10021 | NVoJDN | 2022-01-19 20:08:23 | 123111 |
| 10023 | 10133 | eDEmuq | 2022-01-19 20:09:00 | 123111 |
| 10088 | 10049 | rRhnQs | 2022-01-19 20:10:10 | 123111 |
+-----------+----------+--------+---------------------+------------+
7 rows in set (0.00 sec)
执行时间从0.5s到0s,可见效率提升非常明显。
索引存在的意义就是让数据按照某种顺序进行存储或检索,并且索引字段在B+树中本身就是不降序排好列的,因此我们使用group by
对数据进行分组或使用order by
对数据进行排序时,就需要对分组或排序的字段建立索引。如果待分组或排序的列不止一个,就可以对其建立联合索引。
由于上例中我们对student_id
建立了索引,因此对它的分组查询就非常快(要知道student_info
表中可有百万条数据):
mysql> select student_id, count(*) from student_info group by student_id limit 10;
+------------+----------+
| student_id | count(*) |
+------------+----------+
| 1 | 7 |
| 2 | 4 |
| 3 | 5 |
| 4 | 4 |
| 5 | 1 |
| 6 | 3 |
| 7 | 7 |
| 8 | 4 |
| 9 | 3 |
| 10 | 4 |
+------------+----------+
10 rows in set (0.00 sec)
如果group by
和order by
字段不一致,例如group by col1 order by col2
,则最好对col1
和col2
建立联合索引,且group by
的字段在左边,因为group by
子句先于order by
执行。
因为索引可以提升字段的检索效率,因此更新和删除时通过where
子句过滤索引字段时,会大幅提升操作效率。另外,如果更新的是非索引字段,由于非索引字段的更新不需要维护索引,所以更新的效率提升会更加明显。
因为对student_id
添加了索引,所以下面语句在百万数据集上执行效率很高(0.636ms):
update student_info set student_id = 10002 where name = "462eed7ac6e79129a79";
如果我们需要通过distinct
对某个字段进行去重,那么对该字段建立索引也会提升查询效率。
where
字段创建索引;在对student_info.name
字段创建索引前,执行以下多表连接查询:
mysql> select student_info.course_id, student_info.name, student_info.student_id, course.course_name
-> from student_info join course
-> on student_info.course_id = course.course_id
-> where name = "462eed7ac6e79129a79";
Empty set (0.32 sec)
执行时间为0.32s,我们对student_info.name
字段创建索引:
create index idx_stu_name on student_info(name);
再次执行相同的查询:
mysql> select student_info.course_id, student_info.name, student_info.student_id, course.course_name
-> from student_info join course on student_info.course_id = course.course_id where name = "462eed7ac6e79129a79";
Empty set (0.00 sec)
执行时间降为0s,可见效率提升之大。如果再对连接条件student_info.couse_id
和course.couse_id
创建索引的话,会进一步提升效率。
类型小指的是该数据类型所表示的数据范围小,同时该类型数据占用空间就小,这样一个数据页内就可以放下更多的记录,从而缓存更多的数据,减少磁盘IO。
当某个要创建索引的字段是字符串(特别是varchar
),且长度很长时,必须通过字符串截取的方式创建前缀索引。
例如以下表:
create table shop(address varchar(120) not null);
给address
字段创建索引时,我们可以只截取前12个字符作为索引:
alter table shop add index(address(12));
我们可以通过以下代码检查取前n个字符时,某列的散列程度:
count(distinct left(列名, n)) / count(*);
当散列程度达到90%时,就可以将索引长度设为对应的n,一般n取20就足够了。
计算student_info.student_id
前5个字符的区分度如下:
mysql> select count(distinct left(student_id, 5)) / count(*) from student_info;
+------------------------------------------------+
| count(distinct left(student_id, 5)) / count(*) |
+------------------------------------------------+
| 0.0992 |
+------------------------------------------------+
结果为9.92%。
不过,对某字段进行索引截取后,对该列的排序就可能不准了,因为该列的索引不包含完整的列数据,此时只能使用文件排序,才能得到正确的结果。
我们将某列中不重复数据的个数称为列的基数,列的基数越大,列数据越分散,该列越适合做索引。另外,我们可以通过以下代码检查某列的区分度:
count(distinct 列名) / count(*);
区分度超过33%的列就可以建立索引了。
计算student_info.student_id
的区分度如下:
mysql> select count(distinct student_id) / count(*) from student_info;
+---------------------------------------+
| count(distinct student_id) / count(*) |
+---------------------------------------+
| 0.1982 |
+---------------------------------------+
1 row in set (0.43 sec)
结果为19.82%
根据最左前缀匹配原则,将使用频率高的字段放在联合索引左边可以增加该联合索引的使用率,同样使用频率越高的字段应该放到where
子句的左边。
若某个联合索引从左往右包含字段a、b、c、d
,那么对字段a
、字段(a, b)
、字段(a, b, c)
和字段(a, b, c, d)
的检索都会用到该联合索引。
这是因为:每个索引都要占用磁盘空间,且会影响增删改语句的性能;优化器在决定如何优化查询时,会对每一个可能用到的索引进行评估,生成一个最好的执行计划,因此索引数量过多会降低MySQL优化器的性能。
一般来说,建议单张表的索引数量不要超过6个。
假设有如下表:
CREATE TABLE person_info(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
name VARCHAR(100) NOT NULL,
birthday DATE NOT NULL,
phone_number CHAR(11) NOT NULL,
country varchar(100) NOT NULL,
PRIMARY KEY (id),
KEY idx_name_birthday_phone_number (name(10), birthday, phone_number),
KEY idx_name (name(10)));
上例中,索引idx_name_birthday_phone_number
是建立在name
、birthday
和phone_number
三个字段上的,此时针对name
字段创建单列索引idx_name
就是一个冗余索引。
冗余索引除了增加维护成本外,不会对搜索带来任何好处。
假设有如下表:
CREATE TABLE repeat_index_demo (
col1 INT PRIMARY KEY,
col2 INT,
UNIQUE uk_idx_c1 (col1),
INDEX idx_c1 (col1));
上例中,col1
既是主键,又对它建立了唯一索引和普通索引。因为主键索引本身就是作为聚簇索引存储的,所以这里定义的唯一索引和普通索引都是重复的,这种情况需要规避。