Each MySQL table is associated with a particular storage engine. InnoDB tables have particular physical and logical characteristics that affect performance, scalability, backup, administration, and application development.
In terms of file storage, an InnoDB table belongs to one of the following tablespace types:
The shared InnoDB system tablespace, which is comprised of one or more ibdata files.
A file-per-table tablespace, comprised of an individual .ibd file.
A shared general tablespace, comprised of an individual .ibd file. General tablespaces were introduced in MySQL 5.7.6.
.ibd data files contain both table and index data.
InnoDB tables created in file-per-table tablespaces can use DYNAMIC or COMPRESSED row format. These row formats enable InnoDB features such as compression, efficient storage of off-page columns, and large index key prefixes. General tablespaces support all row formats.
The system tablespace supports tables that use REDUNDANT, COMPACT, and DYNAMIC row formats. System tablespace support for the DYNAMIC row format was added in MySQL 5.7.6.
The rows of an InnoDB table are organized into an index structure known as the clustered index, with entries sorted based on the primary key columns of the table. Data access is optimized for queries that filter and sort on the primary key columns, and each index contains a copy of the associated primary key columns for each entry. Modifying values for any of the primary key columns is an expensive operation. Thus an important aspect of InnoDB table design is choosing a primary key with columns that are used in the most important queries, and keeping the primary key short, with rarely changing values.
A data structure that provides a fast lookup capability for rows of a table, typically by forming a tree structure (B-tree) representing all the values of a particular column or set of columns.
InnoDB tables always have a clustered index representing the primary key. They can also have one or more secondary indexes defined on one or more columns. Depending on their structure, secondary indexes can be classified as partial, column, or composite indexes.
Indexes are a crucial aspect of query performance. Database architects design tables, queries, and indexes to allow fast lookups for data needed by applications. The ideal database design uses a covering index where practical; the query results are computed entirely from the index, without reading the actual table data. Each foreign key constraint also requires an index, to efficiently check whether values exist in both the parent and child tables.
Although a B-tree index is the most common, a different kind of data structure is used for hash indexes, as in the MEMORY storage engine and the InnoDB adaptive hash index. R-tree indexes are used for spatial indexing of multi-dimensional information.
The InnoDB term for a primary key index. InnoDB table storage is organized based on the values of the primary key columns, to speed up queries and sorts involving the primary key columns. For best performance, choose the primary key columns carefully based on the most performance-critical queries. Because modifying the columns of the clustered index is an expensive operation, choose primary columns that are rarely or never updated.
In the Oracle Database product, this type of table is known as an index-organized table.
A set of columns—and by implication, the index based on this set of columns—that can uniquely identify every row in a table. As such, it must be a unique index that does not contain any NULL values.
InnoDB requires that every table has such an index (also called the clustered index or cluster index), and organizes the table storage based on the column values of the primary key.
When choosing primary key values, consider using arbitrary values (a synthetic key) rather than relying on values derived from some other source (a natural key).
A type of InnoDB index that represents a subset of table columns. An InnoDB table can have zero, one, or many secondary indexes. (Contrast with the clustered index, which is required for each InnoDB table, and stores the data for all the table columns.)
A secondary index can be used to satisfy queries that only require values from the indexed columns. For more complex queries, it can be used to identify the relevant rows in the table, which are then retrieved through lookups using the clustered index.
Creating and dropping secondary indexes has traditionally involved significant overhead from copying all the data in the InnoDB table. The fast index creation feature makes both CREATE INDEX and DROP INDEX statements much faster for InnoDB secondary indexes.
mysql> create table T( id int primary key,k int not null,name varchar(16),index (k))engine=InnoDB;
(ID,k) 值分别为 (100,1)、(200,2)、(300,3)、(500,5) 和 (600,6)
主键索引的叶子节点存的是整行数据。在 InnoDB 里,主键索引也被称为聚集索引(clustered index)。
非主键索引的叶子节点内容是主键的值。在 InnoDB 里,非主键索引也被称为二级索引(secondary index)。
如果语句是 select * from T where ID=500,即 主键查询方式,则只需要搜索 ID 这棵 B+树 ;
如果语句是 select * from T where k=5,即 普通索引查询方式,则需要先搜索 k 索引树,得到 ID的值为 500,再到 ID 索引树搜索一次。这个过程称为回表。
B+ 树为了维护索引有序性,在插入新值的时候需要做必要的维护。以上面为例,如果插入新的行 ID 值为 700,则只只需要在 R5 的记录后面插入一个新记录。如果新插入的 ID值为 400,就相对麻烦了,需要逻辑上挪动后面的数据,空出位置。
[参考文档]
https://dev.mysql.com/doc/refman/8.0/en/glossary.html#glos_table
https://dev.mysql.com/doc/refman/8.0/en/glossary.html#glos_index https://dev.mysql.com/doc/refman/8.0/en/glossary.html#glos_clustered_index
https://dev.mysql.com/doc/refman/8.0/en/glossary.html#glos_primary_key
https://dev.mysql.com/doc/refman/8.0/en/glossary.html#glos_secondary_index
http://blog.itpub.net/30126024/viewspace-2221485/