InnoDB Data Dictionary
The InnoDB data dictionary is comprised of internal system tables that contain metadata used to keep track of objects such as tables, indexes, and table columns. The metadata is physically located in the InnoDB system tablespace. For historical reasons, data dictionary metadata overlaps to some degree with information stored in InnoDB table metadata files (.frm files).
InnoDB data dictionary由一些存储和表、列、索引有关的元数据的系统内部表组成的。从某种程度上说,和物理.frm文件中存储的信息存在一定的重复
Doublewrite Buffer
The doublewrite buffer is a storage area located in the system tablespace where InnoDB writes pages that are flushed from the InnoDB buffer pool, before the pages are written to their proper positions in the data file.
Only after flushing and writing pages to the doublewrite buffer, does InnoDB write pages to their proper positions. If there is an operating system, storage subsystem, or mysqld process crash in the middle of a page write, InnoDB can later find a good copy of the page from the doublewrite buffer during crash recovery.
If there is an operating system, storage subsystem, or mysqld process crash in the middle of a page write, InnoDB can later find a good copy of the page from the doublewrite buffer during crash recovery.
Doublewrite Buffer是位于系统表空间里的一块存储区域,用于存放那些已经从buffer pool中刷出来,将要写到对应数据文件里面的页。
数据流向:buffer pool---> doublewrite buffer ---> datafile,如果出现crash的话,doublewrite buffer可以为crash recovery提供数据的完整性
Although data is always written twice, the doublewrite buffer does not require twice as much I/O overhead or twice as many I/O operations. Data is written to the doublewrite buffer itself as a large sequential chunk, with a single fsync() call to the operating system.
虽然数据是写两次,但是doublewrite buffer不需要2倍的I/O负载或者2倍的I/O次数。数据是作为一个大的连续的块,通过操作系统的一个fsync()请求写入doublewrite buffer
The doublewrite buffer is enabled by default in most cases. To disable the doublewrite buffer, set innodb_doublewrite to 0.
doublewrite buffer大部分情况下默认开始。如果想禁用doublewrite buffer,将innodb_doublewrite设置成0
If system tablespace files (“ibdata files”) are located on Fusion-io devices that support atomic writes, doublewrite buffering is automatically disabled and Fusion-io atomic writes are used for all data files.
Because the doublewrite buffer setting is global, doublewrite buffering is also disabled for data files residing on non-Fusion-io hardware.
This feature is only supported on Fusion-io hardware and is only enabled for Fusion-io NVMFS on Linux. To take full advantage of this feature, an innodb_flush_method setting of O_DIRECT is recommended.
相关参数:
innodb_doublewrite:
Dynamic No、Scope Global、Default Value ON、Type Boolean
When enabled (the default), InnoDB stores all data twice, first to the doublewrite buffer, then to the actual data files. This variable can be turned off with --skip-innodb-doublewrite for benchmarks or cases when top performance is needed rather than concern for data integrity or possible failures.
innodb_flush_method:
Dynamic No、Scope Global、Default Value NULL、Type String、Valid Values (Windows):async_unbuffered\normal\unbuffered、Valid Values (Unix):fsync\O_DSYNC\littlesync\nosync\O_DIRECT\O_DIRECT_NO_FSYNC
Defines the method used to flush data to InnoDB data files and log files, which can affect I/Othroughput.
If innodb_flush_method is set to NULL on a Unix-like system, the fsync option is used by default. If innodb_flush_method is set to NULL on Windows, the async_unbuffered option is used by default.
The innodb_flush_method options for Unix-like systems include:
fsync: InnoDB uses the fsync() system call to flush both the data and log files. fsync is the default setting.
O_DSYNC: InnoDB uses O_SYNC to open and flush the log files, and fsync() to flush the data files. InnoDB does not use O_DSYNC directly because there have been problems with it on many varieties of Unix.
littlesync: This option is used for internal performance testing and is currently unsupported.Use at your own risk.
nosync: This option is used for internal performance testing and is currently unsupported. Use at your own risk.
O_DIRECT: InnoDB uses O_DIRECT (or directio() on Solaris) to open the data files, and uses fsync() to flush both the data and log files. This option is available on some GNU/Linux versions,FreeBSD, and Solaris.
O_DIRECT_NO_FSYNC: InnoDB uses O_DIRECT during flushing I/O, but skips the fsync() system call afterward. This setting is suitable for some types of file systems but not others. If you are not sure whether the file system you use requires an fsync(), for example to preserve all file metadata, use O_DIRECT instead.
The innodb_flush_method options for Windows systems include:
async_unbuffered: InnoDB uses Windows asynchronous I/O and non-buffered I/O.async_unbuffered is the default setting on Windows systems.Running MySQL server on a 4K sector hard drive on Windows is not supported with async_unbuffered. The workaround is to use innodb_flush_method=normal.
normal: InnoDB uses simulated asynchronous I/O and buffered I/O.
unbuffered: InnoDB uses simulated asynchronous I/O and non-buffered I/O.
Redo Log
The redo log is a disk-based data structure used during crash recovery to correct data written by incomplete transactions.
During normal operations, the redo log encodes requests to change table data that result from SQL statements or low-level API calls. Modifications that did not finish updating the data files before an unexpected shutdown are replayed automatically during initialization, and before the connections are accepted.
disk-based的数据结构。正常的运行情况下,redo log记录数据改变,在crash recovery时可以校正数据。
By default, the redo log is physically represented on disk by two files named ib_logfile0 and ib_logfile1. MySQL writes to the redo log files in a circular fashion. Data in the redo log is encoded in terms of records affected; this data is collectively referred to as redo. The passage of data through the redo log is represented by an ever-increasing LSN value.
Changing the Number or Size of InnoDB Redo Log Files:
To change the number or the size of your InnoDB redo log files, perform the following steps:
1. Stop the MySQL server and make sure that it shuts down without errors.
2. Edit my.cnf to change the log file configuration. To change the log file size,configure innodb_log_file_size.To increase the number of log files, configure innodb_log_files_in_group.
3. Start the MySQL server again.
If InnoDB detects that the innodb_log_file_size differs from the redo log file size, it writes a log checkpoint, closes and removes the old log files, creates new log files at the requested size, and opens the new log files.
相关参数:
innodb_log_file_size:
Dynamic No、Scope Global、Default Value 48M、Minimum Value (>= 5.7.11) 4M、Minimum Value (<= 5.7.10) 1M、Maximum Value 512GB / innodb_log_files_in_group
The size in bytes of each log file in a log group. The combined size of log files(innodb_log_file_size * innodb_log_files_in_group) cannot exceed a maximum value that is slightly less than 512GB. A pair of 255 GB log files, for example, approaches the limit but does not exceed it. The default value is 48MB.
The minimum innodb_log_file_size value was increased from 1MB to 4MB in MySQL 5.7.11.
innodb_log_files_in_group:
Dynamic No、Scope Global、Default Value 2、Minimum Value 2、Maximum Value 100
The number of log files in the log group. InnoDB writes to the files in a circular fashion.The default (and recommended) value is 2. The location of the files is specified by innodb_log_group_home_dir.
innodb_log_group_home_dir:
The directory path to the InnoDB redo log files, whose number is specified by innodb_log_files_in_group.
If you do not specify any InnoDB log variables, the default is to create two files named ib_logfile0 and ib_logfile1 in the MySQL data directory. Log file size is given by the innodb_log_file_size system variable.
Group Commit for Redo Log Flushing
InnoDB, like any other ACID-compliant database engine, flushes the redo log of a transaction before it is committed.
InnoDB uses group commit functionality to group multiple such flush requests together to avoid one flush for each commit. With group commit, InnoDB issues a single write to the log file to perform the commit action for multiple user transactions that commit at about the same time,significantly improving throughput.
Undo Logs
An undo log is a collection of undo log records associated with a single read-write transaction. An undo log record contains information about how to undo the latest change by a transaction to a clustered index record.
If another transaction needs to see the original data as part of a consistent read operation, the unmodified data is retrieved from undo log records. Undo logs exist within undo log segments, which are contained within rollback segments. Rollback segments reside in the system tablespace, in undo tablespaces, and in the temporary tablespace.
Undo logs that reside in the temporary tablespace are used for transactions that modify data in user-defined temporary tables. These undo logs are not redo-logged, as they are not required for crash recovery. They are used only for rollback while the server is running. This type of undo log benefits performance by avoiding redo logging I/O.
InnoDB supports a maximum of 128 rollback segments, 32 of which are allocated to the temporary tablespace. This leaves 96 rollback segments that can be assigned to transactions that modify data in regular tables. The innodb_rollback_segments variable defines the number of rollback segments used by InnoDB.
The number of transactions that a rollback segment supports depends on the number of undo slots in the rollback segment and the number of undo logs required by each transaction.Number of Undo Slots in a Rollback Segment=InnoDB Page Size / 16
Undo logs are assigned as needed. For example, a transaction that performs INSERT, UPDATE, and DELETE operations on regular and temporary tables requires a full assignment of four undo logs. A transaction that performs only INSERT operations on regular tables requires a single undo log.
An undo log assigned to a transaction remains tied to the transaction for its duration. For example, an undo log assigned to a transaction for an INSERT operation on a regular table is used for all INSERT operations on regular tables performed by that transaction.
A transaction can encounter a concurrent transaction limit error before reaching the number of concurrent read-write transactions that InnoDB is capable of supporting. This occurs when the rollback segment assigned to a transaction runs out of undo slots. In such cases, try rerunning the transaction.
When transactions perform operations on temporary tables, the number of concurrent read-write transactions that InnoDB is capable of supporting is constrained by the number of rollback segments allocated to the temporary tablespace, which is 32.
If each transaction performs either an INSERT or an UPDATE or DELETE operation, the number of concurrent read-write transactions that InnoDB is capable of supporting is:
(innodb_page_size / 16) * (innodb_rollback_segments - 32)
If each transaction performs an INSERT and an UPDATE or DELETE operation, the number of concurrent read-write transactions that InnoDB is capable of supporting is:
(innodb_page_size / 16 / 2) * (innodb_rollback_segments - 32)
If each transaction performs an INSERT operation on a temporary table, the number of concurrent read-write transactions that InnoDB is capable of supporting is:
(innodb_page_size / 16) * 32
If each transaction performs an INSERT and an UPDATE or DELETE operation on a temporary table,the number of concurrent read-write transactions that InnoDB is capable of supporting is:
(innodb_page_size / 16 / 2) * 32