InnoDB
maintains a storage area called the buffer pool for caching data and indexes in memory. Knowing how the InnoDB
buffer pool works, and taking advantage of it to keep frequently accessed data in memory, is an important aspect of MySQL tuning. >>innodb存储引擎维护一块叫innodb buffer pool的内存,用来缓存数据和索引。这样可以做到一次磁盘io,把数据读入到内存,下面再次访问的时候直接访问内存中的数据。mysql innodb buffer pool调优是mysql调优的重要组成部分。
For additional information about the InnoDB
buffer pool, see Section 14.3.3, “InnoDB Buffer Pool Configuration”.
Ideally, you set the size of the buffer pool to as large a value as practical, leaving enough memory for other processes on the server to run without excessive paging. The larger the buffer pool, the more InnoDB
acts like an in-memory database, reading data from disk once and then accessing the data from memory during subsequent reads. The buffer pool even caches data changed by insert and update operations, so that disk writes can be grouped together for better performance. >>理想的情况是,在给系统预留足够的空间的情况(避免使用swap)下,尽可能把innodb buffer pool设置的足够大。这样能够尽量做到一次读取,多次访问。buffer pool还可以缓存新插入和被修改的数据,这样可以把这些插入和修改进行合并提交,大大减少磁盘io的次数提高性能。
Depending on the typical workload on your system, you might adjust the proportions of the parts within the buffer pool. You can tune the way the buffer pool chooses which blocks to cache once it fills up, to keep frequently accessed data in memory despite sudden spikes of activity for operations such as backups or reporting. >>根据你系统的负载情况,你可能需要调整innodb buffer pool中各部分内存大小。在生成报表或者备份的时候,我们可以通过调整一些参数来避免buffer pool被这些一次性使用的大量数据污染。
With 64-bit systems with large memory sizes, you can split the buffer pool into multiple parts, to minimize contention for the memory structures among concurrent operations. For details, see Section 14.3.3.4, “Using Multiple Buffer Pool Instances”. >>在64位的操作系统上,更大的内存能够利用,你可以把innodb buffer pool划分成多个小的池,这样可以减少并发时相关资源争用。
InnoDB
manages the pool as a list, using a variation of the least recently used (LRU) algorithm. When room is needed to add a new block to the pool, InnoDB
evicts the least recently used block and adds the new block to the middle of the list. This “midpoint insertion strategy” treats the list as two sublists: >>innodb 通过一个链表来管理innodb buffer pool,使用lru算法来管理这个链表。当一个新的块需要被读入内存,但是此时已经没有空闲的内存,那么innodb会清除lru中最近最少使用的块,并把新的块插入到冷端的头部。这个中间插入策略(其实是在距离冷端末尾3/8处),把lru 链分成两个部分
At the head, a sublist of “new” (or “young”) blocks that were accessed recently. 前部作为热端,这里保存的是最近经常被使用的块
At the tail, a sublist of “old” blocks that were accessed less recently.尾部为冷端,这里保存着最近最少使用的块
This algorithm keeps blocks that are heavily used by queries in the new sublist. The old sublist contains less-used blocks; these blocks are candidates for eviction. >>lru算法保证经常被使用的块存放在热端,不经常被访问的块存放在冷端。
The LRU algorithm operates as follows by default: mysql lru算法默认操作如下:
3/8 of the buffer pool is devoted to the old sublist. 尾部到距离尾部的3/8处为冷端
The midpoint of the list is the boundary where the tail of the new sublist meets the head of the old sublist. midpoint位置是冷端和热端的临界位置
When InnoDB
reads a block into the buffer pool, it initially inserts it at the midpoint (the head of the old sublist). A block can be read in because it is required for a user-specified operation such as an SQL query, or as part of a read-ahead operation performed automatically by InnoDB
. >>当一个块被innodb 读入innodb buffer pool中,它会被插入到midpoint位置(插入到冷端的头部)。sql查询还有 read-ahead操作都会导致相关的数据块被读入buffer pool
Accessing a block in the old sublist makes it “young”, moving it to the head of the buffer pool (the head of the new sublist). If the block was read in because it was required, the first access occurs immediately and the block is made young. If the block was read in due to read-ahead, the first access does not occur immediately (and might not occur at all before the block is evicted). >>冷端的块被访问后,会被移动到热端的头部。如果一个数据块是因为操作需要而被读入buffer pool,那么在它被读入buffer pool冷端后,很快就会被访问,并且被放入到热端的头部。如果这个块是因为read-ahead被读入buffer pool,那么这个块可能不会立即被访问(这样的块有可能在被清除之前都不会被访问)
As the database operates, blocks in the buffer pool that are not accessed “age” by moving toward the tail of the list. Blocks in both the new and old sublists age as other blocks are made new. Blocks in the old sublist also age as blocks are inserted at the midpoint. Eventually, a block that remains unused for long enough reaches the tail of the old sublist and is evicted. >>随着数据库的运行,buffer pool中没有被访问的块会“老化”,即被朝着链的末端移动。冷端和热端的块都会因为冷端某个块被访问后移动到热端而被老化。冷端的块还会随着新的块插入lru的midpoint而老化。最后,很长时间没有被使用的块被移动到了lru链的末端,最后被从lru链中移除。
By default, blocks read by queries immediately move into the new sublist, meaning they will stay in the buffer pool for a long time. A table scan (such as performed for a mysqldump operation, or a SELECT
statement with no WHERE
clause) can bring a large amount of data into the buffer pool and evict an equivalent amount of older data, even if the new data is never used again. Similarly, blocks that are loaded by the read-ahead background thread and then accessed only once move to the head of the new list. These situations can push frequently used blocks to the old sublist, where they become subject to eviction. >>默认情况下,因为查询操作而被读入buffer pool中的块,会被立即访问并移动到lru的热端,这意味着这样的块会在buffer pool中保留比较长的时间。全表扫描(例如mysqldump,不带where条件的select操作 等都会造成全表扫描操作)会导致大量的数据被读入buffer pool,同时会有等量的lru中旧数据被移除,即使这些新读入的数据不会被再次使用。 这就像有些因为read-ahead操作而被读入buffer pool块,仅仅被访问一次被移动到热端。(还有很多read-ahead读入的块不会被访问)。 这样的情况会把那些经常被访问的数据推到冷端,因为被推到了冷端,所以这些数据可能被移除lru链。(这样的情况是我们不愿意看到的)
Several InnoDB
system variables control the size of the buffer pool and let you tune the LRU algorithm: 下面是一些跟innodb buffer pool相关的参数,可以通过这些参数调整buffer大小和对lru算法进行调优(这些参数详细说明可以参考mysql官方文档)
innodb_buffer_pool_size
Specifies the size of the buffer pool. If your buffer pool is small and you have sufficient memory, making the pool larger can improve performance by reducing the amount of disk I/O needed as queries access InnoDB
tables. >>innodb_buffer_pool_size指定buffer大小。
innodb_buffer_pool_instances
Divides the buffer pool into a user-specified number of separate regions, each with its own LRU list and related data structures, to reduce contention during concurrent memory read and write operations. This option takes effect only when you set the innodb_buffer_pool_size
to a size of 1 gigabyte or more. The total size you specify is divided among all the buffer pools. For best efficiency, specify a combination of innodb_buffer_pool_instances
andinnodb_buffer_pool_size
so that each buffer pool instance is at least 1 gigabyte. >>innodb_buffer_pool_instances指定buffer pool子池个数,每个子池都维护各自独立的lru链和跟其相关的数据结构,这样可以减少并发的内存读写间的争用。只有在innodb_buffer_pool_size值大于1G的情况下才会生效。
innodb_old_blocks_pct
Specifies the approximate percentage of the buffer pool that InnoDB
uses for the old block sublist. The range of values is 5 to 95. The default value is 37 (that is, 3/8 of the pool). >>指定lru链中冷端所占的比例,这个参数的取值范围为5到95。该参数的默认值是37(即3/8 of the pool)
innodb_old_blocks_time
Specifies how long in milliseconds (ms) a block inserted into the old sublist must stay there after its first access before it can be moved to the new sublist. The default value is 0: A block inserted into the old sublist moves to the new sublist when Innodb has evicted 1/4 of the inserted block's pages from the buffer pool, no matter how soon after insertion the access occurs. If the value is greater than 0, blocks remain in the old sublist until an access occurs at least that many ms after the first access. For example, a value of 1000 causes blocks to stay in the old sublist for 1 second after the first access before they become eligible to move to the new sublist. >>指定冷端的块在第一次被访问后需要在滞留多少毫秒才能被移动到热端。这个参数的默认值为0。如果该参数为0,表示冷端的数据在第一次被访问后立即移动到热端。如果该参数大于0,表示冷端的数据被访问后需要在冷端继续等待指定的时间才能被移动到热端。上文中标红的那句话没能理解是什么意思
Setting innodb_old_blocks_time
greater than 0 prevents one-time table scans from flooding the new sublist with blocks used only for the scan. Rows in a block read in for a scan are accessed many times in rapid succession, but the block is unused after that. If innodb_old_blocks_time
is set to a value greater than time to process the block, the block remains in the “old” sublist and ages to the tail of the list to be evicted quickly. This way, blocks used only for a one-time scan do not act to the detriment of heavily used blocks in the new sublist. >>设置innodb_old_blocks_time为大于0的值可以避免表扫描操作污染buffer pool。全表扫描读入的块可能会在很短的时间内被多次访问,但是在这之后这些数据可能就不会被访问了,所以我们应该避免这类的数据污染我们的buffer pool。如果我们把这个时间设置的大于该次全表扫描操作的时间,那么这些因为全表扫描被读入的块会一直保留在冷端,并且很快被移出buffer pool。这样这些一次性使用的就不会对我们lru热端进程使用的块造成影响。
innodb_old_blocks_time
can be set at runtime, so you can change it temporarily while performing operations such as table scans and dumps:>>innodb_old_blocks_time可以动态修改,所以你可以在执例如全表扫描操作之前设置该参数
SET GLOBAL innodb_old_blocks_time = 1000;
... perform queries that scan tables ...
SET GLOBAL innodb_old_blocks_time = 0;
This strategy does not apply if your intent is to “warm up” the buffer pool by filling it with a table's content. For example, benchmark tests often perform a table or index scan at server startup, because that data would normally be in the buffer pool after a period of normal use. In this case, leave innodb_old_blocks_time
set to 0, at least until the warmup phase is complete.
The output from the InnoDB Standard Monitor contains several fields in the BUFFER POOL AND MEMORY
section that pertain to operation of the buffer pool LRU algorithm: 下面的信息来自innodb 标准监控中BUFFER POOL AND MEMORY部分里 有关buffer pool和lru的信息
Old database pages
: The number of pages in the old sublist of the buffer pool. >>lru链中冷端的pages数量
Pages made young, not young
: The number of old pages that were moved to the head of the buffer pool (the new sublist), and the number of pages that have remained in the old sublist without being made new. >>被从冷端移动到热端的pages,留在冷端没有被移动到热端的pages
youngs/s non-youngs/s
: The number of accesses to old pages that have resulted in making them young or not. This metric differs from that of the previous item in two ways. First, it relates only to old pages. Second, it is based on number of accesses to pages and not the number of pages. (There can be multiple accesses to a given page, all of which are counted.) >>这个值理解起来有点绕,我们可以这样来理解,块在移动到热端之前被访问的次数(一个块被移动到热端之前可能被多次访问)
young-making rate
: Hits that cause blocks to move to the head of the buffer pool. >>命中块并且造成块移动到lru链头部
not
: Hits that do not cause blocks to move to the head of the buffer pool (due to the delay not being met). >>命中块但是块没有被移动到lru链头部(因为设置了innodb_old_blocks_time)
The young-making
rate and not
rate will not normally add up to the overall buffer pool hit rate. Hits for blocks in the old sublist cause them to move to the new sublist, but hits to blocks in the new sublist cause them to move to the head of the list only if they are a certain distance from the head. >>
The preceding information from the Monitor can help you make LRU tuning decisions: 上面的监控信息可以为你调优lru提供帮助
If you see very low youngs/s
values when you do not have large scans going on, that indicates that you might need to either reduce the delay time, or increase the percentage of the buffer pool used for the old sublist. Increasing the percentage makes the old sublist larger, so blocks in that sublist take longer to move to the tail and be evicted. This increases the likelihood that they will be accessed again and be made young. >>如果在没有大的扫描操作正在进行,而你的youngs/s值比较小的时候,那么可能是你的innodb_old_blocks_time值设置的过大,需要减小该值,或者增大冷端在lru中所占的比例。
If you do not see a lot of non-youngs/s
when you are doing large table scans (and lots of youngs/s
), to tune your delay value to be larger.
Per second averages provided in InnoDB
Monitor output are based on the elapsed time between the current time and the last timeInnoDB
Monitor output was printed.
For more information about InnoDB Monitors, see Section 14.14, “InnoDB Monitors”.
The INNODB_BUFFER_POOL_STATS
table and InnoDB
buffer pool server status variables provide much of the same buffer pool information that is provided bySHOW ENGINE INNODB STATUS
output.