innodb_buffer_pool缓冲池有两个区域,一个是sublist of new blocks区域(经常被访问的数据---热数据),一个是sublist of old blocks区域(不经常访问的数据)。当用户访问数据时,如果缓冲区里有相应的数据则直接返回,否则会从磁盘读入缓冲区的sublist of old blocks区域,然后在移动到sublist of new blocks,并通过LRU算法来刷出旧的数据页。
但是如果有些sql语句做统计要用到全表扫描,或mysqldump这时就会进入sublist of new blocks区域,把一些真正的热数据“剔出去”,这样照成缓冲区的数据进进出出,导致磁盘IO频繁。
所以从MySQL5.5.X开始,innodb_old_blocks_pct参数可以控制进入缓冲区sublist of old blocks区域的数量,默认是73,占整个缓冲池比例的3/8。当全表扫描一个大表或是MySQLdump时,就可以将innodb_old_blocks_pct设置得小些,例如设置innodb_old_blocks_pct=5,使数据块进入少量sublist of old blocks区域,并移动到sublist of new blocks
区域,从而让更多的热数据不被踢出。当访问一个小表时,则可以保持默认的innodb_old_blocks_pct=37,或者设置的更大,比如innodb_old_blocks_pct=50。
在该过程中,还会涉及另一个参数innodb_old_blocks_time。在访问sublist of old blocks区域里的数据块时,并不是马上就移动到sublist of new blocks区域,而是会先让数据停留在sublist of old blocks区域innodb_old_blocks_time(微秒),然后再移动到sublist of new blocks区域,这样就保证了sublist of new blocks区域里的数据不会马上被踢出。
用show engine innodb status\G可以查看当前信息:
Pages made young 125652177, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 125996920, created 15168, written 2012416
0.00 reads/s, 0.00 creates/s, 0.07 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 8190, unzip_LRU len: 0
I/O sum[3]:cur[0], unzip sum[0]:cur[0]
Old database pages:在旧区域存放着多少页
Pages made young:移动到新区域的有多少页
Pages made not young:没有移动到新区域的有多少页
youngs/s:每秒移动到新区域的有多少页
non-youngs/s:每秒没有移动到新区域的有多少页
young-making rate:移动到新区域的比例
young-making not rate:没有移动到新区域的比例
如果没有全表扫描,发现youngs/s的值很小,那么久应该增大innodb_old_blocks_pct或者减小innodb_old_blocks_time。如果进行了全表扫描,发现non-youngs/s的值很小,那么久应该增大innodb_old_blocks_time。
关于innodb_old_blocks_pct和innodb_old_blocks_time参数的介绍,可参考官方手册:
Rather than using a strictly LRU algorithm, InnoDB uses a technique to minimize the amount of data that is brought into the buffer cache and never accessed again. The goal is to make sure that frequently accessed (“hot”) pages remain in the buffer cache, even as read-ahead and full table scans bring new blocks in that might or might not be accessed afterward.
Newly read blocks are inserted into the middle of the list representing the buffer cache. of the LRU list. All newly read pages are inserted at a location that by default is 3/8
from the tail of the LRU list. The pages are moved to the front of the list (the most-recently used end) when they are accessed in the buffer cache for the first time. Thus pages that are never accessed never make it to the front portion of the LRU list, and “age out” sooner than with a strict LRU approach. This arrangement divides the LRU list into two segments, where the pages downstream of the insertion point are considered “old” and are desirable victims for LRU eviction.
For an explanation of the inner workings of the InnoDB buffer pool and the specifics of its LRU replacement algorithm, see Section 7.8.1, “The InnoDB
Buffer Pool”.
Starting with InnoDB storage engine 1.0.5, you can control the insertion point in the LRU list, and choose whether InnoDB applies the same optimization to blocks brought into the buffer pool by table or index scans.
The configuration parameter innodb_old_blocks_pct
controls the percentage of “old” blocks in the LRU list. The default value of innodb_old_blocks_pct
is 37
, corresponding to the original fixed ratio of 3/8. The value range is 5
(new pages in the buffer pool age out very quickly) to 95
(only 5% of the buffer pool reserved for hot pages, making the algorithm close to the familiar LRU strategy).
The optimization that keeps the buffer cache from being churned by read-ahead can avoid similar problems due to table or index scans. In these scans, a data page is typically accessed a few times in quick succession and is never touched again. The configuration parameter innodb_old_blocks_time
specifies the time window (in milliseconds) after the first access to a page during which it can be accessed without being moved to the front (most-recently used end) of the LRU list. The default value of innodb_old_blocks_time
is 0
, corresponding to the original behavior of moving a page to the most-recently used end of the buffer pool list when it is first accessed in the buffer pool. Increasing this value makes more and more blocks likely to age out faster from the buffer pool.
Both the new parameters innodb_old_blocks_pct
and innodb_old_blocks_time
are dynamic, global and can be specified in the MySQL option file (my.cnf
or my.ini
) or changed at runtime with the SET GLOBAL
command. Changing the setting requires the SUPER
privilege.
To help you gauge the effect of setting these parameters, some additional statistics are reported by SHOW ENGINE INNODB STATUS
command. The BUFFER POOL AND MEMORY
section now looks like:
Total memory allocated 1107296256; in additional pool allocated 0 Dictionary memory allocated 80360 Buffer pool size 65535 Free buffers 0 Database pages 63920 Old database pages 23600 Modified db pages 34969 Pending reads 32 Pending writes: LRU 0, flush list 0, single page 0 Pages made young 414946, not young 2930673 1274.75 youngs/s, 16521.90 non-youngs/s Pages read 486005, created 3178, written 160585 2132.37 reads/s, 3.40 creates/s, 323.74 writes/s Buffer pool hit rate 950 / 1000, young-making rate 30 / 1000 not 392 / 1000 Pages read ahead 1510.10/s, evicted without access 0.00/s LRU len: 63920, unzip_LRU len: 0 I/O sum[43690]:cur[221], unzip sum[0]:cur[0]
Old database pages
is the number of pages in the “old” segment of the LRU list.
Pages made young
and not young
is the total number of “old” pages that have been made young or not respectively.
youngs/s
and non-young/s
is the rate at which page accesses to the “old” pages have resulted in making such pages young or otherwise respectively since the last invocation of the command.
young-making rate
and not
provides the same rate but in terms of overall buffer cache accesses instead of accesses just to the “old” pages.
Because the effects of these parameters can vary widely based on your hardware configuration, your data, and the details of your workload, always benchmark to verify the effectiveness before changing these settings in any performance-critical or production environment.
In mixed workloads where most of the activity is OLTP type with periodic batch reporting queries which result in large scans, setting the value of innodb_old_blocks_time
during the batch runs can help keep the working set of the normal workload in the buffer cache.
When scanning large tables that cannot fit entirely in the buffer pool, setting innodb_old_blocks_pct
to a small value keeps the data that is only read once from consuming a significant portion of the buffer pool. For example, setting innodb_old_blocks_pct
=5 restricts this data that is only read once to 5% of the buffer pool.
When scanning small tables that do fit into memory, there is less overhead for moving pages around within the buffer pool, so you can leave innodb_old_blocks_pct
at its default value, or even higher, such as innodb_old_blocks_pct
=50.
The effect of the innodb_old_blocks_time
parameter is harder to predict than the innodb_old_blocks_pct
parameter, is relatively small, and varies more with the workload. To arrive at an optimal value, conduct your own benchmarks if the performance improvement from adjusting innodb_old_blocks_pct
is not sufficient.