记录一下线上业务数据库每天需要监控的参数和变量情况,声明下:每家公司需要监控的状态变量均不完全相同,本文仅是针对作者自己负责的业务数据库而言,另外,其中很多的参数均是从MySQL官网摘取的英文译著,对其中的错误之处,望请指正!


InnoDB pages监控

total(显示了缓冲池总共有多少个页,即Buffer pool size,页数x页大小)
 
data(Database page,显示了分配用来存储数据库页的页数,即,表示LRU列表中页的数量,包含young sublist和old sublist)
 
dirty(Modified db pages,显示脏数据库页数)
 
flushed(Innodb_buffer_pool_pages_flushed)
 
free(Free buffers,显示了缓冲池空闲页数)


InnoDB IO监控

mysql> show status like 'innodb_buffer_pool_read%';
| Innodb_buffer_pool_read_requests      | 1085462370751 |
| Innodb_buffer_pool_reads              | 31655         |
+---------------------------------------+---------------+
Innodb_buffer_pool_read_requests
    The number of logical read requests
Innodb_buffer_pool_reads
    The number of logical reads that InnoDB could not satisfy from the buffer pool, and had to read directly from disk
 
Innodb_buffer_read_hits=
(1-innodb_buffer_pool_reads/ innodb_buffer_pool_read_requests)*100%
 
mysql> show status like 'innodb_buffer_pool_pages_flushed';
+----------------------------------+----------+
| Variable_name                    | Value    |
+----------------------------------+----------+
| Innodb_buffer_pool_pages_flushed | 16896625 |
+----------------------------------+----------+
The number of requests to flush pages from the InnoDB buffer pool.


InnoDB_rows监控

mysql> show global status like 'innodb_rows_%';
+----------------------+---------------+
| Variable_name        | Value         |
+----------------------+---------------+
| Innodb_rows_deleted  | 4377501       |
| Innodb_rows_inserted | 13328425      |
| Innodb_rows_read     | 1592292250994 |
| Innodb_rows_updated  | 8069672       |
+----------------------+---------------+
 
Innodb_rows_deleted
    The number of rows deleted from InnoDB tables.
Innodb_rows_inserted
    The number of rows inserted into InnoDB tables.
Innodb_rows_read
    The number of rows read from InnoDB tables.
Innodb_rows_updated
    The number of rows updated in InnoDB tables.


键缓存监控

Key_blocks_used命中率:
    Key_blocks_used/( Key_blocks_used+ Key_blocks_unused)x100%
 
mysql> show status like 'key%';
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| Key_blocks_unused      | 25210   |
| Key_blocks_used        | 7907     |
 
Key_blocks_unused
    The number of unused blocks in the MyISAM key cache. You can use this value to determine how much of the key cache is in use; see the discussion of key_buffer_size
Key_blocks_used
    The number of used blocks in the MyISAM key cache. This value is a high-water mark that indicates the maximum number of blocks that have ever been in use at one time.
 
Key_buffer_read命中率:
    1-key_reads/key_read_requestsx100%
 
mysql> show status like 'key%';
+------------------------+-----------+
| Variable_name          | Value     |
+------------------------+-----------+
| Key_read_requests      | 572800837 |
| Key_reads              | 474655    |
 
Key_read_requests
    The number of requests to read a key block from the MyISAM key cache
Key_reads
    The number of physical reads of a key block from disk into the MyISAM key cache. If Key_reads is large, then your key_buffer_size value is probably too small. The cache miss rate can be calculated as Key_reads/Key_read_requests
 
Key_buffer_write命中率:
    1-key_writes/key_write_requestsx100%
 
mysql> show status like 'key%';
+------------------------+-----------+
| Variable_name          | Value     |
+------------------------+-----------+
| Key_write_requests     | 135684306 |
| Key_writes             | 3233824   |
+------------------------+-----------+
Key_write_requests
    The number of requests to write a key block to the MyISAM key cache
Key_writes
    The number of physical writes of a key block from the MyISAM key cache to disk


查看Threads使用情况

mysql> show global status like 'thread%';
+-------------------+--------+
| Variable_name     | Value  |
+-------------------+--------+
| Threads_cached    | 96     |
| Threads_connected | 4      |
| Threads_created   | 225461 |
| Threads_running   | 1      |
+-------------------+--------+
 
Threads_cached
    The number of threads in the thread cache.
Threads_connected
    The number of currently open connections.
Threads_created
    The number of threads created to handle connections. If Threads_created is big, you may want to increase the thread_cache_size value. The cache miss rate can be calculated as Threads_created/Connections.
Threads_running
    The number of threads that are not sleeping.


Thread_cache命中率

1 - Threads_created / connections x 100%
 
mysql> show global status like 'thread%';
| Threads_created   | 225461 |
    The number of threads created to handle connections. If Threads_created is big, you may want to increase the thread_cache_size value. The cache miss rate can be calculated as Threads_created/Connections
 
mysql> show global status like 'connections';
+---------------+----------+
| Variable_name | Value    |
+---------------+----------+
| Connections   | 86073503 |
+---------------+----------+
1 row in set (0.00 sec)
    The number of connection attempts (successful or not) to the MySQL server


查看QPS-TPS情况

QPS计算方式:
    QPS= Questions/Uptime
 
mysql> show global status like 'questions';
+---------------+------------+
| Variable_name | Value      |
+---------------+------------+
| Questions     | 15674599428 |
+---------------+------------+
    The number of statements executed by the server. This includes only statements sent to the server by clients and not statements executed within stored programs, unlike the Queries variable. This variable does not count COM_PING, COM_STATISTICS, COM_STMT_PREPARE, COM_STMT_CLOSE, or COM_STMT_RESET commands.
The discussion at the beginning of this section indicates how to relate this statement-counting status variable to other such variables
 
mysql> show global status like 'Uptime';
+---------------+----------+
| Variable_name | Value    |
+---------------+----------+
| Uptime        | 13609186 |
+---------------+----------+
    The number of seconds that the server has been up
 
TPS计算方式:
    TPS=(com_commit+com_rollback) /Uptime
 
mysql> show status like 'com_commit';(不会记录隐式提交的事务)
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Com_commit    | 0     |
+---------------+-------+
    
mysql> show status like 'com_rollback' ;
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Com_rollback  | 0     |
+---------------+-------+
    The Com_xxx statement counter variables indicate the number of times each xxx statement has been executed. There is one status variable for each type of statement. For example, Com_delete and Com_update count DELETE and UPDATE statements


查看DML per second

记录每一次的语句,只记录隐式提交的数据,如 autocommit=1
 
mysql> SHOW GLOBAL STATUS WHERE variable_name IN
    ->  ('Com_insert','Com_delete','Com_select','Com_update');
+---------------+------------+
| Variable_name | Value      |
+---------------+------------+
| Com_delete    | 59246      |
| Com_insert    | 7880676    |
| Com_select    | 1334273891 |
| Com_update    | 8651697    |
+---------------+------------+
    The Com_xxx statement counter variables indicate the number of times each xxx statement has been executed. There is one status variable for each type of statement. For example, Com_delete and Com_update count DELETE and UPDATE statements


流量监控

mysql> show status like 'bytes%';
+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| Bytes_received | 576   |
| Bytes_sent     | 2648  |
+----------------+-------+
 
Bytes_received
    The number of bytes received from all clients.
Bytes_sent
    The number of bytes sent to all clients.


查看索引使用情况

mysql> show status like 'handler%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 0     |
| Handler_read_rnd           | 0     |
| Handler_read_rnd_next      | 0     |
| Handler_rollback           | 0     |
 
Handler_read_rnd_next
    The number of requests to read the next row in the data file. This value is high if you are doing a lot of table scans. Generally this suggests that your tables are not properly indexed or that your queries are not written to take advantage of the indexes you have
 
Handler_read_rnd
    The number of requests to read a row based on a fixed position. This value is high if you are doing a lot of queries that require sorting of the result. You probably have a lot of queries that require MySQL to scan entire tables or you have joins that do not use keys properly
 
Handler_commit
    The number of internal COMMIT statements.
 
Handler_rollback
    The number of requests for a storage engine to perform a rollback operation


检查没有主键(key)联合(Join)的执行的情况

mysql> show status like 'select_full_join';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| Select_full_join | 0     |
+------------------+-------+
    The number of joins that perform table scans because they do not use indexes. If this value is not 0, you should carefully check the indexes of your tables.


查看执行全表扫描的数量

mysql> show status like 'select_scan';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Select_scan   | 0     |
+---------------+-------+
    The number of joins that did a full scan of the first table


查看慢查询情况

mysql> show status like 'slow_queries';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Slow_queries  | 0     |
+---------------+-------+
    The number of queries that have taken more than long_query_time seconds. This counter increments regardless of whether the slow query log is enabled.


查看表缓存情况

mysql> show global status like 'open%_tables';
+---------------+--------+
| Variable_name | Value  |
+---------------+--------+
| Open_tables   | 1360   |
| Opened_tables | 530013 |
+---------------+--------+
 
Open_tables
    The number of tables that are open
Opened_tables
    The number of tables that have been opened. If Opened_tables is big, your table_open_cache value is probably too small


查看锁使用情况

mysql> show status like '%lock%';
+------------------------------------------+------------+
| Variable_name                            | Value      |
+------------------------------------------+------------+
| Com_lock_tables                          | 0          |
| Com_unlock_tables                        | 0          |
| Handler_external_lock                    | 0          |
| Innodb_row_lock_current_waits            | 18         |
| Innodb_row_lock_time                     | 1472088440 |
| Innodb_row_lock_time_avg                 | 8968       |
| Innodb_row_lock_time_max                 | 31991      |
| Innodb_row_lock_waits                    | 164133     |
| Performance_schema_locker_lost           | 0          |
| Performance_schema_rwlock_classes_lost   | 0          |
| Performance_schema_rwlock_instances_lost | 0          |
| Table_locks_immediate                    | 1702805509 |
| Table_locks_waited                       | 373537     |
+------------------------------------------+------------+
 
Com_lock_tables
    The Com_xxx statement counter variables indicate the number of times each xxx statement has been executed
 
Com_unlock_tables
 
Handler_external_lock
    The server increments this variable for each call to its external_lock() function, which generally occurs at the beginning and end of access to a table instance. There might be differences among storage engines. This variable can be used, for example, to discover for a statement that accesses a partitioned table how many partitions were pruned before locking occurred: Check how much the counter increased for the statement, subtract 2 (2 calls for the table itself), then divide by 2 to get the number of partitions locked.
 
Innodb_row_lock_current_waits
    The number of row locks currently being waited for by operations on InnoDB tables.
 
Innodb_row_lock_time
    The total time spent in acquiring row locks for InnoDB tables, in milliseconds.
 
Innodb_row_lock_time_avg
    The average time to acquire a row lock for InnoDB tables, in milliseconds.
 
Innodb_row_lock_time_max
    The maximum time to acquire a row lock for InnoDB tables, in milliseconds.
 
Innodb_row_lock_waits
    The number of times operations on InnoDB tables had to wait for a row lock.
 
Performance_schema_locker_lost
Performance_schema_rwlock_classes_lost
Performance_schema_rwlock_instances_lost
    Performance Schema status variables are listed in Section 22.16, “Performance Schema Status Variables”. These variables provide information about instrumentation that could not be loaded or created due to memory constraints.
 
Table_locks_immediate
    The number of times that a request for a table lock could be granted immediately.
 
Table_locks_waited
    The number of times that a request for a table lock could not be granted immediately and a wait was needed. If this is high and you have performance problems, you should first optimize your queries, and then either split your table or tables or use replication.


Binlog cache使用情况

mysql> show status like 'binlog_cache%';
+-----------------------+----------+
| Variable_name         | Value    |
+-----------------------+----------+
| Binlog_cache_disk_use | 37       |
| Binlog_cache_use      | 13354593 |
+-----------------------+----------+
 
Binlog_cache_disk_use
    The number of transactions that used the temporary binary log cache but that exceeded the value of binlog_cache_size and used a temporary file to store statements from the transaction
Binlog_cache_use
    The number of transactions that used the binary log cache.


查看wait事件

mysql> show status like 'Innodb_buffer_pool_wait_free';
+------------------------------+-------+
| Variable_name                | Value |
+------------------------------+-------+
| Innodb_buffer_pool_wait_free | 0     |
+------------------------------+-------+
 
    Normally, writes to the InnoDB buffer pool happen in the background. When InnoDB needs to read or create a page and no clean pages are available, InnoDB flushes some dirty pages first and waits for that operation to finish. This counter counts instances of these waits. If innodb_buffer_pool_size has been set properly, this value should be small.
 
mysql> show status like '%Innodb_log_waits%';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| Innodb_log_waits | 0     |
+------------------+-------+
 
    The number of times that the log buffer was too small and a wait was required for it to be flushed before continuing


其他监控情况

查看文件资源使用情况:
 
最大打开文件数 open_files_limit
    The number of files that the operating system permits mysqld to open. The value of this variable at runtime is the real value permitted by the system and might be different from the value you specify at server startup. The value is 0 on systems where MySQL cannot change the number of open files.
 
已打开文件数 Open_files
    The number of files that are open. This count includes regular files opened by the server. It does not include other types of files such as sockets or pipes. Also, the count does not include files that storage engines open using their own internal functions rather than asking the server level to do so.
 
查看表资源使用情况:
 
表缓存数 table_open_cache
    The number of open tables for all threads. Increasing this value increases the number of file descriptors that mysqld requires. You can check whether you need to increase the table cache by checking the Opened_tables status variable. See Section 5.1.9, “Server Status Variables”. If the value of Opened_tables is large and you do not use FLUSH TABLES often (which just forces all tables to be closed and reopened), then you should increase the value of the table_open_cache variable.
 
已打开表 Open_tables
    The number of tables that are open.
 
表空间监控
 
慢查询监控
    查询时间(Query_time_pct_95)
    锁等待时间(Lock_time_pct_95)
    扫描行数(Rows_examined_pct_95)
    出现次数
 
AWR报告
 
检查错误日志(每个数据库实例不一定相同,根据系统变量确定)
mysql> select @@global.log_error;
+----------------------------+
| @@global.log_error         |
+----------------------------+
| /home/data/mysql/error.log |
+----------------------------+
 
检查本地备份完整性
 
检查异地容灾备份完整性(可使用ftp、expect等方式)