mk-query-digest 一个功能很强大的工具,能分析慢查询日志,也可以对当前的语句进行分析等。
一、分析慢查询日志,生成报表
mysql 通过log-slow-queries和long_query_time参数来记录慢查询,默认的格式如下:
# Time: 090909 4:34:28
# User@Host: sparty[sparty] @ [localhost]
# Query_time: 18 Lock_time: 0 Rows_sent: 5 Rows_examined: 52830
select * from test;
# Time: 090909 14:04:24
# User@Host: sparty[sparty] @ [localhost]
# Query_time: 20 Lock_time: 0 Rows_sent: 10 Rows_examined: 2500284
select * from test2;
每产生一条慢查询记录一行,分析起来不直观,也无法进行统计。mk-query-digest能对slow log进行了统计和分析产生报告,可以更具自己的需要,分析出想要的结果。例如
mk-query-digest --limit 2 --select Query_time,Lock_time,Rows_sent,Rows_examined,ts slow.log /
|grep -v '1us' |grep -v '10us'|grep -v '1ms'|grep -v '10ms'|grep -v '100ms' /
|grep -v '1s'|grep -v '10s'|grep -v 'SHOW'|grep -v '100us'|grep -v 'EXPLAIN'
# Overall: 3 total, 2 unique, 0.00 QPS, 0.00x concurrency ________________
# total min max avg 95% stddev media n
# Exec time 78s 8s 44s 26s 42s 14s 25s
# Lock time 2ms 0 2ms 588us 2ms 822us 0
# Rows sent 23.87M 1 11.93M 7.96M 11.87M 5.59M 11.87M
# Rows exam 35.80M 11.93M 11.93M 11.93M 11.87M 0 11.87M
# Time range 2009-05-20 20:33:59 to 2009-05-27 14:50:13
# Query 1: 0.08 QPS, 2.68x concurrency, ID 0x67A347A2812914DF at byte 1240
# This item is included in the report because it matches --limit.
# pct total min max avg 95% stddev median
# Count 66 2
# Exec time 89 70s 26s 44s 35s 44s 13s 35s
# Lock time 0 0 0 0 0 0 0 0
# Rows sent 99 23.87M 11.93M 11.93M 11.93M 11.93M 0.71 11.93M
# Rows exam 66 23.87M 11.93M 11.93M 11.93M 11.93M 0.71 11.93M
# Time range 2009-05-27 14:49:47 to 2009-05-27 14:50:13
# Query_time distribution
# Tables
SELECT /*!40001 SQL_NO_CACHE */ * FROM `test`/G
# Query 2: 0 QPS, 0x concurrency, ID 0x92203A59492D2ECF at byte 0 ________
# This item is included in the report because it matches --limit.
# pct total min max avg 95% stddev median
# Count 33 1
# Exec time 10 8s 8s 8s 8s 8s 0 8s
# Lock time 100 2ms 2ms 2ms 2ms 2ms 0 2ms
# Rows sent 0 1 1 1 1 1 0 1
# Rows exam 33 11.93M 11.93M 11.93M 11.93M 11.93M 0 11.93M
# Time range 2009-05-20 20:33:59 to 2009-05-20 20:33:59
# Query_time distribution
# Tables
select count(*) from test where t1=20 and t2=100/G
# Rank Query ID Response time Calls R/Call Item
# ==== ================== ================ ======= ========== ====
# 1 0x67A347A2812914DF 69.7409 89.4% 2 34.870474 SELECT test
# 2 0x92203A59492D2ECF 8.2742 10.6% 1 8.274248 SELECT test
报表的输出分三部分,开头部分是整个报表的综合统计。中间部分把单独的Query进行统计,按照total Exec time从高到低排序。结尾部分是输出的每条Query的一些统计。
解释一下主要参数:
–limit 指输出耗时最多的Query的条数限制。
–select 选择需要统计的指标。
二、对当前的查询进行统计分析
mk-query-digest也能使用–processlist参数,对指定mysql的当前Query进行统计分析,例如
mk-query-digest –processlist h=localhost -u sg -p
然后在CTRL+C结束,将对这段时间h=localhost里所进行的所有Query进行统计分析并生成报表,报表的输出和上面介绍的一样。
三、分析慢查询日志,将结果导入到指定表。
mk-query-digest也能使用–review参数,将慢查询日志的分析结果插入到指定表中。
首先需要创建表,存放分析结果:
CREATE TABLE query_review (
checksum BIGINT UNSIGNED NOT NULL PRIMARY KEY,
fingerprint TEXT NOT NULL,
sample TEXT NOT NULL,
first_seen DATETIME,
last_seen DATETIME,
reviewed_by VARCHAR(20),
reviewed_on DATETIME,
comments TEXT
)
COLUMN MEANING
=========== ===============
checksum A 64-bit checksum of the query fingerprint
fingerprint The abstracted version of the query; its primary key
sample The query text of a sample of the class of queries
first_seen The smallest timestamp of this class of queries
last_seen The largest timestamp of this class of queries
reviewed_by Initially NULL; if set, query is skipped thereafter
reviewed_on Initially NULL; not assigned any special meaning
comments Initially NULL; not assigned any special meaning
例如:
mk-query-digest --review h=host1,D=test,t=query_review /path/to/slow.log
mysql> use test;
Database changed
mysql> select * from query_review/G;
*************************** 1. row ***************************
checksum: 7467891370387641567
fingerprint: mysqldump
sample: SELECT /*!40001 SQL_NO_CACHE */ * FROM `test`
first_seen: 2009-05-27 14:49:47
last_seen: 2009-05-27 14:50:13
reviewed_by: NULL
reviewed_on: NULL
comments: NULL
*************************** 2. row ***************************
checksum: 10529480083946417871
fingerprint: select count(*) from test where test1=? and test2=?
sample: select count(*) from test where test1=20 and test2=100
first_seen: 2009-05-20 20:33:59
last_seen: 2009-05-20 20:33:59
reviewed_by: NULL
reviewed_on: NULL
comments: NULL
2 rows in set (0.00 sec)
四、查找未走索引的语句
mk-query-digest还可以用来找出当前数据库 未走索引的语句,例如我使用下面命令:
tcpdump -i bond0 port 3306 -s 65535 -x -n -q -tttt | mk-query-digest –type tcpdump –filter ‘($event->{No_index_used } || $event->{No_good_index_used})’
然后在CTRL+C结束,将生成下面的报告。
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond0, link-type EN10MB (Ethernet), capture size 65535 bytes
3 packets captured
3 packets received by filter
0 packets dropped by kernel
# Caught SIGINT.
# 170ms user time, 20ms system time, 12.00M rss, 73.26M vsz
# Overall: 1 total, 1 unique, 0 QPS, 0x concurrency ______________________
# total min max avg 95% stddev median
# Exec time 223us 223us 223us 223us 223us 0 223us
# Time range 2009-09-16 09:22:50.637146 to 2009-09-16 09:22:50.637146
# bytes 26 26 26 26 26 0 26
# Rows affe 0 0 0 0 0 0 0
# Warning c 0 0 0 0 0 0 0
# 100% (1) No_index_used
# Query 1: 0 QPS, 0x concurrency, ID 0x3D8C074C26D05CD7 at byte 0 ________
# This item is included in the report because it matches --limit.
# pct total min max avg 95% stddev median
# Count 100 1
# Exec time 100 223us 223us 223us 223us 223us 0 223us
# Hosts 1 10.0.0.203
# Time range 2009-09-16 09:22:50.637146 to 2009-09-16 09:22:50.637146
# bytes 100 26 26 26 26 26 0 26
# Errors 1 none
# Rows affe 0 0 0 0 0 0 0 0
# Warning c 0 0 0 0 0 0 0 0
# 100% (1) No_index_used
select * from test/G
# Rank Query ID Response time Calls R/Call Item
# ==== ================== ================ ======= ========== ====
# 1 0x3D8C074C26D05CD7 0.0002 100.0% 1 0.000223 SELECT test
很明显,使用tcpdump只能对本地的mysql进行监听,一目了然的找到了哪条sql是没走索引的。
最后,我们在看下这个工具的帮助选项:
mk-query-digest --help
mk-query-digest parses and analyzes MySQL log files. With no FILE, or when FILE
is -, read standard input. For more details, please use the --help option, or
try 'perl doc /usr/bin/mk-query-digest' for complete documentation.
Usage: /usr/bin/mk-query-digest [OPTION...] [FILE]
Options:
--ask-pass Prompt for a password when connecting to MySQL
--attribute-aliases List of attribute|alias,etc (default db|Schema)
--attribute-value-limit A sanity limit for attribute values (default
4294967296)
--aux-dsn Auxiliary DSN used for special options
--charset -A Default character set
--check-attributes-limit Stop checking for new attributes after this
many events (default 1000)
--config Read this comma-separated list of config files;
if specified, this must be the first option on
the command line
--[no]continue-on-error Continue parsing even if there is an error
(default yes)
--create-review-history-table Create the --review-history table if it does
not exist
--create-review-table Create the --review table if it does not exist
--daemonize Fork to the background and detach from the
shell
--defaults-file -F Only read mysql options from the given file
--embedded-attributes Two Perl regex patterns to capture
pseudo-attributes embedded in queries
--execute Execute queries on this DSN
--expected-range Explain items when there are more or fewer than
expected (default 5,10)
--explain Run EXPLAIN for the sample query with this DSN
and print results
--filter Discard events for which this Perl code doesn't
return true
--fingerprints Add query fingerprints to the standard query
analysis report
--[no]for-explain Print extra information to make analysis easy
(default yes)
--group-by Which attribute of the events to group by
(default fingerprint)
--help Show help and exit
--host -h Connect to host
--ignore-attributes Do not aggregate these attributes when
auto-detecting --select (default
s,key,res,val,server _id,offset,end_log_pos,Xid)
--inherit-attributes If missing, inherit these attributes from the
last event that had them (default db,ts)
--interval How frequently to poll the processlist, in
seconds (default .1)
--iterations How many times to iterate through the
collect-and-report cycle (default 1)
--limit Limit output to the given percentage or count
(default 95%:20)
--log Print all output to this file when daemonized
--mirror How often to check whether connections should
be moved, depending on read_only
--order-by Sort events by this attribute and aggregate
function (default Query_time:sum)
--outliers Report outliers by attribute:percentile:count
(default Query_time:1:10)
--password -p Password to use when connecting
--pid Create the given PID file when daemonized
--port -P Port number to use for connection
--print Print log events to STDOUT in standard
slow-query-log format
--processlist Poll this DSN's processlist for queries, with
--interval sleep between
--[no]report Print out reports on the aggregate results from
--group-by (default yes)
--report-all Include all queries, even if they have already
been reviewed
--report-format Print these elements in the query analysis
report (default
rusage,header,query_report,profile)
--review Store a sample of each class of query in this
DSN
--review-history The table in which to store historical values
for review trend analysis
--run-time How long to run before exiting. Optional
suffix s=seconds, m=minutes, h=hours, d=days;
if no suffix, s is used.
--sample Filter out all but the first N occurrences of
each query
--select Compute aggregate statistics for these
attributes
--set-vars Set these MySQL variables (default
wait_timeout=10000)
--shorten Shorten long statements in reports (default
1024)
--since Parse only queries newer than this value (parse
queries since this date)
--socket -S Socket file to use for connection
--tcpdump-errors Write the tcpdump data to this file on error
--timeline Show a timeline of events
--type The type of input to parse (default slowlog)
--until Parse only queries older than this value (parse
queries until this date)
--user -u User for login if not current user
--version Show version and exit
--watch-server This option tells mk-query-digest which server
IP address and port (like "10.0.0.1:3306") to
watch when parsing tcpdump (for --type tcpdump
and memcached); all other servers are ignored
--[no]zero-admin Zero out the Rows_XXX properties for
administrator command events (default yes)
--zero-bool Print 0% boolean values in report
Rules:
DSN values in --review-history default to values in --review if COPY is yes.
DSN syntax is key=value[,key=value...] Allowable DSN keys:
KEY COPY MEANING
=== ==== =============================================
A yes Default character set
D yes Database that contains the query review table
F yes Only read default options from the given file
P yes Port number to use for connection
S yes Socket file to use for connection
h yes Connect to host
p yes Password to use when connecting
t no Table to use as the query review table
u yes User for login if not current user
If the DSN is a bareword, the word is treated as the 'h' key.
Options and values after processing arguments:
--ask-pass FALSE
--attribute-aliases db|Schema
--attribute-value-limit 4294967296
--aux-dsn (No value)
--charset (No value)
--check-attributes-limit 1000
--config /etc/maatkit/maatkit.conf,/etc/maatkit/mk-query-digest.conf,/root/.maatkit.conf,/root/.mk-
query-digest.conf
--continue-on-error TRUE
--create-review-history-table FALSE
--create-review-table FALSE
--daemonize FALSE
--defaults-file (No value)
--embedded-attributes (No value)
--execute (No value)
--expected-range 5,10
--explain (No value)
--filter (No value)
--fingerprints FALSE
--for-explain TRUE
--group-by fingerprint
--help TRUE
--host (No value)
--ignore-attributes
arg,cmd,insert_id,ip,port,Thread_id,timestamp,exptime,flags,key,res,val,server_id,offset,end_log_pos,Xid
--inherit-attributes db,ts
--interval .1
--iterations 1
--limit 95%:20
--log (No value)
--mirror (No value)
--order-by Query_time:sum
--outliers Query_time:1:10
--password (No value)
--pid (No value)
--port (No value)
--print FALSE
--processlist (No value)
--report TRUE
--report-all FALSE
--report-format header,profile,query_report,rusage
--review (No value)
--review-history (No value)
--run-time (No value)
--sample (No value)
--select
--set-vars wait_timeout=10000
--shorten 1024
--since (No value)
--socket (No value)
--tcpdump-errors (No value)
--timeline FALSE
--type slowlog
--until (No value)
--user (No value)
--version FALSE
--watch-server (No value)
--zero-admin TRUE
--zero-bool FALSE