缘起: 实际生产环境中MySQL出现查询慢的问题,为了分析慢查询,开启慢查询日志,并对日志进行分析。
为了避免在生成环境中产生误操作,把记录的慢查询日志copy回来,到自己的电脑中进行分析。
分析结果:
[root@dras-test local]#mysqldumpslow -a -s t -t 2 /opt/slow_query_log.txt
Reading mysql slow query log from /opt/slow_query_log.txt
Count: 1 Time=0.00s (0s) Lock=0.00s (0s) Rows=0.0 (0), LibSvr[LibSvr]@[10.1.202.57]
# Schema: information_schema Last_errno: 0 Killed: 0
# Query_time: 11.257168 Lock_time: 0.000141 Rows_sent: 366777 Rows_examined: 366777 Rows_affected: 0
# Bytes_sent: 43251512
SET timestamp=1492111317;
SELECT * FROM `INNODB_BUFFER_PAGE_LRU`
Count: 1 Time=0.00s (0s) Lock=0.00s (0s) Rows=0.0 (0), LibSvr[LibSvr]@[10.1.122.132]
# Schema: el Last_errno: 0 Killed: 0
# Query_time: 4.471143 Lock_time: 0.000097 Rows_sent: 1 Rows_examined: 8018065 Rows_affected: 0
# Bytes_sent: 1098
SET timestamp=1490682921;
SELECT `UserID`,`FileName`,`AdjunctGuid`,`LiteratureGuid`,`NoteGuid`,`AdjunctType`,`Md5`,`ID`,`ServerModifyTime`,`FileSize` FROM `el_user_adjunct_info` WHERE (`AdjunctGuid` = '4528cef4139846678cddf0d00170af9f.caj')
问题出现了,仔细看的同学已经看出了端倪,Time、Lock、Rows统计数据都为0。
mysqldumpslow是一个perl脚本,其中使用正则表达式对log文件进行逐行匹配,匹配完成后提取出匹配的值然后替换整行为空。
s/^# Query_time: ([0-9.]+)\s+Lock_time: ([0-9.]+)\s+Rows_sent: ([0-9.]+).*\n//;
经单独测试,此表达式没问题,可以正确匹配。但是在mysqldumpslow中却未匹配成功,经过一番查找、对比,发现从生产环境中copy下来的日志格式和mysqldumpslow中要解析的格式不一致,日志中多了一行
18 # User@Host: LibSvr[LibSvr] @ [10.1.122.131] Id: 10
19 # Schema: el Last_errno: 0 Killed: 0
20 # Query_time: 5.993656 Lock_time: 0.000078 Rows_sent: 1 Rows_examined: 8018014 Rows_affected: 0
21 # Bytes_sent: 1086
22 SET timestamp=1490682881;
23 SELECT `UserID`,`FileName`,`AdjunctGuid`,`LiteratureGuid`,`NoteGuid`,`AdjunctType`,`Md5`,`ID`,`ServerModifyTime`,`FileSize` FROM `el_user_adjunct_info` WHERE (`AdjunctGuid` = 'dbf1fc940ddd452d8d2af439438a cb07.caj');
其中,第19行是多出来的,mysqldumpslow为匹配和替换成空,因此逐行匹配时就匹配错了行。
于是修改脚本,添加对第19行的匹配和替换:
105 s/^#? Schema: \w+ Last_errno: \d+ Killed: \d+.*\n//;
106
107 s/^# Query_time: ([0-9.]+)\s+Lock_time: ([0-9.]+)\s+Rows_sent: ([0-9.]+).*\n//;
在匹配Query_time:行的表达式上边,添加第105行匹配Schema行并替换成空。
然后再进行分析,终于正常了。
问题原因:生成环境和我本地的MySQL版本不一致,其生成的日志格式有差异。
mysqldumpslow语法很简单:
[root@dras-test local]# mysqldumpslow --help
Usage: mysqldumpslow [ OPTS... ] [ LOGS... ]
Parse and summarize the MySQL slow query log. Options are
--verbose verbose
--debug debug
--help write this text to standard output
-v verbose
-d debug
-s ORDER what to sort by (al, at, ar, c, l, r, t), 'at' is default
al: average lock time
ar: average rows sent
at: average query time
c: count
l: lock time
r: rows sent
t: query time
-r reverse the sort order (largest last instead of first)
-t NUM just show the top n queries
-a don't abstract all numbers to N and strings to 'S'
-n NUM abstract numbers with at least n digits within names
-g PATTERN grep: only consider stmts that include this string
-h HOSTNAME hostname of db server for *-slow.log filename (can be wildcard),
default is '*', i.e. match all
-i NAME name of server instance (if using mysql.server startup script)
-l don't subtract lock time from total time
示例:
mysqldumpslow -s r -t 10 /database/mysql/slow-log.txt
得到返回记录集最多的10个查询。
mysqldumpslow -s t -t 10 -g “left join” /database/mysql/slow-log.txt
得到按照时间排序的前10条里面含有左连接的查询语句。
mysqldumpslow -a -s t -t 2 /opt/slow_query_log.txt
-a 参数,说明不合并类似的SQL语句,显示具体的SQL语句中的数字和字符串。
4、格式化慢查询日志
Mysqldumpslow的用法汇总
mysqldumpslow --help可显示其参数的使用
经常使用的参数:
-s,是order的顺序
al 平均锁定时间
ar 平均返回记录时间
at 平均查询时间(默认)
c 计数
l 锁定时间
r 返回记录
t 查询时间
-t,是top n的意思,即为返回前面多少条的数据
-g,后边可以写一个正则匹配模式,大小写不敏感的
例子:
mysqldumpslow -t 10 -s t -g “left join” host-slow.log
使用mysqldumpslow的分析结果不会显示具体完整的sql语句,说明:
1:假如真正的sql语句如下:
SELECT * FROM sms_send WHERE service_id=10 GROUP BY content LIMIT 0, 1000;
mysqldumpslow显示的结果会是:
Count: 1 Time=1.91s (1s) Lock=0.00s (0s) Rows=1000.0 (1000), vgos_dba[vgos_dba]@[10.130.229.196]
SELECT * FROM sms_send WHERE service_id=N GROUP BY content LIMIT N, N;
2:如果我们再执行一条
SELECT * FROM sms_send WHERE service_id=20 GROUP BY content LIMIT 10000, 1000;
mysqldumpslow显示的结果会是:
Count: 2 Time=2.79s (5s) Lock=0.00s (0s) Rows=1.0 (2), vgos_dba[vgos_dba]@[10.130.229.196]
SELECT * FROM sms_send WHERE service_id=N GROUP BY content LIMIT N, N;
虽然这两条语句条件不一样,
1:一个是server_id=10,一个是server_id=20
2:一个是LIMIT 0, 1000,一个是LIMIT 10000, 1000
但是mysqldumpslow分析会认为这是一种类型的语句,会合并显示。
3:假设我们执行
SELECT * FROM sms_send WHERE service_id<=10 GROUP BY content LIMIT 0, 1000;
执行mysqldumpslow结果是
Count: 1 Time=2.91s (2s) Lock=0.00s (0s) Rows=1000.0 (1000), vgos_dba[vgos_dba]@[10.130.229.196]
SELECT * FROM sms_send WHERE service_id<=N GROUP BY content LIMIT N, N;
可以看出它和上面我们写的sql语句是两种类型
mysqldumpslow的分析结果
Count会告诉我们这种类型的语句执行了几次
Time会告诉我们这种类型的语句执行的最大时间
Time=2.79s (5s)中(5s)是指这类型的语句执行总共花费的时间
例:
Count: 2 Time=2.79s (5s) Lock=0.00s (0s) Rows=1.0 (2), vgos_dba[vgos_dba]@[10.130.229.196]
告诉我们执行了2次
最大时间是2.79s
总共花费时间5s
lock时间0s
单次返回的结果数是1条记录
2次总共返回2条记录
mysqldumpslow -s t -t 10 slow.log
查询的结果是10条执行时间最慢的sql语句,其中-s t是指此类类型的语句的执
行总时长
Count: 1 Time=2.91s (2s) Lock=0.00s (0s) Rows=1000.0 (1000), vgos_dba[vgos_dba]@[10.130.229.196]
和
Count: 2 Time=2.79s (5s) Lock=0.00s (0s) Rows=1.0 (2), vgos_dba[vgos_dba]@[10.130.229.196]
比较的结果是
Count: 2 Time=2.79s (5s) Lock=0.00s (0s) Rows=1.0 (2), vgos_dba[vgos_dba]@[10.130.229.196]
排在前面,因为比较的时长是(5s)和(2s),而不是2.79s和2.91s
-s at比较的也是(5s)/count:2和(2s)/Count: 1
所以:-s at是
Count: 1 Time=2.91s (2s) Lock=0.00s (0s) Rows=1000.0 (1000), vgos_dba[vgos_dba]@[10.130.229.196]
排在前面。
Rows=1.0 (2) 是按照以下逻辑展示的
(2)是指在Count: 2次数总共返回了2条记录集;row=1.0显示(2)/Count: 2,
如果此时Count是3,那么row的计算方式是Rows=2/3,Rows=0.67
主要功能是, 统计不同慢sql的出现次数(Count),执行最长时间(Time),累计总耗费时间(Time),等待锁的时间(Lock),发送给客户端的行总数(Rows),扫描的行总数(Rows)
正确的结果如下:
[root@dras-test local]# mysqldumpslow -a -s t -t 2 /opt/slow_query_log.txt
Reading mysql slow query log from /opt/slow_query_log.txt
Count: 60 Time=903.45s (54206s) Lock=0.00s (0s) Rows=13271072.8 (796264367), LibSvr[LibSvr]@2hosts
SELECT * FROM `el_user_litera_info`
Count: 60 Time=335.23s (20113s) Lock=0.00s (0s) Rows=17128739.1 (1027724344), LibSvr[LibSvr]@2hosts
SELECT * FROM `el_user_category_litera`
[root@dras-test local]# mysqldumpslow -a -s t -t 5 /opt/slow_query_log.txt
Reading mysql slow query log from /opt/slow_query_log.txt
Count: 60 Time=903.45s (54206s) Lock=0.00s (0s) Rows=13271072.8 (796264367), LibSvr[LibSvr]@2hosts
SELECT * FROM `el_user_litera_info`
Count: 60 Time=335.23s (20113s) Lock=0.00s (0s) Rows=17128739.1 (1027724344), LibSvr[LibSvr]@2hosts
SELECT * FROM `el_user_category_litera`
Count: 60 Time=277.45s (16646s) Lock=0.00s (0s) Rows=13271097.1 (796265825), LibSvr[LibSvr]@2hosts
SELECT * FROM `el_user_litera_reader_info`
Count: 60 Time=153.27s (9196s) Lock=0.00s (0s) Rows=8943019.9 (536581195), LibSvr[LibSvr]@2hosts
SELECT * FROM `el_user_adjunct_info`
Count: 60 Time=91.95s (5516s) Lock=0.00s (0s) Rows=2036609.1 (122196547), LibSvr[LibSvr]@2hosts
SELECT * FROM `el_user_note_content