当面临慢查询SQL时,应如何快速定位与解决问题。本篇主要介绍在实际开发过程中如何分析SQL并对SQL进行优化。数据文件从案例库sakila下载
show [session|global] status
命令可以查看服务器状态信息。[session|global]
不填则默认采用session
下面的命令显示了当前 session 中所有统计参数的值:
mysql> show status like 'Com_______';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Com_binlog | 0 |
| Com_commit | 15 |
| Com_delete | 0 |
| Com_insert | 1017 |
| Com_repair | 0 |
| Com_revoke | 0 |
| Com_select | 21 |
| Com_signal | 0 |
| Com_update | 0 |
| Com_xa_end | 0 |
+---------------+-------+
mysql> show status like 'Innodb_rows_%';
+----------------------+-------+
| Variable_name | Value |
+----------------------+-------+
| Innodb_rows_deleted | 5 |
| Innodb_rows_inserted | 47313 |
| Innodb_rows_read | 79 |
| Innodb_rows_updated | 0 |
+----------------------+-------+
Com_xxx 表示每个 xxx 语句执行的次数,我们通常比较关心的是以下几个统计参数
参数 | 含义 |
---|---|
Com_select | 执行 select 操作的次数,一次查询只累加 1。 |
Com_insert | 执行 INSERT 操作的次数,对于批量插入的 INSERT 操作,只累加一次。 |
Com_update | 执行 UPDATE 操作的次数。 |
Com_delete | 执行 DELETE 操作的次数。 |
Innodb_rows_read | select 查询返回的行数。 |
Innodb_rows_inserted | 执行 INSERT 操作插入的行数。 |
Innodb_rows_updated | 执行 UPDATE 操作更新的行数。 |
Innodb_rows_deleted | 执行 DELETE 操作删除的行数。 |
Connections | 试图连接 MySQL 服务器的次数。 |
Uptime | 服务器工作时间。 |
Slow_queries | 慢查询的次数。 |
Com_*** : 这些参数对于所有存储引擎的表操作都会进行累计。
Innodb_*** : 这几个参数只是针对InnoDB 存储引擎的,累加的算法也略有不同。
可以通过以下两种方式定位执行效率较低的 SQL 语句。
mysql> SHOW processlist;
+----+-------------+-----------+--------+---------+-------+--------------------------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+-------------+-----------+--------+---------+-------+--------------------------------------------------------+------------------+
| 1 | system user | | NULL | Connect | 38184 | Slave has read all relay log; waiting for more updates | NULL |
| 2 | system user | | NULL | Connect | 38184 | Connecting to master | NULL |
| 6 | root | localhost | sakila | Query | 0 | starting | SHOW processlist |
+----+-------------+-----------+--------+---------+-------+--------------------------------------------------------+------------------+
参数 | 描述 |
---|---|
id | 用户登录mysql时,系统分配的"connection_id",可以使用函数connection_id()查看 |
User | 显示当前用户。如果不是root,这个命令就只显示用户权限范围的sql语句 |
host | 显示这个语句是从哪个ip的哪个端口上发的,可以用来跟踪出现问题语句的用户 |
db | 这个进程目前连接的是哪个数据库 |
command | 显示当前连接的执行的命令,一般取值为休眠(sleep),查询(query),连接(connect)等 |
time | 这个状态持续的时间,单位是秒 |
state | 显示使用当前连接的sql语句的状态,很重要的列。state描述的是语句执行中的某一个状态。一个sql语句,以查询为例,可能经过Copying to tmp table、sorting result、sending data等状态才完成 |
info | 显示这个sql语句,是判断问题语句的一个重要依据 |
参照EXPLAIN介绍
Mysql从5.0.37版本开始增加了对 show profiles 和 show profile 语句的支持。show profiles 能够在做SQL优化时帮助我们了解时间都耗费到哪里去了。
查看是否支持profile
mysql> select @@have_profiling;
+------------------+
| @@have_profiling |
+------------------+
| YES |
+------------------+
默认profiling是关闭的,可以通过set语句在Session级别开启profiling:
mysql> SELECT @@profiling;
+-------------+
| @@profiling |
+-------------+
| 0 |
+-------------+
-- 开启
set profiling=1;
下面执行一系列操作,并查看profiles
mysql> SHOW TABLES;
mysql> SELECT COUNT(*) FROM actor;
mysql> SELECT * FROM actor;
mysql> SHOW profiles;
+----------+------------+----------------------------+
| Query_ID | Duration | Query |
+----------+------------+----------------------------+
| 1 | 0.00038200 | SELECT @@profiling |
| 2 | 0.00252900 | SHOW TABLES |
| 3 | 0.00042675 | SELECT COUNT(*) FROM actor |
| 4 | 0.00088175 | SELECT * FROM actor |
+----------+------------+----------------------------+
通过show profile for query query_id
语句可以查看到该SQL执行过程中每个线程的状态和消耗的时间:
mysql> show profile for query 4;
+----------------------+----------+
| Status | Duration |
+----------------------+----------+
| starting | 0.000096 |
| checking permissions | 0.000014 |
| Opening tables | 0.000029 |
| init | 0.000037 |
| System lock | 0.000018 |
| optimizing | 0.000008 |
| statistics | 0.000025 |
| preparing | 0.000023 |
| executing | 0.000005 |
| Sending data | 0.000544 |
| end | 0.000010 |
| query end | 0.000014 |
| closing tables | 0.000013 |
| freeing items | 0.000024 |
| cleaning up | 0.000023 |
+----------------------+----------+
Sending data 状态表示MySQL线程开始访问数据行并把结果返回给客户端,而不仅仅是返回个客户端。由于在Sending data状态下,MySQL线程往往需要做大量的磁盘读取操作,所以经常是整各查询中耗时最长的状态。
匹配全值
对索引中的所有列都有等值匹配
--其中 rental_date为复合索引(rental_date,inventory_id,customer_id)
mysql> SHOW INDEX FROM rental;
+--------+------------+---------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------+------------+---------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| rental | 0 | PRIMARY | 1 | rental_id | A | 16005 | NULL | NULL | | BTREE | | |
| rental | 0 | rental_date | 1 | rental_date | A | 15815 | NULL | NULL | | BTREE | | |
| rental | 0 | rental_date | 2 | inventory_id | A | 16005 | NULL | NULL | | BTREE | | |
| rental | 0 | rental_date | 3 | customer_id | A | 16005 | NULL | NULL | | BTREE | | |
| rental | 1 | idx_fk_inventory_id | 1 | inventory_id | A | 4580 | NULL | NULL | | BTREE | | |
| rental | 1 | idx_fk_customer_id | 1 | customer_id | A | 599 | NULL | NULL | | BTREE | | |
| rental | 1 | idx_fk_staff_id | 1 | staff_id | A | 2 | NULL | NULL | | BTREE | | |
+--------+------------+---------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
7 rows in set (0.00 sec)
mysql> EXPLAIN SELECT * FROM rental WHERE rental_date='2005-05-24 22:53:30' AND inventory_id=367 AND customer_id=130\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: rental
partitions: NULL
type: const
possible_keys: rental_date,idx_fk_inventory_id,idx_fk_customer_id
key: rental_date
key_len: 10
ref: const,const,const
rows: 1
filtered: 100.00
Extra: NULL
1 row in set, 1 warning (0.00 sec)
从执行计划key中可以看出优化器选择了复合索引rental_date
对索引范围查询
mysql> EXPLAIN SELECT * FROM rental WHERE customer_id>=373 AND customer_id<400\G;
mysql> EXPLAIN SELECT * FROM rental WHERE customer_id>=373 AND customer_id<400\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: rental
partitions: NULL
type: range
possible_keys: idx_fk_customer_id
key: idx_fk_customer_id
key_len: 2
ref: NULL
rows: 718
filtered: 100.00
Extra: Using index condition
1 row in set, 1 warning (0.01 sec)
类型type为range说明优化器选择范围查询,优化器选择了idx_fk_customer_id索引。Extra为Using index condition
表示需要根据索引回表查询数据。
匹配最左前缀
如果索引了多列(复合索引),要遵守最左前缀法则。指的是查询从索引的最左前列开始,并且不跳过索引中的列。
走全部索引,key_len=12
mysql> EXPLAIN SELECT * FROM payment WHERE payment_date='2005-05-25 11:30:37' AND amount='2.99' AND last_update='2006-02-15 22:12:30'\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: payment
partitions: NULL
type: ref
possible_keys: idx_payment_date
key: idx_payment_date
key_len: 12
ref: const,const,const
rows: 1
filtered: 100.00
Extra: NULL
1 row in set, 1 warning (0.00 sec)
如果符合最左法则,但是出现跳跃某一列,只有最左列索引生效,key_len=5
ALTER TABLE payment ADD INDEX idx_payment_date(payment_date,amount,last_update);
mysql> EXPLAIN SELECT * FROM payment WHERE payment_date='2005-05-25 11:30:37' AND last_update='2006-02-15 22:12:30'\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: payment
partitions: NULL
type: ref
possible_keys: idx_payment_date
key: idx_payment_date
key_len: 5
ref: const
rows: 1
filtered: 10.00
Extra: Using index condition
1 row in set, 1 warning (0.00 sec)
违反最左前缀法则 , 索引失效:
-- 索引失效
mysql> EXPLAIN SELECT * FROM payment WHERE amount='2.99' AND last_update='2006-02-15 22:12:30'\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: payment
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 16086
filtered: 1.00
Extra: Using where
1 row in set, 1 warning (0.00 sec)
仅对索引列进行查询
当查询的列都在索引的字段时,查询效率更高;避免SELECT *
mysql> EXPLAIN SELECT payment_date FROM payment WHERE payment_date='2005-05-25 11:30:37' AND last_update='2006-02-15 22:12:30'\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: payment
partitions: NULL
type: ref
possible_keys: idx_payment_date
key: idx_payment_date
key_len: 5
ref: const
rows: 1
filtered: 10.00
Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)
对比上一例中select * ,本次只查询payment_date,Extra的值发生了变化
using index :使用覆盖索引的时候就会出现
using where:在查找使用索引的情况下,需要回表去查询所需的数据
using index condition:查找使用了索引,但是需要回表查询数据
using index ; using where:查找使用了索引,但是需要的数据都在索引列中能找到,所以不需要回表查询数据
不要在索引上计算,索引会失效
mysql> EXPLAIN SELECT * FROM customer WHERE last_name='SMITH'\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: customer
partitions: NULL
type: ref
possible_keys: idx_last_name
key: idx_last_name
key_len: 182
ref: const
rows: 1
filtered: 100.00
Extra: NULL
1 row in set, 1 warning (0.01 sec)
mysql> EXPLAIN SELECT * FROM customer WHERE substring(last_name,1,4) ='JOHN'\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: customer
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 599
filtered: 100.00
Extra: Using where
1 row in set, 1 warning (0.00 sec)
or查询,前面列有索引后面类无索引,那么涉及到的索引都失效
mysql> EXPLAIN SELECT * FROM actor WHERE last_name='WAHLBERG' OR first_name='NICK'\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: actor
partitions: NULL
type: ALL
possible_keys: idx_actor_last_name
key: NULL
key_len: NULL
ref: NULL
rows: 200
filtered: 19.00
Extra: Using where
1 row in set, 1 warning (0.00 sec)
以%开头的Like模糊查询,索引失效
-- 索引失效
mysql> EXPLAIN SELECT * FROM actor WHERE last_name like '%WAHLBERG'\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: actor
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 200
filtered: 11.11
Extra: Using where
1 row in set, 1 warning (0.01 sec)
-- 索引生效
mysql> EXPLAIN SELECT * FROM actor WHERE last_name like 'WAHLBERG%'\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: actor
partitions: NULL
type: range
possible_keys: idx_actor_last_name
key: idx_actor_last_name
key_len: 182
ref: NULL
rows: 2
filtered: 100.00
Extra: Using index condition
1 row in set, 1 warning (0.00 sec)
这种情况,查询索引字段可避免
mysql> EXPLAIN SELECT actor_id,last_name FROM actor WHERE last_name like '%WAHLBERG%'\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: actor
partitions: NULL
type: index
possible_keys: NULL
key: idx_actor_last_name
key_len: 182
ref: NULL
rows: 200
filtered: 11.11
Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)
对于 InnoDB 类型的表,有以下几种方式可以提高导入的效率:
主键顺序插入
InnoDB类型的表是按照主键的顺序保存的,所以将导入的数据按照主键的顺序排列,可以有效的提高导入数据的效率。如果InnoDB表没有主键,那么系统会自动默认创建一个内部列作为主键,所以如果可以给表创建一个主键,将可以利用这点,来提高导入数据的效率。
CREATE TABLE `tb_user_1` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`username` varchar(45) NOT NULL,
`password` varchar(96) NOT NULL,
`name` varchar(45) NOT NULL,
`birthday` datetime DEFAULT NULL,
`sex` char(1) DEFAULT NULL,
`email` varchar(45) DEFAULT NULL,
`phone` varchar(45) DEFAULT NULL,
`qq` varchar(32) DEFAULT NULL,
`status` varchar(32) NOT NULL COMMENT '用户状态',
`create_time` datetime NOT NULL,
`update_time` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `unique_user_username` (`username`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ;
数据导入
-- sql1.log数据主键有序,时间为29.11sec
mysql> load data local infile '/usr/local/mysql/temp/sql1.log' into table tb_user_1 fields terminated by ',' lines terminated by '\n';
Query OK, 1000000 rows affected, 65535 warnings (29.11 sec)
Records: 1000000 Deleted: 0 Skipped: 0 Warnings: 4000000
mysql> SELECT COUNT(*) FROM tb_user_1;
+----------+
| COUNT(*) |
+----------+
| 1000000 |
+----------+
1 row in set (0.27 sec)
-- sql2.log为无序数据,时间1min19.81s
mysql> load data local infile '/usr/local/mysql/temp/sql2.log' into table tb_user_2 fields terminated by ',' lines terminated by '\n';
Query OK, 1000000 rows affected, 65535 warnings (1 min 19.81 sec)
Records: 1000000 Deleted: 0 Skipped: 0 Warnings: 4000000
mysql> select count(id) from tb_user_2;
+-----------+
| count(id) |
+-----------+
| 1000000 |
+-----------+
1 row in set (0.24 sec)
关闭唯一性检验
在导入数据前执行 SET UNIQUE_CHECKS=0
,关闭唯一性校验,在导入结束后执行SET UNIQUE_CHECKS=1
,恢复唯一性校验,可以提高导入的效率(笔者在测试时基本没有变化)
mysql> SET UNIQUE_CHECKS=0;
Query OK, 0 rows affected (0.00 sec)
mysql> load data local infile '/usr/local/mysql/temp/sql1.log' into table tb_user_1 fields terminated by ',' lines terminated by '\n';
Query OK, 1000000 rows affected, 65535 warnings (30.42 sec)
Records: 1000000 Deleted: 0 Skipped: 0 Warnings: 4000000
mysql> SET UNIQUE_CHECKS=1;
Query OK, 0 rows affected (0.02 sec)
手动提交事务
如果应用使用自动提交的方式,建议在导入前执行 SET AUTOCOMMIT=0
,关闭自动提交,导入结束后再执行 SET AUTOCOMMIT=1
,打开自动提交,也可以提高导入的效率。
mysql> SET AUTOCOMMIT=0;
Query OK, 0 rows affected (0.00 sec)
mysql> load data local infile '/usr/local/mysql/temp/sql1.log' into table tb_user_1 fields terminated by ',' lines terminated by '\n';
Query OK, 1000000 rows affected, 65535 warnings (28.27 sec)
Records: 1000000 Deleted: 0 Skipped: 0 Warnings: 4000000
mysql> COMMIT;
mysql> SELECT COUNT(id) FROM tb_user_1;
+-----------+
| COUNT(id) |
+-----------+
| 1000000 |
+-----------+
CREATE TABLE `emp` (
`id` int UNSIGNED NOT NULL AUTO_INCREMENT,
`name` varchar(100) NOT NULL,
`age` int(3) NOT NULL,
`salary` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
insert into `emp` (`id`, `name`, `age`, `salary`) values('1','Tom','25','2300');
insert into `emp` (`id`, `name`, `age`, `salary`) values('2','Jerry','30','3500');
insert into `emp` (`id`, `name`, `age`, `salary`) values('3','Luci','25','2800');
insert into `emp` (`id`, `name`, `age`, `salary`) values('4','Jay','36','3500');
insert into `emp` (`id`, `name`, `age`, `salary`) values('5','Tom2','21','2200');
insert into `emp` (`id`, `name`, `age`, `salary`) values('6','Jerry2','31','3300');
insert into `emp` (`id`, `name`, `age`, `salary`) values('7','Luci2','26','2700');
insert into `emp` (`id`, `name`, `age`, `salary`) values('8','Jay2','33','3500');
insert into `emp` (`id`, `name`, `age`, `salary`) values('9','Tom3','23','2400');
insert into `emp` (`id`, `name`, `age`, `salary`)
values('10','Jerry3','32','3100');
insert into `emp` (`id`, `name`, `age`, `salary`) values('11','Luci3','26','2900');
insert into `emp` (`id`, `name`, `age`, `salary`) values('12','Jay3','37','4500');
create index idx_emp_age_salary on emp(age,salary);
有两种排序方式:
mysql> EXPLAIN SELECT * FROM emp ORDER BY age DESC\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: emp
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 12
filtered: 100.00
Extra: Using filesort
1 row in set, 1 warning (0.00 sec)
No query specified
mysql> EXPLAIN SELECT id,age FROM emp ORDER BY age DESC\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: emp
partitions: NULL
type: index
possible_keys: NULL
key: idx_emp_age_salary
key_len: 9
ref: NULL
rows: 12
filtered: 100.00
Extra: Using index
1 row in set, 1 warning (0.00 sec)
了解了MySQL的排序方式,只要通过以下方法优化:
mysql> EXPLAIN SELECT id,age FROM emp ORDER BY age,salary\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: emp
partitions: NULL
type: index
possible_keys: NULL
key: idx_emp_age_salary
key_len: 9
ref: NULL
rows: 12
filtered: 100.00
Extra: Using index
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN SELECT id,age FROM emp ORDER BY salary,age\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: emp
partitions: NULL
type: index
possible_keys: NULL
key: idx_emp_age_salary
key_len: 9
ref: NULL
rows: 12
filtered: 100.00
Extra: Using index; Using filesort
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN SELECT id,age FROM emp ORDER BY age DESC,salary ASC\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: emp
partitions: NULL
type: index
possible_keys: NULL
key: idx_emp_age_salary
key_len: 9
ref: NULL
rows: 12
filtered: 100.00
Extra: Using index; Using filesort
1 row in set, 1 warning (0.00 sec)
Filesort 的优化
通过创建合适的索引,能够减少 Filesort 的出现,但是在某些情况下,条件限制不能让Filesort消失,那就需要加快 Filesort的排序操作。对于Filesort , MySQL 有两种排序算法:
1) 两次扫描算法 :MySQL4.1 之前,使用该方式排序。首先根据条件取出排序字段和行指针信息,然后在排序区sort buffer 中排序,如果sort buffer不够,则在临时表 temporary table 中存储排序结果。完成排序之后,再根据行指针回表读取记录,该操作可能会导致大量随机I/O操作。
2)一次扫描算法:一次性取出满足条件的所有字段,然后在排序区 sort buffer 中排序后直接输出结果集。排序时内存开销较大,但是排序效率比两次扫描算法要高。
MySQL 通过比较系统变量 max_length_for_sort_data 的大小和Query语句取出的字段总大小, 来判定是否那种排序算法,如果max_length_for_sort_data 更大,那么使用第二种优化之后的算法;否则使用第一种。可以适当提高 sort_buffer_size 和 max_length_for_sort_data 系统变量,来增大排序区的大小,提高排序的效率。
mysql> show variables like 'max_length_for_sort_data';
+--------------------------+-------+
| Variable_name | Value |
+--------------------------+-------+
| max_length_for_sort_data | 1024 |
+--------------------------+-------+
1 row in set (0.03 sec)
mysql> show variables like 'sort_buffer_size';
+------------------+--------+
| Variable_name | Value |
+------------------+--------+
| sort_buffer_size | 262144 |
+------------------+--------+
默认情况下,MySQL对所有GROUP BY col1,col2..
的字段进行排序。如果查询包括GROUP BY
但用户想要避免排序结果的消耗,则可以指定ORDER BY NULL
禁止排序。
mysql> EXPLAIN SELECT payment_date,sum(amount) from payment GROUP BY payment_date\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: payment
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 16086
filtered: 100.00
Extra: Using temporary; Using filesort
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN SELECT payment_date,sum(amount) from payment GROUP BY payment_date ORDER BY NULL\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: payment
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 16086
filtered: 100.00
Extra: Using temporary
1 row in set, 1 warning (0.00 sec)
对于包含OR的查询子句,如果要利用索引,则OR之间的每个条件列都必须用到索引 , 而且不能使用到复合索引; 如果没有索引,则应该考虑增加索引。
通过分析sql找到问题所在,最后对索引,常用sql语句进行优化。