这两天在看关于MySQL分页查询的优化问题,看到网上的不少文章。
MySQL的分页查询通常都离不开limit帮忙,而其中的关键主要围绕以下两点:
1. 尽量减小limit的offset,即较小“limit m, n”里面的“m”。参考文章:http://www.fuchaoqun.com/2009/04/efficient-pagination-using-mysql/
2. 仅先定位offset的id,再查需要的limit数据。参考文章:http://www.sqlclub.cn/Optimi/2009-04/2626.htm
参考的这篇文章中,就是通过定位当前页的起止id,或者目标页的起止id,来先定位offset。但是也有一定的局限性,比如,当几次翻页的过程中,如果增加或者删除了数据,那么查询也许就不是最“实时”的了。不过通过第二点,我也想到一点,就是先查找需要的id,再根据id“按图索骥”,查得需要的记录。毕竟通常id都是索引或者主键,查询起来肯定效率高,而且仅查询id一个字段,应该更加快速。尤其是在查询的时候,需要重新排序的时候(当然是根据索引排序,否则慢死了)。
无意中翻看Baron Schwartz写的《高性能MySQL(第二版)》,其中4.5.5章节讲了如何优化LIMIT和OFFSET,也证实了这种方法。
我自己实验了几个例子:
例子1:
表t_attorney结构如下,具有30000行数据
mysql> desc t_applicant;
+---------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| internalNo | varchar(45) | YES | | NULL | |
| firstName | varchar(45) | YES | | NULL | |
| lastName | varchar(45) | YES | | NULL | |
| email | varchar(100) | YES | | NULL | |
| address | varchar(500) | YES | | NULL | |
| telNo | varchar(60) | YES | | NULL | |
| certificateNo | varchar(60) | YES | | NULL | |
| nationality | varchar(60) | YES | | NULL | |
| idCardNo | varchar(45) | YES | | NULL | |
+---------------+--------------+------+-----+---------+----------------+
10 rows in set (0.00 sec)
两种查询方式:
mysql> select a.internalNo, a.firstName, a.lastName, a.email, a.address, a.telNo, a.certificateNo, a.nationality, a.idCardNo from t_ applicant a inner join (select id from t_ applicant order by id desc limit 18000,15) as a2 using(id) ;
15 rows in set (0.03 sec)
mysql> select a.internalNo, a.firstName, a.lastName, a.email, a.address, a.telNo, a.certificateNo, a.nationality, a.idCardNo from t_applicant a order by id desc limit 18000,15;
15 rows in set (0.08 sec)
两种查询方式,第一种查询平均耗时0.03秒左右;而第二种平均耗时0.08秒左右。
例子2:
表t_patentCase结构如下,具有90000行数据。
mysql> desc t_patentCase;
+-------------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| internalNo | varchar(45) | NO | MUL | NULL | |
| name | varchar(45) | YES | | NULL | |
| appliedDate | datetime | YES | | NULL | |
| createdDate | datetime | YES | | NULL | |
| patentCaseType | int(11) | YES | | NULL | |
| firstAttorney_id | int(11) | YES | MUL | NULL | |
| secondAttorney_id | int(11) | YES | | NULL | |
| firstInventor_id | int(11) | YES | MUL | NULL | |
| firstApplicant_id | int(11) | YES | MUL | NULL | |
| firstContact_id | int(11) | YES | MUL | NULL | |
| applicationNo | varchar(45) | YES | | NULL | |
+-------------------+-------------+------+-----+---------+----------------+
12 rows in set (0.03 sec)
视图v_patentCase_summary是t_patentCase左连接另外4张表的结果视图,结构如下:
mysql> desc v_patentCase_summary;
+--------------------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------------------+-------------+------+-----+---------+-------+
| id | int(11) | NO | | 0 | |
| internalNo | varchar(45) | NO | | NULL | |
| name | varchar(45) | YES | | NULL | |
| patentCaseType | int(11) | YES | | NULL | |
| createdDate | datetime | YES | | NULL | |
| appliedDate | datetime | YES | | NULL | |
| applicationNo | varchar(45) | YES | | NULL | |
| firstAttorney_id | int(11) | YES | | NULL | |
| firstAttorney_firstName | varchar(45) | YES | | NULL | |
| firstAttorney_lastName | varchar(45) | YES | | NULL | |
| secondAttorney_id | int(11) | YES | | NULL | |
| secondAttorney_firstName | varchar(45) | YES | | NULL | |
| secondAttorney_lastName | varchar(45) | YES | | NULL | |
| firstApplicant_id | int(11) | YES | | NULL | |
| firstApplicant_firstName | varchar(45) | YES | | NULL | |
| firstApplicant_lastName | varchar(45) | YES | | NULL | |
| firstInventor_id | int(11) | YES | | NULL | |
| firstInventor_firstName | varchar(45) | YES | | NULL | |
| firstInventor_lastName | varchar(45) | YES | | NULL | |
| firstContact_id | int(11) | YES | | NULL | |
| firstContact_firstName | varchar(45) | YES | | NULL | |
| firstContact_lastName | varchar(45) | YES | | NULL | |
+--------------------------+-------------+------+-----+---------+-------+
22 rows in set (0.00 sec)
两种查询方式:
mysql> select v.id, v.internalNo, v.name, v.patentCaseType, v.createdDate, v.appliedDate, v.applicationNo, v.firstAttorney_id, v.firstAttorney_firstName, v.firstAttorney_lastName, v.secondAttorney_id, v.secondAttorney_firstName, v.secondAttorney_lastName, v.firstApplicant_id, v.firstApplicant_firstName, v.firstApplicant_lastName, v.firstInventor_id, v.firstInventor_firstName, v.firstInventor_lastName, v.firstContact_id, v.firstContact_firstName, v.firstContact_lastName from v_patentCase_summary v right join (select id from t_patentCase order by id desc limit 60000,15) as p on v.id=p.id ;
15 rows in set (1.45 sec)
mysql> select v.id, v.internalNo, v.name, v.patentCaseType, v.createdDate, v.appliedDate, v.applicationNo, v.firstAttorney_id, v.firstAttorney_firstName, v.firstAttorney_lastName, v.secondAttorney_id, v.secondAttorney_firstName, v.secondAttorney_lastName, v.firstApplicant_id, v.firstApplicant_firstName, v.firstApplicant_lastName, v.firstInventor_id, v.firstInventor_firstName, v.firstInventor_lastName, v.firstContact_id, v.firstContact_firstName, v.firstContact_lastName from v_patentCase_summary v order by id desc limit 60000,15;
? rows in set (??? sec)
两种查询方式,第一种查询平均耗时1.50秒左右;而第二种过了数分钟尚无反应。
小结:
通过以上两个例子可以看出这种方法的效率改善,当然优化无止境,根据以后的具体应用要求,还可以继续优化offset的定位问题。