最近开发的一个系统,随着它系统内部数据越来越胖,主要界面的查询,也越来越缓慢,每次进入界面,感觉都要几秒才能看到列表,这样的用户体验感是不好的。
而且这才一千多条数据。。。。。
所以,这个肯定是要处理一下的,那么,一起来分析一下吧。
最初的sql语句
SELECT
cbi.id AS id,
cbi.name AS name,
cbi.case_no AS caseNo,
cbi.code AS code,
cbi.cause_code AS causeCode,
cbi.status AS status,
cbi.create_time AS createTime,
cbi.is_avoid AS isAvoid,
cbi.is_paid AS isPaid,
cbi.arbitration_fee as arbitrationFee,
cbi.case_type AS caseType,
cbi.source_platform AS sourcePlatform,
cbi.payment_pattern AS paymentPattern,
cp1.name AS applicantName,
cp2.name AS respondentName,
cp2.user_id AS respondentUserId,
co1.description AS description,
ca.user_id as agentUserId,
ca.name as agentName,
eb.name AS businessName,
IF(cbi.case_type = 0,
(
SELECT
COUNT( * )
FROM
cbi countcbi
LEFT JOIN cp countcp ON countcp.case_id = countcbi.id AND countcp.identify_type = 2
WHERE
countcp.user_id = cp2.user_id AND countcbi.case_type = 0 AND countcbi.id != cbi.id AND countcbi.status != 0
), (
SELECT
COUNT( * )
FROM
cbi countcbi
LEFT JOIN cp countcp ON countcp.case_id = countcbi.id AND countcp.identify_type = 2
WHERE
countcp.user_id = cp2.user_id AND countcbi.case_type = 1 AND countcbi.id != cbi.id AND countcbi.status != 0
)) AS relatedCasesNumber
FROM
cbi
LEFT JOIN cp cp1 ON cp1.case_id = cbi.id AND cp1.identify_type = 1
LEFT JOIN cp cp2 ON cp2.case_id = cbi.id AND cp2.identify_type = 2
LEFT JOIN ca ON ca.case_id = cbi.id
LEFT JOIN ct ON ct.case_id = cbi.id
LEFT JOIN audit ON audit.case_id = cbi.id AND audit.operator_type = 40
LEFT JOIN eb ON cbi.business_id = eb.id
LEFT JOIN (SELECT * FROM (SELECT * FROM co ORDER BY id DESC) a GROUP By a.case_id ORDER BY a.id DESC ) co1 ON co1.case_id = cbi.id
WHERE
cbi.status != 0
现在我把mybatis里边写的这段sql拿出来,除去后半段where条件查询,放到navicat运行,运行时间如下
放到系统当中运行 打时间戳计算毫秒 大概要7000-8000毫秒,而且加上用户id条件筛选,只查询了300多条数据
第一次优化,把if那段拿出来,单独查询,即在代码当中分两次查询,第一次查询出来的列表,再遍历一次,去查询count(*)的内容
java代码
List
sql语句
SELECT
cbi.id AS id,
cbi.name AS name,
cbi.case_no AS caseNo,
cbi.code AS code,
cbi.cause_code AS causeCode,
cbi.status AS status,
cbi.create_time AS createTime,
cbi.is_avoid AS isAvoid,
cbi.is_paid AS isPaid,
cbi.arbitration_fee as arbitrationFee,
cbi.case_type AS caseType,
cbi.source_platform AS sourcePlatform,
cbi.payment_pattern AS paymentPattern,
cp1.name AS applicantName,
cp2.name AS respondentName,
cp2.user_id AS respondentUserId,
co1.description AS description,
ca.user_id as agentUserId,
ca.name as agentName,
eb.name AS businessName
FROM
cbi
LEFT JOIN cp cp1 ON cp1.case_id = cbi.id AND cp1.identify_type = 1
LEFT JOIN cp cp2 ON cp2.case_id = cbi.id AND cp2.identify_type = 2
LEFT JOIN ca ON ca.case_id = cbi.id
LEFT JOIN ct ON ct.case_id = cbi.id
LEFT JOIN audit ON audit.case_id = cbi.id AND audit.operator_type = 40
LEFT JOIN eb ON cbi.business_id = eb.id
LEFT JOIN (SELECT * FROM (SELECT * FROM co ORDER BY id DESC) a GROUP By a.case_id ORDER BY a.id DESC ) co1 ON co1.case_id = cbi.id
WHERE
cbi.status != 0;
-- 第二个sql
SELECT
(
SELECT
COUNT( * )
FROM
cbi countcbi
LEFT JOIN cp countcp ON countcp.case_id = countcbi.id AND countcp.identify_type = 2
WHERE
countcp.user_id = cp2.user_id AND countcbi.id != cbi.id AND countcbi.status != 0
) AS relatedCasesNumber
FROM
cbi
LEFT JOIN cp cp2 ON cp2.case_id = cbi.id AND cp2.identify_type = 2
WHERE
cbi.id = #{id};
除去If count(*)那块查询,这部分单独查询的时间是
通过java代码打时间戳计算查询时间,最终查询出来还用了3000+毫秒 ,同上是300+数据
这个速度简直了,巨慢巨慢,话说我为啥不都拿出来单表查,因为这个sql后半部分有一堆查询条件,涉及到其他表。。。拿掉就不好筛选了
第一天的一整个下午,心思都花费在怎么用Java代码去优化了,If count(*)拿出来单独查,确实时间缩短了一半,但是其他的拿出来,再遍历另外查,就没多大差别,而且也不好拿出来
晚上回家又很认真地思考了一晚上。。。觉得方向可能错了,还是尽量从sql入手,有些肯定还是sql语句去优化更好,如果我能直接sql查只花费0.0几秒,那我干嘛还要费代码呢
第二天尝试
sql语句修改后
SELECT
cbi.id AS id,
cbi.name AS name,
cbi.case_no AS caseNo,
cbi.code AS code,
cbi.cause_code AS causeCode,
cbi.status AS status,
cbi.create_time AS createTime,
cbi.is_avoid AS isAvoid,
cbi.is_paid AS isPaid,
cbi.arbitration_fee as arbitrationFee,
cbi.case_type AS caseType,
cbi.source_platform AS sourcePlatform,
cbi.payment_pattern AS paymentPattern,
cp1.name AS applicantName,
cp2.name AS respondentName,
cp2.user_id AS respondentUserId,
co1.description AS description,
ct.division_user as divisionUserId,
ca.user_id as agentUserId,
ca.name as agentName,
eb.name AS businessName,
ct.start_time AS trialTime
FROM
cbi
LEFT JOIN (SELECT `name`,case_id,user_id FROM cp WHERE identify_type = 1) cp1 ON cp1.case_id = cbi.id
LEFT JOIN (SELECT `name`,case_id,user_id FROM cp WHERE identify_type = 2) cp2 ON cp2.case_id = cbi.id
LEFT JOIN (SELECT `name`,case_id,user_id FROM ca) ca ON ca.case_id = cbi.id
LEFT JOIN ct ON ct.case_id = cbi.id
LEFT JOIN (SELECT create_time,case_id FROM co WHERE operator_type = 40) audit ON audit.case_id = cbi.id
LEFT JOIN enterprise_business eb ON cbi.business_id = eb.id
LEFT JOIN (SELECT * FROM (SELECT * FROM case_operator ORDER BY id DESC) a GROUP By a.case_id ORDER BY a.id DESC ) co1 ON co1.case_id = cbi.id
WHERE
cbi.status != 0
left join 后加入 select查询,只查询需要的字段,大大提高查询效率
然后如果把条件去除,直接查询全部
这个速度勉强还能撑一阵子把。。
java代码带上分页查询,300+数据,目前查询时间如下,也就是199毫秒
这次sql语句优化过后有考虑把count(*)那段放回去,但是好像就太慢了
所以最终的优化方案是
left join 后的关联,加入select查询,而计数部分,单独拿出来,另外查询,java代码处理时,先查多表的基本信息,得到的数据再遍历一次,单独查询计数那块,查到的数据写回相应的map
这次优化前后大概用了六七个小时吧,如果不算晚上思考的时间
其实查询的数据量不多,如果我单表查询主表 ,也就 1395条数据而已,但是大量的left join就导致了查询时大量数据冗余。
然后,关于mysql 的 `IF`(expr1,expr2,expr3)表达式,这个表达式单纯用于判断和返回,应该是很好的工具,但是强行用在计数上边,就很影响效率了。比如,我把里边,改成单纯判断性别
IF(cp1.sex = 0,('男'), ('女')) AS cp_sex
此时整体的查询速度是
这句话有跟没有只差了0.001s,所以,我觉得这种场景是可以用的,如果拿到java代码去处理,比如我查出来的数据,再遍历,遍历的时候再判断这个字段应该返回男还女,这就比sql处理要更花时间了。
而计数的sql,其实不需要条件判断,这个是强行写的,我哪怕不用 IF 判断,而是把每条记录都去做一次计数,速度都比加上IF要来的快,比如,现在我这样查
SELECT
--- 此处省略N个字段
( SELECT
COUNT( * )
FROM
case_base_info countcbi
LEFT JOIN case_party countcp ON countcp.case_id = countcbi.id AND countcp.identify_type = 2
WHERE
countcp.user_id = cp2.user_id AND countcbi.case_type = 1 AND countcbi.id != cbi.id ) AS relatedCasesNumber
FROM
cbi
LEFT JOIN (SELECT `name`,case_id,user_id,sex FROM cp WHERE identify_type = 1) cp1 ON cp1.case_id = cbi.id
--- 此处省略其他表关联语句
WHERE
cbi.status != 0
查询时间
所以,我的总结是,不要随意强行用 left join ,和 `IF`(expr1,expr2,expr3)表达式 ,sql能解决的,就尽量用sql语句解决。如果代码能处理得更快的,就用代码处理。以后可能会遇到上万条数据,到时候的优化或许又会成为一次挑战。