mysql 关联表查询优化(3) - IN、EXISTS

IN、EXISTS 使用方式以及区别

数据背景

协议表数据.png

协议签署表数据.png

协议表数据总数为277811,协议签署表总数为541621 一看数据 协议签署表是协议表的两倍 就误以为使用EXISTS 性能会好一点但结果如何呢?下面的生产环境在跑的结果查询耗时20s, 不对呀 !为啥才几十万的数据为啥能查询时间这么长!难道是因为join 的表太多了导致时间太久?

原始协议查询sql.png

进一步查看改sql 的执行计划 居然执行了全表扫描,明明ps.signer_ent_id = '979425854629085184' 使用了索引字段查询按道理会走索引,但真实却没有。所以查询时间才这么久!

原始协议查询sql查询耗时.png
原始协议查询sql查询执行计划.png

原始sql 使用EXISTS

-- 原始sql  使用EXISTS
EXPLAIN SELECT
    ep.id,
    ep.expired_date AS expireDate,
    ep.create_id AS createId,
    ep.create_time AS createTime,
    ep.finish_time AS finishTime,
    ep.dead_time AS deadTime,
    ep.page_no AS pageNo,
    ep.protocol_code AS protocolCode,
    ep.buss_type AS bussType,
    ep.protocol_type AS protocolType,
    ep.contract_name AS contractName,
    ep.contract_no AS contractNo,
    ep.biz_id AS bizId,
    ep.biz_code AS bizCode,
    ep.sign_state AS signState,
    ep.remark AS remark,
    ep.is_sign AS isSign,
    ep.system_channel AS systemChannel,
    ep.fail_reason AS failReason,
    ep.download_state AS downloadState,
    ep.can_download AS canDownload,
    ep.file_temp_name AS fileTempName,
    ep.file_related_id AS fileRelatedId,
    ep.file_save_type AS fileSaveType,

IF (ep.file_save_type = 2, 1, 0) AS isBackup,
 ps1.signer_id AS signerAId,
 ps1.signer_ent_id AS signerAEntId,
 ps1.signer_name AS signerAName,
 ps1.sign_time AS signATime,
 ps1.signer_state AS signerAState,
 ps1.remark AS aRemark,
 ps2.signer_id AS signerBId,
 ps2.signer_ent_id AS signerBEntId,
 ps2.signer_name AS signerBName,
 ps2.sign_time AS signBTime,
 ps2.signer_state AS signerBState,
 ps2.remark AS bRemark,
 ps3.signer_id AS signerCId,
 ps3.signer_ent_id AS signerCEntId,
 ps3.signer_name AS signerCName,
 ps3.sign_time AS signCTime,
 ps3.signer_state AS signerCState,
 ps3.remark AS cRemark
FROM
    pub_enterprise_protocol ep
LEFT JOIN pub_enterprise_protocol_signer ps1 ON ps1.contract_no = ep.contract_no
AND ps1.signer_type = 'A'
LEFT JOIN pub_enterprise_protocol_signer ps2 ON ps2.contract_no = ep.contract_no
AND ps2.signer_type = 'B'
LEFT JOIN pub_enterprise_protocol_signer ps3 ON ps3.contract_no = ep.contract_no
AND ps3.signer_type = 'C'
WHERE
    1 = 1
AND EXISTS (
    SELECT
        1
    FROM
        pub_enterprise_protocol_signer ps
    WHERE
        ps.contract_no = ep.contract_no
    AND ps.signer_ent_id = '979425854629085184'
)
AND ep.protocol_type = '1'
LIMIT 0,
 100;

优化分析

当子查询结果集很大,而外部表较小的时候,查询效率会优于IN。
当子查询结果集较小,而外部表很大的时候,IN的查询效率会优于Exists。
改成in 看一下结果

优化后协议查询sql查询耗时.png
优化后协议查询sql查询执行计划.png

优化后sql 使用in

-- 优化后sql 使用in 
EXPLAIN SELECT
    ep.id,
    ep.expired_date AS expireDate,
    ep.create_id AS createId,
    ep.create_time AS createTime,
    ep.finish_time AS finishTime,
    ep.dead_time AS deadTime,
    ep.page_no AS pageNo,
    ep.protocol_code AS protocolCode,
    ep.buss_type AS bussType,
    ep.protocol_type AS protocolType,
    ep.contract_name AS contractName,
    ep.contract_no AS contractNo,
    ep.biz_id AS bizId,
    ep.biz_code AS bizCode,
    ep.sign_state AS signState,
    ep.remark AS remark,
    ep.is_sign AS isSign,
    ep.system_channel AS systemChannel,
    ep.fail_reason AS failReason,
    ep.download_state AS downloadState,
    ep.can_download AS canDownload,
    ep.file_temp_name AS fileTempName,
    ep.file_related_id AS fileRelatedId,
    ep.file_save_type AS fileSaveType,

IF (ep.file_save_type = 2, 1, 0) AS isBackup,
 ps1.signer_id AS signerAId,
 ps1.signer_ent_id AS signerAEntId,
 ps1.signer_name AS signerAName,
 ps1.sign_time AS signATime,
 ps1.signer_state AS signerAState,
 ps1.remark AS aRemark,
 ps2.signer_id AS signerBId,
 ps2.signer_ent_id AS signerBEntId,
 ps2.signer_name AS signerBName,
 ps2.sign_time AS signBTime,
 ps2.signer_state AS signerBState,
 ps2.remark AS bRemark,
 ps3.signer_id AS signerCId,
 ps3.signer_ent_id AS signerCEntId,
 ps3.signer_name AS signerCName,
 ps3.sign_time AS signCTime,
 ps3.signer_state AS signerCState,
 ps3.remark AS cRemark
FROM
    pub_enterprise_protocol ep
LEFT JOIN pub_enterprise_protocol_signer ps1 ON ps1.contract_no = ep.contract_no
AND ps1.signer_type = 'A'
LEFT JOIN pub_enterprise_protocol_signer ps2 ON ps2.contract_no = ep.contract_no
AND ps2.signer_type = 'B'
LEFT JOIN pub_enterprise_protocol_signer ps3 ON ps3.contract_no = ep.contract_no
AND ps3.signer_type = 'C'
WHERE
    ep.contract_no IN (
        SELECT
            ps.contract_no
        FROM
            pub_enterprise_protocol_signer ps
        WHERE
            ps.signer_ent_id = '979425854629085184'
    )
AND ep.protocol_type = '1'
LIMIT 0,
 100;

结果证明 改成in后查询时间只需0.019s 相比优化前20.57s 提升1085倍,并且查看了优化后的执行计划 条件命中索引 扫描的行数也只有32 行 相比原来277811 行全表扫描,可谓提升巨大

问题分析 EXISTS变慢了?

由EXISTS执行计划看到 子查询并没有走索引走的是where 接下来做个实验 去掉查询条件


原始协议查询不带条件执行计划.png

发现还是会全表扫描 但是子查询走了索引 索引字段是合同编号 查询时间也到了毫秒级别 这才是一个正常的查询时间(虽然和in 还有一倍的差距 但是也提升巨大)
子查询中 两个条件 (A)ps.contract_no = ep.contract_no(B)AND ps.signer_ent_id = '979425854629085184' 都是索引列为啥使用B 条件 反而没走索引了?
有执行计划中的子查询可知:
DEPENDENT SUBQUERY:子查询中的第一个SELECT,取决于外面的查询 ,DEPENDENT SUBQUERY , 那么就会先执行外部的查询 , 然后再循环执行内部的查询
MATERIALIZED: 子查询视图
EXISTS 的子查询出现 DEPENDENT SUBQUERY 先查询了外部表277811 行 在根据查询子查询条件 当只有A时走了contract_no索引字段 当同时出现B时 虽然执行计划上显示可能用到的索引字段是contract_no 但最终走了where 条件,这就说明了EXISTS 慢的原因

你可能感兴趣的:(mysql 关联表查询优化(3) - IN、EXISTS)