SQL优化改写之美——MySQL 一条SELECT死锁引发SQL优化思考

问题SQL出现的背景

出问题的数据库版本为MySQL 5.1,表引擎为MyISAM,在业务 SELECT 查询的时候,居然与一条 UPDATE 语句相作用,触发了数据库的死锁问题。
具体问题如下:
SQL优化改写之美——MySQL 一条SELECT死锁引发SQL优化思考_第1张图片
在常规情况下,SELECT 查询语句在 MyISAM 表引擎下是不会与 UPDATE 语句产生死锁,但数据库版本过旧,数据库存在未知且难以解决的 BUG,尝试升级数据库版本和更改表结构引擎,测试数据库升级方案中,业务中许多 SQL 中出现错误的结果返回和反应数据库整体缓慢。数据库升级方案无法通过,只能通过优化SQL,减少 SELECT 持有锁的等待时间,降低死锁出现的概率,最后在逐步升级业务、数据库和更改表引擎。
SQL 执行时长13S,SQL文 本如下:

SELECT .. ..
  FROM weituoanjian WA
  LEFT JOIN phoneinfo p
    ON wa.daoruId = p.drId
   AND wa.subId = p.subId
  LEFT JOIN employee e
    ON employeeId = beifenpeizhe
  LEFT JOIN mst_tuianleibie
    ON WA.TuiAnCode = mst_tuianleibie.TuiAnCode
  LEFT JOIN kehutel
    ON WA.keHuCode = kehutel.keHuId
  LEFT JOIN mst_anjianstatus
    ON WA.AnJianStatusCode = mst_anjianstatus.AnJianStatusCode
  LEFT OUTER JOIN mst_anjianstatus2
    ON WA.taCode = mst_anjianstatus2.TAStatusCode
  LEFT JOIN anjianpici
    ON anjianpicicode = wa.pcCode
 WHERE IFNULL(historyFlag, 0) <> 1
   AND (REPLACE(ChiKaRenXingMing, ' ', '') LIKE '%13335192949%' OR
        ckrpinyin LIKE '%13335192949%' OR ckrszm LIKE '%13335192949%' OR
        ZhengJianHaoMa LIKE '%13335192949%' OR KaHao LIKE '%13335192949%' OR
        ZhangHao LIKE '%13335192949%' OR email3 LIKE '%13335192949%' OR
        email2 LIKE '%13335192949%' OR
        REPLACE(bqContent, ' ', '') LIKE '%13335192949%' OR
        REPLACE(BeiZhu1, ' ', '') LIKE '%13335192949%' OR
        REPLACE(BeiZhu2, ' ', '') LIKE '%13335192949%' OR
        REPLACE(DanWeiMingCheng, ' ', '') LIKE '%13335192949%' OR
        pNo LIKE '%13335192949%' OR pName LIKE '%13335192949%' OR
        (wa.daoruId, wa.subId) IN
        (SELECT DISTINCT drId, subId
           FROM bgbinfo
          WHERE bgInfo LIKE '%13335192949%') OR
        ZhengJianHaoMa IN
        (SELECT DISTINCT ShenFenZhengID
           FROM inputlog
          WHERE REPLACE(inputText, ' ', '') LIKE '%13335192949%'))
 ORDER BY ZhengJianHaoMa, daoruId, subId LIMIT 0, 40;

仔细分析 SQL 文本,SQL 中以 LIKE ‘% 13335192949%’ 模糊查询,以及 IFNULL(historyFlag, 0) <> 1 等,并且没有很好的条件过滤字段,均无法通过对where后过滤谓词添加索引来优化 SQL。
执行计划如下:
SQL优化改写之美——MySQL 一条SELECT死锁引发SQL优化思考_第2张图片

分析、验证复杂SQL的性能瓶颈

通过对执行计划分析可知,SQL 先执行了 bgInfo,inputlog 相关子查询( DEPENDENT SUBQUERY ),然后再与其他表关联。
将 SQL拆分执行,验证 SQL 性能瓶颈:

SELECT .. ..
  FROM weituoanjian WA
  LEFT JOIN phoneinfo p
    ON wa.daoruId = p.drId
   AND wa.subId = p.subId
  LEFT JOIN employee e
    ON employeeId = beifenpeizhe
  LEFT JOIN mst_tuianleibie
    ON WA.TuiAnCode = mst_tuianleibie.TuiAnCode
  LEFT JOIN kehutel
    ON WA.keHuCode = kehutel.keHuId
  LEFT JOIN mst_anjianstatus
    ON WA.AnJianStatusCode = mst_anjianstatus.AnJianStatusCode
  LEFT OUTER JOIN mst_anjianstatus2
    ON WA.taCode = mst_anjianstatus2.TAStatusCode
  LEFT JOIN anjianpici
    ON anjianpicicode = wa.pcCode
 WHERE IFNULL(historyFlag, 0) <> 1
   AND (REPLACE(ChiKaRenXingMing, ' ', '') LIKE '%13335192949%' OR
        ckrpinyin LIKE '%13335192949%' OR ckrszm LIKE '%13335192949%' OR
        ZhengJianHaoMa LIKE '%13335192949%' OR KaHao LIKE '%13335192949%' OR
        ZhangHao LIKE '%13335192949%' OR email3 LIKE '%13335192949%' OR
        email2 LIKE '%13335192949%' OR
        REPLACE(bqContent, ' ', '') LIKE '%13335192949%' OR
        REPLACE(BeiZhu1, ' ', '') LIKE '%13335192949%' OR
        REPLACE(BeiZhu2, ' ', '') LIKE '%13335192949%' OR
        REPLACE(DanWeiMingCheng, ' ', '') LIKE '%13335192949%' OR
        pNo LIKE '%13335192949%' OR pName LIKE '%13335192949%';

此部分的SQL在0.5S内即可返回结果,初步诊断SQL性能瓶颈不在于多表的LEFT JOIN 关联,而在于与bgInfo,inputlog的相关子查询部分。
将 SQL 中的 in 子查询等价改写为 INNER JOIN 关联:
bgbinfo 表部分 SQL 改写如下:

SELECT .. .
          FROM weituoanjian WA
          LEFT JOIN phoneinfo p
            ON wa.daoruId = p.drId
           AND wa.subId = p.subId
          LEFT JOIN employee e
            ON employeeId = beifenpeizhe
          LEFT JOIN mst_tuianleibie
            ON WA.TuiAnCode = mst_tuianleibie.TuiAnCode
          LEFT JOIN kehutel
            ON WA.keHuCode = kehutel.keHuId
          LEFT JOIN mst_anjianstatus
            ON WA.AnJianStatusCode = mst_anjianstatus.AnJianStatusCode
          LEFT OUTER JOIN mst_anjianstatus2
            ON WA.taCode = mst_anjianstatus2.TAStatusCode
          LEFT JOIN anjianpici
            ON anjianpicicode = wa.pcCode
         INNER JOIN (SELECT drId, subId
                      FROM bgbinfo
                     WHERE REPLACE(inputText, ' ', '') LIKE '%13335192949%'
                     GROUP BY drId, subId) m
            ON m.drId = wa.daoruId
           AND m.subId = wa.subId
         WHERE (historyFlag > 1 or historyFlag < 1 or historyFlag is null)

此段 0.6s 即可返回结果,执行计划如下:
SQL优化改写之美——MySQL 一条SELECT死锁引发SQL优化思考_第3张图片
由执行计划可知,bgbinfo 先通过全索引扫描对drId, subId去重,获得结果集之后,成为驱动表,嵌套驱动WA 表。
inputlog 表部分SQL改写如下:

SELECT .. .
  FROM weituoanjian WA
  LEFT JOIN phoneinfo p
    ON wa.daoruId = p.drId
   AND wa.subId = p.subId
  LEFT JOIN employee e
    ON employeeId = beifenpeizhe
  LEFT JOIN mst_tuianleibie
    ON WA.TuiAnCode = mst_tuianleibie.TuiAnCode
  LEFT JOIN kehutel
    ON WA.keHuCode = kehutel.keHuId
  LEFT JOIN mst_anjianstatus
    ON WA.AnJianStatusCode = mst_anjianstatus.AnJianStatusCode
  LEFT OUTER JOIN mst_anjianstatus2
    ON WA.taCode = mst_anjianstatus2.TAStatusCode
  LEFT JOIN anjianpici
    ON anjianpicicode = wa.pcCode
 INNER JOIN (SELECT ShenFenZhengID
               FROM inputlog
              WHERE REPLACE(inputText, ' ', '') LIKE '%13335192949%'
              GROUP BY ShenFenZhengID) n
    ON wa.ZhengJianHaoMa = n.ShenFenZhengID
 WHERE (historyFlag > 1 or historyFlag < 1 or historyFlag is null)

改写 in 子查询后执行计划如下:
SQL优化改写之美——MySQL 一条SELECT死锁引发SQL优化思考_第4张图片
对 WA 表 ZhengJianHaoMa 字段加索引,6s 即可返回结果集,添加索引后执行计划如下:
SQL优化改写之美——MySQL 一条SELECT死锁引发SQL优化思考_第5张图片
由执行计划可知,inputlog 先通过全索引扫描对 ShenFenZhengID 字段去重,取得结果集之后,嵌套驱动 WA表 。
整个SQL改写和索引优化已基本结束,SQL执行时间由原来的 13s 提升到 7s,SQL索引、等价改写效果不太明显。但最终确认了SQL的性能瓶颈源于对 inputlog(表数据量150W)整张表按 ShenFenZhengID 的去重,无法在进一步通过SQL等价改写层面优化SQL的性能。

优化业务实现

既然无法进一步优化SQL,不得已需通过改变业务的实现方式来优化,由于 inputlog 表字段ShenFenZhengID 和 inputText 是不会更新和删除数据,于是通过定时任务来定时统计 inputlog 表按 ShenFenZhengID 的当前时间零点之前的数据去重插入到新表 inputlog_day 中,再与当天表新增的的数据做合并。定时任务可以通过程序实现,也可以通过存储过程实现,最主要的是消除每次SQL调用 inputlog 需要index 全索引扫描的高额代价。
定时任务SQL代码如下:

INERT INTO inputlog_day 
SELECT ShenFenZhengID,  '13335192949'
  FROM inputlog
 WHERE REPLACE(inputText, ' ', '') LIKE '%13335192949%'
   AND InputDate < DATE_FORMAT(CURRENT_DATE, '%Y-%m-%d 00:00:00')
 GROUP BY ShenFenZhengID;

业务和SQL实现如下,SQL在0.2s左右即可获得查询结果。

SELECT .. .
  FROM weituoanjian WA
  LEFT JOIN phoneinfo p
    ON wa.daoruId = p.drId
   AND wa.subId = p.subId
  LEFT JOIN employee e
    ON employeeId = beifenpeizhe
  LEFT JOIN mst_tuianleibie
    ON WA.TuiAnCode = mst_tuianleibie.TuiAnCode
  LEFT JOIN kehutel
    ON WA.keHuCode = kehutel.keHuId
  LEFT JOIN mst_anjianstatus
    ON WA.AnJianStatusCode = mst_anjianstatus.AnJianStatusCode
  LEFT OUTER JOIN mst_anjianstatus2
    ON WA.taCode = mst_anjianstatus2.TAStatusCode
  LEFT JOIN anjianpici
    ON anjianpicicode = wa.pcCode
 INNER JOIN (SELECT ShenFenZhengID
               from inputlog_day
              WHERE phoneNum = '13335192949'
             UNION
             SELECT ShenFenZhengID
               FROM inputlog
              WHERE REPLACE(inputText, ' ', '') LIKE '%13335192949%'
                AND InputDate >=
                    DATE_FORMAT(CURRENT_DATE, '%Y-%m-%d 00:00:00')
              GROUP BY ShenFenZhengID) n
    ON wa.ZhengJianHaoMa = n.ShenFenZhengID
 WHERE (historyFlag > 1 or historyFlag < 1 or historyFlag is null)

SQL的优化为1.1s,SQL最终改写后的代码实现如下:

SELECT *
  FROM (SELECT .. .
          FROM weituoanjian WA
          LEFT JOIN phoneinfo p
            ON wa.daoruId = p.drId
           AND wa.subId = p.subId
          LEFT JOIN employee e
            ON employeeId = beifenpeizhe
          LEFT JOIN mst_tuianleibie
            ON WA.TuiAnCode = mst_tuianleibie.TuiAnCode
          LEFT JOIN kehutel
            ON WA.keHuCode = kehutel.keHuId
          LEFT JOIN mst_anjianstatus
            ON WA.AnJianStatusCode = mst_anjianstatus.AnJianStatusCode
          LEFT OUTER JOIN mst_anjianstatus2
            ON WA.taCode = mst_anjianstatus2.TAStatusCode
          LEFT JOIN anjianpici
            ON anjianpicicode = wa.pcCode
         WHERE (historyFlag > 1 or historyFlag < 1 or historyFlag is null)
           AND (REPLACE(ChiKaRenXingMing, ' ', '') LIKE '%13335192949%' OR
               ckrpinyin LIKE '%13335192949%' OR
               ckrszm LIKE '%13335192949%' OR
               ZhengJianHaoMa LIKE '%13335192949%' OR
               KaHao LIKE '%13335192949%' OR ZhangHao LIKE '%13335192949%' OR
               email3 LIKE '%13335192949%' OR email2 LIKE '%13335192949%' OR
               REPLACE(bqContent, ' ', '') LIKE '%13335192949%' OR
               REPLACE(BeiZhu1, ' ', '') LIKE '%13335192949%' OR
               REPLACE(BeiZhu2, ' ', '') LIKE '%13335192949%' OR
               REPLACE(DanWeiMingCheng, ' ', '') LIKE '%13335192949%' OR
               pNo LIKE '%13335192949%' OR pName LIKE '%13335192949%')
        UNION
        SELECT .. .
          FROM weituoanjian WA
          LEFT JOIN phoneinfo p
            ON wa.daoruId = p.drId
           AND wa.subId = p.subId
          LEFT JOIN employee e
            ON employeeId = beifenpeizhe
          LEFT JOIN mst_tuianleibie
            ON WA.TuiAnCode = mst_tuianleibie.TuiAnCode
          LEFT JOIN kehutel
            ON WA.keHuCode = kehutel.keHuId
          LEFT JOIN mst_anjianstatus
            ON WA.AnJianStatusCode = mst_anjianstatus.AnJianStatusCode
          LEFT OUTER JOIN mst_anjianstatus2
            ON WA.taCode = mst_anjianstatus2.TAStatusCode
          LEFT JOIN anjianpici
            ON anjianpicicode = wa.pcCode
         INNER JOIN (SELECT drId, subId
                      FROM bgbinfo
                     WHERE REPLACE(inputText, ' ', '') LIKE '%13335192949%'
                     GROUP BY drId, subId) m
            ON m.drId = wa.daoruId
           AND m.subId = wa.subId
         WHERE (historyFlag > 1 or historyFlag < 1 or historyFlag is null)
        UNION
        SELECT .. .
          FROM weituoanjian WA
          LEFT JOIN phoneinfo p
            ON wa.daoruId = p.drId
           AND wa.subId = p.subId
          LEFT JOIN employee e
            ON employeeId = beifenpeizhe
          LEFT JOIN mst_tuianleibie
            ON WA.TuiAnCode = mst_tuianleibie.TuiAnCode
          LEFT JOIN kehutel
            ON WA.keHuCode = kehutel.keHuId
          LEFT JOIN mst_anjianstatus
            ON WA.AnJianStatusCode = mst_anjianstatus.AnJianStatusCode
          LEFT OUTER JOIN mst_anjianstatus2
            ON WA.taCode = mst_anjianstatus2.TAStatusCode
          LEFT JOIN anjianpici
            ON anjianpicicode = wa.pcCode
         INNER JOIN (SELECT ShenFenZhengID
                      from inputlog_day
                     WHERE phoneNum = '13335192949'
                    UNION
                    SELECT ShenFenZhengID
                      FROM inputlog
                     WHERE REPLACE(inputText, ' ', '') LIKE '%13335192949%'
                       AND InputDate >= DATE_FORMAT(CURRENT_DATE, '%Y-%m-%d 00:00:00')
                     GROUP BY ShenFenZhengID) n
            ON wa.ZhengJianHaoMa = n.ShenFenZhengID
         WHERE (historyFlag > 1 or historyFlag < 1 or historyFlag is null)) t
 ORDER BY t.ZhengJianHaoMa, t.daoruId, t.subId LIMIT 0, 40;

总结此SQL的优化分析过程,SQL的优化难点在于 LIKE ‘% 13335192949%’ 模糊查询和 IN 子查询及 OR 条件,无法添加有效的索引实现SQL优化,初步分析性能瓶颈由 in 相关子查询导致的,等价改写SQL改变表的驱动方式,也仅仅只将SQL的执行时间由13s优化为7s,SQL真正的性能瓶颈在于对150W数据的 inputlog 表按列 ShenFenZhengID 去重。已无法在索引、SQL层面进一步对SQL优化,只能通过进一步优化业务的实现方式,通过以定时任务的方式,来优化SQL。当然如果深入了解业务,可能会使SQL有更高的性能,就不在本次讨论里面,本次仅从SQL层面,尽可能的避免改变业务方式实现。

你可能感兴趣的:(SQL优化改写之美——MySQL 一条SELECT死锁引发SQL优化思考)