记一次MySQL单表区间查询优化过程

表结构

CREATE TABLE `slabdatahistory` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `time` timestamp NULL DEFAULT NULL COMMENT '生产的实时时间',
  `longTime` bigint(11) DEFAULT NULL,
  `FixedSideWater` float DEFAULT NULL COMMENT '固定侧冷却水流量',
  `LiberalSideWater` float DEFAULT NULL COMMENT '自由测冷却水流量',
  `NarrowLeftWater` float DEFAULT NULL COMMENT '窄面左侧冷却水流量',
  `NarrowRightWater` float DEFAULT NULL COMMENT '窄面右侧冷却水流量',
  `FixedSideWTemp` float DEFAULT NULL COMMENT '固定侧冷却水温度',
  `LiberaSideWTemp` float DEFAULT NULL COMMENT '自由侧冷却水温度',
  `NarrowLeftWTemp` float DEFAULT NULL COMMENT '窄面左侧冷却水温度',
  `NarrowRightWTemp` float DEFAULT NULL COMMENT '窄面右侧冷却水温度',
  `UCurrent` int(11) DEFAULT NULL COMMENT '变频器U相电流',
  `VCurrent` int(11) DEFAULT NULL COMMENT '变频器V相电流',
  `WCurrent` int(11) DEFAULT NULL COMMENT '变频器W相电流',
  `ActualCurrent` float DEFAULT NULL COMMENT '电流实际值',
  `ActualFrequency` int(11) DEFAULT NULL COMMENT '频率实际值 1=0.01HZ',
  `BusVoltage` int(11) DEFAULT NULL COMMENT '变频器直流母线电压',
  `VibrationL1` int(11) DEFAULT NULL COMMENT '结晶器振动L1表号',
  `VibrationC1` float DEFAULT NULL COMMENT '结晶器振动参数C1',
  `VibrationC2` float DEFAULT NULL COMMENT '结晶器振动参数C2',
  `VibrationC3` float DEFAULT NULL COMMENT '结晶器振动参数C3',
  `VibrationC4` float DEFAULT NULL COMMENT '结晶器振动参数C4',
  `VibrationC5` float DEFAULT NULL COMMENT '结晶器振动参数C5',
  `VibrationC6` float DEFAULT NULL COMMENT '结晶器振动参数C6',
  `VibrationC7` float DEFAULT NULL COMMENT '结晶器振动参数C7',
  `VibrationC8` float DEFAULT NULL COMMENT '结晶器振动参数C8',
  `FluidSetValue` int(11) DEFAULT NULL COMMENT '结晶器液位设定值',
  `FluidActualValue` float DEFAULT NULL COMMENT '结晶器液位实际值',
  `CastingSpeed` int(11) DEFAULT NULL COMMENT '拉速实际值',
  `SlabWidth` float DEFAULT NULL COMMENT '板坯宽度',
  `TundishTemperature` float DEFAULT NULL COMMENT '中间包温度',
  `NozzleFlow` float DEFAULT NULL COMMENT '水口中包车实际流量',
  `GrooveFlow` float DEFAULT NULL COMMENT '氩封中包车实际流量',
  `StopperFlow` float DEFAULT NULL COMMENT '塞棒中包车实际流量',
  `ImersionDepthTwo` float DEFAULT NULL COMMENT '四流水口侵入深度',
  `CastingSetSpeed` int(11) DEFAULT NULL COMMENT '拉速设定值',
  `FluidFlucValue` float DEFAULT NULL COMMENT '液位波动差值,等于结晶器液位实际值减去结晶器液位实际值',
  `result` tinyint(2) DEFAULT NULL COMMENT '板坯的质量结果,0表示封锁,1表示不封锁',
  PRIMARY KEY (`id`),
  KEY `FluidFlucValue` (`longTime`,`FluidFlucValue`,`result`) USING BTREE,
  KEY `SlabWidth` (`longTime`,`SlabWidth`,`result`) USING BTREE,
  KEY `TundishTemperature` (`result`,`TundishTemperature`,`longTime`) USING BTREE
) ENGINE=MyISAM AUTO_INCREMENT=869589 DEFAULT CHARSET=utf8;

这份表结构是优化后的表结构,和最初的表结构差别不大,最初建表默认引擎是InnoDB,没有longTime字段。数据一共有八十多万条。

任务

给出时间段,查询出这个时间段中,TundishTemperature(中间包温度)在(1530,1540)、(1540,1550)、(1550,1560)、(1560,1570)、(1570,1580)区间result为0,1的个数。数据一共有

第一次MySQL查询

mysql1

select s.result, COUNT(s.longTime) as number from slabdatahistory s where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59") 
and s.TundishTemperature BETWEEN 1530 and 1540 GROUP BY s.result
union all
select s.result, COUNT(s.longTime) as number from slabdatahistory s where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59") 
and s.TundishTemperature BETWEEN 1540 and 1550  GROUP BY s.result
union all
select s.result, COUNT(s.longTime) as number from slabdatahistory s where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59") 
and s.TundishTemperature BETWEEN 1550 and 1560 GROUP BY s.result
union all
select s.result, COUNT(s.longTime) as number from slabdatahistory s where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59") 
and s.TundishTemperature BETWEEN 1560 and 1570 GROUP BY s.result
union all
select s.result, COUNT(s.longTime) as number from slabdatahistory s where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59") 
and s.TundishTemperature BETWEEN 1570 and 1580 GROUP BY s.result

结果

result number
0   1
1   1374
0   1710
1   9533
0   31141
1   194377
0   63795
1   547649
0   15562
1   98605

耗时34.757s
上面的查询还是稍微优化之后的,我没有使用time比较时间大小,而是添加一个字段longTime来比较大小,因为整数的大小比较比字符串的比较要快。但是这个查询有个问题,就是如果某个区间没有数据,比如(1530, 1540)区间数据不存在,在结果上面就会少返回两行,而根据结果没办法判断到底是那个区间缺失数据,所以这种写法不行。

第二次MySQL查询

mysql2

select 0 as result, count(s.result) as number from slabdatahistory s 
where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59") and s.TundishTemperature BETWEEN 1530 and 1540 and s.result = 0
UNION all
select 1 as result, count(s.result) as number from slabdatahistory s 
where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59") and s.TundishTemperature BETWEEN 1530 and 1540 and s.result = 1
union ALL
select 0 as result, count(s.result) as number from slabdatahistory s 
where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59") and s.TundishTemperature BETWEEN 1540 and 1550 and s.result = 0
UNION all
select 1 as result, count(s.result) as number from slabdatahistory s 
where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59") and s.TundishTemperature BETWEEN 1540 and 1550 and s.result = 1
union ALL
select 0 as result, count(s.result) as number from slabdatahistory s 
where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59") and s.TundishTemperature BETWEEN 1550 and 1560 and s.result = 0
UNION all
select 1 as result, count(s.result) as number from slabdatahistory s 
where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59") and s.TundishTemperature BETWEEN 1550 and 1560 and s.result = 1
union ALL
select 0 as result, count(s.result) as number from slabdatahistory s 
where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59") and s.TundishTemperature BETWEEN 1560 and 1570 and s.result = 0
UNION all
select 1 as result, count(s.result) as number from slabdatahistory s 
where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59") and s.TundishTemperature BETWEEN 1560 and 1570 and s.result = 1
union all
select 0 as result, count(s.result) as number from slabdatahistory s 
where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59") and s.TundishTemperature BETWEEN 1570 and 1580 and s.result = 0
union ALL
select 1 as result, count(s.result) as number from slabdatahistory s 
where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59") and s.TundishTemperature BETWEEN 1570 and 1580 and s.result = 1

结果

result number
0   1
1   1374
0   1710
1   9533
0   31141
1   194377
0   63795
1   547649
0   15562
1   98605

耗时71.073s
上面这个查询可以避免第一次MySQL遇到的问题,但是耗时一分钟多,接下来我会优化一下,最终实现我们想要的区间统计数据,但是耗时不到1秒,因为我不是很精通MySQL,有错误的地方请留言指正。

优化

修改区间统计方法优化

使用case when

上面的统计使用了union all ,导致代码很长,并且直观上感觉查询应该会很耗时,使用case when 查询

mysql3

select s.result,sum(case when s.TundishTemperature BETWEEN 1530 AND 1540 then 1 else 0 end) as num0,
sum(case when s.TundishTemperature BETWEEN 1540 AND 1550 then 1 else 0 end) as num2,
sum(case when s.TundishTemperature BETWEEN 1550 AND 1560 then 1 else 0 end) as num4,
sum(case when s.TundishTemperature BETWEEN 1560 AND 1570 then 1 else 0 end) as num6,
sum(case when s.TundishTemperature BETWEEN 1570 AND 1580 then 1 else 0 end) as num8
from slabdatahistory s
where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59")
GROUP BY s.result

结果

result num0    num2        num4        num6      num8
0   1          1710         31141      63795    15562
1   1374       9533        194377      547649   98605

耗时7.746s

还有另外一种方式,不使用group by

mysql4

select sum(case when s.TundishTemperature BETWEEN 1530 AND 1540 and s.result = 0 then 1 else 0 end) as num0,
sum(case when s.TundishTemperature BETWEEN 1530 AND 1540 and s.result = 1 then 1 else 0 end) as num1,
sum(case when s.TundishTemperature BETWEEN 1540 AND 1550 and s.result = 0 then 1 else 0 end) as num2,
sum(case when s.TundishTemperature BETWEEN 1540 AND 1550 and s.result = 1 then 1 else 0 end) as num3,
sum(case when s.TundishTemperature BETWEEN 1550 AND 1560 and s.result = 0 then 1 else 0 end) as num4,
sum(case when s.TundishTemperature BETWEEN 1550 AND 1560 and s.result = 1 then 1 else 0 end) as num5,
sum(case when s.TundishTemperature BETWEEN 1560 AND 1570 and s.result = 0 then 1 else 0 end) as num6,
sum(case when s.TundishTemperature BETWEEN 1560 AND 1570 and s.result = 1 then 1 else 0 end) as num7,
sum(case when s.TundishTemperature BETWEEN 1570 AND 1580 and s.result = 0 then 1 else 0 end) as num8,
sum(case when s.TundishTemperature BETWEEN 1570 AND 1580 and s.result = 1 then 1 else 0 end) as num9
from slabdatahistory s
where s.longTime BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") and UNIX_TIMESTAMP("2018-03-12 23:59:59")

结果

num0 num1 num2 num3 num4 num5 num6 num7 num8 num9
1   1374    1710    9533    31141   194377  63795   547649  15562   98605

耗时 7.402s

使用 elt interval

关于elt interval具体用法百度一下,很简单,只是不容易想到这个方法

mysql5

select elt(interval(s.TundishTemperature,0,1530,1540,1550,1560,1570,1580),
'1zero/less1530', '2zero/1530-1540','3zero/1540-1550','4zero/1550-1560','5zero/1560-1570','6zero/1570-1580','7zero/>1580') as "区间",
count(s.id) as "数量"
from slabdatahistory s
where s.longTime 
BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") 
and UNIX_TIMESTAMP("2018-03-12 23:59:59")
and s.result = 0
group by  elt(interval(s.TundishTemperature,0,1530,1540,1550,1560,1570,1580),
'1zero/less1530', '2zero/1530-1540','3zero/1540-1550','4zero/1550-1560','5zero/1560-1570','6zero/1570-1580','7zero/>1580')
union all
select elt(interval(s.TundishTemperature,0,1530,1540,1550,1560,1570,1580),
'1one/less1530', '2one/1530-1540','3one/1540-1550','4one/1550-1560','5one/1560-1570','6one/1570-1580','7one/>1580') as "区间",
count(s.id) as "数量"
from slabdatahistory s
where s.longTime 
BETWEEN UNIX_TIMESTAMP("2018-03-01 00:00:00") 
and UNIX_TIMESTAMP("2018-03-12 23:59:59")
and s.result = 1
group by elt(interval(s.TundishTemperature,0,1530,1540,1550,1560,1570,1580),
'1/less1530', '2/1530-1540','3/1540-1550','4/1550-1560','5/1560-1570','6/1570-1580','7/>1580')

结果

区间 数量
1zero/less1530  2
2zero/1530-1540 1
3zero/1540-1550 1710
4zero/1550-1560 22929
5zero/1560-1570 58871
6zero/1570-1580 15562
7zero/>1580 170
2one/1530-1540  1373
3one/1540-1550  6049
4one/1550-1560  142715
5one/1560-1570  519690
6one/1570-1580  98605
7one/>1580  1910

耗时 13.230s

mysql2耗时71.073s mysql3耗时7.746s mysql4耗时7.402s mysql5耗时13.230s

使用索引优化

添加索引可以达到优化的目的,但是要正确添加索引,不然得不偿失,可以添加覆盖索引来进行查询优化,什么是覆盖索引?就是把你查询用到的字段设为一个索引,比如

 KEY `TundishTemperature` (`result`,`TundishTemperature`,`longTime`)

其中最重要的一点是添加覆盖索引的顺序,如果顺序为

 KEY `TundishTemperature` (`longTime`,`TundishTemperature`, `result`)

获得其他顺序查询优化效果并不明显,我当时就犯了这个错误,随便添加了覆盖索引,但是没有注意到顺序,导致查询速度没有想象中快,所以如果大家添加覆盖索引后速度没有很快的提升不妨修改一下覆盖索引里面字段的顺序。再试一遍,至于覆盖索引顺序的规则,我感觉可能是和查询语句里面字段出现的顺序一直会好点,这个没有做深入研究。
OK看一下添加覆盖索引后速度如何。
mysql2耗时1.131s mysql3耗时1.804s mysql4耗时2.261s mysql5耗时1.723s
mysql2 提升很明显,但是还没有达到1秒的目标

修改引擎优化

默认的MySQL引擎是InnoDB,修改成MyISAM 。两个引擎的区别百度一下,因为我这个任务只是查询,对插入、更新没有要求,所有可以用MyISAM。

alter table slabdatahistory ENGINE=MyISAM

mysql2耗时0.520s mysql3耗时0.855s mysql4耗时1.566s mysql5耗时1.393s
完成了1s查询的任务。竟然是mysql2,感觉有点不可思议。

总结

对于MySQL查询优化

  • MySQL查询语句优化
  • 表添加字段,比如把时间大小比较,变成时间戳的比较
  • 添加覆盖索引,注意覆盖索引中列的顺序,可以多试几遍不同的顺序
  • 修改MySQL引擎,如果对插入、更新没有要求的话。
    此外可以使用explain分析一下查询,看看查询是否使用了索引。

你可能感兴趣的:(记一次MySQL单表区间查询优化过程)