故障案例--mysql5.5分区表的一个坑

故障现象

db每隔一段时间就异常重启,查看DB错误日志的错误日志Database was not shut down normally相关的信息,而查看/var/log/message并没有发现什么异常,没有发生OOM。由于每次异常重启的间隔都比较相近,所以怀疑是业务的某个sql引起的,后来经过业务层排查,发现每隔一段时间都会执行一条如下SQL语句

select sid as sid,source as source,sum(valid) as valid,sum(error) as error from playstats where startTime>="2016-07-08 10:00:00" and endTime<="2016-07-08 12:00:00" group by sid,source;

查看表结构如下

mysql> show create table playstats\G
*************************** 1. row ***************************
       Table: playstats
Create Table: CREATE TABLE `playstats` (
  `id` int(255) NOT NULL AUTO_INCREMENT,
  `year` int(4) NOT NULL,
  `month` int(2) NOT NULL,
  `day` int(2) NOT NULL,
  `startTime` datetime NOT NULL,
  `endTime` datetime NOT NULL,
  `version` varchar(12) NOT NULL DEFAULT '',
  `source` varchar(12) NOT NULL DEFAULT '',
  `sid` varchar(12) NOT NULL,
  `valid` int(8) NOT NULL,
  `error` int(8) NOT NULL,
  `total` int(8) NOT NULL,
  PRIMARY KEY (`id`,`startTime`,`version`,`source`,`sid`),
  KEY `playstats_index_startTime` (`startTime`),
  KEY `playstats_index_endTime` (`endTime`),
  KEY `playstats_muti_index` (`year`,`month`,`source`),
  KEY `playstats_index_source` (`source`),
  KEY `month_index` (`month`),
  KEY `year_index` (`year`)
) ENGINE=InnoDB AUTO_INCREMENT=1267666446 DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE (to_days(startTime))
(PARTITION p20160405 VALUES LESS THAN (736425) ENGINE = InnoDB,
 PARTITION p20160620 VALUES LESS THAN (736501) ENGINE = InnoDB,
 PARTITION p20160706 VALUES LESS THAN (736517) ENGINE = InnoDB) */
1 row in set (0.00 sec)

现象模拟:

mysql> select * from playstats;

Empty set (0.00 sec)

mysql> select sid as sid,source as source,sum(valid) as valid,sum(error) as error from playstats where startTime>="2016-07-03 10:00:00" and endTime<="2016-07-03 12:00:00" group by sid,source;
Empty set (0.00 sec)

mysql> select sid as sid,source as source,sum(valid) as valid,sum(error) as error from playstats where startTime>="2016-07-08 10:00:00" and endTime<="2016-07-08 12:00:00" group by sid,source;
ERROR 2013 (HY000): Lost connection to MySQL server during query

 

原因分析:

mysql> select to_days("2016-07-03 12:00:00");
+--------------------------------+
| to_days("2016-07-03 12:00:00") |
+--------------------------------+
|                         736513 |
+--------------------------------+
1 row in set (0.00 sec)

mysql> select to_days("2016-07-08 10:00:00");
+--------------------------------+
| to_days("2016-07-08 10:00:00") |
+--------------------------------+
|                         736518 |
+--------------------------------+
1 row in set (0.00 sec)

直接查没问题,时间范围<736517也没问题,应该是大于736517以后没有定义,于是直接db崩溃,属于bug

 

解决措施

添加一个maxvalue,即将表结构改为

 

alter table playstats /*!50100 PARTITION BY RANGE (to_days(startTime)) (PARTITION p20160405 VALUES LESS THAN (736425) ENGINE = InnoDB,  PARTITION p20160620 VALUES LESS THAN (736501) ENGINE = InnoDB,  PARTITION p20160706 VALUES LESS THAN maxvalue ENGINE = InnoDB) */;

mysql> select sid as sid,source as source,sum(valid) as valid,sum(error) as error from playstats where startTime>="2016-07-08 10:00:00" and endTime<="2016-07-08 12:00:00" group by sid,source;
Empty set (0.00 sec)

更好的修改方法是直接添加新分区,这样可以瞬间完成,避免copy数据和锁表,比如这样

ALTER TABLE  xxxx ADD PARTITION (PARTITION yyyyy VALUES LESS THAN maxvalue ENGINE = InnoDB);

这时再执行上面的sql就不会崩溃了

 

你可能感兴趣的:(MySQL,DB故障处理案例)