概述
mysql分区表概述:google搜索一下;
RANGE COLUMNS partitioning
主要测试mysql分区表的性能;
- load 500w 条记录:大约在10min左右;
- batch insert 1.9w条记录(没建立索引):存在500w条记录的情况下批量插入,速度很快,基本1s左右;
- batch insert 1.9w条记录(建立1个索引):存在500w条记录的情况下批量插入,速度变慢,基本3s左右(建立的索引越多,速度会越慢);
- 查询:通过where对分区进行过滤,使用了表分区之后,性能提升很明显;
- 建立索引查询时的性能:数据量大时,B+TREE索引若需要进行回表查询(无法索引覆盖),则性能很差;
- 建立索引查询时的性能:数据量不大时,B+TREE索引性能不错(8w时的数据量,性能不如无索引的性能);
- 索引覆盖 vs 非索引覆盖: 速度相差十几倍;
- 多列索引,索引顺序影响:性能相差20倍;
性能对比
如下数据都是查询多次的平均值(首次查询时,耗时都比较长)
耗时 | 未使用索引 | 使用索引 | 未分区表 | 分区表 | 特点 | 备注 |
---|---|---|---|---|---|---|
load data | 8 min 26.03 sec | 11 min 11.01 sec | 非分区表的插入性能好些 | |||
batch insert | 0.29 sec | 0.56 sec | 3000-records | 批量插入性能差不多 | ||
batch insert | 1.85 sec | 1.4 sec | 1.9w-records | 批量插入性能差不多 | ||
batch insert | 未测试 | 3~4 sec | 1.9w-records | 索引建立的越多,插入越慢 | ||
query1 | 3.38 sec | 3.36 sec | count(*),没有where | 性能差不多 | ||
query2 | 4 sec | 0.6 sec | 将分区作为过滤条件 | 分区表的性能,提升了好多倍 | ||
query3 | 5.7 sec | 1.8 sec | 将分区作为过滤条件,group by | 分区表的性能,提升了3倍左右 | ||
query4 | 1.26s | 26s | 使用了B+Tree索引,产生了大量的随机IO | 使用索引虽然查询条数减少,性能反而下降的厉害 | ||
query5 |
表结构
未分区表
| performance_metirc_host_min10_hour | CREATE TABLE `performance_metirc_host_min10_hour` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`pool_id` char(36) NOT NULL COMMENT '资源池ID',
`host_id` char(36) NOT NULL COMMENT '主机ID',
`indicator_key` varchar(64) NOT NULL COMMENT '指标key',
`value` double DEFAULT NULL COMMENT '指标值',
`resource_type` varchar(64) NOT NULL COMMENT '资源类型',
`create_at` datetime NOT NULL COMMENT '最近一次添加或更新的时间',
`business_id` char(36) DEFAULT NULL COMMENT '业务系统ID',
`organization_id` char(36) DEFAULT NULL COMMENT '部门ID',
`vpc_id` char(36) DEFAULT NULL COMMENT 'vpc维度',
`security_id` char(36) DEFAULT NULL COMMENT '安全域ID',
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5046203 DEFAULT CHARSET=utf8 COMMENT='该表用于保存裸金属指标项数据'
分区表
根据indicator_key创建分区表;
主键使用:PRIMARY KEY (id,indicator_key)
而不是PRIMARY KEY (id)
,
原因:使用mysql分区表的限制,分区的列必须包含在所有的唯一索引或主键中;
| performance_metirc_host_part_min10_hour | CREATE TABLE `performance_metirc_host_part_min10_hour` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`pool_id` char(36) COLLATE utf8_bin NOT NULL COMMENT '资源池ID',
`host_id` char(36) COLLATE utf8_bin NOT NULL COMMENT '主机ID',
`indicator_key` varchar(64) COLLATE utf8_bin NOT NULL COMMENT '指标key',
`value` double DEFAULT NULL COMMENT '指标值',
`resource_type` varchar(64) COLLATE utf8_bin NOT NULL COMMENT '资源类型',
`create_at` datetime NOT NULL COMMENT '最近一次添加或更新的时间',
`business_id` char(36) COLLATE utf8_bin DEFAULT NULL COMMENT '业务系统ID',
`organization_id` char(36) COLLATE utf8_bin DEFAULT NULL COMMENT '部门ID',
`vpc_id` char(36) COLLATE utf8_bin DEFAULT NULL COMMENT 'vpc维度',
`security_id` char(36) COLLATE utf8_bin DEFAULT NULL COMMENT '安全域ID',
PRIMARY KEY (`id`,`indicator_key`)
) ENGINE=InnoDB AUTO_INCREMENT=4999308 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
/*!50500 PARTITION BY RANGE COLUMNS(indicator_key)
(PARTITION bm_cpu VALUES LESS THAN ('bm_statistic_cpu') ENGINE = InnoDB,
PARTITION bm_disk VALUES LESS THAN ('bm_statistic_disk') ENGINE = InnoDB,
PARTITION bm_mem VALUES LESS THAN ('bm_statistic_mem') ENGINE = InnoDB,
PARTITION vm_cpu VALUES LESS THAN ('vm_statistic_cpu') ENGINE = InnoDB,
PARTITION vm_disk VALUES LESS THAN ('vm_statistic_disk') ENGINE = InnoDB,
PARTITION vm_mem VALUES LESS THAN ('vm_statistic_mem') ENGINE = InnoDB,
PARTITION pmax VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB) */ |
导入数据 (load)
数据准备:
- 使用Java生成数据(代码见下文);
- 记录条数:500W条;
- 使用load的方式导入数据;
非分区表
MySQL [test]> load data local infile '/opt/data/tmp/hostpartSql2.data' into table performance_metirc_host_min10_hour fields terminated by ',' enclosed by '\'';
Query OK, 4999314 rows affected, 65535 warnings (8 min 26.03 sec)
Records: 5000000 Deleted: 0 Skipped: 686 Warnings: 4999692
MySQL [test]> select count(*) from performance_metirc_host_min10_hour;
+----------+
| count(*) |
+----------+
| 4999314 |
+----------+
1 row in set (3.41 sec)
总耗时: 8 min 26.03 sec;
分区表
MySQL [test]> load data local infile '/opt/data/tmp/hostpartSql2.data' into table performance_metirc_host_part_min10_hour fields terminated by
Query OK, 4999361 rows affected, 65535 warnings (11 min 11.01 sec)
Records: 5000000 Deleted: 0 Skipped: 639 Warnings: 4999692
MySQL [test]> select count(*) from performance_metirc_host_part_min10_hour;
+----------+
| count(*) |
+----------+
| 4999361 |
+----------+
1 row in set (3.36 sec)
总耗时: 10 min 52.22 sec;
查看各分区的数据分布情况
MySQL [test]> select partition_name, TABLE_SCHEMA, table_rows from information_schema.partitions where table_name='performance_metirc_host_part_min10_hour';
+----------------+--------------+------------+
| partition_name | TABLE_SCHEMA | table_rows |
+----------------+--------------+------------+
| bm_cpu | test | 154 |
| bm_disk | test | 803897 |
| bm_mem | test | 803297 |
| vm_cpu | test | 802386 |
| vm_disk | test | 802738 |
| vm_mem | test | 802532 |
| pmax | test | 804355 |
+----------------+--------------+------------+
从上面可以看出,各分区的记录分布比较平均,每一个分区的数据大约都在80万左右;
批量插入3000、1.9w条记录(没建立索引时)
sql语句如下所示(完整sql没有列举完)
INSERT INTO performance_metirc_host_min10_hour(pool_id,host_id,indicator_key,value,resource_type,create_at,business_id,organization_id,vpc_id,security_id) VALUES
('7b8f0f5e2fbb4d9aa2d5fd55466d638e', 'fd623404-301a-402a-a57c-b6202737d218', 'vm_statistic_cpu_avg_util_percent', '0.056361832611832606', 'vm', '2017-12-01 06:00:00', 'a02f53f285804dda82dc7d1817513c70', '1da69607a73349bb909e65294e44c3a5', null, null),
('7b8f0f5e2fbb4d9aa2d5fd55466d638e', '003c958b-2286-4933-a30f-6c050ec0ae37', 'vm_statistic_cpu_avg_util_percent', '0.05548400673400674', 'vm', '2017-12-09 06:00:00', 'a02f53f285804dda82dc7d1817513c70', '1da69607a73349bb909e65294e44c3a5', null, null),
...
...
...
;
非分区表
1.9w条数据:平均时间1s左右;
MySQL [test]> use test;
MySQL [test]> source /opt/data/tmp/insert3000Record.sql;
Query OK, 3033 rows affected (0.29 sec)
Records: 3033 Duplicates: 0 Warnings: 0
MySQL [test]> source /opt/data/tmp/insert19000Records.sql;
Query OK, 18654 rows affected (1.85 sec--10次的平均值)
Records: 18654 Duplicates: 0 Warnings: 0
分区表
1.9w条数据:平均时间1s左右;
MySQL [test]> source /opt/data/tmp/insert3000Record.sql;
Query OK, 3033 rows affected (0.56 sec)
Records: 3033 Duplicates: 0 Warnings: 0
MySQL [test]> source /opt/data/tmp/insert19000Records.sql;
Query OK, 18654 rows affected (1.40 sec--10次的平均值)
Records: 18654 Duplicates: 0 Warnings: 0
批量插入1.9w条记录(建立1个索引)
alter table performance_metirc_host_part_min10_hour add key indicator_create_busi_idx(indicator_key, create_at, business_id);
批量查询测试:1.6s~8.65s, 平均时间:3~4s
MySQL [test]> source /opt/data/tmp/insert19000Records.sql;
Query OK, 18654 rows affected (3.51 sec)
Records: 18654 Duplicates: 0 Warnings: 0
批量插入1.9w条记录(建立多个索引)
索引如下:
PRIMARY KEY (`id`,`indicator_key`),
KEY `indicator_create_busi_idx` (`indicator_key`,`create_at`,`business_id`),
KEY `indicator_busi_create_idx` (`indicator_key`,`business_id`,`create_at`)
批量查询测试:1.6s~9.5s, 平均时间:4s
MySQL [test]> source /opt/data/tmp/insert19000Records.sql;
Query OK, 18654 rows affected (3.51 sec)
Records: 18654 Duplicates: 0 Warnings: 0
查询数据
Query1:没有进行分区过滤
example1:
//count统计,没有进行分区过滤
select count(*) from performance_metirc_host_min10_hour;
select count(*) from performance_metirc_host_part_min10_hour;
example2:
//没有进行分区过滤
select distinct(create_at) from performance_metirc_host_min10_hour;
select distinct(create_at) from performance_metirc_host_part_min10_hour;
MySQL [test]> explain partitions select distinct(create_at) from performance_metirc_host_part_min10_hour \G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: performance_metirc_host_part_min10_hour
partitions: bm_cpu,bm_disk,bm_mem,vm_cpu,vm_disk,vm_mem,pmax //全部分区都使用了
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 4822392
Extra: Using temporary
1 row in set (0.00 sec)
Query2:
非分区表
平均时间:4s
- 没有分区信息;
- 没有建立索引;
- 遍历表:查询了500w条记录;
MySQL [test]> select avg(value) from performance_metirc_host_min10_hour where indicator_key = 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' ;
+-------------------+
| avg(value) |
+-------------------+
| 50.09309208798831 |
+-------------------+
1 row in set (4.09 sec)
MySQL [test]> select max(value) from performance_metirc_host_min10_hour where indicator_key = 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' ;
+-------------------+
| max(value) |
+-------------------+
| 99.99980456042323 |
+-------------------+
1 row in set (3.50 sec)
MySQL [test]> explain partitions select avg(value) from performance_metirc_host_min10_hour where indicator_key = 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' \G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: performance_metirc_host_min10_hour
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5042030 // 查询了500w条数据
Extra: Using where
1 row in set (0.00 sec)
分区表
平均时间:0.6s
- 使用了分区信息;
- 没有建立索引;
- 遍历表:查询了80w~92w条记录,比非分区表少查询了6倍多(刚刚是分区的个数);
MySQL [test]> select avg(value) from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' ;
+-------------------+
| avg(value) |
+-------------------+
| 50.09288924799467 |
+-------------------+
1 row in set (0.60 sec)
MySQL [test]> select max(value) from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' ;
+-------------------+
| max(value) |
+-------------------+
| 99.99980456042323 |
+-------------------+
1 row in set (0.60 sec)
MySQL [test]> explain partitions select avg(value) from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' \G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: performance_metirc_host_part_min10_hour
partitions: vm_cpu //只使用了vm_cpu分区
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 802386 // 只查询了80w条数据
Extra: Using where
1 row in set (0.00 sec)
MySQL [test]> explain partitions select avg(value) from performance_metirc_host_part_min10_hour where indicator_key= 'vm_statistic_disk_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' \G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: performance_metirc_host_part_min10_hour
partitions: vm_mem // 只使用了vm_mem分区
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 929035 // 值查询了92w条数据
Extra: Using where
1 row in set (0.00 sec)
ERROR: No query specified
Query3:
非分区表
平均时间:5.7 sec
MySQL [test]> select avg(value) from performance_metirc_host_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 02:50:00' group by organization_id;
+--------------------+
| avg(value) |
+--------------------+
| 50.0384388700016 |
| 49.954251371279 |
| 50.1629822975072 |
+--------------------+
9 rows in set (5.59 sec)
MySQL [test]> select max(value) from performance_metirc_host_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 02:50:00' group by organization_id;
+-------------------+
| max(value) |
+-------------------+
| 99.99964338156543 |
| 99.99855115581629 |
| 99.99941828293112 |
+-------------------+
9 rows in set (5.86 sec)
MySQL [test]> select max(value) from performance_metirc_host_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 02:50:00' group by business_id;
+-------------------+
| max(value) |
+-------------------+
| 99.99964338156543 |
| 99.99898161250623 |
| 99.99980456042323 |
+-------------------+
11 rows in set (5.50 sec)
MySQL [test]> select avg(value) from performance_metirc_host_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 02:50:00' group by business_id;
+--------------------+
| avg(value) |
+--------------------+
| 50.1993472498974 |
| 50.04430780009459 |
| 50.078605604109285 |
+--------------------+
11 rows in set (5.57 sec)
分区表
平均时间:1.8s
MySQL [test]> select max(value) from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 02:50:00' group by organization_id;
+-------------------+
| max(value) |
+-------------------+
| 99.99818300251297 |
| 99.99855115581629 |
| 99.99941828293112 |
+-------------------+
9 rows in set (1.86 sec)
MySQL [test]> select avg(value) from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 02:50:00' group by organization_id;
+--------------------+
| avg(value) |
+--------------------+
| 50.0384388700016 |
| 49.954023140979096 |
| 50.16278417450607 |
+--------------------+
9 rows in set (1.86 sec)
MySQL [test]> select max(value) from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' group by business_id;
+-------------------+
| max(value) |
+-------------------+
| 99.99536010412046 |
| 99.99898161250623 |
| 99.99980456042323 |
+-------------------+
11 rows in set (1.24 sec)
MySQL [test]> select avg(value) from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 02:50:00' group by business_id;
+--------------------+
| avg(value) |
+--------------------+
| 50.1993472498974 |
| 50.285359967063464 |
| 50.078605604109285 |
+--------------------+
11 rows in set (1.77 sec)
Query4:使用B+TREE索引--回表查询-查询性能反而大幅度降低
查询语句:
select indicator_key, avg(value) as value from performance_metirc_host_part_min10_hour
where indicator_key= 'bm_statistic_mem_avg_util_percent'
and
create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00'
group by business_id order by value asc;
未添加索引
- 平均查询时间:1.26s(10次的平均结果)
- 总查询条数:80w,使用了全表扫描;
- 使用了 vm_cpu 分区:大大提升了性能
MySQL [test]> select indicator_key, avg(value) as value from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' group by business_id order by value asc;
+-----------------------------------+--------------------+
| indicator_key | value |
+-----------------------------------+--------------------+
| bm_statistic_mem_avg_util_percent | 49.845570215053264 |
| bm_statistic_mem_avg_util_percent | 49.90994276843408 |
| bm_statistic_mem_avg_util_percent | 50.01579830123528 |
| bm_statistic_mem_avg_util_percent | 50.036187114557514 |
| bm_statistic_mem_avg_util_percent | 50.056310301051525 |
| bm_statistic_mem_avg_util_percent | 50.1082718123528 |
| bm_statistic_mem_avg_util_percent | 50.116061996684614 |
| bm_statistic_mem_avg_util_percent | 50.15219690174755 |
| bm_statistic_mem_avg_util_percent | 50.1848819477595 |
| bm_statistic_mem_avg_util_percent | 50.2105660859758 |
| bm_statistic_mem_avg_util_percent | 50.384555005273285 |
+-----------------------------------+--------------------+
11 rows in set (1.45 sec)
MySQL [test]> explain select indicator_key, avg(value) as value from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' group by business_id order by value asc;
+----+-------------+-----------------------------------------+------+---------------+------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------------------+------+---------------+------+---------+------+--------+----------------------------------------------+
| 1 | SIMPLE | performance_metirc_host_part_min10_hour | ALL | NULL | NULL | NULL | NULL | 802386 | Using where; Using temporary; Using filesort |
+----+-------------+-----------------------------------------+------+---------------+------+---------+------+--------+----------------------------------------------+
MySQL [test]> explain partitions select indicator_key, avg(value) as value from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' group by business_id order by value asc;
+----+-------------+-----------------------------------------+------------+------+---------------+------+---------+------+--------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------------------+------------+------+---------------+------+---------+------+--------+----------------------------------------------+
| 1 | SIMPLE | performance_metirc_host_part_min10_hour | vm_cpu | ALL | NULL | NULL | NULL | NULL | 802386 | Using where; Using temporary; Using filesort |
+----+-------------+-----------------------------------------+------------+------+---------------+------+---------+------+--------+----------------------------------------------+
1 row in set (0.00 sec)
索引 indicator_create_busi_idx
此索引无法使用BusinessId,因为create_at一般为范围查询;
KEY `indicator_create_busi_idx` (`indicator_key`,`create_at`,`business_id`),
添加索引后的查询时间:26s(10次的平均结果),性能急剧下滑
- 查询总条数:40w;
- 使用了filesort文件排序;
- 使用了索引:indicator_create_busi_idx(indicator_key, create_at, business_id), 查询性能反而降低了20倍左右;
- 只使用了 vm_cpu 分区: 大大提升了性能;
详细见下面:
alter table performance_metirc_host_part_min10_hour add key indicator_create_busi_idx(indicator_key, create_at, business_id);
// 只使用了 vm_cpu 分区: 提升了性能
MySQL [test]> explain partitions select indicator_key, avg(value) as value from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' group by business_id order by value asc;
+----+-------------+-----------------------------------------+------------+------+---------------------------+---------------------------+---------+-------+--------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------------------+------------+------+---------------------------+---------------------------+---------+-------+--------+----------------------------------------------+
| 1 | SIMPLE | performance_metirc_host_part_min10_hour | vm_cpu | ref | indicator_create_busi_idx | indicator_create_busi_idx | 194 | const | 401193 | Using where; Using temporary; Using filesort |
+----+-------------+-----------------------------------------+------------+------+---------------------------+---------------------------+---------+-------+--------+----------------------------------------------+
MySQL [test]> explain select indicator_key, avg(value) as value from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' group by business_id order by value asc;
+----+-------------+-----------------------------------------+------+---------------------------+---------------------------+---------+-------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------------------+------+---------------------------+---------------------------+---------+-------+--------+----------------------------------------------+
| 1 | SIMPLE | performance_metirc_host_part_min10_hour | ref | indicator_create_busi_idx | indicator_create_busi_idx | 194 | const | 401193 | Using where; Using temporary; Using filesort |
+----+-------------+-----------------------------------------+------+---------------------------+---------------------------+---------+-------+--------+----------------------------------------------+
1 row in set (0.00 sec)
MySQL [test]> show create table performance_metirc_host_part_min10_hour;
| performance_metirc_host_part_min10_hour | CREATE TABLE `performance_metirc_host_part_min10_hour` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`pool_id` char(36) COLLATE utf8_bin NOT NULL COMMENT '资源池ID',
`host_id` char(36) COLLATE utf8_bin NOT NULL COMMENT '主机ID',
`indicator_key` varchar(64) COLLATE utf8_bin NOT NULL COMMENT '指标key',
`value` double DEFAULT NULL COMMENT '指标值',
`resource_type` varchar(64) COLLATE utf8_bin NOT NULL COMMENT '资源类型',
`create_at` datetime NOT NULL COMMENT '最近一次添加或更新的时间',
`business_id` char(36) COLLATE utf8_bin DEFAULT NULL COMMENT '业务系统ID',
`organization_id` char(36) COLLATE utf8_bin DEFAULT NULL COMMENT '部门ID',
`vpc_id` char(36) COLLATE utf8_bin DEFAULT NULL COMMENT 'vpc维度',
`security_id` char(36) COLLATE utf8_bin DEFAULT NULL COMMENT '安全域ID',
PRIMARY KEY (`id`,`indicator_key`),
KEY `indicator_create_busi_idx` (`indicator_key`,`create_at`,`business_id`)
) ENGINE=InnoDB AUTO_INCREMENT=10287524 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
/*!50500 PARTITION BY RANGE COLUMNS(indicator_key)
(PARTITION bm_cpu VALUES LESS THAN ('bm_statistic_cpu') ENGINE = InnoDB,
PARTITION bm_disk VALUES LESS THAN ('bm_statistic_disk') ENGINE = InnoDB,
PARTITION bm_mem VALUES LESS THAN ('bm_statistic_mem') ENGINE = InnoDB,
PARTITION vm_cpu VALUES LESS THAN ('vm_statistic_cpu') ENGINE = InnoDB,
PARTITION vm_disk VALUES LESS THAN ('vm_statistic_disk') ENGINE = InnoDB,
PARTITION vm_mem VALUES LESS THAN ('vm_statistic_mem') ENGINE = InnoDB,
PARTITION pmax VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB) */ |
索引 indicator_busi_create_idx
KEY `indicator_busi_create_idx` (`indicator_key`,`business_id`,`create_at`)
//注意并非(索引列的顺序不同):KEY `indicator_create_busi_idx` (`indicator_key`,`create_at`,`business_id`),
添加索引后的查询时间:26s(10次的平均结果),性能急剧下滑
- 查询总条数:40w;
- 使用了filesort文件排序;
- 使用了索引:indicator_create_busi_idx(indicator_key, create_at, business_id), 查询性能反而降低了20倍左右;
- 只使用了 vm_cpu 分区: 大大提升了性能;
详细见下面:
MySQL [test]> explain partitions select indicator_key, avg(value) as value from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' group by business_id order by value asc;
+----+-------------+-----------------------------------------+------------+------+---------------------------+---------------------------+---------+-------+--------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------------------+------------+------+---------------------------+---------------------------+---------+-------+--------+----------------------------------------------+
| 1 | SIMPLE | performance_metirc_host_part_min10_hour | vm_cpu | ref | indicator_busi_create_idx | indicator_busi_create_idx | 194 | const | 401193 | Using where; Using temporary; Using filesort |
+----+-------------+-----------------------------------------+------------+------+---------------------------+---------------------------+---------+-------+--------+----------------------------------------------+
1 row in set (0.00 sec)
MySQL [test]> select indicator_key, avg(value) as value from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' group by business_id order by value asc;
+-----------------------------------+--------------------+
| indicator_key | value |
+-----------------------------------+--------------------+
| bm_statistic_mem_avg_util_percent | 49.84557021505294 |
| bm_statistic_mem_avg_util_percent | 49.909942768434064 |
| bm_statistic_mem_avg_util_percent | 50.01579830123537 |
| bm_statistic_mem_avg_util_percent | 50.036187114557656 |
| bm_statistic_mem_avg_util_percent | 50.05631030105144 |
| bm_statistic_mem_avg_util_percent | 50.10827181235255 |
| bm_statistic_mem_avg_util_percent | 50.11606199668445 |
| bm_statistic_mem_avg_util_percent | 50.15219690174774 |
| bm_statistic_mem_avg_util_percent | 50.184881947759685 |
| bm_statistic_mem_avg_util_percent | 50.21056608597557 |
| bm_statistic_mem_avg_util_percent | 50.384555005273334 |
+-----------------------------------+--------------------+
11 rows in set (32.06 sec)
总结
indicator_busi_create_idx 和 indicator_create_busi_idx 对此查询的性能基本一样;
原因:
该查询只能使用到 indicator_key,无法使用到 businessId,都将导致大量的回表查询,大量的随机IO;
实验结果:
当数据量很大时,BTREE索引如果需要进行回表查询(未能索引覆盖),产生大量随机IO,导致查询性能很差;
- 未使用索引,平均时间:1.26s;
- 使用索引,平均时间:26s;
- 添加索引后,性能下降了20倍左右;
原因推测
- 使用索引后,B+Tree索引需要进行主键二次查询,即需要回表查询,虽然总查询条数变少了(80w减少到40w),但是会产生大量的随机IO,严重影响查询性能;(B+Tree索引在大数据量下性能很差)
- 不使用索引,直接进行全表顺序扫描,虽然总扫描条数较多(80w),但是不是随机IO磁盘读写,性能反而比索引的随机IO性能要好;
索引覆盖 vs 非索引覆盖
KEY `indicator_busi_create_idx` (`indicator_key`,`business_id`,`create_at`)
// 特别注意:不是该索引 KEY `indicator_create_busi_idx` (`indicator_key`,`create_at`,`business_id`)
// indicator_create_busi_idx中,create_at为范围查询,最左前缀原则,将会导致Business_id不可用;
性能对比
非覆盖索引
返回值中,包含value,该值不在索引中,无法使用索引覆盖;
平均下来,使用了 1.19 sec
//rows=7.8w, 可以和 indicator_create_busi_idx 索引对比(rows=40w左右): 可见,将create_at放在索引的最后,过滤的条数很明显
MySQL [test]> explain select indicator_key, business_id, value from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' and business_id = '93d79263806742f190c6e6b9e7a1c08d';
+----+-------------+-----------------------------------------+-------+---------------------------+---------------------------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------------------+-------+---------------------------+---------------------------+---------+------+-------+-------------+
| 1 | SIMPLE | performance_metirc_host_part_min10_hour | range | indicator_busi_create_idx | indicator_busi_create_idx | 308 | NULL | 78028 | Using where |
+----+-------------+-----------------------------------------+-------+---------------------------+---------------------------+---------+------+-------+-------------+
1 row in set (0.00 sec)
MySQL [test]> select indicator_key, business_id, value from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' and business_id = '93d79263806742f190c6e6b9e7a1c08d';
37846 rows in set (1.19 sec)
rows=7.8w, 可以和 indicator_create_busi_idx 索引对比(rows=40w左右): 可见,将create_at放在索引的最后,过滤的条数很明显;
覆盖索引
返回值中,只包含 indicator_key, business_id, 可以使用索引覆盖;
平均下来,使用了 0.09 sec
MySQL [test]> explain select indicator_key, business_id from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' and business_id = '93d79263806742f190c6e6b9e7a1c08d';
+----+-------------+-----------------------------------------+-------+---------------------------+---------------------------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------------------+-------+---------------------------+---------------------------+---------+------+-------+--------------------------+
| 1 | SIMPLE | performance_metirc_host_part_min10_hour | range | indicator_busi_create_idx | indicator_busi_create_idx | 308 | NULL | 78028 | Using where; Using index |
+----+-------------+-----------------------------------------+-------+---------------------------+---------------------------+---------+------+-------+--------------------------+
MySQL [test]> select indicator_key, business_id from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' and business_id = '93d79263806742f190c6e6b9e7a1c08d';
37846 rows in set (0.09 sec)
总结
- 非覆盖索引:1.19s;
- 覆盖索引:0.09s;
速度提升了十几倍;
多列索引,索引顺序影响
- indicator_busi_create_idx: 平均:1.15s(8w数据量)
- indicator_create_busi_idx:平均:26s左右(40w~80w数据量)
- 无索引:平均: 0.8s(80w数据量)
indicator_busi_create_idx: 平均:1.15s
将范围查询的create_at放到索引列的最后;(8w数量)
KEY `indicator_busi_create_idx` (`indicator_key`,`business_id`,`create_at`)
MySQL [test]> explain select indicator_key, business_id , avg(value) from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' and business_id = '93d79263806742f190c6e6b9e7a1c08d';
+----+-------------+-----------------------------------------+-------+---------------------------+---------------------------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------------------+-------+---------------------------+---------------------------+---------+------+-------+--------------------------+
| 1 | SIMPLE | performance_metirc_host_part_min10_hour | range | indicator_busi_create_idx | indicator_busi_create_idx | 308 | NULL | 78028 | Using where; Using index |
+----+-------------+-----------------------------------------+-------+---------------------------+---------------------------+---------+------+-------+--------------------------+
MySQL [test]> select indicator_key, business_id, avg(value) from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' and business_id = '93d79263806742f190c6e6b9e7a1c08d';
+-----------------------------------+----------------------------------+--------------------+
| indicator_key | business_id | avg(value) |
+-----------------------------------+----------------------------------+--------------------+
| bm_statistic_mem_avg_util_percent | 93d79263806742f190c6e6b9e7a1c08d | 50.036187114557656 |
+-----------------------------------+----------------------------------+--------------------+
1 row in set (1.15 sec)
indicator_create_busi_idx:平均:26s左右
将范围查询的create_at放到索引列的前面,导致BusinessId无法索引;(80w数据量)
和indicator_busi_create_idx相比,整整多了10倍的数据返回,这些都是随机IO;
KEY `indicator_create_busi_idx` (`indicator_key`,`create_at`,`business_id`)
MySQL [test]> select indicator_key, business_id, avg(value) from performance_metirc_host_part_min10_hour where indicator_key='bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' and business_id = '93d79263806742f190c6e6b9e7a1c08d';
+-----------------------------------+----------------------------------+--------------------+
| indicator_key | business_id | avg(value) |
+-----------------------------------+----------------------------------+--------------------+
| bm_statistic_mem_avg_util_percent | 93d79263806742f190c6e6b9e7a1c08d | 50.036187114557656 |
+-----------------------------------+----------------------------------+--------------------+
1 row in set (25.34 sec)
无索引:平均: 0.8s
将使用全表扫描(80w数据量)
MySQL [test]> select indicator_key, business_id, avg(value) from performance_metirc_host_part_min10_hour where indicator_key= 'bm_statistic_mem_avg_util_percent' and create_at >='2017-12-10 01:00:00' and create_at <='2017-12-10 01:50:00' and business_id = '93d79263806742f190c6e6b9e7a1c08d';
+-----------------------------------+----------------------------------+--------------------+
| indicator_key | business_id | avg(value) |
+-----------------------------------+----------------------------------+--------------------+
| bm_statistic_mem_avg_util_percent | 93d79263806742f190c6e6b9e7a1c08d | 50.036187114557514 |
+-----------------------------------+----------------------------------+--------------------+
1 row in set (0.8 sec)
附件
数据准备代码
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import java.util.UUID;
public class WriteHostPartdata {
private static final List poolIDList = new ArrayList<>();
private static final List indicatorKeyList = new ArrayList<>();
private static final List timeList = new ArrayList<>();
private static final List busiIdList = new ArrayList<>();
private static final List orgaIdList = new ArrayList<>();
static {
poolIDList.add("7b8f0f5e2fbb4d9aa2d5fd55466d638e");
poolIDList.add("7b8f0f5e2fbb4d9aa2d5fd55466d63df");
poolIDList.add("7b8f0f5e2fbb4d9aa2d5fd55466d63er");
poolIDList.add("7b8f0f5e2fbb4d9aa2d5fd55466d6398");
indicatorKeyList.add("bm_statistic_cpu_avg_util_percent");
indicatorKeyList.add("bm_statistic_disk_avg_util_percent");
indicatorKeyList.add("bm_statistic_mem_avg_util_percent");
indicatorKeyList.add("vm_statistic_cpu_avg_util_percent");
indicatorKeyList.add("vm_statistic_disk_avg_util_percent");
indicatorKeyList.add("vm_statistic_mem_avg_util_percent");
timeList.add("2017-12-10 01:00:00");
timeList.add("2017-12-10 01:10:00");
timeList.add("2017-12-10 01:20:00");
timeList.add("2017-12-10 01:30:00");
timeList.add("2017-12-10 01:40:00");
timeList.add("2017-12-10 01:50:00");
timeList.add("2017-12-10 02:00:00");
timeList.add("2017-12-10 02:10:00");
timeList.add("2017-12-10 02:20:00");
timeList.add("2017-12-10 02:30:00");
timeList.add("2017-12-10 02:40:00");
timeList.add("2017-12-10 02:50:00");
busiIdList.add("8fe3e7bcebf540d1ae47ef5b53f62524");
busiIdList.add("93d79263806742f190c6e6b9e7a1c08d");
busiIdList.add("6e1141b4328843f09176fcc6928fab74");
busiIdList.add("59562271f4e6483cb784cea5cdb8bc8f");
busiIdList.add("c29ef5146d2641a2b6d7b731866e73b0");
busiIdList.add("10a86c53d54e46c2bedab6899075f41e");
busiIdList.add("ef818a8080db48568dd9f34cec21999a");
busiIdList.add("1384eb7cde9a497891a7ed743a66cc70");
busiIdList.add("3085f77c8fc8451683864a578ec94fdf");
busiIdList.add("aa6183cb7704431f857e8e63c63a7b84");
busiIdList.add("dbf5233183fd40679768552b16d73491");
orgaIdList.add("1da69607a73349bb909e65294e44c3a5");
orgaIdList.add("e1b72aa209654aa9a21acd59e6c9b7d6");
orgaIdList.add("3feb63ee93a046adada742f18b278f6d");
orgaIdList.add("defe080c3802423aa3e84a59f269b7a0");
orgaIdList.add("b62eff24281a4935a853cca65c7608da");
orgaIdList.add("d3701686cc0b4f0da4eead39fa807bd7");
orgaIdList.add("f90b3f78a9d641ba8aa942d912d1adc7");
orgaIdList.add("43e03831ef8c4e52a8541ad465efcb67");
orgaIdList.add("65458cc498e8481e8bf915a6947916b3");
}
public static void main(String[] args) {
String file = "D:\\tempTempTemp\\hostpartSql2.data";
writeFile(file);
}
public static void writeFile(String fileName) {
try {
FileWriter fw = new FileWriter(new File(fileName));
for (int i = 1; i < 500_0001; i++) {
//id
fw.write("'");
fw.write(i);
fw.write("'");
fw.write(",");
//poolId
fw.write("'");
fw.write(poolIDList.get(new Random().nextInt(poolIDList.size())));
fw.write("'");
fw.write(",");
//hostId: uuid
fw.write("'");
fw.write(UUID.randomUUID().toString());
fw.write("'");
fw.write(",");
//indicator_key
fw.write("'");
fw.write(indicatorKeyList.get(new Random().nextInt(indicatorKeyList.size())));
fw.write("'");
fw.write(",");
//value
fw.write("'");
fw.write(String.valueOf(new Random().nextDouble() * 100));
fw.write("'");
fw.write(",");
//resource_type
fw.write("'");
fw.write("");
fw.write("'");
fw.write(",");
//create_at
fw.write("'");
fw.write(timeList.get(new Random().nextInt(timeList.size())));
fw.write("'");
fw.write(",");
//business_id
fw.write("'");
fw.write(busiIdList.get(new Random().nextInt(busiIdList.size())));
fw.write("'");
fw.write(",");
//organization_id
fw.write("'");
fw.write(orgaIdList.get(new Random().nextInt(orgaIdList.size())));
fw.write("'");
fw.write(",");
//vpc_id
fw.write("'");
fw.write("");
fw.write("'");
fw.write(",");
//security_id
fw.write("'");
fw.write("");
fw.write("'");
fw.write("\n");
if (i % 50000 == 0) {
System.out.println("Finish:" + i / 50000);
}
}
fw.close();
} catch (IOException e1) {
}
}
}