【20180509】MySQL5.7 新特性之虚拟列的使用

摘要

在MySQL 5.7中,支持俩种的Generated Column,即Virtual Generated Column和Stored Generated Column,前者只将Generated Column 保存在数据字典中(表的元数据),并不会将这一列数据持久化到磁盘上;后者会将Generated Column 持久化到磁盘上,而不是每次读取的时候计算所得。很明显,后者存放了可以通过已有的数据计算得的数据,需要更多的磁盘空间,与Virtual Column相比并没有优势,因此,MySQL5.7中,不指定Generated Column的类型的时候,默认是Virtual Generated Column。

  • 如果需要Stored Generated Column的话,可能在Virtual Genterated Column上建立索引更加合适。

语法

 [ GENERATED ALWAYS ] AS (  ) [ VIRTUAL|STORED ] [ UNIQUE [ KEY ] ] [ NOT NULL ] [COLUMN  ]

实际应用

  1. 表结构
mysql> show create table fen_simpic \G
*************************** 1. row ***************************
       Table: fen_simpic
Create Table: CREATE TABLE `fen_simpic` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `group` int(11) NOT NULL COMMENT '截图的视频帖号',
  `item` int(2) NOT NULL COMMENT '截图的顺序号',
  `mh` char(144) DEFAULT NULL COMMENT '截图的汉明哈希值',
  `dct` bigint(20) unsigned DEFAULT NULL COMMENT '截图的dct哈希值',
  `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '记录生成时间',
  PRIMARY KEY (`id`),
  KEY `created_at` (`created_at`),
  KEY `group` (`group`,`item`),
) ENGINE=InnoDB AUTO_INCREMENT=2599837 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql>

2.慢SQL和执行计划

mysql> explain select `group`, `item` , dct , mh, bit_count(dct^17228540329887592107) as dist from fen_simpic force index(created_at) where created_at<"2018-05-08 21:44:09" and created_at>"2018-04-09 10:15:50.463238" and `group` not in (120381696,120381705,120381709,120381714,120381718,120381736,120381747,120381753,120381763,120381776,120381787,120381808,120381820,120381837,120381857,120381861,120382022,120381776) and (`item`>=3 and `item`<=5) having dist<=26 order by dist limit 5000;
+----+-------------+------------+------------+-------+---------------+------------+---------+------+---------+----------+----------------------------------------------------+
| id | select_type | table      | partitions | type  | possible_keys | key        | key_len | ref  | rows    | filtered | Extra                                              |
+----+-------------+------------+------------+-------+---------------+------------+---------+------+---------+----------+----------------------------------------------------+
|  1 | SIMPLE      | fen_simpic | NULL       | range | created_at    | created_at | 4       | NULL | 1071840 |     5.55 | Using index condition; Using where; Using filesort |
+----+-------------+------------+------------+-------+---------------+------------+---------+------+---------+----------+----------------------------------------------------+
1 row in set, 1 warning (0.00 sec)

mysql>

3.请求耗时

mysql> show profile for query 52;
+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.008504 |
| checking permissions | 0.000009 |
| Opening tables       | 0.000028 |
| init                 | 0.000049 |
| System lock          | 0.000012 |
| optimizing           | 0.000017 |
| statistics           | 0.000107 |
| preparing            | 0.000025 |
| Sorting result       | 0.000006 |
| executing            | 0.000003 |
| Sending data         | 0.000010 |
| Creating sort index  | 1.088568 |
| end                  | 0.000011 |
| query end            | 0.000013 |
| closing tables       | 0.000010 |
| freeing items        | 0.000270 |
| logging slow query   | 0.000060 |
| cleaning up          | 0.000018 |
+----------------------+----------+
18 rows in set, 1 warning (0.00 sec)

4.创建虚拟列

mysql> alter table fen_simpic add column dist tinyint(1) generated always as (bit_count(dct^17228540329887592107)) virtual;
mysql> alter table fen_simpic add index idx_dist(dist);
mysql> show create table fen_simpic \G
*************************** 1. row ***************************
       Table: fen_simpic
Create Table: CREATE TABLE `fen_simpic` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `group` int(11) NOT NULL COMMENT '截图的视频帖号',
  `item` int(2) NOT NULL COMMENT '截图的顺序号',
  `mh` char(144) DEFAULT NULL COMMENT '截图的汉明哈希值',
  `dct` bigint(20) unsigned DEFAULT NULL COMMENT '截图的dct哈希值',
  `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '记录生成时间',
  `dist` tinyint(1) GENERATED ALWAYS AS (bit_count((`dct` ^ 17228540329887592107))) VIRTUAL,
  PRIMARY KEY (`id`),
  KEY `created_at` (`created_at`),
  KEY `group` (`group`,`item`),
  KEY `idx_dist` (`dist`)
) ENGINE=InnoDB AUTO_INCREMENT=2599837 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql>

5.执行SQL

mysql> explain  select `group`, `item` , dct , mh,  dist from fen_simpic force index(idx_dist) where created_at<"2018-05-08 21:44:09" and created_at>"2018-04-09 10:15:50.463238" and `group` not in (120381696,120381705,120381709,120381714,120381718,120381736,120381747,120381753,120381763,120381776,120381787,120381808,120381820,120381837,120381857,120381861,120382022,120381776) and (`item`>=3 and `item`<=5) having dist<=26 order by dist limit 5000;
+----+-------------+------------+------------+-------+---------------+----------+---------+------+---------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key      | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------+----------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | fen_simpic | NULL       | index | NULL          | idx_dist | 2       | NULL | 2502423 |     0.62 | Using where |
+----+-------------+------------+------------+-------+---------------+----------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql>

6.请求耗时

mysql> show profile for query 57;
+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.000133 |
| checking permissions | 0.000009 |
| Opening tables       | 0.000029 |
| init                 | 0.000049 |
| System lock          | 0.000012 |
| optimizing           | 0.000016 |
| statistics           | 0.000029 |
| preparing            | 0.000023 |
| Sorting result       | 0.000006 |
| executing            | 0.000003 |
| Sending data         | 0.212587 |
| end                  | 0.000013 |
| query end            | 0.000012 |
| closing tables       | 0.000012 |
| freeing items        | 0.000279 |
| cleaning up          | 0.000018 |
+----------------------+----------+
16 rows in set, 1 warning (0.00 sec)

7.进一步改进的SQL

mysql> explain select t1.`group`, t1.`item` , t1.dct , t1.dist from fen_simpic t1 inner join (select id,dist from fen_simpic force index(idx_dist) where created_at<"2018-05-08 21:44:09" and created_at>"2018-04-09 10:15:50.463238" and `group` not in (120381696,120381705,120381709,120381714,120381718,120381736,120381747,120381753,120381763,120381776,120381787,120381808,120381820,120381837,120381857,120381861,120382022,
+----+-------------+------------+------------+--------+---------------+----------+---------+-------+---------+----------+-------------+
| id | select_type | table      | partitions | type   | possible_keys | key      | key_len | ref   | rows    | filtered | Extra       |
+----+-------------+------------+------------+--------+---------------+----------+---------+-------+---------+----------+-------------+
|  1 | PRIMARY     |  | NULL       | ALL    | NULL          | NULL     | NULL    | NULL  |    5000 |   100.00 | NULL        |
|  1 | PRIMARY     | t1         | NULL       | eq_ref | PRIMARY       | PRIMARY  | 4       | t2.id |       1 |   100.00 | NULL        |
|  2 | DERIVED     | fen_simpic | NULL       | index  | NULL          | idx_dist | 2       | NULL  | 2502423 |     0.62 | Using where |
+----+-------------+------------+------------+--------+---------------+----------+---------+-------+---------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)

mysql>

8.进一步改进的SQL的耗时

mysql> show profile for query 58;
+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.005367 |
| checking permissions | 0.000007 |
| checking permissions | 0.000005 |
| Opening tables       | 0.000032 |
| init                 | 0.000081 |
| System lock          | 0.000013 |
| optimizing           | 0.000015 |
| optimizing           | 0.000015 |
| statistics           | 0.000031 |
| preparing            | 0.000023 |
| Sorting result       | 0.000010 |
| statistics           | 0.000026 |
| preparing            | 0.000011 |
| executing            | 0.000009 |
| Sending data         | 0.000009 |
| executing            | 0.000002 |
| Sending data         | 0.201685 |
| end                  | 0.000012 |
| query end            | 0.000013 |
| closing tables       | 0.000005 |
| removing tmp table   | 0.000008 |
| closing tables       | 0.000009 |
| freeing items        | 0.000340 |
| cleaning up          | 0.000028 |
+----------------------+----------+
24 rows in set, 1 warning (0.00 sec)

总结

  1. 在原生的SQL中刚刚开始有使用force index(created_at) 主要是因为在进行所有过滤的时候,过滤的数据一般超过30%左右就会进行全文扫描,不会使用索引。所以才会使用强制索引,还有就是在选择索引的时候会选择选择率比较高的索引。
  2. 在进行SQL耗时分析的时候,可以比较明显的看出耗时大部分都是在Create sort index上面,因为排序使用的是dist,这个列在表中实际上是不存在的,所以会在计算完之后再创建排序索引。
  3. 虚拟列在类似与这种计算后的值进行排序和过滤有很大的帮助。
  4. 在优化之后进行进一步的SQL改写的目的,其实是为了减少返回的数据量。

引用

http://www.cnblogs.com/raichen/p/5227449.html

转载于:https://blog.51cto.com/11819159/2114394

你可能感兴趣的:(【20180509】MySQL5.7 新特性之虚拟列的使用)