在这篇文章中,我将重新探究ProxySQL中的Query Rewrite
功能,因为query rewriting是创建ProxySQL的最根本初衷。
为什么我们需要重写查询?
- 你已经确定了一个触发性能瓶颈或导致系统缓慢的查询
- 你无法(快速的)修改应用代码
- 某些特殊的操作需要“重定向查询”
这儿举例你作为DBA发现了一个“坏查询”,你确认是它导致了服务缓慢,并且可能会导致服务不可用。那这个查询必须被优化,你和开发沟通要修正这个SQL,但是开发反馈回来的信息是能改,但是由于技术的非技术的种种原因吧,没有那么快。这时你怎么办,等着?显然不能,你可以在开发完成修正之前通过ProxySQL的Query Rewrite
功能重写某些查询来完成优化同时对应用保持透明。
如何重写查询?通过ProxySQL有两种方式来完成(译者注:其实应该理解为两种匹配查询的方式)。
Query rewrite其实就是通过 mysql_query_rules
表中一个 match_pattern + replace_pattern
的过程,而match_digest
(注意区分 match_pattern 和 match_digest )仅用来匹配一个查询,而非重写它。逻辑上讲,match_digest
和 username
,schemaname
,proxy_addr
等字段的作用是一样的,仅用来匹配查询。
这两种不同的机制为不同的查询类型(例如DML操作,SELECT等)提供了灵活高效匹配方式。注意如果你希望重写查询,那么规则中的match_pattern
必须能匹配到原始的查询。查询规则按照rule_id字段的升序顺序处理,并且只有在active字段为1的前提下才会处理。
下面是我们如何在我们的测试环境演示 match_digest
mysql> SELECT hostgroup hg, sum_time, count_star, digest_text FROM stats_mysql_query_digest ORDER BY sum_time DESC limit 10;
+----+-----------+------------+-----------------------------------+
| hg | sum_time | count_star | digest_text |
+----+-----------+------------+-----------------------------------+
| 0 | 243549572 | 85710 | SELECT c FROM sbtest10 WHERE id=? |
| 0 | 146324255 | 42856 | COMMIT |
| 0 | 126643488 | 44310 | SELECT c FROM sbtest7 WHERE id=? |
| 0 | 126517140 | 42927 | BEGIN |
| 0 | 123797307 | 43820 | SELECT c FROM sbtest1 WHERE id=? |
| 0 | 123345775 | 43460 | SELECT c FROM sbtest6 WHERE id=? |
| 0 | 122121030 | 43010 | SELECT c FROM sbtest9 WHERE id=? |
| 0 | 121245265 | 42400 | SELECT c FROM sbtest8 WHERE id=? |
| 0 | 120554811 | 42520 | SELECT c FROM sbtest3 WHERE id=? |
| 0 | 119244143 | 42070 | SELECT c FROM sbtest5 WHERE id=? |
+----+-----------+------------+-----------------------------------+
10 rows in set (0.00 sec)
mysql> INSERT INTO mysql_query_rules (rule_id,active,username,match_digest, match_pattern,replace_pattern,apply) VALUES (10,1,'root','SELECT.*WHERE id=?','sbtest2','sbtest10',1);
Query OK, 1 row affected (0.00 sec)
mysql> LOAD MYSQL QUERY RULES TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT hits, mysql_query_rules.rule_id,digest,active,username, match_digest, match_pattern, replace_pattern, cache_ttl, apply FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;
+------+---------+--------+--------+----------+--------------------+---------------+-----------------+-----------+-------+
| hits | rule_id | digest | active | username | match_digest | match_pattern | replace_pattern | cache_ttl | apply |
+------+---------+--------+--------+----------+--------------------+---------------+-----------------+-----------+-------+
| 0 | 10 | NULL | 1 | root | SELECT.*WHERE id=? | sbtest2 | sbtest10 | NULL | 1 |
+------+---------+--------+--------+----------+--------------------+---------------+-----------------+-----------+-------+
1 row in set (0.00 sec)
mysql> SELECT hits, mysql_query_rules.rule_id,digest,active,username, match_digest, match_pattern, replace_pattern, cache_ttl, apply FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;
+------+---------+--------+--------+----------+--------------------+---------------+-----------------+-----------+-------+
| hits | rule_id | digest | active | username | match_digest | match_pattern | replace_pattern | cache_ttl | apply |
+------+---------+--------+--------+----------+--------------------+---------------+-----------------+-----------+-------+
| 593 | 10 | NULL | 1 | root | SELECT.*WHERE id=? | sbtest2 | sbtest10 | NULL | 1 |
+------+---------+--------+--------+----------+--------------------+---------------+-----------------+-----------+-------+
1 row in set (0.00 sec)
如果想清空 query rules 的统计信息,使用下列方法
mysql> SELECT 1 FROM stats_mysql_query_digest_reset LIMIT 1;
+---+
| 1 |
+---+
| 1 |
+---+
1 row in set (0.01 sec)
mysql> LOAD MYSQL QUERY RULES TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)
接下来是 match_pattern 示例:
mysql> SELECT hostgroup hg, sum_time, count_star, digest_text FROM stats_mysql_query_digest ORDER BY sum_time DESC limit 5;
+----+----------+------------+----------------------------------+
| hg | sum_time | count_star | digest_text |
+----+----------+------------+----------------------------------+
| 0 | 98753983 | 16292 | BEGIN |
| 0 | 84613532 | 16232 | COMMIT |
| 1 | 49327292 | 16556 | SELECT c FROM sbtest3 WHERE id=? |
| 1 | 49027118 | 16706 | SELECT c FROM sbtest2 WHERE id=? |
| 1 | 48095847 | 16396 | SELECT c FROM sbtest4 WHERE id=? |
+----+----------+------------+----------------------------------+
5 rows in set (0.01 sec)
mysql> INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply) VALUES (20,1,'root','DISTINCT(.*)ORDER BY c','DISTINCT1',1);
Query OK, 1 row affected (0.00 sec)
mysql> LOAD MYSQL QUERY RULES TO RUNTIME;
Query OK, 0 rows affected (0.01 sec)
mysql> SELECT hits, mysql_query_rules.rule_id,digest,active,username, match_digest, match_pattern, replace_pattern, cache_ttl, apply FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;
+------+---------+--------+--------+----------+--------------------+------------------------+-----------------+-----------+-------+
| hits | rule_id | digest | active | username | match_digest | match_pattern | replace_pattern | cache_ttl | apply |
+------+---------+--------+--------+----------+--------------------+------------------------+-----------------+-----------+-------+
| 0 | 10 | NULL | 1 | root | SELECT.*WHERE id=? | sbtest2 | sbtest10 | NULL | 1 |
| 0 | 20 | NULL | 1 | root | NULL | DISTINCT(.*)ORDER BY c | DISTINCT1 | NULL | 1 |
+------+---------+--------+--------+----------+--------------------+------------------------+-----------------+-----------+-------+
2 rows in set (0.01 sec)
mysql> SELECT hits, mysql_query_rules.rule_id,digest,active,username, match_digest, match_pattern, replace_pattern, cache_ttl, apply FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;
+------+---------+--------+--------+----------+--------------------+------------------------+-----------------+-----------+-------+
| hits | rule_id | digest | active | username | match_digest | match_pattern | replace_pattern | cache_ttl | apply |
+------+---------+--------+--------+----------+--------------------+------------------------+-----------------+-----------+-------+
| 9994 | 10 | NULL | 1 | root | SELECT.*WHERE id=? | sbtest2 | sbtest10 | NULL | 1 |
| 6487 | 20 | NULL | 1 | root | NULL | DISTINCT(.*)ORDER BY c | DISTINCT1 | NULL | 1 |
+------+---------+--------+--------+----------+--------------------+------------------------+-----------------+-----------+-------+
2 rows in set (0.00 sec)
mysql> SELECT 1 FROM stats_mysql_query_digest_reset LIMIT 1;
+---+
| 1 |
+---+
| 1 |
+---+
1 row in set (0.00 sec)
mysql> LOAD MYSQL QUERY RULES TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)
路由规则中一个关键点是 mysql_query_rules 的 apply 字段
- apply = 1(默认)表示查询一旦匹配到一条规则就不再匹配剩余的规则
- apply = 0 表示继续尝试匹配后续的规则
(译者注:类似于nginx rewrite 指令中的 break 参数)
如下面测试中所展示的,所有匹配rule_id = 10 或 rule_id = 20 的查询都准确的匹配上了。实际上,现在所有的规则在 runtime_mysql_query_rules 表中都是激活的。如果我们想禁用 mysql_query_rules 表中某条规则,设置 active = 0
mysql> update mysql_query_rules set apply = 1 where rule_id in (10);
Query OK, 1 row affected (0.00 sec)
mysql> update mysql_query_rules set apply = 0 where rule_id in (20);
Query OK, 1 row affected (0.00 sec)
mysql> LOAD MYSQL QUERY RULES TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT hits, mysql_query_rules.rule_id,digest,active,username, match_digest, match_pattern, replace_pattern, cache_ttl, apply FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;
+------+---------+--------+--------+----------+--------------------+------------------------+-----------------+-----------+-------+
| hits | rule_id | digest | active | username | match_digest | match_pattern | replace_pattern | cache_ttl | apply |
+------+---------+--------+--------+----------+--------------------+------------------------+-----------------+-----------+-------+
| 0 | 10 | NULL | 1 | root | SELECT.*WHERE id=? | sbtest2 | sbtest10 | NULL | 1 |
| 0 | 20 | NULL | 1 | root | NULL | DISTINCT(.*)ORDER BY c | DISTINCT1 | NULL | 0 |
+------+---------+--------+--------+----------+--------------------+------------------------+-----------------+-----------+-------+
2 rows in set (0.00 sec)
mysql> SELECT hits, mysql_query_rules.rule_id,digest,active,username, match_digest, match_pattern, replace_pattern, flagIN, apply FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;
+-------+---------+--------+--------+----------+--------------------+------------------------+-----------------+--------+-------+
| hits | rule_id | digest | active | username | match_digest | match_pattern | replace_pattern | flagIN | apply |
+-------+---------+--------+--------+----------+--------------------+------------------------+-----------------+--------+-------+
| 10195 | 10 | NULL | 1 | root | SELECT.*WHERE id=? | sbtest2 | sbtest10 | 0 | 1 |
| 6599 | 20 | NULL | 1 | root | NULL | DISTINCT(.*)ORDER BY c | DISTINCT1 | 0 | 0 |
+-------+---------+--------+--------+----------+--------------------+------------------------+-----------------+--------+-------+
2 rows in set (0.00 sec)
mysql> SELECT hits, mysql_query_rules.rule_id,digest,active,username, match_digest, match_pattern, replace_pattern, flagIN, apply FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;
+-------+---------+--------+--------+----------+--------------------+------------------------+-----------------+--------+-------+
| hits | rule_id | digest | active | username | match_digest | match_pattern | replace_pattern | flagIN | apply |
+-------+---------+--------+--------+----------+--------------------+------------------------+-----------------+--------+-------+
| 20217 | 5 | NULL | 1 | root | NULL | DISTINCT(.*)ORDER BY c | DISTINCT1 | 0 | 1 |
| 27020 | 10 | NULL | 1 | root | SELECT.*WHERE id=? | sbtest2 | sbtest10 | 0 | 0 |
+-------+---------+--------+--------+----------+--------------------+------------------------+-----------------+--------+-------+
2 rows in set (0.00 sec)
mysql> update mysql_query_rules set active = 0 where rule_id = 5;
Query OK, 1 row affected (0.00 sec)
mysql> LOAD MYSQL QUERY RULES TO RUNTIME;
Query OK, 0 rows affected (0.02 sec)
mysql> SELECT hits, mysql_query_rules.rule_id,digest,active,username, match_digest, match_pattern, replace_pattern, cache_ttl, apply FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;
+------+---------+--------+--------+----------+--------------------+---------------+-----------------+-----------+-------+
| hits | rule_id | digest | active | username | match_digest | match_pattern | replace_pattern | cache_ttl | apply |
+------+---------+--------+--------+----------+--------------------+---------------+-----------------+-----------+-------+
| 0 | 10 | NULL | 1 | root | SELECT.*WHERE id=? | sbtest2 | sbtest10 | NULL | 0 |
+------+---------+--------+--------+----------+--------------------+---------------+-----------------+-----------+-------+
1 row in set (0.00 sec)
mysql> SELECT hits, mysql_query_rules.rule_id,digest,active,username, match_digest, match_pattern, replace_pattern, cache_ttl, apply FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules ORDER BY mysql_query_rules.rule_id;
+------+---------+--------+--------+----------+--------------------+---------------+-----------------+-----------+-------+
| hits | rule_id | digest | active | username | match_digest | match_pattern | replace_pattern | cache_ttl | apply |
+------+---------+--------+--------+----------+--------------------+---------------+-----------------+-----------+-------+
| 4224 | 10 | NULL | 1 | root | SELECT.*WHERE id=? | sbtest2 | sbtest10 | NULL | 0 |
+------+---------+--------+--------+----------+--------------------+---------------+-----------------+-----------+-------+
1 row in set (0.01 sec)
另外,ProxySQL还能帮忙识别出“低效的查询”,登录管理界面按如下操作
找出总耗时最多的查询
mysql> SELECT SUM(sum_time), SUM(count_star), digest_text FROM stats_mysql_query_digest GROUP BY digest ORDER BY SUM(sum_time) DESC LIMIT 3G
*************************** 1. row ***************************
SUM(sum_time): 95053795
SUM(count_star): 13164
digest_text: BEGIN
*************************** 2. row ***************************
SUM(sum_time): 85094367
SUM(count_star): 13130
digest_text: COMMIT
*************************** 3. row ***************************
SUM(sum_time): 52110099
SUM(count_star): 13806
digest_text: SELECT c FROM sbtest3 WHERE id=?
3 rows in set (0.00 sec)
找出平均耗时最高的查询
mysql> SELECT SUM(sum_time), SUM(count_star), SUM(sum_time)/SUM(count_star) avg, digest_text FROM stats_mysql_query_digest GROUP BY digest ORDER BY SUM(sum_time)/SUM(count_star) DESC limit 1;
+---------------+-----------------+--------+--------------------------------+
| SUM(sum_time) | SUM(count_star) | avg | digest_text |
+---------------+-----------------+--------+--------------------------------+
| 972162 | 1 | 972162 | CREATE INDEX k_5 ON sbtest5(k) |
+---------------+-----------------+--------+--------------------------------+
1 row in set (0.00 sec)
我发现关于ProxySQL query rewrite 的“最好”的文档在IBM,这里介绍了查询重写的原理和示例,值得一读。
还有一些别的场景你可能需要重写查询,试想有一张表的自增ID列已经达到了int类型的最大值,你可以将新插入的数据重定向到另一张表同时你通过alter命令来修正原表的问题,在这期间所有的查询还将访问原表,等alter原表完成后,将新表的数据导入的原表,即可达到不停机修DDL的效果。
从MySQL 5.7.6 起,MySQL以插件形式提供了 query rewrite 功能,你可以在这里找到相关文档。MySQL内建的查询重写功能的一个最大的劣势在于重写规则仅作用于当前MySQL实例,这也是相比之下ProxySQL 的优势所在:它处在应用和数据库之间,所以它的重写规则是全局的。
原文链接