10.2 Optimizing SQL 优化SQL (page 308)
When a SQL query is designed or modified to take advantage of subquery factoring, there are some not-so-subtle changes that may take place when the optimizer creates an execution plan for the query. The following quote comes from the Oracle 11gR2 documentation in the Oracle Database SQL Language
Reference for SELECT, under the subquery_factoring_clause heading:
当一个SQL查询采用子查询分解设计或修改,优化器创建的查询执行计划会产生显著 的改变。下面引用至Oracle 11gR2文档“Oracle Database SQL Language Reference(Oracle 数据库SQL语言参考)”关于SELECT,subquery_factoring_clause 标题下(的内容):
The WITH query_name clause lets you assign a name to a subquery block. You can
then reference the subquery block multiple places in the query by specifying
query_name. Oracle Database optimizes the query by treating the query name as
either an inline view or as a temporary table.
WITH query_name 子句让你对子查询块赋予名称。你可以在查询中通过指定 query_name多次引用子查询块。Oracle数据库把 query_name看做是一内联视图或作为一个临时表使优化查询。
Notice that Oracle may treat the factored subquery as a temporary table. In queries where a table is
referenced more than once, this could be a distinct performance advantage, as Oracle can materialize
result sets from the query, thereby avoiding performing some expensive database operations more than
once. The caveat here is that it “could be” a distinct performance advantage. Keep in mind that
materializing the result set requires creating a temporary table and inserting the rows into it. Doing so
may be of value if the same result set is referred to many times, or it may be a big performance penalty.
注意Oracle可能把分解的子查询作为临时表。在查询中表被引用多次,这 本该 是明显的性能提升,因为Oracle能查询结果物化,因而避免一些昂贵的数据库操作多次执行。这里有一点警告 就是它“本该”是一次明显的的性能提升。记住物化结果集需要创建临时表且向其插入行。如果结果集引用多次,这样做才是有价值的,要么它可能是一个大的性能罚单。
Testing Execution Plans 测试执行计划
When examining the execution plans for subfactored queries, it may not be readily apparent if Oracle is
choosing the best execution plan. It may seem that the use of the INLINE or MATERIALZE1 hint would
result in better performing SQL. In some cases it may, but the use of these hints needs to be tested and
considered in the context of overall application performance.
检查分解后查询的执行计划,看Oracle是否选择了最佳的执行计划不是那么显而易见的。也许使用INLINE或MATERIAL1 提示会产生性能更佳的SQL。在某些情况下是这样,但是使用这些提示需要测试还要考虑整个应用性能的上下文。
-----------------------------------
1 Though well known in the Oracle community for some time now, the INLINE and MATERIALIZE hints remain undocumented by Oracle.
虽然INLINE 和MATERIALIZE提示在Oracle社区以为人熟知一段时间了,但是他们仍然没有被Oracle文档化。
-----------------------------------
The need to test for optimum query performance can be illustrated by a report that management
has requested. The report must show the distribution of customers by country and income level,
showing only those countries and income levels that make up 1% or more of the entire customer base. A
country and income level should also be reported if the number of customers in an income level bracket
is greater than or equal to 25% of all customers in that income bracket 2 .
需要测试最优的查询性能,可以通过请求的管理报告来演示。报告必须按国家和收入层级展示顾客的分布,展示只占1%或更多的顾客基础的那些国家和收入层级。如果在某收入级别范围内的顾客(群体)数大于等于该收入级别所有顾客数的25%的国家和收入层级也要报告出。
The query in Listing 10-3 is the end result3 . The cust factored subquery has been retained from previous queries. New are the subqueries in the HAVING clause; these are used to enforce the rules stipulated for the report.
在列表10-3中的查询是最终结果。分解的子查询cust由之前的查询保留而来。新东西是在HAVING子句中的子查询;用于强制的保证报告的规则。
Listing 10-3. WITH and MATERIALIZE
1 with cust as (
2 select /*+ materialize gather_plan_statistics */
3 b.cust_income_level,
4 a.country_name
5 from sh.customers b
6 join sh.countries a on a.country_id = b.country_id
7 )
8 select country_name, cust_income_level, count(country_name) country_cust_count
9 from cust c
10 having count(country_name) >
11 (
12 select count(*) * .01
13 from cust c2
14 )
15 or count(cust_income_level) >=
16 (
17 select median(income_level_count)
18 from (
19 select cust_income_level, count(*) *.25 income_level_count
20 from cust
21 group by cust_income_level
22 )
23 )
24 group by country_name, cust_income_level
25 order by 1,2;
CUSTOMER
COUNTRY INCOME LEVEL COUNT
------------------------------ -------------------- --------
France E: 90,000 - 109,999 585
France F: 110,000 - 129,999 651
...
United States of America H: 150,000 - 169,999 1857
United States of America I: 170,000 - 189,999 1395
...
----------------------------
2 If you run these examples on a version of Oracle other then 11gR2, the output may appear differently, as the test data sometimes changes with versions of Oracle.
如果你在Oracle非11gR2版本运行这些例子,输出可能显得不同,因为测试数据有时会随着Oracle版本而改变。
3 The MATERIALIZE hint was used to ensure that the example would work as expected, given that you may be testing on a different version or patch level of Oracle. On the test system used by the author, this was the default action by Oracle.
MATERIALIZE提示用于确保例子如期工作,假设你可能在不同的版本或者分支层级的Oracle上测试。作者所用的测试系统,采用的是Oracle的默认动作。
-----------------------------
35 rows selected.
Elapsed: 00:00:01.37
Statistics
----------------------------------------------------------
1854 recursive calls
307 db block gets
2791 consistent gets
1804 physical reads
672 redo size
4609 bytes sent via SQL*Net to client
700 bytes received via SQL*Net from client
18 SQL*Net roundtrips to/from client
38 sorts (memory)
0 sorts (disk)
35 rows processed
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 35 |00:00:11.74 |
| 1 | TEMP TABLE TRANSFORMATION | | 1 | | 35 |00:00:11.74 |
| 2 | LOAD AS SELECT | | 1 | | 0 |00:00:09.87 |
| * 3 | HASH JOIN | | 1 | 55500 | 55500 |00:03:30.11 |
| 4 | TABLE ACCESS FULL | COUNTRIES | 1 | 23 | 23 |00:00:00.04 |
| 5 | TABLE ACCESS FULL | CUSTOMERS | 1 | 55500 | 55500 |00:03:29.77 |
| * 6 | FILTER | | 1 | | 35 |00:00:01.88 |
| 7 | SORT GROUP BY | | 1 | 18 | 209 |00:00:01.84 |
| 8 | VIEW | | 1 | 55500 | 55500 |00:00:30.87 |
| 9 | TABLE ACCESS FULL | SYS_TEMP_0F | 1 | 55500 | 55500 |00:00:30.73 |
| 10 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 |
| 11 | VIEW | | 1 | 55500 | 55500 |00:00:00.21 |
| 12 | TABLE ACCESS FULL | SYS_TEMP_0F | 1 | 55500 | 55500 |00:00:00.07 |
| 13 | SORT GROUP BY | | 1 | 1 | 1 |00:00:00.03 |
| 14 | VIEW | | 1 | 11 | 13 |00:00:00.03 |
| 15 | SORT GROUP BY | | 1 | 11 | 13 |00:00:00.03 |
| 16 | VIEW | | 1 | 55500 | 55500 |00:00:00.21 |
| 17 | TABLE ACCESS FULL | SYS_TEMP_0F | 1 | 55500 | 55500 |00:00:00.07 |
-----------------------------------------------------------------------------------------
When executing4 the SQL, all appears as you expect. Then you check the execution plan and find
that the join of the CUSTOMERS and COUNTRIES tables underwent a TEMP TABLE TRANSFORMATION, and the rest of the query was satisfied by using the temporary table SYS_TEMP_0F5 . At this point, you might rightly wonder if the execution plan chosen was a reasonable one. That can easily be tested, thanks to the MATERIALIZED and INLINE hints.
当执行SQL,所有的展现如你所期望的。然后你检查执行计划和发现CUSTOMERS 和COUNTRIES 的连接,在TEMP TABLE TRANSFORMATION之后,且其余的查询可通过使用临时表SYS_TEMP_0F得到满足。从这点起,你可能正困惑是否所选的执行计划是最合理的。这很容易测试,通过MATERIALIZED 和INLINE提示。
-----------------------------
4 Initial executions are executed after first flushing the shared_pool and buffer_cache.
初始执行时在第一次刷新共享池(shared_pool)和缓冲区缓存(buffer_cache)之后。
5 The actual table name was SYS_TEMP_0FD9D66A2_453290, but was shortened in the listing for formatting purposes.
实际的表名是 SYS_TEMP_0FD9D66A2_453290,在列表中为了格式(好看)的原因简短了。
------------------------------------------------
By using the INLINE hint, Oracle can be instructed to satisfy all portions of the query without using a
TEMP TABLE TRANSFORMATION. The results of doing so are shown in Listing 10-4. Only the relevant portion of the SQL that has changed is shown here, the rest of it being identical to that in Listing 10-3.
通过使用INLINE提示,Oracle指示所有的部分查询不使用TEMP TABLE TRANSFORMATION。 在列表10-4中展示了这样做的结果。只有相关改变部分的SQL在这里展现了,其余相同的部分同列表10-3。
Listing 10-4. WITH and INLINE Hint
1 with cust as (
2 select /*+ inline gather_plan_statistics */
3 b.cust_income_level,
4 a.country_name
5 from sh.customers b
6 join sh.countries a on a.country_id = b.country_id
7 )
...
COUNTRY INCOME LEVEL COUNT
------------------------------ -------------------- --------
France E: 90,000 - 109,999 585
France F: 110,000 - 129,999 651
...
United States of America I: 170,000 - 189,999 1395
United States of America J: 190,000 - 249,999 1390
...
35 rows selected.
Elapsed: 00:00:00.62
Statistics
----------------------------------------------------------
1501 recursive calls
0 db block gets
4758 consistent gets
1486 physical reads
0 redo size
4609 bytes sent via SQL*Net to client
700 bytes received via SQL*Net from client
18 SQL*Net roundtrips to/from client
34 sorts (memory)
0 sorts (disk)
35 rows processed
---------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 35 |00:00:09.65 |
| *1 | FILTER | | 1 | | 35 |00:00:09.65 |
| 2 | SORT GROUP BY | | 1 | 20 | 236 |00:00:09.53 |
|* 3 | HASH JOIN | | 1 | 55500 | 55500 |00:03:09.16 |
| 4 | TABLE ACCESS FULL | COUNTRIES | 1 | 23 | 23 |00:00:00.03 |
| 5 | TABLE ACCESS FULL | CUSTOMERS | 1 | 55500 | 55500 |00:03:08.83 |
| 6 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.07 |
|* 7 | HASH JOIN | | 1 | 55500 | 55500 |00:00:00.41 |
| 8 | INDEX FULL SCAN | COUNTRIES_PK | 1 | 23 | 23 |00:00:00.03 |
| 9 | TABLE ACCESS FULL | CUSTOMERS | 1 | 55500 | 55500 |00:00:00.09 |
| 10 | SORT GROUP BY | | 1 | 1 | 1 |00:00:00.06 |
| 11 | VIEW | | 1 | 12 | 13 |00:00:00.06 |
| 12 | SORT GROUP BY | | 1 | 12 | 13 |00:00:00.06 |
|* 13 | HASH JOIN | | 1 | 55500 | 55500 |00:00:00.38 |
| 14 | INDEX FULL SCAN | COUNTRIES_PK | 1 | 23 | 23 |00:00:00.01 |
| 15 | TABLE ACCESS FULL | CUSTOMERS | 1 | 55500 | 55500 |00:00:00.08 |
---------------------------------------------------------------------------------------
From the execution plan in Listing 10-4, you can see that three full scans were performed on the
CUSTOMERS table and one full scan on the COUNTRIES table. Two of the executions against the cust
subquery required only the information in the COUNTRIES_PK index, so a full scan of the index was
performed rather than a full scan of the table, saving a small bit of time and resources.
从列表10-4的执行计划,你能看出再CUSTOMERS表上发生了三次全扫描和在COUNTRIES表上发生了一次全扫描。 两次对cust子查询的执行,只需要 COUNTRIES_PK索引信息,因为执行的是索引全扫描而不是表全扫描,节省了部分时间和资源。
What may be surprising is that the execution using full table scans was .75 seconds, or about 100%,
faster than when a temporary table was used. Of course, the cache was cold for both queries, as both
the buffer cache and shared pool were flushed prior to running each query.
令人惊奇的是全表扫描的执行时间是0.75s,或大约100%,快于使用临时表(的场景)。 当然,缓存对于两个查询都是“冷的”,因为两个缓冲区缓存和共享池在执行各查询之前都刷新了。