10.2.4 Seizing Other Optimization Opportunities 掌握其他的优化机会
There are other opportunities where subquery factoring may be used to your advantage. If you are
working on applications that were originally written several years ago, you may find that some of SQL
could use a bit of improvement based on the features offered by Oracle versions 9i and later. The query
in Listing 10-8, for example, does exactly what it was asked to do, which is to find the average, minimum,
and maximum costs for each product that was produced in the year 2000, with the costs calculated for
each of the sale channels the product was sold in. This SQL is not only difficult to read and hard to
modify, but is also somewhat inefficient.
有些其他的机会子查询分解可用于你的优化。如果你对几年前写的应用进行优化,可能发现某些SQL能用Oracle 9i或之后版本提供的特性提升性能。在列表10-8中的查询,例如,准确的完成了需求,对每种生产于2000年的产品找出平均数,最小和最大成本,以及计 算出每种产品销售的销售渠道的成本。 这段SQL不但难读,难修改,还有些低效率。
Listing 10-8. Old SQL to Calculate Costs
1 select /*+ gather_plan_statistics */
2 substr(prod_name,1,30) prod_name
3 , channel_desc
4 , (
5 select avg(c2.unit_cost)
6 from sh.costs c2
7 where c2.prod_id = c.prod_id and c2.channel_id = c.channel_id
8 and c2.time_id between to_date('01/01/2000','mm/dd/yyyy')
9 and to_date('12/31/2000')
10 ) avg_cost
11 , (
12 select min(c2.unit_cost)
13 from sh.costs c2
14 where c2.prod_id = c.prod_id and c2.channel_id = c.channel_id
15 and c2.time_id between to_date('01/01/2000','mm/dd/yyyy')
16 and to_date('12/31/2000')
17 ) min_cost
18 , (
19 select max(c2.unit_cost)
20 from sh.costs c2
21 where c2.prod_id = c.prod_id and c2.channel_id = c.channel_id
22 and c2.time_id between to_date('01/01/2000','mm/dd/yyyy')
23 and to_date('12/31/2000')
24 ) max_cost
25 from (
26 select distinct pr.prod_id, pr.prod_name, ch.channel_id, ch.channel_desc
27 from sh.channels ch
28 , sh.products pr
29 , sh.costs co
30 where ch.channel_id = co.channel_id
31 and co.prod_id = pr.prod_id
32 and co.time_id between to_date('01/01/2000','mm/dd/yyyy')
33 and to_date('12/31/2000')
34 ) c
35 order by prod_name, channel_desc;
...(运行结果参考附件)
COLD CACHE Elapsed: 00:00:02.30
WARM CACHE Elapsed: 00:00:01.09
...
Examining the output of Listing 10-8, you see that the elapsed execution time on a cold cache is
2.30 seconds and 1.09 seconds on a warm cache. These times don’t seem all that bad at first. But when
you examine the execution plan, you find that this query can be improved upon from a performance
perspective as well as a readability perspective.
考察列表10-8的输出结果,你可看到在“冷缓存 ”的执行时间是2.30s,“热缓存 ”的执行时间是1.09s。这些时间初看上去并不坏。但是当你考察执行计划,你会发现这个查询从性能和可读性方面均有改善的空间。
The Starts column is telling. Each execution against the COSTS table is executed 864 times. This is
due to there being 216 rows produced by a join between CHANNELS, PRODUCTS, and COSTS. Also, the COSTS table is queried in four separate places for the same information. By using subquery factoring, not only can this SQL be cleaned up and made easier to read, it can also be made more efficient.
从开始的列可以看出,每次执行对COST表共864次。这是由于有216行乘以在CHANNELS,PRODUCTS和COSTS之间的连接。还有在4个 分散的地方查询COSTS表,获取同样的信息。通过使用子查询因子,不仅能使得SQL整洁而且易读,还能更有效率。
As seen in Listing 10-9, you can start by putting the begin_date and end_date columns in a separate
query bookends, leaving only one place that the values need to be set. The data for products is placed in
the prodmaster subquery. While this bit of the SQL worked fine as subquery in the FROM clause, the
readability of the SQL statement as a whole is greatly improved by moving it to a factored subquery.
如在列表10-9中所见,你能提取begin_date 和end_date列到独立的查询bookends,只留下一处地方设置值。products的数据放入prodmaster子查询。虽然这段SQL在 FROM子句中作为子查询工作的很好,通过移动它成为因子化的子查询,SQL语句的整体可读性大幅提高。
The calculations for the average, minimum, and maximum costs are replaced with a single subquery
called cost_compare. Finally, the SQL that joins the prodmaster and cost_compare subqueries is added.
The structure of the SQL is now much easier on the eyes and the overworked Developer’s brain. It’s also
simpler for the DBA to understand. The DBA will be especially happy with the execution statistics.
对平均,最小,最大成本的计算用单独的子查询cost_compare替换。最后,加入SQL连接prodmaster和cost_compare子查 询。现在SQL的结构使得劳累过度的的开发人员的大脑和眼睛舒服多了。 它也利于DBA的理解。DBA将对执行统计非常高兴。
Where the old SQL queried the COSTS table and COSTS_TIME_BIX index several hundred times, the
new SQL queries each only eight times. That is quite an improvement, and it shows in the elapsed times.
The query time on a cold cache is 1.48 seconds, about 25% better than the old SQL. On a warm cache,
however, the re-factored SQL really shines, running at 0.17 seconds whereas the old SQL managed only
1.09 seconds.
老的SQL查询COSTS表和COST_TIME_BIX索引几百次,新的SQL查询每次只要8次。这是相当大的提升,且它在消耗时间上也显现出来了。查 询时间在“冷缓存”上是1.48s,大约25%优于旧的SQL。在“热缓存”上,然而,重构的SQL真正闪光之处,运行时间0.17s而老SQL需要 1.09s
Listing 10-9. Old SQL Refactored Using WITH Clause
1 with bookends as (
2 select
3 to_date('01/01/2000','mm/dd/yyyy') begin_date
4 ,to_date('12/31/2000','mm/dd/yyyy') end_date
5 from dual
6 ),
7 prodmaster as (
8 select distinct pr.prod_id, pr.prod_name, ch.channel_id, ch.channel_desc
9 from sh.channels ch
10 , sh.products pr
11 , sh.costs co
12 where ch.channel_id = co.channel_id
13 and co.prod_id = pr.prod_id
14 and co.time_id between (select begin_date from bookends)
15 and (select end_date from bookends)
16 ),
17 cost_compare as (
18 select
19 prod_id
20 , channel_id
21 , avg(c2.unit_cost) avg_cost
22 , min(c2.unit_cost) min_cost
23 , max(c2.unit_cost) max_cost
24 from sh.costs c2
25 where c2.time_id between (select begin_date from bookends)
26 and (select end_date from bookends)
27 group by c2.prod_id, c2.channel_id
28 )
29 select /*+ gather_plan_statistics */
30 substr(pm.prod_name,1,30) prod_name
31 , pm.channel_desc
32 , cc.avg_cost
33 , cc.min_cost
34 , cc.max_cost
35 from prodmaster pm
36 join cost_compare cc on cc.prod_id = pm.prod_id
37 and cc.channel_id = pm.channel_id
38 order by pm.prod_name, pm.channel_desc;
PRODUCT CHANNEL_DESC AVG COST MIN COST MAX COST
------------------------------ -------------------- ---------- ---------- ----------
1.44MB External 3.5" Diskette Direct Sales 8.36 7.43 9.17
1.44MB External 3.5" Diskette Internet 8.59 7.42 9.55
Y Box Internet 266.73 245.00 282.30
Y Box Partners 272.62 242.79 293.68
---------- ---------- ----------
sum 27,961.39 24,407.85 34,478.10
216 rows selected.
COLD CACHE Elapsed: 00:00:01.48
WARM CACHE Elapsed: 00:00:00.17
(执行计划参见附件)