今天讲一个小众Case,Oracle数据库的“Star Transformation”。
根据关系数据库的范式理论,表在设计过程中会尽量降低耦合和冗余。
于是,大多数的数据库都是这样设计的。
例:
在一个销售系统中,一般都会涉及到这样几种数据:产品,分类,厂家,卖家,买家等等。
下面是表设计:
产品表:产品ID,产品名,价格。
分类表:分类ID,分类名。
厂家表:厂家ID,厂家名。
卖家表:卖家ID,卖家名。
买家表:买家ID,买家名。
销售记录表:时间,产品ID,分类ID,厂家ID,卖家ID,买家ID,其他。
使用上面的数据库设计,基本可以满足销售数据的快速记录,更新等日常处理。
但是,随着销售记录的增加,做下面的销售汇总报表的时候就会非常慢,因为报表是需要把销售记录的各种ID转换成实际名字给人看的。
Select 产品名, sum(价格)
from 销售记录表 t1, 产品表 t2, 分类表 t3, 厂家表 t4, 卖家表 t5, 买家表 t6
where t1.产品ID = t2.产品ID
and t1.分类ID = t3.分类ID
and t1.厂家ID = t4.厂家ID
and t1.卖家ID = t5.卖家ID
and t1.买家ID = t6.买家ID
and t2.产品名 =
and t3.分类名 =
and t4.厂家名 =
and t5.卖家名 =
and t6.买家名 = ;
如何解决这个问题呢?
聪明的ORACLE工程师设计了“Star Schema”和“Star Transformation”。
首先,什么是“Star Schema”呢?
上面的例子里,我们可以发现在所有的表中,只有“销售记录表”一个“事实表”(Fact Table)。其他的表都只是"参照表"(Dimension Table)。结构可以用下面的图表示。
这就是一个典型的“Star Schema”。
那什么又是“Star Transformation”呢?
“Star Transformation”是ORACLE为了解决上面问题,结合ORACLE数据库特有的“Bitmap”索引而作出的优化。
“Star Transformation”是如何实现的呢?
- 在“Fact Table”的检索列上建立Bitmap索引。
- 通过检索列上的Bitmap索引操作( BITMAP AND/OR)拿到“Fact Table”的RowID,从而避免”FULL SCAN“。
现在我用”Example Schemas“的几个表来演示一下。
取得”Star Transformation“没有启用时的执行计划。
SQL> show parameter star_transformation_enabled NAME TYPE VALUE ------------------------------------ --------------------------------- ------------------------------ star_transformation_enabled string FALSE SQL> set autot on SQL> set autot traceonly exp SQL> set lin 180 pages 9999 SQL> r 1 SELECT c.cust_city, 2 t.calendar_quarter_desc, 3 SUM(s.amount_sold) sales_amount 4 FROM sh.sales s, 5 sh.times t, 6 sh.customers c, 7 sh.channels ch 8 WHERE s.time_id = t.time_id 9 AND s.cust_id = c.cust_id 10 AND s.channel_id = ch.channel_id 11 AND c.cust_state_province = 'CA' 12 AND ch.channel_desc = 'Internet' 13 AND t.calendar_quarter_desc IN ('1999-01','1999-02') 14* GROUP BY c.cust_city, t.calendar_quarter_desc 実行計画 ---------------------------------------------------------- Plan hash value: 1865285285 ------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 405 | 30780 | 955 (2)| 00:00:12 | | | | 1 | HASH GROUP BY | | 405 | 30780 | 955 (2)| 00:00:12 | | | |* 2 | HASH JOIN | | 1558 | 115K| 954 (2)| 00:00:12 | | | | 3 | PART JOIN FILTER CREATE | :BF0000 | 183 | 2928 | 18 (0)| 00:00:01 | | | |* 4 | TABLE ACCESS FULL | TIMES | 183 | 2928 | 18 (0)| 00:00:01 | | | |* 5 | HASH JOIN | | 12456 | 729K| 935 (2)| 00:00:12 | | | | 6 | MERGE JOIN CARTESIAN | | 383 | 14937 | 408 (1)| 00:00:05 | | | |* 7 | TABLE ACCESS FULL | CHANNELS | 1 | 13 | 3 (0)| 00:00:01 | | | | 8 | BUFFER SORT | | 383 | 9958 | 405 (1)| 00:00:05 | | | |* 9 | TABLE ACCESS FULL | CUSTOMERS | 383 | 9958 | 405 (1)| 00:00:05 | | | | 10 | PARTITION RANGE JOIN-FILTER| | 918K| 18M| 525 (2)| 00:00:07 |:BF0000|:BF0000| | 11 | TABLE ACCESS FULL | SALES | 918K| 18M| 525 (2)| 00:00:07 |:BF0000|:BF0000| ------------------------------------------------------------------------------------------------------------
取得”Star Transformation“启用时的执行计划。
SQL> alter session set star_transformation_enabled='true'; セッションが変更されました。 SQL> SELECT c.cust_city, t.calendar_quarter_desc, SUM(s.amount_sold) sales_amount FROM sh.sales s, sh.times t, sh.customers c, sh.channels ch WHERE s.time_id = t.time_id AND s.cust_id = c.cust_id AND s.channel_id = ch.channel_id AND c.cust_state_province = 'CA' AND ch.channel_desc = 'Internet' AND t.calendar_quarter_desc IN ('1999-01','1999-02') GROUP BY c.cust_city, t.calendar_quarter_desc; 実行計画 ---------------------------------------------------------- Plan hash value: 2164696140 -------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | -------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 542 | 30894 | 533 (1)| 00:00:07 | | | | 1 | TEMP TABLE TRANSFORMATION | | | | | | | | | 2 | LOAD AS SELECT | SYS_TEMP_0FD9D6609_340EC8 | | | | | | | |* 3 | TABLE ACCESS FULL | CUSTOMERS | 383 | 9958 | 405 (1)| 00:00:05 | | | | 4 | HASH GROUP BY | | 542 | 30894 | 128 (1)| 00:00:02 | | | |* 5 | HASH JOIN | | 1949 | 108K| 127 (0)| 00:00:02 | | | | 6 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6609_340EC8 | 383 | 5745 | 2 (0)| 00:00:01 | | | |* 7 | HASH JOIN | | 1949 | 81858 | 125 (0)| 00:00:02 | | | |* 8 | TABLE ACCESS FULL | TIMES | 183 | 2928 | 18 (0)| 00:00:01 | | | | 9 | VIEW | VW_ST_A3F94988 | 1953 | 50778 | 107 (0)| 00:00:02 | | | | 10 | NESTED LOOPS | | 1953 | 108K| 84 (0)| 00:00:02 | | | | 11 | PARTITION RANGE SUBQUERY | | 1952 | 54676 | 54 (0)| 00:00:01 |KEY(SQ)|KEY(SQ)| | 12 | BITMAP CONVERSION TO ROWIDS| | 1952 | 54676 | 54 (0)| 00:00:01 | | | | 13 | BITMAP AND | | | | | | | | | 14 | BITMAP MERGE | | | | | | | | | 15 | BITMAP KEY ITERATION | | | | | | | | | 16 | BUFFER SORT | | | | | | | | |* 17 | TABLE ACCESS FULL | CHANNELS | 1 | 13 | 3 (0)| 00:00:01 | | | |* 18 | BITMAP INDEX RANGE SCAN| SALES_CHANNEL_BIX | | | | |KEY(SQ)|KEY(SQ)| | 19 | BITMAP MERGE | | | | | | | | | 20 | BITMAP KEY ITERATION | | | | | | | | | 21 | BUFFER SORT | | | | | | | | |* 22 | TABLE ACCESS FULL | TIMES | 183 | 2928 | 18 (0)| 00:00:01 | | | |* 23 | BITMAP INDEX RANGE SCAN| SALES_TIME_BIX | | | | |KEY(SQ)|KEY(SQ)| | 24 | BITMAP MERGE | | | | | | | | | 25 | BITMAP KEY ITERATION | | | | | | | | | 26 | BUFFER SORT | | | | | | | | | 27 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6609_340EC8 | 383 | 1915 | 2 (0)| 00:00:01 | | | |* 28 | BITMAP INDEX RANGE SCAN| SALES_CUST_BIX | | | | |KEY(SQ)|KEY(SQ)| | 29 | TABLE ACCESS BY USER ROWID | SALES | 1 | 29 | 53 (0)| 00:00:01 | ROWID | ROWID |
简单的比较一下上面两个执行计划就可以发现”Star Transformation“启用后,Access的记录数和COST都大幅降低了。
在看一下CBO的动作,我们可以看到启用”Star Transformation“后,上面的SQL文被转换成了下面的SQL文:
SELECT
"T1" . "C1" "CUST_CITY"
,"T" . "CALENDAR_QUARTER_DESC" "CALENDAR_QUARTER_DESC"
,SUM (
"VW_ST_A3F94988" . "ITEM_3"
) "SALES_AMOUNT"
FROM
(
SELECT
/*+ ORDERED USE_NL ("SYS_CP_S") NOPARALLEL ("S") */ "SYS_CP_S" . "CUST_ID" "ITEM_1"
,"SYS_CP_S" . "TIME_ID" "ITEM_2"
,"SYS_CP_S" . "AMOUNT_SOLD" "ITEM_3"
FROM
"SH" . "SALES" "S"
,"SH" . "SALES" "SYS_CP_S"
WHERE
"S" . ROWID = "SYS_CP_S" . ROWID
AND "S" . "CUST_ID" = ANY (
SELECT
/*+ SEMIJOIN_DRIVER CACHE_TEMP_TABLE ("T1") */ "T1" . "C0" "C0"
FROM
"SYS" . "SYS_TEMP_0FD9D660B_340EC8" "T1"
)
AND "S" . "CHANNEL_ID" = ANY (
SELECT
/*+ SEMIJOIN_DRIVER */ "CH" . "CHANNEL_ID" "ITEM_1"
FROM
"SH" . "CHANNELS" "CH"
WHERE
"CH" . "CHANNEL_DESC" = 'Internet'
)
AND "S" . "TIME_ID" = ANY (
SELECT
/*+ SEMIJOIN_DRIVER */ "T" . "TIME_ID" "ITEM_1"
FROM
"SH" . "TIMES" "T"
WHERE
"T" . "CALENDAR_QUARTER_DESC" = '1999-01'
OR "T" . "CALENDAR_QUARTER_DESC" = '1999-02'
)
) "VW_ST_A3F94988"
,"SH" . "TIMES" "T"
,"SYS" . "SYS_TEMP_0FD9D660B_340EC8" "T1"
WHERE
"VW_ST_A3F94988" . "ITEM_2" = "T" . "TIME_ID"
AND "VW_ST_A3F94988" . "ITEM_1" = "T1" . "C0"
AND (
"T" . "CALENDAR_QUARTER_DESC" = '1999-01'
OR "T" . "CALENDAR_QUARTER_DESC" = '1999-02'
)
GROUP BY
"T1" . "C1"
,"T" . "CALENDAR_QUARTER_DESC"
大家是否注意到了有一个临时表“SYS_TEMP_0FD9D660B_340EC8”被使用了呢?
这个转换我们上面没有提到,实际这是一个”Star Transformation“的附加功能,他把属于“c.cust_state_province = 'CA'”的“customers” 缓存到一个临时表中,避免了在和“sales”表数据结合是再次对”TABLE ACCESS FULL CUSTOMERS“
我们看看如果不使用临时表的情况。
SQL> alter session set star_transformation_enabled='temp_disable';
セッションが変更されました。
SQL> SELECT c.cust_city,
t.calendar_quarter_desc,
SUM(s.amount_sold) sales_amount
2 3 4 FROM sh.sales s,
5 sh.times t,
6 sh.customers c,
7 sh.channels ch
WHERE s.time_id = t.time_id
AND s.cust_id = c.cust_id
AND s.channel_id = ch.channel_id
AND c.cust_state_province = 'CA'
AND ch.channel_desc = 'Internet'
AND t.calendar_quarter_desc IN ('1999-01','1999-02')
GROUP BY c.cust_city, t.calendar_quarter_desc;
8 9 10 11 12 13 14
実行計画
----------------------------------------------------------
Plan hash value: 2346071880
-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 405 | 27540 | 1337 (1)| 00:00:17 | | |
| 1 | HASH GROUP BY | | 405 | 27540 | 1337 (1)| 00:00:17 | | |
|* 2 | HASH JOIN | | 1948 | 129K| 1336 (1)| 00:00:17 | | |
|* 3 | TABLE ACCESS FULL | CUSTOMERS | 383 | 9958 | 405 (1)| 00:00:05 | | |
|* 4 | HASH JOIN | | 1949 | 81858 | 931 (1)| 00:00:12 | | |
|* 5 | TABLE ACCESS FULL | TIMES | 183 | 2928 | 18 (0)| 00:00:01 | | |
| 6 | VIEW | VW_ST_B1772830 | 1953 | 50778 | 913 (1)| 00:00:11 | | |
| 7 | NESTED LOOPS | | 1953 | 108K| 487 (1)| 00:00:06 | | |
| 8 | PARTITION RANGE SUBQUERY | | 1952 | 54676 | 457 (1)| 00:00:06 |KEY(SQ)|KEY(SQ)|
| 9 | BITMAP CONVERSION TO ROWIDS| | 1952 | 54676 | 457 (1)| 00:00:06 | | |
| 10 | BITMAP AND | | | | | | | |
| 11 | BITMAP MERGE | | | | | | | |
| 12 | BITMAP KEY ITERATION | | | | | | | |
| 13 | BUFFER SORT | | | | | | | |
|* 14 | TABLE ACCESS FULL | CHANNELS | 1 | 13 | 3 (0)| 00:00:01 | | |
|* 15 | BITMAP INDEX RANGE SCAN| SALES_CHANNEL_BIX | | | | |KEY(SQ)|KEY(SQ)|
| 16 | BITMAP MERGE | | | | | | | |
| 17 | BITMAP KEY ITERATION | | | | | | | |
| 18 | BUFFER SORT | | | | | | | |
|* 19 | TABLE ACCESS FULL | TIMES | 183 | 2928 | 18 (0)| 00:00:01 | | |
|* 20 | BITMAP INDEX RANGE SCAN| SALES_TIME_BIX | | | | |KEY(SQ)|KEY(SQ)|
| 21 | BITMAP MERGE | | | | | | | |
| 22 | BITMAP KEY ITERATION | | | | | | | |
| 23 | BUFFER SORT | | | | | | | |
|* 24 | TABLE ACCESS FULL | CUSTOMERS | 383 | 6128 | 405 (1)| 00:00:05 | | |
|* 25 | BITMAP INDEX RANGE SCAN| SALES_CUST_BIX | | | | |KEY(SQ)|KEY(SQ)|
| 26 | TABLE ACCESS BY USER ROWID | SALES | 1 | 29 | 456 (1)| 00:00:06 | ROWID | ROWID |
-----------------------------------------------------------------------------------------------------------------------
是不是在24行,和“sales”表数据结合是再次对”TABLE ACCESS FULL CUSTOMERS“了呢。
最后,我们来总结一下如何启用”Star Transformation“。
1. 参数“star_transformation_enabled”设成“True”或“temp_disable”。
2. "Fact Table"的检索列上有Bitmap索引。
以上。
2021/04/01 @ Dalian