11g引入了大量compress相关的特性,其中之一便是dbms_compression包;GET_COMPRESSION_RATIO函数可以帮助我们了解压缩某个表后各种可能的影响。换而言之,这个函数可以让我们在具体实施表压缩技术或者测试前,对于压缩后的效果能有一个基本的印象。该包在11gr2中被首次引入,故而使用之前版本的包括11gr1都无缘得用。其次除OLTP压缩模式之外的柱形混合压缩只能在基于Exdata存储的表空间上实现。使用DBMS_COMPRESSION包获取的相关压缩信息是十分准确的,因为在评估过程中Oracle通过实际采样并建立模型表以尽可能还原逼真的数据。 我们可以通过trace来分析其评估过程中的具体操作,可以分成2步: 1. 建立原表的样本表,其采样值基于原表的大小:
SQL> create table samp_dss_nation tablespace SCRATCH as select * from dss_nation sample block (50);
Table created.
2. 基于采用表建立对应压缩类型的模型表:
SQL> create table model_dss_nation tablespace SCRATCH compress for query high as select * from samp_dss_nation;
create table model_dss_nation tablespace SCRATCH compress for query high as select * from samp_dss_nation
*
ERROR at line 1:
ORA-64307: hybrid columnar compression is only supported in tablespaces
residing on Exadata storage
可以看到在实际建立过程中Oracle将拒绝在非Exdata存储的表空间上建立该类柱形混合压缩(包括:COMP_FOR_QUERY_HIGH,COMP_FOR_QUERY_LOW,COMP_FOR_ARCHIVE_HIGH,CO MP_FOR_ARCHIVE_LOW)。但DBMS_COMPRESSION在进行评估时可以绕过Oracle对于该类操作的LOCK. 要在没有Exdata存储设备的情况下使用dbms_compression包评测OLTP压缩模式外的柱状混合压缩模式时 (hybrid columnar compression is only supported in tablespaces residing on Exadata storage),首先需要打上patch 8896202:
[oracle@rh2 admin]$ /s01/dbhome_1/OPatch/opatch lsinventory
Invoking OPatch 11.1.0.6.6
Oracle Interim Patch Installer version 11.1.0.6.6
Copyright (c) 2009, Oracle Corporation. All rights reserved.
Oracle Home : /s01/dbhome_1
Central Inventory : /s01/oraInventory
from : /etc/oraInst.loc
OPatch version : 11.1.0.6.6
OUI version : 11.2.0.1.0
OUI location : /s01/dbhome_1/oui
Log file location : /s01/dbhome_1/cfgtoollogs/opatch/opatch2010-06-02_23-08-33PM.log
Patch history file: /s01/dbhome_1/cfgtoollogs/opatch/opatch_history.txt
Lsinventory Output file location : /s01/dbhome_1/cfgtoollogs/opatch/lsinv/lsinventory2010-06-02_23-08-33PM.txt
--------------------------------------------------------------------------------
Installed Top-level Products (1):
Oracle Database 11g 11.2.0.1.0
There are 1 products installed in this Oracle Home.
Interim patches (1) :
Patch 8896202 : applied on Wed Jun 02 21:55:44 CST 2010
Unique Patch ID: 11909460
Created on 29 Oct 2009, 15:21:45 hrs US/Pacific
Bugs fixed:
8896202
该patch用以:ENABLE COMPRESSION ADVISOR TO ESTIMATE EXADATA HCC COMPRESSION RATIOS
接着我们还需要运行被修改后的DBMSCOMP包创建SQL,具体操作为:
SQL> @?/rdbms/admin/prvtcmpr.plb
Package created.
Grant succeeded.
Package body created.
No errors.
Package body created.
No errors.
Type body created.
No errors.
SQL> @?/rdbms/admin/dbmscomp.sql
Package created.
Synonym created.
Grant succeeded.
No errors.
DBMS_COMPRESSION包在对表压缩进行评估时,默认表最少数据为1000000行,可能在你的测试库中没有这么多数据,我们可以修改这个下限;
通过将COMP_RATIO_MINROWS常数修改为1后,就可以分析最小为1行的表了:
SQL>create or replace package sys.dbms_compression authid current_user is
COMP_NOCOMPRESS CONSTANT NUMBER := 1;
COMP_FOR_OLTP CONSTANT NUMBER := 2;
COMP_FOR_QUERY_HIGH CONSTANT NUMBER := 4;
COMP_FOR_QUERY_LOW CONSTANT NUMBER := 8;
COMP_FOR_ARCHIVE_HIGH CONSTANT NUMBER := 16;
COMP_FOR_ARCHIVE_LOW CONSTANT NUMBER := 32;
COMP_RATIO_MINROWS CONSTANT NUMBER := 10;
COMP_RATIO_ALLROWS CONSTANT NUMBER := -1;
PROCEDURE get_compression_ratio(scratchtbsname IN varchar2,
ownname IN varchar2,
tabname IN varchar2,
partname IN varchar2,
comptype IN number,
blkcnt_cmp OUT PLS_INTEGER,
blkcnt_uncmp OUT PLS_INTEGER,
row_cmp OUT PLS_INTEGER,
row_uncmp OUT PLS_INTEGER,
cmp_ratio OUT NUMBER,
comptype_str OUT varchar2,
subset_numrows IN number DEFAULT COMP_RATIO_MINROWS);
function get_compression_type(ownname IN varchar2,
tabname IN varchar2,
row_id IN rowid) return number;
PROCEDURE incremental_compress(ownname IN dba_objects.owner%type,
tabname IN dba_objects.object_name%type,
partname IN dba_objects.subobject_name%type,
colname IN varchar2,
dump_on IN number default 0,
autocompress_on IN number default 0,
where_clause IN varchar2 default '');
end dbms_compression;
Package created.
SQL> alter package dbms_compression compile body;
Package body altered.
接下来我们通过建立一个基于TPC-D的测试的Schema,保证各表上有较多的数据,并且数据有一定的拟真度:
SQL> select table_name,num_rows,blocks from user_tables ;
TABLE_NAME NUM_ROWS BLOCKS
------------------------------ ---------- ----------
DSS_SUPPLIER 20000 496
DSS_PART 400000 7552
DSS_REGION 5 5
DSS_PARTSUPP 1600000 29349
DSS_LINEITEM 12000000 221376
DSS_ORDER 3000000 48601
DSS_CUSTOMER 300000 6922
DSS_NATION 25 5
现在可以进行压缩评估了,我们针对测试模型Schema编辑以下匿名块并运行
SQL> set serveroutput on;
SQL> declare
cmp_blk_cnt binary_integer;
uncmp_blk_cnt binary_integer;
cmp_rows binary_integer;
uncmp_rows binary_integer;
cmp_ratio number;
cmp_typ varchar2(100);
BEGIN
for i in (SELECT TABLE_NAME
from dba_tables
where compression = 'DISABLED'
and owner = 'MACLEAN' and num_rows>1000000) loop
for j in 1 .. 5 loop
dbms_compression.get_compression_ratio(scratchtbsname => 'SCRATCH',
ownname => 'MACLEAN',
tabname => i.table_name,
partname => NULL,
comptype => power(2, j),
blkcnt_cmp => cmp_blk_cnt,
blkcnt_uncmp => uncmp_blk_cnt,
row_cmp => cmp_rows,
row_uncmp => uncmp_rows,
cmp_ratio => cmp_ratio,
comptype_str => cmp_typ);
dbms_output.put_line(i.table_name || '--' || 'compress_type is ' ||
cmp_typ || ' ratio :' ||
to_char(cmp_ratio, '99.9') || '%');
end loop;
end loop;
end;
/
DSS_ORDER--compress_type is "Compress For OLTP" ratio : 1.1%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_ORDER--compress_type is "Compress For Query High" ratio : 2.7%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_ORDER--compress_type is "Compress For Query Low" ratio : 1.7%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_ORDER--compress_type is "Compress For Archive High" ratio : 2.9%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_ORDER--compress_type is "Compress For Archive Low" ratio : 2.7%
DSS_PARTSUPP--compress_type is "Compress For OLTP" ratio : .9%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_PARTSUPP--compress_type is "Compress For Query High" ratio : 1.8%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_PARTSUPP--compress_type is "Compress For Query Low" ratio : 1.2%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_PARTSUPP--compress_type is "Compress For Archive High" ratio : 1.9%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_PARTSUPP--compress_type is "Compress For Archive Low" ratio : 1.8%
DSS_LINEITEM--compress_type is "Compress For OLTP" ratio : 1.4%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_LINEITEM--compress_type is "Compress For Query High" ratio : 3.5%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_LINEITEM--compress_type is "Compress For Query Low" ratio : 2.3%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_LINEITEM--compress_type is "Compress For Archive High" ratio : 4.3%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
DSS_LINEITEM--compress_type is "Compress For Archive Low" ratio : 3.7%
PL/SQL procedure successfully completed.
可以从上述测试看到,"Compress For Archive High"压缩率最高,该类型最适合于数据归档存储,但其算法复杂度高于"Compress For Archive Low",压缩耗时亦随之上升。 总体压缩率都较低,这同TPC-D测试的数据建模有一定关联,我们再使用一组TPC-H的测试数据来模拟压缩:
SQL> conn liu/liu;
Connected.
SQL> select num_rows,blocks,table_name from user_tables;
NUM_ROWS BLOCKS TABLE_NAME
---------- ---------- ------------------------------
3000000 46817 H_ORDER
300000 6040 H_CUSTOMER
12000000 221376 H_LINEITEM
25 5 H_NATION
400000 7552 H_PART
5 5 H_REGION
1600000 17491 H_PARTSUPP
20000 496 H_SUPPLIER
8 rows selected.
SQL> set serveroutput on;
SQL> declare
cmp_blk_cnt binary_integer;
uncmp_blk_cnt binary_integer;
cmp_rows binary_integer;
uncmp_rows binary_integer;
cmp_ratio number;
cmp_typ varchar2(100);
BEGIN
for i in (SELECT TABLE_NAME
from dba_tables
where compression = 'DISABLED'
and owner = 'LIU' and num_rows>1000000) loop
for j in 1 .. 5 loop
dbms_compression.get_compression_ratio(scratchtbsname => 'SCRATCH',
ownname => 'LIU',
tabname => i.table_name,
partname => NULL,
comptype => power(2, j),
blkcnt_cmp => cmp_blk_cnt,
blkcnt_uncmp => uncmp_blk_cnt,
row_cmp => cmp_rows,
row_uncmp => uncmp_rows,
cmp_ratio => cmp_ratio,
comptype_str => cmp_typ);
dbms_output.put_line(i.table_name || '--' || 'compress_type is ' ||
cmp_typ || ' ratio :' ||
to_char(cmp_ratio, '99.9') || '%');
end loop;
end loop;
end;
/
H_ORDER--compress_type is "Compress For OLTP" ratio : 1.1%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_ORDER--compress_type is "Compress For Query High" ratio : 5.2%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_ORDER--compress_type is "Compress For Query Low" ratio : 2.9%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_ORDER--compress_type is "Compress For Archive High" ratio : 7.2%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_ORDER--compress_type is "Compress For Archive Low" ratio : 5.5%
H_PARTSUPP--compress_type is "Compress For OLTP" ratio : .9%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_PARTSUPP--compress_type is "Compress For Query High" ratio : 5.1%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_PARTSUPP--compress_type is "Compress For Query Low" ratio : 2.7%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_PARTSUPP--compress_type is "Compress For Archive High" ratio : 7.2%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_PARTSUPP--compress_type is "Compress For Archive Low" ratio : 5.3%
H_LINEITEM--compress_type is "Compress For OLTP" ratio : 1.4%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_LINEITEM--compress_type is "Compress For Query High" ratio : 5.2%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_LINEITEM--compress_type is "Compress For Query Low" ratio : 3.0%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_LINEITEM--compress_type is "Compress For Archive High" ratio : 7.4%
Compression Advisor self-check validation successful. select count(*) on both
Uncompressed and EHCC Compressed format = 1000001 rows
H_LINEITEM--compress_type is "Compress For Archive Low" ratio : 5.6%
PL/SQL procedure successfully completed.
可以看到相比TPC-D的测试用数据,TPC-H建立的数据更具可压缩性。 PS: TPC-D represents a broad range of decision support (DS) applications that require complex, long running queries against large complex data structures. Real-world business questions were written against this model, resulting in 17 complex queries. The TPC Benchmark