Full Scans and the Highwater Mark 全扫描和高水标识
A final point of note regarding full table scans is that as the multiblock read calls for the scan are
made, Oracle will read blocks up to the highwater mark in the table. The highwater mark marks the
last block in the table that has ever had data written to it. To be technically correct, this is actually
called the low highwater mark. For your purposes, the low highwater mark is what I’ll be discussing
and I’ll refer to it generically as the highwater mark. For a more detailed discussion, please see the
Oracle documentation.
关于全表扫描最后的注意点是因为扫描使用多块读,Oracle将读取块达到表中的“高水标识”处。高水标识标记表中曾经被写入数据的最后一个块。 技术上正确的说,实际上应该称为“低高水标识”。为了你(容易理解)目的, “低高水标识”是我要讨论的,我就统称它是“高水标识”。更为详细的讨论,请参阅 Oracle文档。 (注:有的书把Highwater Mark翻译成“高水位线”)
When rows are inserted into a table, blocks are allocated and the rows are placed in the blocks.
Figure 3-2 shows how a table might look after a large insert to populate the table.
当行集插入表中,块随之分配而行存于块中。图3-2展示在一个大的insert操作填充表后,表可能的样子。
Figure 3-2. Blocks allocated to a table with rows indicated with a + 分配给表的存着行的块,用一个+表示
Over the course of normal operations, rows are deleted from the blocks. Figure 3-3 shows how the
table might look after a large number of rows have been deleted from the table.
在正常操作之后,有些行从块中删除了。图3-3展示在大量的行从表中删除后表可能的样子。
Figure 3-3. The blocks after rows have been deleted. The HWM remains unchanged. 行删除之后的块,HWM保持不变。
Even though almost all the rows have been deleted and some blocks have actually become totally
unused, the highwater mark remains the same. When a full scan operation occurs, all blocks up to the
highwater mark will be read in and scanned, even if they are empty. This means that many blocks that
don’t need to be read because they are empty will still be read. Listing 3-6 shows an example of how
highwater mark doesn’t change, even if all the rows in the table are deleted.
即使几乎所有的行被删而且某些块实际上完全没用上,高水标示却保持一样。当全扫描操作发生,所有包括在高水标识内的块将被读入和扫描,即使他们是空的。这意味着许多不需要读入的块,因为他们是空的,还是读入了。 列表3-6展示了即使所有表中的所有行被删除,高水标示不变的例子。
Listing 3-6. Highwater Mark
SQL> -- List number of allocated blocks (table has 800,000 rows)
SQL> -- The highwater mark is the last block containing data.
SQL> -- While this query doesn’t specifically show the HWM, it gives you an idea.
SQL>
SQL> select blocks from user_segments where segment_name = 'T2';
BLOCKS
---------------
12288
1 row selected.
SQL> -- List how many blocks contain data
SQL>
SQL> select count(distinct (dbms_rowid.rowid_block_number(rowid))) block_ct from t2 ;
BLOCK_CT
---------------
12122
1 row selected.
SQL> -- List the lowest and highest block numbers for this table
SQL>
SQL> select min(dbms_rowid.rowid_block_number(rowid)) min_blk,
max(dbms_rowid.rowid_block_number(rowid)) max_blk from t2 ;
MIN_BLK MAX_BLK
--------------- ---------------
1302492 1386248
1 row selected.
SQL> -- Check the space usage in the table
SQL> get space_usage.sql
1 declare
2 l_tabname varchar2(30) := '&1';
3 l_fs1_bytes number;
4 l_fs2_bytes number;
5 l_fs3_bytes number;
6 l_fs4_bytes number;
7 l_fs1_blocks number;
8 l_fs2_blocks number;
9 l_fs3_blocks number;
10 l_fs4_blocks number;
11 l_full_bytes number;
12 l_full_blocks number;
13 l_unformatted_bytes number;
14 l_unformatted_blocks number;
15 begin
16 dbms_space.space_usage(
17 segment_owner => user,
18 segment_name => l_tabname,
19 segment_type => 'TABLE',
20 fs1_bytes => l_fs1_bytes,
21 fs1_blocks => l_fs1_blocks,
22 fs2_bytes => l_fs2_bytes,
23 fs2_blocks => l_fs2_blocks,
24 fs3_bytes => l_fs3_bytes,
25 fs3_blocks => l_fs3_blocks,
26 fs4_bytes => l_fs4_bytes,
27 fs4_blocks => l_fs4_blocks,
28 full_bytes => l_full_bytes,
29 full_blocks => l_full_blocks,
30 unformatted_blocks => l_unformatted_blocks,
31 unformatted_bytes => l_unformatted_bytes
32 );
33 dbms_output.put_line('0-25% Free = '||l_fs1_blocks||' Bytes = '||l_fs1_bytes);
34 dbms_output.put_line('25-50% Free = '||l_fs2_blocks||' Bytes = '||l_fs2_bytes);
35 dbms_output.put_line('50-75% Free = '||l_fs3_blocks||' Bytes = '||l_fs3_bytes);
36 dbms_output.put_line('75-100% Free = '||l_fs4_blocks||' Bytes = '||l_fs4_bytes);
37 dbms_output.put_line('Full Blocks = '||l_full_blocks||' Bytes = '||l_full_bytes);
38* end;
SQL>
SQL> @space_usage T2
0-25% Free = 0 Bytes = 0
25-50% Free = 0 Bytes = 0
50-75% Free = 0 Bytes = 0
75-100% Free = 16 Bytes = 131072
Full Blocks = 12121 Bytes = 99295232
PL/SQL procedure successfully completed.
SQL> -- Note that most blocks are full
SQL> -- A full table scan would have to read all the blocks (12137 total)
SQL>
SQL> -- Delete all the rows from the table
SQL> delete from t2 ;
800000 rows deleted.
SQL>
SQL> commit;
Commit complete.
SQL>-- Check the space usage after all rows are deleted
SQL>@space_usege T2
0-25% Free = 0 Bytes = 0
25-50% Free = 0 Bytes = 0
50-75% Free = 0 Bytes = 0
75-100% Free = 12137 Bytes = 99426304
Full Blocks = 0 Bytes = 0
PL/SQL procedure successfully completed.
SQL> -- Note that blocks are now free but the same space is still consumed
SQL> -- A full table scan would still read 12137 blocks
SQL> -- List number of blocks (table has 0 rows)
SQL> select blocks from user_segments where segment_name='T2';
BLOCKS
---------------
12288
1 row selected.
SQL> -- List how many blocks contain data
SQL>
SQL> select count(distinct (dbms_rowid.rowid_block_number(rowid))) block_ct from t2 ;
BLOCK_CT
---------------
0
1 row selected.
SQL> -- Execute a full table scan and note the consistent gets(logical block reads)
SQL>
SQL> set autotrace traceonly
SQL> select * from t2;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 1513984157
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 65 | 2674 (1)| 00:00:33 |
| 1 | TABLE ACCESS FULL| T2 | 1 | 65 | 2674 (1)| 00:00:33 |
--------------------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
12148 consistent gets
11310 physical reads
0 redo size
332 bytes sent via SQL*Net to client
370 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
SQL> set autotrace off
SQL>
SQL> -- Truncate the table to deallocate the space and reset the HWM
SQL> truncate table t2 ;
Table truncated.
SQL> -- Check the space usage after table is truncated
SQL> @space_usage T2
0-25% Free = 0 Bytes = 0
25-50% Free = 0 Bytes = 0
50-75% Free = 0 Bytes = 0
75-100% Free = 0 Bytes = 0
Full Blocks = 0 Bytes = 0
PL/SQL procedure successfully completed.
SQL> -- Note that the space has been deallocated
SQL>
SQL> -- List number of blocks (table has 0 rows and all space recovered)
SQL> select blocks from user_segments where segment_name = 'T2';
BLOCKS
---------------
8
1 row selected.
SQL> set autotrace traceonly
SQL> select * from t2 ;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 1513984157
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 65 | 2674 (1)| 00:00:33 |
| 1 | TABLE ACCESS FULL| T2 | 1 | 65 | 2674 (1)| 00:00:33 |
--------------------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
332 bytes sent via SQL*Net to client
370 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
SQL> set autotrace off
I hope this example illustrates that even when a full table scan is the “right” plan operation choice,
the overhead of reading additional empty blocks can mean performance takes a significant hit. For
tables that are frequently loaded and unloaded (using DELETE instead of TRUNCATE), you may discover
that response time suffers. This occurs often with tables that are used for ETL or any form of
load/process/unload activity. Now that you know how full scan behavior can be affected, you will be
able to diagnose and correct related performance problems more easily.
我希望通过这个例子演示:即使全表扫描是“正确的”计划操作选择,读附加的空块的开销也意味着性能遭受巨大打击。对于那些经常加载和卸载数据的表而言(使用DELETE而不是TRUNCATE),你可能发现响应时间极慢。这种情况尤其在表被用做ETL(数据抽取Extraction Transformation And Loading) 或者任意形式的加载、处理、卸载动作。 既然你知道了全扫描动作会被如何影响了,你就能更加容易的诊断和纠正相关的性能问题。
译者注: 在本机实验11g,T2表插入1000000条数据。然后delete整个表这种情况,再select count(*) 并没有耗费很多时间,有可能11g有所优化,见实验结果。