全表扫描是Oracle访问数据库表是较为常见的访问方式之一。很多朋友一看到SQL语句执行计划中的全表扫描,就要考虑对其进行修理一番。全表扫描的存在,的确存在可能优化的余地。但事实上很多时候全表扫描也并非是最低效的,完全要看不同的情形与场合,任一方式都是有利有弊的,也就是具体情况要具体分析。本文描述了什么是全表扫描以及何时发生全表扫描,何时全表扫描才低效。
本文涉及到的相关链接:
高水位线和全表扫描
启用 AUTOTRACE 功能
Oracle 测试常用表BIG_TABLE
Oracle db_file_mulitblock_read_count参数
1、什么是全表扫描?
全表扫描就是扫表表中所有的行,实际上是扫描表中所有的数据块,因为Oracle中最小的存储单位是Oracle block。
扫描所有的数据块就包括高水位线以内的数据块,即使是空数据块在没有被释放的情形下也会被扫描而导致I/O增加。
在全表扫描期间,通常情况下,表上这些相邻的数据块被按顺序(sequentially)的方式访问以使得一次I/O可以读取多个数据块。
一次读取更多的数据块有助于全表扫描使用更少的I/O,对于可读取的数据块被限制于参数DB_FILE_MULTIBLOCK_READ_COUNT。
2、何时发生全表扫描?
a、表上的索引失效或无法被使用的情形(如对谓词使用函数、计算、NULL值、不等运算符、类型转换)
b、查询条件返回了整个表的大部分数据
c、使用了并行方式访问表
d、使用full 提示
e、统计信息缺失时使得Oracle认为全表扫描比索引扫描更高效
f、表上的数据块小于DB_FILE_MULTIBLOCK_READ_COUNT值的情形可能产生全表扫描
3、演示全表扫描的情形
a、准备演示环境
scott@ORA11G> select * from v$version where rownum<2;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
--创建表t
scott@ORA11G> CREATE TABLE t
2 AS
3 SELECT rownum AS n, rpad('*',100,'*') AS pad
4 FROM dual
5 CONNECT BY level <= 1000;
Table created.
--添加索引
scott@ORA11G> create unique index t_pk on t(n);
Index created.
scott@ORA11G> alter table t add constraint t_pk primary key(n) using index t_pk;
Table altered.
--收集统计信息
scott@ORA11G> execute dbms_stats.gather_table_stats('SCOTT','T',cascade=>true);
PL/SQL procedure successfully completed.
scott@ORA11G> set autot trace exp;
scott@ORA11G> select count(*) from t; --->count(*)的时候使用了索引快速扫描
Execution Plan
----------------------------------------------------------
Plan hash value: 454320086
----------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| T_PK | 1000 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------
scott@ORA11G> set autot off;
scott@ORA11G> alter table t move; --->进行move table
Table altered.
-->move 之后索引失效,如下所示
scott@ORA11G> @idx_info
Enter value for owner: scott
Enter value for table_name: t
Table Name INDEX_NAME CL_NAM CL_POS STATUS IDX_TYP DSCD
------------- -------------- -------------------- ------ -------- --------------- ----
T T_PK N 1 UNUSABLE NORMAL ASC
b、索引失效导致全表扫描
scott@ORA11G> set autot trace exp;
scott@ORA11G> select count(*) from t;
Execution Plan
----------------------------------------------------------
Plan hash value: 2966233522
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| T | 1000 | 7 (0)| 00:00:01 |
-------------------------------------------------------------------
scott@ORA11G> set autot off;
scott@ORA11G> alter index t_pk rebuild; -->重建索引
Index altered.
scott@ORA11G> @idx_info
Enter value for owner: scott
Enter value for table_name: t
Table Name INDEX_NAME CL_NAM CL_POS STATUS IDX_TYP DSCD
-------------- ---------------- -------------------- ------ -------- --------------- ----
T T_PK N 1 VALID NORMAL ASC
c、返回了整个表的大部分数据使用了全表扫描
scott@ORA11G> select count(pad) from t where n<=990;
Execution Plan
----------------------------------------------------------
Plan hash value: 2966233522
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 105 | 7 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 105 | | |
|* 2 | TABLE ACCESS FULL| T | 991 | 101K| 7 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("N"<=990)
--返回小部分数据时,使用的是索引扫描
scott@ORA11G> select count(pad) from t where n<=10;
Execution Plan
----------------------------------------------------------
Plan hash value: 4270555908
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 105 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 105 | | |
| 2 | TABLE ACCESS BY INDEX ROWID| T | 10 | 1050 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | T_PK | 10 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("N"<=10)
d、使用并行方式访问表时使用了全表扫描
scott@ORA11G> select /*+ parallel(3) */ count(pad) from t where n<=10;
Execution Plan
----------------------------------------------------------
Plan hash value: 3126468333
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 105 | 3 (0)| 00:00:01 | | | |
| 1 | SORT AGGREGATE | | 1 | 105 | | | | | |
| 2 | PX COORDINATOR | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 105 | | | Q1,00 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | 1 | 105 | | | Q1,00 | PCWP | |
| 5 | PX BLOCK ITERATOR | | 10 | 1050 | 3 (0)| 00:00:01 | Q1,00 | PCWC | |
|* 6 | TABLE ACCESS FULL| T | 10 | 1050 | 3 (0)| 00:00:01 | Q1,00 | PCWP | |
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - filter("N"<=10)
Note
-----
- Degree of Parallelism is 3 because of hint
--Author : Robinson
--Blog : http://blog.csdn.net/robinson_0612
e、使用full提示时使用了全表扫描
scott@ORA11G> select /*+ full(t) */ count(pad) from t where n<=10;
Execution Plan
----------------------------------------------------------
Plan hash value: 2966233522
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 105 | 7 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 105 | | |
|* 2 | TABLE ACCESS FULL| T | 10 | 1050 | 7 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("N"<=10)
f、统计信息缺失导致全表扫描的情形
scott@ORA11G> exec dbms_stats.delete_table_stats('SCOTT','T');
PL/SQL procedure successfully completed.
scott@ORA11G> select count(pad) from t where n<=10;
Execution Plan
----------------------------------------------------------
Plan hash value: 2966233522
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 65 | 7 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 65 | | |
|* 2 | TABLE ACCESS FULL| T | 10 | 650 | 7 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("N"<=10)
Note
-----
- dynamic sampling used for this statement (level=2)
--上面的执行计划使用了全表扫描,而且提示使用了动态采样,也就是缺乏统计信息
--表上的数据块小于DB_FILE_MULTIBLOCK_READ_COUNT值的情形可能产生全表扫描的情形不演示
4、全表扫描何时低效?
--先来做几个实验
a、演示表上的相关信息
scott@ORA11G> @idx_info
Enter value for owner: scott
Enter value for table_name: big_table
Table Name Index Name CL_NAM CL_POS Status IDX_TYP DSCD
------------------------- ------------------------- --------- ------ -------- --------------- ----
BIG_TABLE BIG_TABLE_PK ID 1 VALID NORMAL ASC
scott@ORA11G> @idx_stat
Enter value for input_table_name: big_table
Enter value for owner: scott
AVG LEAF BLKS AVG DATA BLKS
BLEV IDX_NAME LEAF_BLKS DST_KEYS PER KEY PER KEY CLUST_FACT LAST_ANALYZED TB_BLKS TB_ROWS
---- -------------- ---------- ---------- ------------- ------------- ---------- ------------------ ---------- ----------
1 BIG_TABLE_PK 208 100000 1 1 1483 20130524 10:45:51 1515 100000
--数据库参数设置
scott@ORA11G> show parameter optimizer_index_
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_index_caching integer 0
optimizer_index_cost_adj integer 100
scott@ORA11G> show parameter optimizer_mode
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_mode string ALL_ROWS
b、查询返回20%数据行的情形
scott@ORA11G> alter system flush buffer_cache;
scott@ORA11G> select sum(object_id),avg(object_id) from big_table where id between 20000 and 40000;
Execution Plan
----------------------------------------------------------
Plan hash value: 3098837282 -- 执行计划中,使用了索引范围扫描
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 18 | 341 (0)| 00:00:05 |
| 1 | SORT AGGREGATE | | 1 | 18 | | |
| 2 | TABLE ACCESS BY INDEX ROWID| BIG_TABLE | 20046 | 352K| 341 (0)| 00:00:05 |
|* 3 | INDEX RANGE SCAN | BIG_TABLE_PK | 20046 | | 43 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("ID">=20000 AND "ID"<=40000)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
351 consistent gets
351 physical reads
0 redo size
427 bytes sent via SQL*Net to client
349 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
scott@ORA11G> alter system flush buffer_cache;
scott@ORA11G> select /*+ full(big_table) */ sum(object_id),avg(object_id) from big_table where id between 20000 and 40000;
Execution Plan
----------------------------------------------------------
Plan hash value: 599409829 ---- 使用了提示执行为全表扫描
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 18 | 413 (1)| 00:00:05 |
| 1 | SORT AGGREGATE | | 1 | 18 | | |
|* 2 | TABLE ACCESS FULL| BIG_TABLE | 20046 | 352K| 413 (1)| 00:00:05 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("ID"<=40000 AND "ID">=20000)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1486 consistent gets
1484 physical reads
0 redo size
427 bytes sent via SQL*Net to client
349 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
--注意对比上面两次操作中的consistent gets与physical reads
c、查询返回30%数据行的情形
scott@ORA11G> alter system flush buffer_cache;
scott@ORA11G> select sum(object_id),avg(object_id) from big_table where id between 20000 and 50000;
Execution Plan
----------------------------------------------------------
Plan hash value: 599409829 --->尽管返回数据的总行数为30%,而此时优化器使用了全表扫描
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 18 | 413 (1)| 00:00:05 |
| 1 | SORT AGGREGATE | | 1 | 18 | | |
|* 2 | TABLE ACCESS FULL| BIG_TABLE | 30012 | 527K| 413 (1)| 00:00:05 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("ID"<=50000 AND "ID">=20000)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1486 consistent gets
1484 physical reads
0 redo size
427 bytes sent via SQL*Net to client
349 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
--下面使用提示来强制优化器走索引扫描
scott@ORA11G> alter system flush buffer_cache;
scott@ORA11G> select /*+ index(big_table big_table_pk) */ sum(object_id),avg(object_id)
2 from big_table where id between 20000 and 50000;
Execution Plan
----------------------------------------------------------
Plan hash value: 3098837282
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 18 | 511 (1)| 00:00:07 |
| 1 | SORT AGGREGATE | | 1 | 18 | | |
| 2 | TABLE ACCESS BY INDEX ROWID| BIG_TABLE | 30012 | 527K| 511 (1)| 00:00:07 |
|* 3 | INDEX RANGE SCAN | BIG_TABLE_PK | 30012 | | 64 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("ID">=20000 AND "ID"<=50000)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
526 consistent gets
526 physical reads
0 redo size
427 bytes sent via SQL*Net to client
349 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
--注意观察每一次测试时所耗用的物理读与逻辑读
--从上面的测试可以看出,当表上所返回的数据行数接近于表上的30%时,Oracle 倾向于使用全表扫描
--而对于表上所返回的数据行数接近于表上的30%的情形,我们给与索引提示,此时比全表扫描更高效,即全表扫描是低效的
--笔者同时测试了数据返回总行数接近80%的情形以及创建了一个百万记录的进行对比测试
--大致结论,如果查询所返回的数据的总行数仅仅是表上数据的百分之八十以下,而使用了全表扫描,即可认为该全表扫描是低效的
--注:
--具体情况需要具体分析,如果你的表是千万级的,返回总数据的百分之零点几都会导致很大的差异
--其次,表上的索引应具有良好的聚簇因子,如不然,测试的结果可能有天壤之别
--最后,上面所描述的返回总行数应与执行结果返回的行数有差异,是指多少行参与了sum(object_id)
5、小表的全表扫描是否高效?
--使用scott下dept表,仅有4行数据
scott@ORA11G> select * from dept where deptno>10;
3 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2985873453 --->执行计划选择了索引扫描
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 60 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| DEPT | 3 | 60 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | PK_DEPT | 3 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("DEPTNO">10)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets -->使用了4次逻辑读
0 physical reads
0 redo size
515 bytes sent via SQL*Net to client
349 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
-->下面强制使用全表扫描
scott@ORA11G> select /*+ full(dept) */ * from dept where deptno>10;
3 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3383998547
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 60 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| DEPT | 3 | 60 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("DEPTNO">10)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
4 consistent gets -->此时的逻辑读同样为4次
0 physical reads
0 redo size
515 bytes sent via SQL*Net to client
349 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
--下面来看看count(*)的情形
scott@ORA11G> select count(*) from dept;
1 row selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3051237957 --->执行计划选择了索引全扫描
--------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 1 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FULL SCAN| PK_DEPT | 4 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1 consistent gets -->逻辑读仅为1次
0 physical reads
0 redo size
335 bytes sent via SQL*Net to client
349 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
-->下面强制使用全表扫描
scott@ORA11G> select /*+ full(dept) */ count(*) from dept;
1 row selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 315352865
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| DEPT | 4 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets -->使用了3次逻辑读
0 physical reads
0 redo size
335 bytes sent via SQL*Net to client
349 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
--对于小表,从上面的情形可以看出,使用索引扫描也是比全表扫描高效
--因此,建议始终为小表建立索引
更多参考
DML Error Logging 特性
PL/SQL --> 游标
PL/SQL --> 隐式游标(SQL%FOUND)
批量SQL之 FORALL 语句
批量SQL之 BULK COLLECT 子句
PL/SQL 集合的初始化与赋值
PL/SQL 联合数组与嵌套表
PL/SQL 变长数组
PL/SQL --> PL/SQL记录
SQL tuning 步骤
高效SQL语句必杀技
父游标、子游标及共享游标
绑定变量及其优缺点
dbms_xplan之display_cursor函数的使用
dbms_xplan之display函数的使用
执行计划中各字段各模块描述
使用 EXPLAIN PLAN 获取SQL语句执行计划