delete from t where create_time < to_date('2017-01-01','yyyy-mm-dd');
当我们看到这条语句时,会想到什么呢? 一条再简单不过的按照条件删除数据库的操作。 如果大量存在,会不会引起系统性能问题呢?
当这个表的数据足够大时,按照这样的方式来清除数据,代价无疑是非常高昂的。
我们提出使用分区的方式来解决这个问题。比如 保留历史数据的骨折是,删除最早一个月的数据,按照这个需求,将需要删除数据的表,改为分区表。 按照每个月一个分区的方式创建分区表。
这样做的结果是,以前的delete操作就变成了下面的操作:
alter table t drop partition p_name
或者
alter table t truncate partiton p_name ;
这个操作对已一个几GB数据的分区表来讲,执行时间不超过10S。
比起使用分区来操作数据,DELETE方式操作数据的问题是显而易见的,至少存在如下两点:
在DELETE操作中,SQL语句首先要扫描表或者索引找到符合条件的记录,然后把他们删除,这个过程中将消耗大量的CPU资源和产生大量的I/O,同时还会产生大量的undo数据。
下面是一个简单的栗子,我们可以看出delete相对于drop或者truncate对资源的消耗情况:
SQL> drop table t purge ; ---> 不放入回收站,直接删除
Table dropped
SQL> create table t as select a.OBJECT_ID,a.OBJECT_NAME from dba_objects a;
Table created
SQL> exec dbms_stats.gather_table_stats(user,'t');---->表分析
PL/SQL procedure successfully completed
SQL>
注释:执行drop table xx 语句
drop后的表被放在回收站(user_recyclebin)里,而不是直接删除掉。这样,回收站里的表信息就可以被恢复,或彻底清除。
通过查询回收站user_recyclebin获取被删除的表信息,然后使用语句
flashback tableto before drop [rename to
];
将回收站里的表恢复为原名称或指定新名称,表中数据不会丢失。
若要彻底删除表,则使用语句:drop tablepurge;
清除回收站里的信息
清除指定表:purge table;
清除当前用户的回收站:purge recyclebin;
SQL> purge recyclebin;
Done
清除所有用户的回收站:purge dba_recyclebin;
不放入回收站,直接删除则是:drop table xx purge;
oracle@entel1:[/oracle]$sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Tue Jan 17 22:17:53 2017
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> conn xgj/smart
Connected.
SQL>
SQL> set autotrace trace exp stat; -->显示执行计划和统计结果
SQL> delete from t where t.object_id < 10000;
9811 rows deleted.
Execution Plan
----------------------------------------------------------
Plan hash value: 3335594643
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 4420 | 22100 | 43 (0)| 00:00:01 |
| 1 | DELETE | T | | | | |
|* 2 | TABLE ACCESS FULL| T | 4420 | 22100 | 43 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("T"."OBJECT_ID"<10000)
Statistics
----------------------------------------------------------
18 recursive calls
10465 db block gets
162 consistent gets
0 physical reads
2640572 redo size
831 bytes sent via SQL*Net to client
797 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
9811 rows processed
SQL>
SET AUTOT 用法
用法:
SET AUTOT[RACE] {OFF | ON | TRACE[ONLY]} [EXP[LAIN]]
[STAT[ISTICS]]SET AUTOT[RACE] OFF 停止AutoTrace SET AUTOT[RACE] ON
开启AutoTrace,显示AUTOTRACE信息和SQL执行结果 SET AUTOT[RACE] TRACEONLY
开启AutoTrace,仅显示AUTOTRACE信息 SET AUTOT[RACE] ON EXPLAIN
开启AutoTrace,仅显示AUTOTRACE的EXPLAIN信息 SET AUTOT[RACE] ON
STATISTICS开启AutoTrace,仅显示AUTOTRACE的STATISTICS信息
在这种情况下,DELETE操作产生了 10465 db block gets + 162 consistent gets 数据块读,同时产生了 2640572 redo size 的 redo日志 (2640572 /1024/1024 = 2.5M 大小)
即使我们给这个表创建了索引,依然无法避免这种资源消耗
SQL> rollback;
Rollback complete.
SQL> create index ind_t on t(object_id);
Index created.
SQL> exec dbms_stats.gather_index_stats(user,'ind_t');
PL/SQL procedure successfully completed.
SQL> delete from t where t.object_id < 10000;
9811 rows deleted.
Execution Plan
----------------------------------------------------------
Plan hash value: 3974964266
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 4420 | 22100 | 11 (0)| 00:00:01 |
| 1 | DELETE | T | | | | |
|* 2 | INDEX RANGE SCAN| IND_T | 4420 | 22100 | 11 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T"."OBJECT_ID"<10000)
Statistics
----------------------------------------------------------
15 recursive calls
10596 db block gets
36 consistent gets
0 physical reads
2801592 redo size
837 bytes sent via SQL*Net to client
796 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
9811 rows processed
SQL> rollback;
Rollback complete.
可以看到即使加了索引,资源的消耗情况差不多,都是 10596 db block gets + 36 consistent gets 。
相反的 truncate 和 drop 操作消耗的资源就小很多。
案例
Connected to Oracle Database 11g Enterprise Edition Release 11.2.0.4.0
Connected as xxx@xgj
SQL> --->创建分区表
SQL> create table t_part (object_id int ,object_name varchar2(2000) )
2 partition by range(object_id)
3 (
4 partition p1 values less than (6000),
5 partition p2 values less than (12000),
6 partition p3 values less than (18000),
7 partition p4 values less than (24000),
8 partition p5 values less than (30000),
9 partition pm values less than (maxvalue)
10 );
Table created
SQL> insert into t_part select object_id ,object_name from dba_objects;
35238 rows inserted
SQL> commit;
Commit complete
SQL> select count(1) from t_part partition(p1);
COUNT(1)
----------
5995
SQL>
下面我们使用sql_trace来跟踪下 drop和truncate 之类的DDL操作的相关信息。
SQL> alter session set tracefile_identifier='xgj_20170118';
Session altered
SQL> alter session set sql_trace=true;
Session altered
SQL> alter table t_part truncate partition p1;
Table truncated
SQL> truncate table t_part;
Table truncated
SQL> insert into t_part select object_id ,object_name from dba_objects;
35238 rows inserted
SQL> commit;
Commit complete
SQL> alter table t_part drop partition p1;
Table altered
SQL> alter session set sql_trace=false;
Session altered
SQL>
开启SQL跟踪后,会生成一个trace文件,通过初始化参数user_dump_dest配置其所在目录,该参数的值可以通过下面方法获取到:
SQL> select name, value from v$parameter where name = 'user_dump_dest' ;
NAME VALUE
--------------------------- -----------------------------
user_dump_dest /oracle/diag/rdbms/cc/cc/trace
以上是产生的trace原文件,我们通过tkprof来查看一下:
oracle@entel1:[/oracle/diag/rdbms/cc/cc/trace]$tkprof cc_ora_14104_xgj_20170118.trc cc_ora_14104_xgj_20170118.log explain=xxx/xgj sys=no ;
TKPROF: Release 11.2.0.4.0 - Development on Wed Jan 18 16:35:18 2017
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
oracle@entel1:[/oracle/diag/rdbms/cc/cc/trace]$
以下是截取cc_ora_14104_xgj_20170118.log中的关键信息:
truncate partition p1
********************************************************************************
SQL ID: 4gj9z8z8707sx Plan Hash: 491940592
alter table t_part truncate partition p1
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 5 0 0
Execute 1 0.01 0.38 20 1 69 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.01 0.38 20 6 69 0
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 38 (xxx)
********************************************************************************
truncate table
********************************************************************************
SQL ID: cv84y6hnt1dy2 Plan Hash: 491940592
truncate table t_part
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.02 1.54 80 4 266 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.02 1.54 80 4 266 0
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 38 (xxx)
********************************************************************************
drop partition p1
********************************************************************************
SQL ID: 7wwjnfwgcxgza Plan Hash: 0
alter table t_part drop
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.69 1 1 10 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.00 0.69 1 1 10 0
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 38 (xxx)
从结果可以看到,DDL对分区操作消耗的资源要远小于DML的
操作 | 数据块的读取次数 |
---|---|
delete | 1000+ |
truncate | 6 |
drop | 1 |
可以看出DDL的操作要远小于DML消耗的资源。 DDL的操作主要是对数据字典的修改,这个值基本上是确定的,而DML会随着数据量的增加,消耗的资源也会相应的增加
oracle@entel1:[/oracle/diag/rdbms/cc/cc/trace]$sqlplus /nolog
SQL*Plus: Release 11.2.0.4.0 Production on Wed Jan 18 18:04:57 2017
Copyright (c) 1982, 2013, Oracle. All rights reserved.
SQL> conn xxx/xgj
Connected.
SQL> select count(1) from t;
COUNT(1)
----------
35249
SQL> set autotrace trace exp stat ;
SQL> delete from t where object_id < 10000;
9811 rows deleted.
Execution Plan
----------------------------------------------------------
Plan hash value: 3974964266
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 4420 | 22100 | 11 (0)| 00:00:01 |
| 1 | DELETE | T | | | | |
|* 2 | INDEX RANGE SCAN| IND_T | 4420 | 22100 | 11 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"<10000)
Statistics
----------------------------------------------------------
45 recursive calls
10940 db block gets
52 consistent gets
0 physical reads
2806724 redo size
831 bytes sent via SQL*Net to client
794 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
9811 rows processed
SQL> rollback;
Rollback complete.
SQL> delete from t where object_id < 20000;
13938 rows deleted.
Execution Plan
----------------------------------------------------------
Plan hash value: 3974964266
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 8842 | 44210 | 21 (0)| 00:00:01 |
| 1 | DELETE | T | | | | |
|* 2 | INDEX RANGE SCAN| IND_T | 8842 | 44210 | 21 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"<20000)
Statistics
----------------------------------------------------------
17 recursive calls
15016 db block gets
47 consistent gets
0 physical reads
3988132 redo size
836 bytes sent via SQL*Net to client
794 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
13938 rows processed
SQL> rollback;
Rollback complete.
SQL> delete from t where object_id <30000;
14403 rows deleted.
Execution Plan
----------------------------------------------------------
Plan hash value: 3974964266
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 13263 | 66315 | 31 (0)| 00:00:01 |
| 1 | DELETE | T | | | | |
|* 2 | INDEX RANGE SCAN| IND_T | 13263 | 66315 | 31 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"<30000)
Statistics
----------------------------------------------------------
17 recursive calls
15508 db block gets
48 consistent gets
0 physical reads
4121320 redo size
837 bytes sent via SQL*Net to client
793 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
14403 rows processed
SQL>
从结果可以看到,消耗的资源和删除的数据量是同步增长的
删除的数据量 | 数据块的读取次数 |
---|---|
object_id<10000 | 10940 + 52 |
object_id<20000 | 15016 +47 |
object_id<30000 | 15508+48 |
truncate的消耗
t_part_2表
SQL> create table t_part_2 (object_id int ,object_name varchar2(2000) )
2 partition by range(object_id)
3 (
4 partition p1 values less than (20000),
5 partition pm values less than (maxvalue)
6 );
Table created
SQL> insert into t_part_2 select object_id ,object_name from dba_objects;
35239 rows inserted
SQL> commit;
Commit complete
SQL> alter session set tracefile_identifier='xgj_truncate';
Session altered
SQL> alter session set sql_trace=true;
Session altered
SQL> alter table t_part_2 truncate partition p1;
Table truncated
SQL> alter session set sql_trace=false;
Session altered
SQL>
我们可以看到,truncate操作时,数据块读的次数10次左右,所以DDL操作并没有随着操作的数据增加而明显的增加,它只取决于需要操作的字典表的数据量。
DELETE操作并不能释放出空间,也就是说,删除了哪个表的数据,疼出来的空间还是只能是那个表使用,并不能让给其他的对象使用,因为delete操作并不能使高水位线下降。 如果需要通过清除一个表中的部分数据来腾出空间给其他的对象,这个操作时不行的。
而truncate或者drop操作,就可以释放出空间给其他的对象使用。
下面举例说明:
我们还是用我们上面创建的t_part分区表为例
SQL> select * from user_extents a where a.segment_name='T_PART';
SEGMENT_NAME PARTITION_NAME SEGMENT_TYPE TABLESPACE_NAME EXTENT_ID BYTES BLOCKS
-------------------------------------------------------------------------------- ------------------------------ ------------------ ------------------------------ ---------- ---------- ----------
T_PART P1 TABLE PARTITION TAB_ZMC 0 8388608 1024
T_PART P2 TABLE PARTITION TAB_ZMC 0 8388608 1024
T_PART P3 TABLE PARTITION TAB_ZMC 0 8388608 1024
T_PART PM TABLE PARTITION TAB_ZMC 0 8388608 1024
T_PART P5 TABLE PARTITION TAB_ZMC 0 8388608 1024
SQL> select count(1) from user_extents a where a.segment_name='T_PART';
COUNT(1)
----------
5
SQL>
最初的时候,T_PART表的数据占据了5个EXTENTS(区)
我们通过delete删除部分数据
SQL> select count(1) from t_part;
COUNT(1)
----------
35245
SQL> delete from t_part a where a.object_id<30000;
14403 rows deleted
SQL> commit;
Commit complete
SQL> select count(1) from user_extents a where a.segment_name='T_PART';
COUNT(1)
----------
5
SQL>
我们删除了14403 条数据后,发现T_PART仍然占据着5个EXTENDS,也就是说虽然表中的数据删除了,但是空间只能是T_PART
自己使用,并没有将控件返回给数据库。
我们使用truncate 来试下 ,用T表的数据吧,比较多
SQL> select count(1) from user_extents a where a.segment_name='T';
COUNT(1)
----------
17
SQL> select count(1) from t;
COUNT(1)
----------
35245
SQL> truncate table t;
Table truncated
SQL> select count(1) from user_extents a where a.segment_name='T';
COUNT(1)
----------
1
SQL> select count(1) from
2 t;
COUNT(1)
----------
0
SQL>
可以看出 通过truncate 之后 T表占用的数据区从17个降为1个,大部分空间都返回给了数据库,这部分空间是可以分配给其他数据使用的。
还有一种方式,就是如果这个表上有许多碎片,可以通过整理表的空间来达到释放空间的目的,命令如下
alter table t move ;
我们用T表来做下试验,还原T表的数据
SQL> insert into t select object_id ,object_name from dba_objects;
35245 rows inserted
SQL> commit;
Commit complete
SQL> select count(1) from user_extents a where a.segment_name='T';
COUNT(1)
----------
17
SQL> delete from t ; --->通过delete方式删除数据
35245 rows deleted
SQL> commit;
Commit complete
SQL> select count(1) from user_extents a where a.segment_name='T';
COUNT(1)
----------
17
SQL> alter table t move;--->重新整理数据
Table altered
SQL> select count(1) from user_extents a where a.segment_name='T';
COUNT(1)
----------
1
SQL>
我们看到,使用alter table move 操作,表的空间基本上已经全部释放出来了,只剩下一个EXTENT(因为当前已经是一个空表了)。
通过alter table move 方式让表中的数据重新存储,这样可以释放出来一些空间,但是没这个操作基本上等同于将表中的数据重新分布一次,如果表中的数据比较庞大,这将是一个非常耗时的操作。
实际上delete的操作只适用删除非常少量的数据,并且需要在索引存在的情况下适用。
案例说明:
SQL> select count(1) from t;
COUNT(1)
----------
0
SQL> insert into t select object_id ,object_name from dba_objects;
35245 rows inserted
SQL> commit;
Commit complete
SQL> select a.index_name ,a.index_type ,a.table_name from user_indexes a where a.table_name = 'T'; --->查看表上的索引
INDEX_NAME INDEX_TYPE TABLE_NAME
------------------------------ --------------------------- ------------------------------
IND_T NORMAL T
SQL> drop index ind_t ;-->先删掉索引
Index dropped
SQL> select a.index_name ,a.index_type ,a.table_name from user_indexes a where a.table_name = 'T';
INDEX_NAME INDEX_TYPE TABLE_NAME
------------------------------ --------------------------- ------------------------------
SQL>
看看执行计划吧
oracle@entel1:[/oracle]$sqlplus /nolog
SQL*Plus: Release 11.2.0.4.0 Production on Wed Jan 18 20:40:16 2017
Copyright (c) 1982, 2013, Oracle. All rights reserved.
SQL> conn xxx/xgj
Connected.
SQL> set autotrace trace exp stat;
SQL> delete from t where object_id=888;
1 row deleted.
Execution Plan
----------------------------------------------------------
Plan hash value: 3335594643
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 1 | 5 | 43 (0)| 00:00:01 |
| 1 | DELETE | T | | | | |
|* 2 | TABLE ACCESS FULL| T | 1 | 5 | 43 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("OBJECT_ID"=888)
Statistics
----------------------------------------------------------
4 recursive calls
4 db block gets
209 consistent gets
0 physical reads
0 redo size
830 bytes sent via SQL*Net to client
790 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
在没有索引的情况下,即使删除很少的数据,Oracle也会将全表扫描一遍(分区表除外)。
此时,Oracle读取了209 个数据块,如果我们肩上索引,效果会好很多。
SQL> create index ind_t on t(object_id);
Index created.
SQL> delete from t where object_id=888;
1 row deleted.
Execution Plan
----------------------------------------------------------
Plan hash value: 3974964266
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 1 | 5 | 1 (0)| 00:00:01 |
| 1 | DELETE | T | | | | |
|* 2 | INDEX RANGE SCAN| IND_T | 1 | 5 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"=888)
Statistics
----------------------------------------------------------
1 recursive calls
5 db block gets
2 consistent gets
4 physical reads
0 redo size
834 bytes sent via SQL*Net to client
790 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
我们可以看到仅仅读取了2个数据块,2个一致性读,非常高效。