1.比较sql*loader使用conventional传统 和direct直接 方式加载的性能差异。
一、概念
1.直接加载优势
(1)直接加载比传统加载效率要高
(2)不扫描原来的空数据块
(3)不需要sql解析,减少系统的负载
(4)不经过SGA
(5)不走DBWR进程,走自己的专属进程,所以速度快
2.直接加载限制
(1)不能加载簇表
(2)锁定整个表,在表上有活动事务的时候不能加载
3.直接加载特点
(1)直接加载是在所有数据块后面加载新数据块,修改高水位线,不扫描原来的空数据块。
(2)直接加载只产生一点点的管理redo,因为要修改数据字典(也可以讲不产生redo)。
(3)回滚,如果加载失败把新分配数据块抹掉就行了。
(4)无需SGA,无需SQL解析,无需DBWR进程
二、实验
1.现在我们已经定义了leo_test_sqlload;
LS@LEO> select count(*) from leo_test_sqlload; 现在表里没有记录数
COUNT(*)
----------
0
LS@LEO> select segment_name,extent_id,bytes from user_extents where segment_name='LEO_TEST_SQLLOAD'; 现在分配了1个区
SEGMENT_NAME EXTENT_ID BYTES
--------------------------------------------------------------------------------- ---------- ----------
LEO_TEST_SQLLOAD 0 65536
[oracle@secdb1 ~]$ sqlldr userid=ls/ls control=leo_test.ctl data=leo_test.data log=leo_test.log 传统方式加载数据
SQL*Loader: Release 10.2.0.1.0 - Production on Sun Sep 9 10:29:42 2012
Copyright (c) 1982, 2005, Oracle. All rights reserved.
LS@LEO> select count(*) from leo_test_sqlload; 已经成功加载了10万条数据
COUNT(*)
----------
100000
LS@LEO> select segment_name,extent_id,bytes from user_extents where segment_name='LEO_TEST_SQLLOAD'; 10万条数据占用23个数据块
SEGMENT_NAME EXTENT_ID BYTES
--------------------------------------------------------------------------------- ---------- ----------
LEO_TEST_SQLLOAD 0 65536
LEO_TEST_SQLLOAD 1 65536
LEO_TEST_SQLLOAD 2 65536
LEO_TEST_SQLLOAD 3 65536
LEO_TEST_SQLLOAD 4 65536
LEO_TEST_SQLLOAD 5 65536
LEO_TEST_SQLLOAD 6 65536
LEO_TEST_SQLLOAD 7 65536
LEO_TEST_SQLLOAD 8 65536
LEO_TEST_SQLLOAD 9 65536
LEO_TEST_SQLLOAD 10 65536
LEO_TEST_SQLLOAD 11 65536
LEO_TEST_SQLLOAD 12 65536
LEO_TEST_SQLLOAD 13 65536
LEO_TEST_SQLLOAD 14 65536
LEO_TEST_SQLLOAD 15 65536
LEO_TEST_SQLLOAD 16 1048576
LEO_TEST_SQLLOAD 17 1048576
LEO_TEST_SQLLOAD 18 1048576
LEO_TEST_SQLLOAD 19 1048576
LEO_TEST_SQLLOAD 20 1048576
LEO_TEST_SQLLOAD 21 1048576
LEO_TEST_SQLLOAD 22 1048576
23 rows selected.
LS@LEO> delete from leo_test_sqlload; 删除10万条数据
100000 rows deleted.
LS@LEO> commit; 提交
Commit complete.
LS@LEO> select segment_name,extent_id,bytes from user_extents where segment_name='LEO_TEST_SQLLOAD';
SEGMENT_NAME EXTENT_ID BYTES
--------------------------------------------------------------------------------- ---------- ----------
LEO_TEST_SQLLOAD 0 65536
LEO_TEST_SQLLOAD 1 65536
LEO_TEST_SQLLOAD 2 65536
LEO_TEST_SQLLOAD 3 65536
LEO_TEST_SQLLOAD 4 65536
LEO_TEST_SQLLOAD 5 65536
LEO_TEST_SQLLOAD 6 65536
LEO_TEST_SQLLOAD 7 65536
LEO_TEST_SQLLOAD 8 65536
LEO_TEST_SQLLOAD 9 65536
LEO_TEST_SQLLOAD 10 65536
LEO_TEST_SQLLOAD 11 65536
LEO_TEST_SQLLOAD 12 65536
LEO_TEST_SQLLOAD 13 65536
LEO_TEST_SQLLOAD 14 65536
LEO_TEST_SQLLOAD 15 65536
LEO_TEST_SQLLOAD 16 1048576
LEO_TEST_SQLLOAD 17 1048576
LEO_TEST_SQLLOAD 18 1048576
LEO_TEST_SQLLOAD 19 1048576
LEO_TEST_SQLLOAD 20 1048576
LEO_TEST_SQLLOAD 21 1048576
LEO_TEST_SQLLOAD 22 1048576
23 rows selected.
有人会问我们把数据都删除了为什么还占用空间呢,呵呵,oracle的delete操作不回收空间,只是把自己的记录标记为删除,实际呢还占用的空间不释放
[oracle@secdb1 ~]$ sqlldr userid=ls/ls control=leo_test.ctl data=leo_test.data log=leo_test.log 第二次传统方式加载数据
SQL*Loader: Release 10.2.0.1.0 - Production on Sun Sep 9 10:29:42 2012
Copyright (c) 1982, 2005, Oracle. All rights reserved.
LS@LEO> select count(*) from leo_test_sqlload; 已经成功加载了10万条数据
COUNT(*)
----------
100000
LS@LEO> select segment_name,extent_id,bytes from user_extents where segment_name='LEO_TEST_SQLLOAD';
SEGMENT_NAME EXTENT_ID BYTES
--------------------------------------------------------------------------------- ---------- ----------
LEO_TEST_SQLLOAD 0 65536
LEO_TEST_SQLLOAD 1 65536
LEO_TEST_SQLLOAD 2 65536
LEO_TEST_SQLLOAD 3 65536
LEO_TEST_SQLLOAD 4 65536
LEO_TEST_SQLLOAD 5 65536
LEO_TEST_SQLLOAD 6 65536
LEO_TEST_SQLLOAD 7 65536
LEO_TEST_SQLLOAD 8 65536
LEO_TEST_SQLLOAD 9 65536
LEO_TEST_SQLLOAD 10 65536
LEO_TEST_SQLLOAD 11 65536
LEO_TEST_SQLLOAD 12 65536
LEO_TEST_SQLLOAD 13 65536
LEO_TEST_SQLLOAD 14 65536
LEO_TEST_SQLLOAD 15 65536
LEO_TEST_SQLLOAD 16 1048576
LEO_TEST_SQLLOAD 17 1048576
LEO_TEST_SQLLOAD 18 1048576
LEO_TEST_SQLLOAD 19 1048576
LEO_TEST_SQLLOAD 20 1048576
LEO_TEST_SQLLOAD 21 1048576
LEO_TEST_SQLLOAD 22 1048576
23 rows selected.
使用传统方式加载数据,会扫描原来的空数据块,会把新加载的数据插入到空数据块内,看我们还是使用原来的23个数据块
LS@LEO> delete from leo_test_sqlload; 这是第二次删除10万条数据
100000 rows deleted.
LS@LEO> commit; 提交
Commit complete.
LS@LEO> select segment_name,extent_id,bytes from user_extents where segment_name='LEO_TEST_SQLLOAD';
SEGMENT_NAME EXTENT_ID BYTES
--------------------------------------------------------------------------------- ---------- ----------
LEO_TEST_SQLLOAD 0 65536
LEO_TEST_SQLLOAD 1 65536
LEO_TEST_SQLLOAD 2 65536
LEO_TEST_SQLLOAD 3 65536
LEO_TEST_SQLLOAD 4 65536
LEO_TEST_SQLLOAD 5 65536
LEO_TEST_SQLLOAD 6 65536
LEO_TEST_SQLLOAD 7 65536
LEO_TEST_SQLLOAD 8 65536
LEO_TEST_SQLLOAD 9 65536
LEO_TEST_SQLLOAD 10 65536
LEO_TEST_SQLLOAD 11 65536
LEO_TEST_SQLLOAD 12 65536
LEO_TEST_SQLLOAD 13 65536
LEO_TEST_SQLLOAD 14 65536
LEO_TEST_SQLLOAD 15 65536
LEO_TEST_SQLLOAD , 16 1048576
LEO_TEST_SQLLOAD 17 1048576
LEO_TEST_SQLLOAD 18 1048576
LEO_TEST_SQLLOAD 19 1048576
LEO_TEST_SQLLOAD 20 1048576
LEO_TEST_SQLLOAD 21 1048576
LEO_TEST_SQLLOAD 22 1048576
23 rows selected.
delete还是不回收空间,我们依然占用着23个数据块
Sun Sep 9 20:25:04 CST 2012
[oracle@secdb1 ~]$ sqlldr userid=ls/ls control=leo_test.ctl data=leo_test.data log=leo_test.log direct=true 直接方式加载数据
SQL*Loader: Release 10.2.0.1.0 - Production on Sun Sep 9 20:25:05 2012
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Load completed - logical record count 100000.
[oracle@secdb1 ~]$ date
Sun Sep 9 20:25:14 CST 2012 看我们才用了10秒时间,比传统加载效率高很多
LS@LEO> select segment_name,extent_id,bytes from user_extents where segment_name='LEO_TEST_SQLLOAD';
SEGMENT_NAME EXTENT_ID BYTES
--------------------------------------------------------------------------------- ---------- ----------
LEO_TEST_SQLLOAD 0 65536
LEO_TEST_SQLLOAD 1 65536
LEO_TEST_SQLLOAD 2 65536
LEO_TEST_SQLLOAD 3 65536
LEO_TEST_SQLLOAD 4 65536
LEO_TEST_SQLLOAD 5 65536
LEO_TEST_SQLLOAD 6 65536
LEO_TEST_SQLLOAD 7 65536
LEO_TEST_SQLLOAD 8 65536
LEO_TEST_SQLLOAD 9 65536
LEO_TEST_SQLLOAD 10 65536
LEO_TEST_SQLLOAD 11 65536
LEO_TEST_SQLLOAD 12 65536
LEO_TEST_SQLLOAD 13 65536
LEO_TEST_SQLLOAD 14 65536
LEO_TEST_SQLLOAD 15 65536
LEO_TEST_SQLLOAD 16 1048576
LEO_TEST_SQLLOAD 17 1048576
LEO_TEST_SQLLOAD 18 1048576
LEO_TEST_SQLLOAD 19 1048576
LEO_TEST_SQLLOAD 20 1048576
LEO_TEST_SQLLOAD 21 1048576
LEO_TEST_SQLLOAD 22 1048576
LEO_TEST_SQLLOAD 23 1048576
LEO_TEST_SQLLOAD 24 1048576
LEO_TEST_SQLLOAD 25 1048576
LEO_TEST_SQLLOAD 26 1048576
LEO_TEST_SQLLOAD 27 1048576
LEO_TEST_SQLLOAD 28 1048576
LEO_TEST_SQLLOAD 29 1048576
LEO_TEST_SQLLOAD 30 1048576
31 rows selected.
哈哈我们现在发现同样的10万条记录,竟然占用了31个数据块,传统加载只用了23个,而我们使用直接加载到多了8个数据块,有个上面的概念大家是不是有所启发呢,对了直接加载不扫描原来的空数据块,会在所有数据块之后加载新的数据块插入数据修改高水位线HWM,当提交事务之后,把高水位线移到新数据之后,其他的用户就可以看见了。
2.比较直接加载使用conventional 和direct方式产生的redo大小(可以通过/*+ append */模拟直接加载)。
明确:直接加载与logging配合下并不能显著的减少redo日志量
直接加载与nologging配合下可以大幅度的减少redo日志量
LS@LEO> create table leo_t1 as select * from leo_test_sqlload where 1=2; 创建leo_t1表
Table created.
LS@LEO> alter table leo_t1 logging; 设置leo_t1表logging模式
Table altered.
LS@LEO> set autotrace trace stat;
LS@LEO> insert into leo_t1 select * from leo_test_sqlload where rownum <= 20000; 采用传统方式加载2万条记录
20000 rows created.
Statistics 统计信息
----------------------------------------------------------
1071 recursive calls
2668 db block gets
1860 consistent gets
386 physical reads
1680404 redo size 这是产生的日志量1680404
680 bytes sent via SQL*Net to client
603 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
20000 rows processed
LS@LEO> rollback; 回滚操作,使用undo表空间
Rollback complete.
LS@LEO> insert /*+ append */ into leo_t1 select * from leo_test_sqlload where rownum <= 20000; 使用直接加载方式插入2万条记录
20000 rows created.
Statistics
----------------------------------------------------------
94 recursive calls
268 db block gets
1294 consistent gets
202 physical reads
1627260 redo size 当leo_t1为logging属性时,直接加载和传统加载产生redo日志查不多
664 bytes sent via SQL*Net to client
617 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
20000 rows processed
小结:这是因为在logging模式下,所有的数据块的改变都会产生redo日志,为以后恢复做准备,这时候直接加载没有多大的优势。
直接加载与nologging配合下可以大幅度的减少redo日志量
重大前提
如果你的数据库开启了force_logging=yes模式,那么不管你是传统加载还是直接加载都不会减少redo产生量
所以要想大幅度减少redo日志就必须满足3个条件
(1)关闭force_logging选项 alter database no force logging; 启动 alter database force logging;
(2)数据对象级别nologging模式 alter table leo_t1 nologging;
(3)直接加载 insert /*+ append */ into
数据库归档与redo日志量关系
数据库处于归档模式
当表模式为logging状态时,无论是否使用append模式,都会生成redo.当表模式为nologging状态时,只有append模式,不会生成redo。
数据库处于非归档模式
无论是在logging还是nologing的模式下,append的模式都不会生成redo,而no append模式下都会生成redo。
force_logging logging nologging 如果都设置了 有一个优先级的 排名
force_logging 可以在数据库级别 表空间级别设置
logging nologging 只能在数据对象级别设置
LS@LEO> alter database no force logging;
LS@LEO> select force_logging from v$database; 已经关闭force_logging选项
FOR
---
NO
LS@LEO> alter table leo_t1 nologging; 设置leo_t1表nologging模式
Table altered.
LS@LEO> select logging from user_tables where table_name='LEO_T1';
LOG
---
NO
LS@LEO> select count(*) from leo_t1; 0条记录
COUNT(*)
----------
0
LS@LEO> select index_name from user_indexes where table_name='LEO_T1'; 表上没有索引
no rows selected
LS@LEO> insert /*+ append */ into leo_t1 select * from leo_test_sqlload where rownum <= 20000; 直接加载
20000 rows created.
Statistics
----------------------------------------------------------
1443 recursive calls
649 db block gets
1702 consistent gets
1519 physical reads
44900 redo size 直接加载产生的redo日志非常少
658 bytes sent via SQL*Net to client
617 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
9 sorts (memory)
0 sorts (disk)
20000 rows processed
LS@LEO> rollback;
Rollback complete.
LS@LEO> insert into leo_t1 select * from leo_test_sqlload where rownum <= 20000; 传统加载
20000 rows created.
Statistics
----------------------------------------------------------
4 recursive calls
2207 db block gets
1534 consistent gets
441 physical reads
1634064 redo size 传统加载产生的redo日志非常非常的多
673 bytes sent via SQL*Net to client
603 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
20000 rows processed
小结:直接加载与nologging配合下可以大幅度的减少redo日志量,因为插入的数据不产生redo日志,所以在插入后要做备份操作,一旦数据损坏,就要使用备份来恢复,不能使用redo来恢复。注意要关闭force_logging选项哦!
3.比较direct方式使用并行和非并行选项的性能差异。
数据文件:leo_test.data 100000记录
控制文件:leo_test.ctl
日志文件:leo_test.log
串行直接加载sqlload : sqlldr userid=ls/ls control=leo_test.ctl data=leo_test.data log=leo_test.log direct=true
并行直接加载sqlload : sqlldr userid=ls/ls control=leo_test.ctl data=leo_test.data log=leo_test.log direct=true parallel=true
1.现在演示“串行直接加载”
LS@LEO> select count(*) from leo_test_sqlload; 加载之前
COUNT(*)
----------
0
Sun Sep 9 10:58:24 CST 2012 58:24
[oracle@secdb1 ~]$ sqlldr userid=ls/ls control=leo_test.ctl data=leo_test.data log=leo_test.log direct=true
SQL*Loader: Release 10.2.0.1.0 - Production on Sun Sep 9 10:58:26 2012
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Load completed - logical record count 100000.
[oracle@secdb1 ~]$ date
Sun Sep 9 10:58:45 CST 2012 58:45-58:24=21秒 即用时21秒 我这是手算的,一会看看日志里记录的是多少
LS@LEO> select count(*) from leo_test_sqlload; 加载之后
COUNT(*)
----------
100000
[oracle@secdb1 ~]$ more leo_test.log
SQL*Loader: Release 10.2.0.1.0 - Production on Sun Sep 9 10:58:26 2012
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Control File: leo_test.ctl
Data File: leo_test.data
Bad File: leo_test.bad
Discard File: none specified
(Allow all discards)
Number to load: ALL
Number to skip: 0
Errors allowed: 50
Continuation: none specified
Path used: Direct 【sql*loader采用串行直接加载方式加载数据】
Table LEO_TEST_SQLLOAD, loaded from every logical record. 加载的表名
Insert option in effect for this table: APPEND 追加方式加载数据
TRAILING NULLCOLS option in effect
Column Name Position Len Term Encl Datatype 列名
------------------------------ ---------- ----- ---- ---- ---------------------
START_TIME FIRST * | DATE YYYY-MM-DD HH24:MI:SS
END_TIME NEXT * | DATE YYYY-MM-DD HH24:MI:SS
PROTOCOL NEXT * | CHARACTER
PRIVATE_IP NEXT * | CHARACTER
PRIVATE_PORT NEXT * | CHARACTER
SRC_IP NEXT * | CHARACTER
SRC_PORT NEXT * | CHARACTER
DEST_IP NEXT * | CHARACTER
DEST_PORT NEXT * | CHARACTER
The following index(es) on table LEO_TEST_SQLLOAD were processed: 并把索引也加载了10万个索引键值
index LS.LEO_INDEX1 loaded successfully with 100000 keys
Table LEO_TEST_SQLLOAD:
100000 Rows successfully loaded. 成功加载10万行记录
0 Rows not loaded due to data errors. 由于数据错误,0行没有加载
0 Rows not loaded because all WHEN clauses were failed. 因为所有条款都失败的时候,0行没有加载
0 Rows not loaded because all fields were null. 因为所有字段都是空的,0行没有加载
Date cache:
Max Size: 1000
Entries : 65
Hits : 199935
Misses : 0
Bind array size not used in direct path.
Column array rows : 5000
Stream buffer bytes: 256000
Read buffer bytes: 1048576
Total logical records skipped: 0 跳过逻辑记录数总和0
Total logical records read: 100000 读取逻辑记录数总和100000
Total logical records rejected: 0 拒绝逻辑记录数总和0
Total logical records discarded: 0 丢弃逻辑记录数总和0
Total stream buffers loaded by SQL*Loader main thread: 26
Total stream buffers loaded by SQL*Loader load thread: 17
Run began on Sun Sep 09 10:58:26 2012 开始的时间
Run ended on Sun Sep 09 10:58:41 2012 结束的时间
Elapsed time was: 00:00:14.70 即用时14.7秒 这是机器算的
CPU time was: 00:00:00.38 CPU占用0.38秒
2.现在演示“并行直接加载”
并行加载
(1)并行加载和并行insert机制差不多
(2)并行加载可以启动多个并行进程,同时加载多个文件
(3)并行加载可以启动多个并行进程,分拆一个文件加载
LS@LEO> select count(*) from leo_test_sqlload; 加载之前
COUNT(*)
----------
0
Sun Sep 9 11:28:13 CST 2012 28:13
[oracle@secdb1 ~]$ sqlldr userid=ls/ls control=leo_test.ctl data=leo_test.data log=leo_test.log direct=true parallel=true
SQL*Loader: Release 10.2.0.1.0 - Production on Sun Sep 9 11:28:14 2012
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Load completed - logical record count 100000.
[oracle@secdb1 ~]$ date
Sun Sep 9 11:28:28 CST 2012 28:28-28:13=15秒 即用时15秒 我这是手算的,一会看看日志里记录的是多少
LS@LEO> select count(*) from leo_test_sqlload; 加载之后
COUNT(*)
----------
100000
[oracle@secdb1 ~]$ more leo_test.log
SQL*Loader: Release 10.2.0.1.0 - Production on Sun Sep 9 11:28:14 2012
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Control File: leo_test.ctl
Data File: leo_test.data
Bad File: leo_test.bad
Discard File: none specified
(Allow all discards)
Number to load: ALL
Number to skip: 0
Errors allowed: 50
Continuation: none specified
Path used: Direct - with parallel option. 【sql*loader采用并行直接加载方式加载数据】
Table LEO_TEST_SQLLOAD, loaded from every logical record. 加载的表名
Insert option in effect for this table: APPEND 追加的方式加载数据
TRAILING NULLCOLS option in effect
Column Name Position Len Term Encl Datatype 列名
------------------------------ ---------- ----- ---- ---- ---------------------
START_TIME FIRST * | DATE YYYY-MM-DD HH24:MI:SS
END_TIME NEXT * | DATE YYYY-MM-DD HH24:MI:SS
PROTOCOL NEXT * | CHARACTER
PRIVATE_IP NEXT * | CHARACTER
PRIVATE_PORT NEXT * | CHARACTER
SRC_IP NEXT * | CHARACTER
SRC_PORT NEXT * | CHARACTER
DEST_IP NEXT * | CHARACTER
DEST_PORT NEXT * | CHARACTER
Table LEO_TEST_SQLLOAD:
100000 Rows successfully loaded. 成功加载10万行记录
0 Rows not loaded due to data errors. 由于数据错误,0行没有加载
0 Rows not loaded because all WHEN clauses were failed. 因为所有条款都失败的时候,0行没有加载
0 Rows not loaded because all fields were null. 因为所有字段都是空的,0行没有加载
Date cache:
Max Size: 1000
Entries : 65
Hits : 199935
Misses : 0
Bind array size not used in direct path.
Column array rows : 5000
Stream buffer bytes: 256000
Read buffer bytes: 1048576
Total logical records skipped: 0 跳过逻辑记录数总和0
Total logical records read: 100000 读取逻辑记录数总和100000
Total logical records rejected: 0 拒绝逻辑记录数总和0
Total logical records discarded: 0 丢弃逻辑记录数总和0
Total stream buffers loaded by SQL*Loader main thread: 26
Total stream buffers loaded by SQL*Loader load thread: 17
Run began on Sun Sep 09 11:28:14 2012 开始的时间
Run ended on Sun Sep 09 11:28:24 2012 结束的时间
Elapsed time was: 00:00:09.18 即用时9.18秒 这是机器算的
CPU time was: 00:00:00.42 CPU占用0.38秒
<,P>小结:从时间效率上看“并行直接加载”比“串行直接加载”的效率要高,但我们都知道“并行直接加载”如果表中有索引会导致加载失败,因为oracle不能一边并行一边维护索引的完整性。
4.直接加载对约束性索引和非约束型索引的影响。
一、条件
(1)现在我们使用sql*load来加载100000条数据
(2)加载的leo_test_sqlload表上有索引
二、实验
1.非约束性索引,直接加载在加载完成后会维护索引的完整性
$ cat leo_test.data | wc -l 检查数据文件内的记录数10万条,这就是我们要加载的数据
100000
$ cat leo_test.ctl 控制文件内容
LOAD DATA
INFILE '/home/oracle/leo_test.data' 数据源
APPEND INTO TABLE leo_test_sqlload 加载数据的表
FIELDS TERMINATED BY '|' 文本数据的分隔符
TRAILING NULLCOLS
(START_TIME DATE 'YYYY-MM-DD HH24:MI:SS',END_TIME DATE 'YYYY-MM-DD HH24:MI:SS',PROTOCOL,PRIVATE_IP,PRIVATE_PORT
,SRC_IP,SRC_PORT,DEST_IP,DEST_PORT) 表中字段的名
LS@LEO> desc leo_test_sqlload; 表的结构
Name Null? Type
----------------------------------------- -------- ----------------------------
START_TIME DATE
END_TIME DATE
PROTOCOL VARCHAR2(20)
PRIVATE_IP VARCHAR2(20)
PRIVATE_PORT VARCHAR2(20)
SRC_IP VARCHAR2(20)
SRC_PORT VARCHAR2(20)
DEST_IP VARCHAR2(20)
DEST_PORT VARCHAR2(20)
LS@LEO> select index_name,status from user_indexes where table_name='LEO_TEST_SQLLOAD'; 现在表上没有索引,我们定义一个索引
no rows selected
LS@LEO> create index leo_index1 on leo_test_sqlload(start_time); 我们在start_time字段添加了索引
Index created.
LS@LEO> select index_name,status from user_indexes where table_name='LEO_TEST_SQLLOAD'; 索引已经生效
INDEX_NAME STATUS
------------------------------ --------
LEO_INDEX1 VALID
[oracle@secdb1 ~]$ sqlldr userid=ls/ls control=leo_test.ctl data=leo_test.data log=leo_test.log direct=true 直接加载
SQL*Loader: Release 10.2.0.1.0 - Production on Sun Sep 9 09:18:13 2012
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Load completed - logical record count 100000.
LS@LEO> select count(*) from leo_test_sqlload; 已经加载了10万条记录
COUNT(*)
----------
100000
LS@LEO> select index_name,status from user_indexes where table_name='LEO_TEST_SQLLOAD'; 索引状态还是有效的
INDEX_NAME STATUS
------------------------------ --------
LEO_INDEX1 VALID
小结:非约束索引下,直接加载会维护索引的完整性,在数据加载入库后索引还是有效的。
2.约束性索引【主键】,直接加载依然会把数据加载入库,但索引会失效unusable,并且在日志中没有提示,必须手工rebuild重新建立
数据文件:leo_test1.data
控制文件:leo_test1.ctl
日志文件:leo_test1.log
sqlload : sqlldr userid=ls/ls control=leo_test1.ctl data=leo_test1.data log=leo_test1.log direct=true
LS@LEO> desc leo_test_sqlload1 表结构
Name Null? Type
----------------------------------------- -------- ----------------------------
START_TIME DATE
END_TIME DATE
PROTOCOL VARCHAR2(20)
PRIVATE_IP VARCHAR2(20)
PRIVATE_PORT VARCHAR2(20)
SRC_IP VARCHAR2(20)
SRC_PORT VARCHAR2(20)
DEST_IP VARCHAR2(20)
DEST_PORT VARCHAR2(20)
LS@LEO> select index_name,status from user_indexes where table_name='LEO_TEST_SQLLOAD1'; 没有主键,我们现在建一个
no rows selected
LS@LEO> alter table leo_test_sqlload1 add constraint pk_leo_test1 primary key(dest_port); 创建主键
Table altered.
LS@LEO> select index_name,status from user_indexes where table_name='LEO_TEST_SQLLOAD1'; 主键已经生效
INDEX_NAME STATUS
------------------------------ --------
PK_LEO_TEST1 VALID
LS@LEO> select count(*) from leo_test_sqlload1; 表中没有数据我们开始加载
COUNT(*)
----------
0
[oracle@secdb1 ~]$ sqlldr userid=ls/ls control=leo_test1.ctl data=leo_test1.data log=leo_test1.log direct=true
SQL*Loader: Release 10.2.0.1.0 - Production on Sun Sep 9 10:07:49 2012
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Load completed - logical record count 100.
LS@LEO> select * from leo_test_sqlload1 where rownum <= 6;
START_TIME END_TIME PROTOCOL PRIVATE_IP PRIVATE_PORT SRC_IP SRC_PORT DEST_IP DEST_PORT
------------------- ------------------- -------------------- ------------ ------------ -------------------- ----------
2012-08-08 20:59:54 2012-08-08 21:00:28 6 2886756061 1111 3395517721 45031 3419418065 80
2012-08-08 21:00:26 2012-08-08 21:00:28 6 2886900807 1111 395507143 51733 3658060738 80
2012-08-08 20:59:55 2012-08-08 21:00:28 6 2886917742 1111 3395507159 43516 2071873572 80
2012-08-08 20:59:55 2012-08-08 21:00:28 6 2886917742 1111 3395507159 43534 2071873572 80
2012-08-08 20:59:55 2012-08-08 21:00:28 6 2886917742 1111 3395507159 43523 2071873572 80
2012-08-08 21:00:14 2012-08-08 21:00:28 6 2886832065 1111 3395507109 51442 2099718013 80
6 rows selected.
我们已经加载了100条记录,dest_port列值都是80,违背了主键约束,说明索引失效
LS@LEO> select index_name,status from user_indexes where table_name='LEO_TEST_SQLLOAD1'; 索引已经失效
INDEX_NAME STATUS
------------------------------ --------
PK_LEO_TEST1 UNUSABLE
结论:在OLAP系统中实时表不应该使用约束【因为是海量数据重复是正常的】,在维度表中可以使用约束。
3.如果使用并行+直接加载数据的话,如果表中有索引,会导致加载失败,可以使用skip_index_maintenance选项“跳过索引维护”,来到达数据加载的目的,但是此时索引会无效unusable,必须手工rebuild重新建立
重新搭建环境
LS@LEO> truncate table leo_test_sqlload1;
Table truncated.
LS@LEO> alter table leo_test_sqlload1 drop constraint pk_leo_test1;
Table altered.
LS@LEO> alter table leo_test_sqlload1 add constraint pk_leo_test1 primary key(dest_port);
Table altered.
[oracle@secdb1 ~]$ sqlldr userid=ls/ls control=leo_test1.ctl data=leo_test1.data log=leo_test1.log direct=true parallel=true
SQL*Loader: Release 10.2.0.1.0 - Production on Sun Sep 9 10:29:42 2012
Copyright (c) 1982, 2005, Oracle. All rights reserved.
SQL*Loader-951: Error calling once/load initialization
ORA-26002: Table LS.LEO_TEST_SQLLOAD1 has index defined upon it. 表上有索引定义导致加载失败
[oracle@secdb1 ~]$ sqlldr userid=ls/ls control=leo_test1.ctl data=leo_test1.data log=leo_test1.log direct=true parallel=true
skip_index_maintenance=true
SQL*Loader: Release 10.2.0.1.0 - Production on Sun Sep 9 10:32:51 2012 我们跳过索引维护成功加载100条记录
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Load completed - logical record count 100.
LS@LEO> select count(*) from leo_test_sqlload1;
COUNT(*)
----------
100
LS@LEO> select index_name,status from user_indexes where table_name='LEO_TEST_SQLLOAD1'; 索引此时是无效状态
INDEX_NAME STATUS
------------------------------ --------
PK_LEO_TEST1 UNUSABLE
小结:这就告诉我们了并行就是切片,一边切片一边维护索引完整性是做不到的,我们只能在加载数据后重新建立索引。
oracle视频教程请关注:http://u.youku.com/user_video/id_UMzAzMjkxMjE2.html