转自: http://www.oraclefans.cn/forum/showtopic.jsp?rootid=23654&CPages=1
记得10多年前,老白为电信开发计费账务系统,那个时候,电信的计费系统应该算是“海量”数据处理系统了,一个本地网可能拥有50万电话用户,其中20-30万用户是有长话业务权限的,这些用户每个月可能会产生500万以上的长话话单。每个月底如何处理这些话单就是一个很大的挑战了。1999年电总发布了电信账务系统业务规范,并且要求各个开发商将其开发的账务系统统一到电总去测试,测试通过的才能够发给入网许可。在参加测试的60多家企业中,老白设计的系统虽然用户界面上做的比较丑陋(当时老白的公司一共不也就七八条枪,整个开发团队只有7个人,因此UI方面比起动辄几十人开发团队的公司来说,是没法比的),不过在整个账务处理的性能上是首屈一指的。处理50万长话用户,500万话单记录的账务处理中,总耗时4小时多一点,拿到了第一名,而第二名的成绩是6个半小时,第三名的成绩就已经是10小时开外了。当时老白能���胜出的法宝有两个,一个是将50万用户资料一次性载入内存,在内存中通过B树结构保存,第二个就是使用了BULK操作。后来老白也和第二名的公司进行了沟通,他们和我们相似的地方是也做了50万用户资料的预装载,但是他们没有使用BULK操作。 BULK INSERT操作比普通INSERT的速度要快很多,这一点是很多使用过BULK操作的人都了解的,不过为什么BULK操作会比较快呢?Oracle官方的说法是BULK操作的时候,USER进程和SQL引擎的交互次数会大大减少,因此BULK操作有较好的性能。老白也一直接受了这个观点,不过有一点疑惑的是,BULK操作和普通操作的差异仅仅在于和SQL引擎的交互次数上吗?难道BULK操作是一次性向SQL引擎提交一个SQL,SQL引擎内部处理BULK操作的时候还是将整个数组还原为一条一条的记录去插入的吗?还是说BULK INSERT在Oracle内部处理过程中有一些独特的地方呢? 在研究REDO OPCODE的时候,老白发现了一些蛛丝马迹,LAYER 11是ROW ACCESS方面的,也就是处理行数据的。在LAYER 11中,有这样一些操作: Layer 11 : Row Access - KCOCODRW [kdocts.h] Opcode 1 : Interpret Undo Record (Undo) Opcode 2 : Insert Row Piece Opcode 3 : Drop Row Piece Opcode 4 : Lock Row Piece Opcode 5 : Update Row Piece Opcode 6 : Overwrite Row Piece Opcode 7 : Manipulate First Column (add or delete the 1rst column) Opcode 8 : Change Forwarding address Opcode 9 : Change the Cluster Key Index Opcode 10 :Set Key Links (change the forward & backward key links on a cluster key) Opcode 11 :Quick Multi-Insert Opcode 12 :Quick Multi-Delete Opcode 13 :Toggle Block Header flags
我们注意到,11..11的定义为Quick Multi-insert,11.12是Quick Multi-Delete,这两个OPCODE是不是有可能和BULK操作有关呢?我们来做一个实验,首先创建一张测试表:
drop table sm_histable0101; CREATE TABLE SM_HISTABLE0101 ( SM_ID NUMBER(10) NOT NULL, SM_SUBID NUMBER(3) NOT NULL, SERVICE_TYPE VARCHAR2(6), ORGTON NUMBER(3), ORGNPI NUMBER(3), ORGADDR VARCHAR2(21) NOT NULL, DESTTON NUMBER(3), DESTNPI NUMBER(3), DESTADDR VARCHAR2(21) NOT NULL, PRI NUMBER(3), PID NUMBER(3), SRR NUMBER(3), DCS NUMBER(3), SCHEDULE VARCHAR2(21), EXPIRE VARCHAR2(21), FINAL VARCHAR2(21), SM_STATUS NUMBER(3), ERROR_CODE NUMBER(3), UDL NUMBER(3), SM_TYPE NUMBER(10), SCADDRTYPE NUMBER(3), SCADDR VARCHAR2(21), MOMSCADDRTYPE NUMBER(3), MOMSCADDR VARCHAR2(21), MTMSCADDRTYPE NUMBER(3), MTMSCADDR VARCHAR2(21), SCHEDULEMODE NUMBER(3), UD VARCHAR2(255), ID_HINT NUMBER(10) NOT NULL, DELIVERCOUNT NUMBER(10), L2CACHE NUMBER(10), L2CACHEWRITECOUNT NUMBER(10), SERVICE NUMBER(10), NEWORGADDRESS VARCHAR2(21), NEWDESTADDRESS VARCHAR2(21) );
然后创建两个存储过程redo1和redo2,分别用于普通的插入操作和bulk插入操作:
create or replace procedure redo1 is TYPE T_SM_ID IS TABLE OF NUMBER(10) INDEX BY BINARY_INTEGER; TYPE T_SM_SUBID IS TABLE OF NUMBER(3) INDEX BY BINARY_INTEGER; TYPE T_ORGADDR IS TABLE OF VARCHAR2(21) INDEX BY BINARY_INTEGER; TYPE T_DESTADDR IS TABLE OF VARCHAR2(21) INDEX BY BINARY_INTEGER; TYPE T_ID_HINT IS TABLE OF NUMBER(10) INDEX BY BINARY_INTEGER; V_SM_ID T_SM_ID; V_SM_SUBID T_SM_SUBID; V_ORGADDR T_ORGADDR; V_DESTADDR T_DESTADDR; V_ID_HINT T_ID_HINT; I INTEGER; VREDO1 INTEGER; vredo2 integer; BEGIN FOR I IN 1.. 2000 LOOP V_SM_ID(I):=I; V_SM_SUBID(I):=12; V_ORGADDR(I):='444555565'; V_DESTADDR(I):='555555'; V_ID_HINT(I):=i; END LOOP; select value into vredo1 from v$sysstat where name = 'redo size'; FOR I IN 1..2000 LOOP INSERT INTO SM_HISTABLE0101 (SM_ID,SM_SUBID,ORGADDR,DESTADDR,ID_HINT) VALUES (V_SM_ID(I),V_SM_SUBID(I),V_ORGADDR(I),V_DESTADDR(I),V_ID_HINT(I)); END LOOP; COMMIT; commit; select value into vredo2 from v$sysstat where name = 'redo size'; select value into vredo2 from v$sysstat where name = 'redo size'; dbms_output.put_line('redo size:'||to_char(vredo2-vredo1));
END; /
create or replace procedure redo2 is TYPE T_SM_ID IS TABLE OF NUMBER(10) INDEX BY BINARY_INTEGER; TYPE T_SM_SUBID IS TABLE OF NUMBER(3) INDEX BY BINARY_INTEGER; TYPE T_ORGADDR IS TABLE OF VARCHAR2(21) INDEX BY BINARY_INTEGER; TYPE T_DESTADDR IS TABLE OF VARCHAR2(21) INDEX BY BINARY_INTEGER; TYPE T_ID_HINT IS TABLE OF NUMBER(10) INDEX BY BINARY_INTEGER; V_SM_ID T_SM_ID; V_SM_SUBID T_SM_SUBID; V_ORGADDR T_ORGADDR; V_DESTADDR T_DESTADDR; V_ID_HINT T_ID_HINT; I INTEGER; VREDO1 INTEGER; vredo2 integer; n integer; BEGIN n:=2000; FOR I IN 1.. N LOOP V_SM_ID(I):=I; V_SM_SUBID(I):=12; V_ORGADDR(I):='444555565'; V_DESTADDR(I):='555555'; V_ID_HINT(I):=i; END LOOP; select value into vredo1 from v$sysstat where name = 'redo size';
FORALL I IN 1..N INSERT INTO SM_HISTABLE0101 (SM_ID,SM_SUBID,ORGADDR,DESTADDR,ID_HINT) VALUES (V_SM_ID(I),V_SM_SUBID(I),V_ORGADDR(I),V_DESTADDR(I),V_ID_HINT(I)); COMMIT; commit; select value into vredo2 from v$sysstat where name = 'redo size'; select value into vredo2 from v$sysstat where name = 'redo size'; dbms_output.put_line('redo size:'||to_char(vredo2-vredo1)); END; /
然后执行下面的代码进行测试:
set serveroutput on truncate table sm_histable0101; select max(ktuxescnw * power(2, 32) + ktuxescnb) from x$ktuxe; exec redo1; select max(ktuxescnw * power(2, 32) + ktuxescnb) from x$ktuxe;
truncate table sm_histable0101; select max(ktuxescnw * power(2, 32) + ktuxescnb) from x$ktuxe; exec redo2; select max(ktuxescnw * power(2, 32) + ktuxescnb) from x$ktuxe; 测试结果如下:
Table truncated.
SQL> MAX(KTUXESCNW*POWER(2,32)+KTUXESCNB) ------------------------------------ 87596111
SQL> redo size:707356
PL/SQL procedure successfully completed.
SQL> MAX(KTUXESCNW*POWER(2,32)+KTUXESCNB) ------------------------------------ 87596151
SQL> truncate table sm_histable0101; select max(ktuxescnw * power(2, 32) + ktuxescnb) from x$ktuxe; exec redo2; select max(ktuxescnw * power(2, 32) + ktuxescnb) from x$ktuxe;
Table truncated.
SQL> MAX(KTUXESCNW*POWER(2,32)+KTUXESCNB) ------------------------------------ 87596178
SQL> redo size:138728
PL/SQL procedure successfully completed.
SQL> MAX(KTUXESCNW*POWER(2,32)+KTUXESCNB) ------------------------------------ 87596195
从测试的结果来看,使用普通的INSERT操作,产生了707356 字节的REDO LOG,而使用BULK INSERT,只产生了138728字节的REDO LOG,REDO LOG产生量只有正常水平的1/5不到。看样子BULK INSERT操作在ORACLE RDBMS内部的操作是完全不同的,应该是采用了我们前面猜测的QUICK MULTI-INSERT操作。通过DUMP REDO LOG我们来验证一下:
SQL> alter system dump logfile '/opt/oracle/oradata/orcl/redo01.log' scn min 87596178 scn max 87596195;
System altered.
我们来查看DUMP出来的REDO信息:
CHANGE #2 TYP:0 CLS:18 AFN:7 DBA:0x01c007f2 OBJ:4294967295 SCN:0x0000.05389bf0 SEQ: 2 OP:5.1 ktudb redo: siz: 396 spc: 5858 flg: 0x0012 seq: 0x2aba rec: 0x11 xid: 0x0001.021.0000551d ktubl redo: slt: 33 rci: 0 opc: 11.1 objn: 122951 objd: 122956 tsn: 0 Undo type: Regular undo Begin trans Last buffer split: No Temp Object: No Tablespace Undo: No 0x00000000 prev ctl uba: 0x01c007f2.2aba.0f prev ctl max cmt scn: 0x0000.05388a1f prev tx cmt scn: 0x0000.05388a26 txn start scn: 0xffff.ffffffff logon user: 0 prev brb: 29362160 prev bcl: 0 KDO undo record: KTB Redo op: 0x03 ver: 0x01 op: Z KDO Op code: QMD row dependencies Disabled xtype: XA flags: 0x00000000 bdba: 0x00406482 hdba: 0x00406481 itli: 1 ispac: 0 maxfr: 4863 tabn: 0 lock: 0 nrow: 131 slot[0]: 0 slot[1]: 1 slot[2]: 2 slot[3]: 3 slot[4]: 4 slot[5]: 5 slot[6]: 6 slot[7]: 7 slot[8]: 8 slot[9]: 9 slot[10]: 10 slot[11]: 11 slot[12]: 12 slot[13]: 13 slot[14]: 14 slot[15]: 15 slot[16]: 16 slot[17]: 17 slot[18]: 18 slot[19]: 19 slot[20]: 20 slot[21]: 21 slot[22]: 22 slot[23]: 23 slot[24]: 24
从上面来看,UNDO的数据也和普通的INSERT操作不同,是一种批量方式的UNDO,对于同一个数据块中的所有记录,都产生在同一个UNDO的CHANGE VECTOR中。我们再来看看数据:
CHANGE #3 TYP:0 CLS: 1 AFN:1 DBA:0x00406482 OBJ:122956 SCN:0x0000.05389c94 SEQ: 3 OP:11.11 KTB Redo op: 0x01 ver: 0x01 op: F xid: 0x0001.021.0000551d uba: 0x01c007f2.2aba.11 KDO Op code: QMI row dependencies Disabled xtype: XA flags: 0x00000000 bdba: 0x00406482 hdba: 0x00406481 itli: 1 ispac: 0 maxfr: 4863 tabn: 0 lock: 1 nrow: 131 slot[0]: 0 tl: 53 fb: --H-FL-- lb: 0x0 cc: 29 col 0: [ 2] c1 02 col 1: [ 2] c1 0d col 2: *NULL* col 3: *NULL* col 4: *NULL* col 5: [ 9] 34 34 34 35 35 35 35 36 35 col 6: *NULL* col 7: *NULL* col 8: [ 6] 35 35 35 35 35 35 col 9: *NULL* col 10: *NULL* col 11: *NULL* col 12: *NULL* col 13: *NULL* col 14: *NULL* col 15: *NULL* col 16: *NULL* col 17: *NULL* col 18: *NULL* col 19: *NULL* col 20: *NULL* col 21: *NULL* col 22: *NULL* col 23: *NULL* col 24: *NULL* col 25: *NULL* col 26: *NULL* col 27: *NULL* col 28: [ 2] c1 02 slot[1]: 1 tl: 53 fb: --H-FL-- lb: 0x0 cc: 29
确实使用了OP CODE:11.11,BULK INSERT 操作使用了批量数据插入机制,因此BULK INSERT的性能才能够远高于单条记录操作。在我们以往进行的测试中,BULK操作一般都比单条操作快1倍以上,有的甚至能够快2倍以上。 在REDO LAYER 11中,大家也许会发现一个问题,11.11是MULTI-INSERT,11.12是MULTI-DELETE,单独少了MULTI-UPDATE,难道BULK UPDATE操作的实现机制和BULK INSERT有所不同吗?我们通过一个实验来看看BULK UPDATE是否能够减少REDO的产生量。通过简单的修改REDO1,REDO2两个存储过程,生成REDOU1,REDOU2这两个存储过程:
create or replace procedure redou1 is TYPE T_SM_ID IS TABLE OF NUMBER(10) INDEX BY BINARY_INTEGER; TYPE T_SM_SUBID IS TABLE OF NUMBER(3) INDEX BY BINARY_INTEGER; TYPE T_ORGADDR IS TABLE OF VARCHAR2(21) INDEX BY BINARY_INTEGER; TYPE T_DESTADDR IS TABLE OF VARCHAR2(21) INDEX BY BINARY_INTEGER; TYPE T_ID_HINT IS TABLE OF NUMBER(10) INDEX BY BINARY_INTEGER; V_SM_ID T_SM_ID; V_SM_SUBID T_SM_SUBID; V_ORGADDR T_ORGADDR; V_DESTADDR T_DESTADDR; V_ID_HINT T_ID_HINT; I INTEGER; VREDO1 INTEGER; vredo2 integer; BEGIN FOR I IN 1.. 2000 LOOP V_SM_ID(I):=I; V_SM_SUBID(I):=12; V_ORGADDR(I):='111111'; V_DESTADDR(I):='2222'; V_ID_HINT(I):=i; END LOOP; select value into vredo1 from v$sysstat where name = 'redo size'; FOR I IN 1..2000 LOOP update SM_HISTABLE0101 SET ORGADDR=V_ORGADDR(I) WHERE ID_HINT=V_ID_HINT(I); END LOOP; COMMIT; commit; select value into vredo2 from v$sysstat where name = 'redo size'; select value into vredo2 from v$sysstat where name = 'redo size'; dbms_output.put_line('redo size:'||to_char(vredo2-vredo1));
END; /
create or replace procedure redoU2 is TYPE T_SM_ID IS TABLE OF NUMBER(10) INDEX BY BINARY_INTEGER; TYPE T_SM_SUBID IS TABLE OF NUMBER(3) INDEX BY BINARY_INTEGER; TYPE T_ORGADDR IS TABLE OF VARCHAR2(21) INDEX BY BINARY_INTEGER; TYPE T_DESTADDR IS TABLE OF VARCHAR2(21) INDEX BY BINARY_INTEGER; TYPE T_ID_HINT IS TABLE OF NUMBER(10) INDEX BY BINARY_INTEGER; V_SM_ID T_SM_ID; V_SM_SUBID T_SM_SUBID; V_ORGADDR T_ORGADDR; V_DESTADDR T_DESTADDR; V_ID_HINT T_ID_HINT; I INTEGER; VREDO1 INTEGER; vredo2 integer; n integer; BEGIN n:=2000; FOR I IN 1.. N LOOP V_SM_ID(I):=I; V_SM_SUBID(I):=12; V_ORGADDR(I):='111111'; V_DESTADDR(I):='2222'; V_ID_HINT(I):=i; END LOOP; select value into vredo1 from v$sysstat where name = 'redo size';
FORALL I IN 1..N update SM_HISTABLE0101 SET ORGADDR=V_ORGADDR(I) WHERE ID_HINT=V_ID_HINT(I); COMMIT; commit; select value into vredo2 from v$sysstat where name = 'redo size'; select value into vredo2 from v$sysstat where name = 'redo size'; dbms_output.put_line('redo size:'||to_char(vredo2-vredo1)); END; /
然后执行下面的过程:
select max(ktuxescnw * power(2, 32) + ktuxescnb) from x$ktuxe; exec redou1; select max(ktuxescnw * power(2, 32) + ktuxescnb) from x$ktuxe;
select max(ktuxescnw * power(2, 32) + ktuxescnb) from x$ktuxe; exec redou2; select max(ktuxescnw * power(2, 32) + ktuxescnb) from x$ktuxe;
redo size:578904
PL/SQL procedure successfully completed.
SQL> MAX(KTUXESCNW*POWER(2,32)+KTUXESCNB) ------------------------------------ 87608317
SQL> SQL> SQL> MAX(KTUXESCNW*POWER(2,32)+KTUXESCNB) ------------------------------------ 87608317
SQL> redo size:571168
PL/SQL procedure successfully completed.
SQL> MAX(KTUXESCNW*POWER(2,32)+KTUXESCNB) ------------------------------------ 87610350
从上面的结果看,两种操作产生的REDO量是基本上十分接近的,看样子BULK UPDATE在REDO方面并没有很大的改善。通过DUMP REDO LOG,我们进一步验证一下这个测试结果:
CHANGE #3 TYP:2 CLS: 1 AFN:1 DBA:0x00406482 OBJ:122959 SCN:0x0000.0538cbfd SEQ: 1 OP:11.5 KTB Redo op: 0x11 ver: 0x01 op: F xid: 0x0003.011.0000724a uba: 0x00800ac1.32bc.1e Block cleanout record, scn: 0x0000.0538cbff ver: 0x01 opt: 0x02, entries follow... itli: 2 flg: 2 scn: 0x0000.0538cbfd KDO Op code: URP row dependencies Disabled xtype: XA flags: 0x00000000 bdba: 0x00406482 hdba: 0x00406481 itli: 1 ispac: 0 maxfr: 4863 tabn: 0 slot: 0(0x0) flag: 0x2c lock: 1 ckix: 191 ncol: 29 nnew: 1 size: 0 col 5: [ 6] 31 31 31 31 31 31
我们看到了OP CODE 11.5,这是一个正常的单行UPDATE操作。
|
---------------------------------------------- |