在我们迁移数据,或者进行同步数据的时候,对于应用变更频繁的表进行抽取数据,经常会碰到oracle需要读取回滚段,会导致很慢,有时候甚至会报ora01555错误,
比如我们有个表比较大是40G左右,就是一个月的按月分区数据,这个时候如果想尽快抽取数据到另外一个库,有几种方法:
方法1:
大家都知道的使用append,然后不写日志,parallel抽取方式:
代码类似如下:
alter session enable parallel DML; ALTER SESSION SET db_file_multiblock_read_count=128; INSERT /*+append parallel(b 2)*/ INTO OS_USER_SERVICE_HIS_1 b SELECT /*+FULL(a) PARALLEL(A,2)*/ * FROM OS_USER_SERVICE_HIS A WHERE CREATETIME >= TO_DATE('20110906', 'yyyymmdd');
方法2:
开4个会话:
通过createtime逻辑上进行4个时间区间的并行处理:比如一个月的话,分成1会话处理第一周,然后一直到4会话处理第4周
,当然你也可以再细分:1会话写循环一小时一小时处理。
方法3:
使用rowid并行:
这里我重要说下使用rowid并行的方法:
真实案例:
create table ROWID_OS_USER_BEHAVIOR_201212 ( ID NUMBER, ROWID_MIN VARCHAR2(32), ROWID_MAX VARCHAR2(32), FLAG NUMBER );
首先创建rowid保存表:
获取远程库的data_object_id:
SQL> select data_object_id from [email protected] where object_name='OS_USER_BEHAVIOR_MONTH' and subobject_name='OS_USER_BEHAVIOR_MONTH2012M12' 2 ; DATA_OBJECT_ID -------------- 218043
--获取远程库的最小,最大rowid:
SQL> insert into rowid_os_user_behavior_201212(id,rowid_min,rowid_max,FLAG) 2 select rownum, 3 [email protected](1,218043,e.RELATIVE_FNO,e.BLOCK_ID,0), 4 [email protected](1,218043,e.RELATIVE_FNO,e.BLOCK_ID+e.BLOCKS-1,10000), 5 0 6 from [email protected] e where e.segment_name='OS_USER_BEHAVIOR_MONTH' 7 and e.owner='OSS01' 8 and partition_name='OS_USER_BEHAVIOR_MONTH2012M12' 9 ; 659 rows inserted; commit;
--将远程这个分区对应的extents范围的rowid放入表中:
插入完之后,查询结果如下:
SQL> select * from rowid_os_user_behavior_201212 where flag =0 and rownum =1;
ID ROWID_MIN ROWID_MAX FLAG
---------- -------------------------------- -------------------------------- ----------
422 AAA1O7AAxAADDgJAAA AAA1O7AAxAADFgICcQ 0
编写拉数据存储过程: 如下:
create or replace procedure p_ods_os_user_beha_month(i integer) is vSTATEDATE dbms_sql.NUMBER_Table; vUSERNUMBER dbms_sql.VARCHAR2_Table; vSERVICEID dbms_sql.NUMBER_Table; vOPERTYPE dbms_sql.NUMBER_Table; vRECVCOUNT dbms_sql.NUMBER_Table; vSENDCOUNT dbms_sql.NUMBER_Table; vTOTALCOUNT dbms_sql.NUMBER_Table; vPRESENDCOUNT dbms_sql.NUMBER_Table; vENTERPRISEFLAG dbms_sql.NUMBER_Table; vENTERPRISESHEETNO dbms_sql.VARCHAR2_Table; vCREATETIME dbms_sql.DATE_Table; vMODIFYTIME dbms_sql.DATE_Table; vPROVCODE dbms_sql.NUMBER_Table; vSERVICEITEM dbms_sql.VARCHAR2_Table; vCARDTYPE dbms_sql.NUMBER_Table; vAREACODE dbms_sql.NUMBER_Table; vBINDTYPEID dbms_sql.NUMBER_Table; vORDERTYPE dbms_sql.NUMBER_Table; vMAILSERVICEITEM dbms_sql.VARCHAR2_Table; /* vCounter number := 1;*/ vCounter_out number := 0; cur_syncdata sys_refcursor; begin for x in (select * from rowid_OS_USER_BEHAVIOR_201212 where mod(id, 4) = i ---这里就是变量i; and flag = 0) loop begin open cur_syncdata for select /*+rowid(t))*/ STATEDATE, USERNUMBER, SERVICEID, OPERTYPE, RECVCOUNT, SENDCOUNT, TOTALCOUNT, PRESENDCOUNT, ENTERPRISEFLAG, ENTERPRISESHEETNO, CREATETIME, MODIFYTIME, PROVCODE, SERVICEITEM, CARDTYPE, AREACODE, BINDTYPEID, ORDERTYPE, MAILSERVICEITEM from [email protected] t where rowid >= chartorowid(x.rowid_min) and rowid <= chartorowid(x.rowid_max); loop begin fetch cur_syncdata bulk collect into vSTATEDATE, vUSERNUMBER, vSERVICEID, vOPERTYPE, vRECVCOUNT, vSENDCOUNT, vTOTALCOUNT, vPRESENDCOUNT, vENTERPRISEFLAG, vENTERPRISESHEETNO, vCREATETIME, vMODIFYTIME, vPROVCODE, vSERVICEITEM, vCARDTYPE, vAREACODE, vBINDTYPEID, vORDERTYPE, vMAILSERVICEITEM limit 5000; forall row in 1 .. vUSERNUMBER.count() insert into OS_USER_BEHAVIOR_MONTH_201212 (STATEDATE, USERNUMBER, SERVICEID, OPERTYPE, RECVCOUNT, SENDCOUNT, TOTALCOUNT, PRESENDCOUNT, ENTERPRISEFLAG, ENTERPRISESHEETNO, CREATETIME, MODIFYTIME, PROVCODE, SERVICEITEM, CARDTYPE, AREACODE, BINDTYPEID, ORDERTYPE, MAILSERVICEITEM) values (vSTATEDATE(row), vUSERNUMBER(row), vSERVICEID(row), vOPERTYPE(row), vRECVCOUNT(row), vSENDCOUNT(row), vTOTALCOUNT(row), vPRESENDCOUNT(row), vENTERPRISEFLAG(row), vENTERPRISESHEETNO(row), vCREATETIME(row), vMODIFYTIME(row), vPROVCODE(row), vSERVICEITEM(row), vCARDTYPE(row), vAREACODE(row), vBINDTYPEID(row), vORDERTYPE(row), vMAILSERVICEITEM(row)); vCounter_out := vCounter_out + sql%rowcount; commit; /* if vCounter = 1000 then begin dbms_lock.sleep(3); vCounter := 0; end; end if;*/ exit when cur_syncdata%notfound; exception when others then dbms_output.put_line(sqlerrm); rollback; return; end; end loop; end; --更新处理的标记位: update rowid_OS_USER_BEHAVIOR_201212 set flag = 1 where id = x.id; commit; end loop; dbms_output.put_line('共处理' || vCounter_out || '条记录!'); end;
然后开4个会话,分别传入0,1,2,3即可:
30G的数据,经过测试并行4个进程,大概40分钟可以拉完,这里的应用在于拉的数据是经常需要dml的数据,优势比较明显。