OGG重复记录导致复制进程挂起

今天处理了个复制进程异常挂起的CASE,出错日志是:

2012-08-20 10:33:02  WARNING OGG-00869  Oracle GoldenGate Delivery for Oracle, r_inv1.prm:  No unique key is defined for table 'WL_PSINFO'. All viable columns will be used to represent the key, but may not guarantee uniqueness.  KEYCOLS may be used to define the key.
2012-08-20 10:34:12  WARNING OGG-01431  Oracle GoldenGate Delivery for Oracle, r_inv1.prm:  Aborted grouped transaction on 'MBS7_INV.WL_PSINFO', Mapping error.
2012-08-20 10:34:12  WARNING OGG-01003  Oracle GoldenGate Delivery for Oracle, r_inv1.prm:  Repositioning to rba 124252822 in seqno 77.
2012-08-20 10:34:12  WARNING OGG-01151  Oracle GoldenGate Delivery for Oracle, r_inv1.prm:  Error mapping from MBS7_INV.WL_PSINFO to MBS7_INV.WL_PSINFO.
2012-08-20 10:34:12  WARNING OGG-01003  Oracle GoldenGate Delivery for Oracle, r_inv1.prm:  Repositioning to rba 124252822 in seqno 77.
2012-08-20 10:34:12  ERROR   OGG-01296  Oracle GoldenGate Delivery for Oracle, r_inv1.prm:  Error mapping from MBS7_INV.WL_PSINFO to MBS7_INV.WL_PSINFO.
2012-08-20 10:34:12  ERROR   OGG-01668  Oracle GoldenGate Delivery for Oracle, r_inv1.prm:  PROCESS ABENDING.

从日志看是该表缺少主键,但OGG也是可以基于无主键的情况下同步的,进一步分析目标表和结合挂起的时间,本来该表是无主键的,在发生异常前刚在源端做了创建主键的操作,而此时目标表是存在重复记录的:

目标端:

select id,ordercode, consigncode from mbs7_inv.WL_PSINFO group by id,
                ordercode, consigncode having count(*) > 1

.....有1千多条重复记录,而与此同时在源端是不存在重复记录。

进一步查询发现该表的重复记录只有前几个字段相同,后面几个字段还是不一样的:

select *
  from mbs7_inv.WL_PSINFO
 where id in
       (select id
          from (
select id,ordercode, consigncode from mbs7_inv.WL_PSINFO group by id,
                ordercode, consigncode having count(*) > 1)) order by id

所以不能用传统删除重复记录的方法来处理。

为了OGG进程能正常运作,打算先在源端备份这些删除掉的真实记录,然后在目标端做删除处理后再导入,处理方法是:

1 先在源端备份这些记录:

create table system.WL_PSINFO_bak   as select * from 
mbs7_inv.WL_PSINFO a  where a.id in 
(select id
          from (
select id,ordercode, consigncode from mbs7_inv.WL_PSINFO@link_102 group by id,
                ordercode, consigncode having count(*) > 1))

 

2 在目标端删除这些记录:

delete from mbs7_inv.WL_PSINFO a  where id in 
(
select id 
          from (
select id,ordercode, consigncode from mbs7_inv.WL_PSINFO group by id,
                ordercode, consigncode having count(*) > 1))

commit;

3 重新把源端备份的数据导回来:

insert into mbs7_inv.WL_PSINFO select * from system.WL_PSINFO_bak@link_100;

 

4 启动复制进程即可:

ggsci>start r_inv1

 

总结:主要是要理解OGG的复制原理,它是读日志或DDL同步表进行同步处理的,如果源端进行得动作,在目标端执行不了,为了保证数据一致性,OGG会让复制进程挂起,所要要结合警告日志和两边表结构情况来分析。

你可能感兴趣的:(OGG重复记录导致复制进程挂起)