今天客户的GoldenGate出问题了,打电话过来咨询,根据客户描述的问题现象和日志的分析,初步判断是数据库索引异常引起的故障,以下是问题的分析和处理描述。
问题描述:
2010-02-01 17:19:28 GGS ERROR 103 Discard file (./dirrpt/repsz.dsc) exceeded max bytes (10000000).
2010-02-01 17:19:28 GGS ERROR 190 PROCESS ABENDING.
Discard文件摘录:
ORA-01502: index a.IDX_SB_SBXX_SSSQ_QZ' or partition of such index is in unusable state, SQL <UPDATE "a"."DJ_YZCWSBQC_CWBB" SET "NSRDZDAH" = :a21,"ND" = :a22,"YF" = :a23,"SSSQ_Q" = :a24,"SSSQ_Z" = :a25,"CWBBZL_DM" = :a26,"SBQX" = :a27,"YQSBQX" = :a28,"SBRQ" = :a29,"SBFS_DM" = :a30,"HY_DM">
Operation failed at seqno 1816 rba 153098124
Discarding record on action DISCARD on error 1502
Problem replicating CTAIS2.SB_SBXX to CTAIS2.SB_SBXX
Error (1502) occurred with insert record (target format)...
*
问题分析:
看到错误描述,又是一个discard文件写满进程abend的问题,这回是什么原因引起大量的写discard的操作的呢?通过分析discard文件发现大量的如下错误信息:or partition of such index is in unusable state。
看来是数据库的索引出现故障了,拿我们就先从处理数据库故障开始。
问题处理:
1、参考Oracle官方给出的处理这个ora类型错误的建议:
ORA-01502: index 'string.string' or partition of such index is in unusable state
Cause: An attempt has been made to access an index or index partition that has been marked unusable by a direct load or by a DDL operation
Action: DROP the specified index, or REBUILD the specified index, or REBUILD the unusable index partition
2、我们重建了这个有问题的索引
3、清空已经写满信息的discard文件
4、重新启动rep进程,故障排除。