一客户上午报过来一个OGG错误
2018-03-28 10:26:20 ERROR OGG-01416 Oracle GoldenGate Capture for Oracle, P_HYDEE.prm: File d:\ogg\dirdat\et007149, with format RELEASE 9.0/9.5, does not match current format specification of RELEASE 11.2. Modify the parameter file to specify format RELEASE 9.0/9.5 or issue ETROLLOVER prior to restart.
2018-03-28 10:26:20 ERROR OGG-01668 Oracle GoldenGate Capture for Oracle, P_HYDEE.prm: PROCESS ABENDING.
解决这个错误比较简单,alter extract xxx etrollover
参考: http://www.askmaclean.com/archives/goldengate-ogg-ogg-01416.html
执行命令之后又有了新问题:
2018-03-28 10:31:20 ERROR OGG-01033 Oracle GoldenGate Capture for Oracle, P_HYDEE.prm: There is a problem in network communication, a remote file problem, encryption keys for target and source do not match (if using ENCRYPT) or an unknown error. (Remote file used is d:\ogg\dirdat\et007150, reply received is Input rba past EOF for d:\ogg\dirdat\et007150; input rba: 99998104; EOF rba: 98832384).
2018-03-28 10:31:20 ERROR OGG-01668 Oracle GoldenGate Capture for Oracle, P_HYDEE.prm: PROCESS ABENDING.
这个就花了我不少时间去检查
最后,在目标端的操作系统看到, D盘整个已经被占满了,已经没有磁盘空间了
同时,在目标端的ggserr.log中出现了大量的warning:
2018-03-28 10:31:10 INFO OGG-01228 Oracle GoldenGate Collector for Oracle: Timeout in 300 seconds.
2018-03-28 10:31:15 INFO OGG-01229 Oracle GoldenGate Collector for Oracle: Connected to WIN-O97S9H5BGRD:49780.
2018-03-28 10:31:15 INFO OGG-01669 Oracle GoldenGate Collector for Oracle: Opening d:\ogg\dirdat\et007150 (byte -1, current EOF 0).
2018-03-28 10:31:21 WARNING OGG-01223 Oracle GoldenGate Collector for Oracle: fwrite() error 112 (磁盘空间不足。) writing to d:\ogg\dirdat\et007150.
于是这个就理解了。处理也相对来说比较简单,清理目标端磁盘,释放一些空间即可。
整个故障大致是这么形成:
在2月28号, 因表结构问题,崩溃,然后客户那边没人检查并注意到。
2018-02-28 17:15:43 ERROR OGG-01161 Oracle GoldenGate Delivery for Oracle, R_*****.prm: Bad column index (124) specified for table *****, max columns = 124.
2018-02-28 17:15:43 ERROR OGG-01668 Oracle GoldenGate Delivery for Oracle, R_HYDEE.prm: PROCESS ABENDING.
进而导致目标端trail文件一致删除不(mgr进程是如下的配置)
port 7500
AUTOSTART ER *
AUTORESTART EXTRACT *,RETRIES 3,WAITMINUTES 5,RESETMINUTES 60
dynamicportlist 7501-7505
autorestart extract *,waitminutes 2,retries 5
purgeoldextracts .\dirdat\et*,usecheckpoints,minkeepdays 2
trail文件删除不了,导致磁盘空间不足,进而导致源端的pump进程写入trail文件到目标端时出现了错误, 导致pump进程崩溃
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/8520577/viewspace-2152297/,如需转载,请注明出处,否则将追究法律责任。