同事正常操作,并停止一个ogg进程,数据库是12c的。ogg当然也是12c的版本。一切都是一个正常的操作,但是出了坑爹的效应,差不多四个人,搞了近3个小时吧。下面我们看看详细的报错;
GGSCI (dwdb1) 1> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING EK_ZW1 00:00:03 00:00:05
EXTRACT STOPPED EXT_KAF1 00:00:04 00:20:06
EXTRACT RUNNING PMP_KAF1 00:00:00 00:00:03
EXTRACT RUNNING PM_ZW1 00:00:00 00:00:08
REPLICAT RUNNING REP_EWM1 00:00:00 00:00:02
REPLICAT RUNNING REP_HX4 00:00:04 00:00:00

我们report这个进程,得到的报错是
2019-08-13 21:24:07 ERROR OGG-00868 Error code 1291, error message: ORA-01291: missing logfile
(Missing Log File WAITING FOR REDO: FILE NA, THREAD 2, SEQUENCE 758882, SCN 0x00000034fb602604. Read Position SCN: 52.4243351251
(227581650643)).

2019-08-13 21:24:07 ERROR OGG-01668 PROCESS ABENDING.

初看是归档丢失了,确实是,该库是一套统计报表库。OLAP系统,每小时归档产生量超级多。所以有脚本定时删除,每5分钟删除一次。

常见的方法,肯定是恢复这个归档文件;
[oracle@dwdb1 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Tue Aug 13 22:27:09 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.

connected to target database: DWDB (DBID=2693284169)

RMAN> Restore archivelog from logseq 758882 until logseq 759082;

Starting restore at 13-AUG-19
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=3753 instance=dwdb1 device type=DISK
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 08/13/2019 22:27:43
RMAN-20242: specification does not match any archived log in the repository

RMAN> list backup of archivelog sequence between 758882 and 758883;

specification does not match any backup in the repository

但是抱歉,一切都是徒劳,我们的库太大,而且没有专用的备份软件。所以,一直在裸奔状态。就是说要死大家一起死的那种,自己去想吧;

通过查询,得到了一个解决方案;
Oracle GoldenGate is reliant on the Oracle Redo Logs and Archive Logs when capturing transactions. Since I do a lot of testing on VMs (limited space) and in the cloud (limited space … don’t want to burn to much $), I often delete my archive logs. Normally this is not a problem; however, every once-in-awhile I delete more archivelogs than I should. This throws the IE into a state where it will not start because of OGG-00868/ORA-01291 – Missing Log Files.

On some level this is to be expected, but when you are using IE you have to remember that the extract is registered with the database. Since I’m using Integrated Extract, we have to reset how the extract is registered with the database. The below steps will show you how this should be done:

Note: Registering/Unregistering process have to be done at the container database (CDB) level.

adminclient> dblogin useridalias domain
adminclient> stop extract
adminclient> unregister extract database
adminclient> register extract database container
adminclient> start extract
adminclient> info extract

按照解决方案,总是可以走的那么悠然,我们搞一波;
GGSCI (dwdb1) 3> dblogin userid c##ggadmin,PASSWORD ggadmin
Successfully logged into database CDB$ROOT.

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 5> unregister extract EXT_KAF1 database

2019-08-13 21:54:17 INFO OGG-01750 Successfully unregistered EXTRACT EXT_KAF1 from database.

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 6>
GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 6> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING EK_ZW1 00:00:02 00:00:09
EXTRACT STOPPED EXT_KAF1 00:00:04 00:45:17
EXTRACT RUNNING PMP_KAF1 00:00:00 00:00:01
EXTRACT RUNNING PM_ZW1 00:00:04 00:00:06
REPLICAT RUNNING REP_EWM1 00:00:00 00:00:00
REPLICAT RUNNING REP_HX4 00:00:04 00:00:01

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 7> register extract EXT_KAF1 database container(dcdb)

2019-08-13 21:54:39 ERROR OGG-01891 EXTRACT EXT_KAF1 must first be deleted before it can be registered.

卡住了,需要删除该进程,完全没有按照想象的剧本去流转啊,坑爹啊!!!!我们这里想想,在看看别的,上一次检查点进程是4月4日。那么有可能这个进程早就出了问题,只不过今天这个哥们点正,触发了。应该是12c的未知bug了。

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 21> info PMP_KAF1

EXTRACT PMP_KAF1 Last Started 2019-08-13 21:09 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:00 ago)
Process ID 128917
Log Read Checkpoint File /home/oracle/ogg/ggs12/dirdat/k1000000012
2019-04-04 06:45:21.930903 RBA 181893354

和开发沟通后,这个进程涉及的表很重要,涉及到对账,硬着头皮删除这个进程吧;

GGSCI (dwdb1) 11> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING EK_ZW1 00:00:04 00:00:04
EXTRACT STOPPED EXT_KAF1 00:00:04 01:46:58
EXTRACT RUNNING PMP_KAF1 00:00:00 00:00:06
EXTRACT RUNNING PM_ZW1 00:00:00 00:00:08
REPLICAT RUNNING REP_EWM1 00:00:00 00:00:06
REPLICAT RUNNING REP_HX4 00:00:05 00:00:04

GGSCI (dwdb1) 12> dblogin userid c##ggadmin,PASSWORD ggadmin
Successfully logged into database CDB$ROOT.

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 13> delete EXT_KAF1
Deleted EXTRACT EXT_KAF1.

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 14> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING EK_ZW1 00:00:02 00:00:07
EXTRACT RUNNING PMP_KAF1 00:00:00 00:00:05
EXTRACT RUNNING PM_ZW1 00:00:00 00:00:05
REPLICAT RUNNING REP_EWM1 00:00:00 00:00:04
REPLICAT RUNNING REP_HX4 00:00:03 00:00:01

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 15> register extract EXT_KAF1 database container(dcdb)

2019-08-13 23:06:54 INFO OGG-02003 Extract EXT_KAF1 successfully registered with database at SCN 227587403736.

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 16> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING EK_ZW1 00:00:03 00:00:07
EXTRACT RUNNING PMP_KAF1 00:00:00 00:00:06
EXTRACT RUNNING PM_ZW1 00:00:00 00:00:05
REPLICAT RUNNING REP_EWM1 00:00:00 00:00:05
REPLICAT RUNNING REP_HX4 00:00:00 00:00:09

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 17> start extract EXT_KAF1
ERROR: EXTRACT EXT_KAF1 does not exist.

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 18> info PMP_KAF1

EXTRACT PMP_KAF1 Last Started 2019-08-13 21:09 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:04 ago)
Process ID 128917
Log Read Checkpoint File /home/oracle/ogg/ggs12/dirdat/k1000000012
2019-04-04 06:45:21.930903 RBA 181893354

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 19> add exttrail ./dirdat/k1, extract EXT_KAF1
EXTRACT group does not exist.

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 20> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING EK_ZW1 00:00:02 00:00:09
EXTRACT RUNNING PMP_KAF1 00:00:00 00:00:08
EXTRACT RUNNING PM_ZW1 00:00:00 00:00:07
REPLICAT RUNNING REP_EWM1 00:00:00 00:00:07
REPLICAT RUNNING REP_HX4 00:00:00 00:00:03

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 21> add extract EXT_KAF1, integrated tranlog, begin now
EXTRACT (Integrated) added.

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 22> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING EK_ZW1 00:00:05 00:00:00
EXTRACT STOPPED EXT_KAF1 00:00:00 00:00:08
EXTRACT RUNNING PMP_KAF1 00:00:00 00:00:09
EXTRACT RUNNING PM_ZW1 00:00:00 00:00:09
REPLICAT RUNNING REP_EWM1 00:00:00 00:00:08
REPLICAT RUNNING REP_HX4 00:00:03 00:00:06

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 23> add exttrail ./dirdat/k1, extract EXT_KAF1
EXTTRAIL added.

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 24> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING EK_ZW1 00:00:05 00:00:05
EXTRACT STOPPED EXT_KAF1 00:00:00 00:00:13
EXTRACT RUNNING PMP_KAF1 00:00:00 00:00:04
EXTRACT RUNNING PM_ZW1 00:00:00 00:00:04
REPLICAT RUNNING REP_EWM1 00:00:00 00:00:03
REPLICAT RUNNING REP_HX4 00:00:00 00:00:00

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 25> start EXTRACT EXT_KAF1

Sending START request to MANAGER ...
EXTRACT EXT_KAF1 starting

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 26> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING EK_ZW1 00:00:03 00:00:04
EXTRACT RUNNING EXT_KAF1 00:00:00 00:00:22
EXTRACT RUNNING PMP_KAF1 00:00:00 00:00:04
EXTRACT RUNNING PM_ZW1 00:00:00 00:00:03
REPLICAT RUNNING REP_EWM1 00:00:00 00:00:03
REPLICAT RUNNING REP_HX4 00:00:00 00:00:09

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 27> info EXT_KAF1

EXTRACT EXT_KAF1 Initialized 2019-08-13 23:10 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:26 ago)
Process ID 48863
Log Read Checkpoint Oracle Integrated Redo Logs
2019-08-13 23:10:44
SCN 0.0 (0)

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 28> info EXT_KAF1

EXTRACT EXT_KAF1 Initialized 2019-08-13 23:10 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:29 ago)
Process ID 48863
Log Read Checkpoint Oracle Integrated Redo Logs
2019-08-13 23:10:44
SCN 0.0 (0)

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 29> info EXT_KAF1

EXTRACT EXT_KAF1 Initialized 2019-08-13 23:10 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:32 ago)
Process ID 48863
Log Read Checkpoint Oracle Integrated Redo Logs
2019-08-13 23:10:44
SCN 0.0 (0)

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 30> info extract EXT_KAF1

EXTRACT EXT_KAF1 Last Started 2019-08-13 23:11 Status RUNNING
Checkpoint Lag 00:00:48 (updated 00:00:05 ago)
Process ID 48863
Log Read Checkpoint Oracle Integrated Redo Logs
2019-08-13 23:10:44
SCN 0.0 (0)

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 31> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING EK_ZW1 00:00:04 00:00:07
EXTRACT RUNNING EXT_KAF1 00:00:48 00:00:08
EXTRACT RUNNING PMP_KAF1 00:00:00 00:00:07
EXTRACT RUNNING PM_ZW1 00:00:00 00:00:06
REPLICAT RUNNING REP_EWM1 00:00:00 00:00:06
REPLICAT RUNNING REP_HX4 00:00:00 00:00:06

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 32>

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 32> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING EK_ZW1 00:00:02 00:00:08
EXTRACT RUNNING EXT_KAF1 00:00:04 00:00:09
EXTRACT RUNNING PMP_KAF1 00:00:00 00:00:09
EXTRACT RUNNING PM_ZW1 00:00:00 00:00:08
REPLICAT RUNNING REP_EWM1 00:00:00 00:00:08
REPLICAT RUNNING REP_HX4 00:00:00 00:00:04

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 33> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING EK_ZW1 00:00:03 00:00:00
EXTRACT RUNNING EXT_KAF1 00:00:05 00:00:01
EXTRACT RUNNING PMP_KAF1 00:00:00 00:00:00
EXTRACT RUNNING PM_ZW1 00:00:00 00:00:09
REPLICAT RUNNING REP_EWM1 00:00:00 00:00:09
REPLICAT RUNNING REP_HX4 00:00:00 00:00:06

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 34> view params EXT_KAF1

extract EXT_KAF1
userid c##ggadmin,PASSWORD ggadmin
LOGALLSUPCOLS
UPDATERECORDFORMAT COMPACT
exttrail ./dirdat/k1,FORMAT RELEASE 12.3
SOURCECATALOG dcdb
--traceId=defgen_kaf1
table dcm_owner.pts_inmno_amt;

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 35> info EXT_KAF1

EXTRACT EXT_KAF1 Last Started 2019-08-13 23:11 Status RUNNING
Checkpoint Lag 00:00:05 (updated 00:00:05 ago)
Process ID 48863
Log Read Checkpoint Oracle Integrated Redo Logs
2019-08-13 23:12:27
SCN 52.4249372382 (227587671774)

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 36> info EXT_KAF1

EXTRACT EXT_KAF1 Last Started 2019-08-13 23:11 Status RUNNING
Checkpoint Lag 00:00:06 (updated 00:00:00 ago)
Process ID 48863
Log Read Checkpoint Oracle Integrated Redo Logs
2019-08-13 23:12:36
SCN 52.4249376790 (227587676182)

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 37>

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 37>

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 37> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING EK_ZW1 00:00:03 00:00:06
EXTRACT RUNNING EXT_KAF1 00:00:07 00:00:07
EXTRACT RUNNING PMP_KAF1 00:00:00 00:00:07
EXTRACT RUNNING PM_ZW1 00:00:05 00:00:06
REPLICAT RUNNING REP_EWM1 00:00:00 00:00:06
REPLICAT RUNNING REP_HX4 00:00:00 00:00:02

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 38> info EXT_KAF1

EXTRACT EXT_KAF1 Last Started 2019-08-13 23:11 Status RUNNING
Checkpoint Lag 00:00:03 (updated 00:00:01 ago)
Process ID 48863
Log Read Checkpoint Oracle Integrated Redo Logs
2019-08-13 23:12:59
SCN 52.4249388952 (227587688344)

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 39> info PMP_KAF1

EXTRACT PMP_KAF1 Last Started 2019-08-13 21:09 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:01 ago)
Process ID 128917
Log Read Checkpoint File /home/oracle/ogg/ggs12/dirdat/k1000000013
2019-08-13 23:11:22.220552 RBA 1803

GGSCI (dwdb1 as c##ggadmin@dwdb1/CDB$ROOT) 45> info EXT_KAF1

EXTRACT EXT_KAF1 Last Started 2019-08-13 23:11 Status RUNNING
Checkpoint Lag 00:00:05 (updated 00:00:07 ago)
Process ID 48863
Log Read Checkpoint Oracle Integrated Redo Logs
2019-08-13 23:39:01
SCN 52.4250213999 (227588513391)

截止到发稿前,该进程RBA在正常工作,且其他log日志也都正常;