OGG和数据库版本如下:
软件 版本
GoldenGate Version 12.2.0.1.160823
Oracle database 11.2.0.4.0
数据库同时出现TMCH和LAG,如下:
GGSCI (ORADB) 1> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING 2401 00:00:00 00:00:03
REPLICAT RUNNING 1912P4R1 00:00:12 00:00:09
REPLICAT RUNNING 2401R1 00:01:28 00:17:51
2401R1进程是整合模式。
GGSCI (ORADB) 2> info 2401R1
REPLICAT 2401R1 Last Started 2019-08-27 14:30 Status RUNNING
INTEGRATED
Checkpoint Lag 00:01:28 (updated 00:18:00 ago)
Process ID 31325
Log Read Checkpoint File /u01/app/oracle/products/ogg/dirdat/2401/E0000020883
2019-08-27 15:13:14.726857 RBA 21287703
查询复制进程的等待事件。
SQL> set pagesize 3000
SQL> set line 300
SQL> select s.SID,s.PROCESS,s.PROGRAM,s.ACTION,s.EVENT,s.WAIT_CLASS,s.SECONDS_IN_WAIT,s.STATE from v$session s where s.ACTION like '%OGG$%2401R1%';
SID PROCESS PROGRAM
---------- ------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------------------------------------------------------
ACTION
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
EVENT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
WAIT_CLASS SECONDS_IN_WAIT STATE
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ --------------- ---------------------------------------------------------
258 698 [email protected]
OGG$2401R1 - Apply Server
library cache pin
Concurrency 841 WAITING
452 31325 [email protected] (T
OGG$2401R1 - Propagation Receive
library cache lock
Concurrency 841 WAITING
1091 31349 [email protected]
OGG$2401R1 - Apply Coordinator
rdbms ipc message
Idle 1 WAITING
1155 31351 [email protected]
OGG$2401R1 - Apply Reader
library cache pin
Concurrency 842 WAITING
1217 31353 [email protected]
OGG$2401R1 - Apply Server
Streams apply: waiting for dependency
Concurrency 0 WAITING
1281 31355 [email protected]
OGG$2401R1 - Apply Server
Streams apply: waiting for dependency
Concurrency 0 WAITING
1475 31552 [email protected]
OGG$2401R1 - Apply Server
Streams apply: waiting for dependency
Concurrency 0 WAITING
7 rows selected.
可以看到2401R1进程有4个并发进程,四个实际工作的进程在等待Streams apply: waiting for dependency,library cache lock和library cache pin等待事件。
查询数据库等待事件。
SQL> col event for a35
SQL> select inst_id, event#, event,count(*) from gv$session where wait_class# <> 6 group by inst_id, event#,event order by 1,4 desc;
INST_ID EVENT# EVENT COUNT(*)
---------- ---------- ------------------------------------- ----------
1 342 Streams apply: waiting for dependency 3
1 286 library cache pin 2
1 289 library cache: mutex X 1
1 350 SQL*Net message to client 1
1 287 library cache lock 1
从library cache lock和library cache pin入手分析。
SQL> set linesize 200
SQL> set arraysize 1000
SQL> set sqlblankline on
SQL> set echo on
SQL> set feedback on
SQL> set heading on
SQL> set timing on time on
15:39:44 SQL>
15:39:44 SQL> SELECT distinct decode(kglpnreq,0,'holding_session: '||s.sid,'waiting_session: '||s.sid) sid,
15:39:44 2 s.SERIAL#, kglpnmod "Pin Mode", kglpnreq "Req Pin",a.sql_text,kglnaown "Owner", kglnaobj "Object"
15:39:44 3 FROM x$kglpn p, v$session s,v$sqlarea a,v$session_wait sw,x$kglob x
15:39:44 4 WHERE p.kglpnuse=s.saddr
15:39:44 5 AND kglpnhdl=sw.p1raw
15:39:45 6 and kglhdadr=sw.p1raw
15:39:45 7 and s.event like 'library cache%'
15:39:45 8 and (a.hash_value, a.address) IN (
15:39:45 9 select
15:39:45 10 DECODE (sql_hash_value,
15:39:45 11 0,
15:39:45 12 prev_hash_value,
15:39:45 13 sql_hash_value
15:39:45 14 ),
15:39:45 15 DECODE (sql_hash_value, 0, prev_sql_addr, sql_address)
15:39:45 16 from v$session s2
15:39:45 17 where s2.sid=s.sid
15:39:45 18 )
15:39:45 19 ;
SID SERIAL# Pin Mode
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------- ----------
Req Pin
----------
SQL_TEXT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Owner
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Object
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
holding_session: 1155 22119 2
0
/* SQL Analyze(1155,0) */ insert into "NADMIN"."NEWLOG2" select * from "NADMIN"."NEWLOG2"
NADMIN
NEWLOG2
holding_session: 258 46181 2
0
INSERT /*+ restrict_all_ref_cons */ INTO "NADMIN"."NEWLOG2" ("TABLENAME","RECORDID","DOFLAG","COMMITTIMESTAMP","CSN","BEFOREAFTERINDICATOR") VALUES (:1 ,:2 ,:3 ,:4 ,:5 ,:6 )
NADMIN
NEWLOG2
waiting_session: 1155 22119 0
3
/* SQL Analyze(1155,0) */ insert into "NADMIN"."NEWLOG2" select * from "NADMIN"."NEWLOG2"
NADMIN
JG_TRG_NEWLOG2_SYST
waiting_session: 258 46181 0
2
INSERT /*+ restrict_all_ref_cons */ INTO "NADMIN"."NEWLOG2" ("TABLENAME","RECORDID","DOFLAG","COMMITTIMESTAMP","CSN","BEFOREAFTERINDICATOR") VALUES (:1 ,:2 ,:3 ,:4 ,:5 ,:6 )
NADMIN
JG_TRG_NEWLOG2_SYST
4 rows selected.
查看这几个会话的阻塞情况
15:41:14 SQL> select sid,serial#,blocking_session from v$session where sid in (1155,258);
SID SERIAL# BLOCKING_SESSION
---------- ---------- ----------------
258 46181 1155
1155 22119 1475
2 rows selected.
Elapsed: 00:00:00.02
15:41:37 SQL> select sid,serial#,blocking_session from v$session where sid in 1475;
SID SERIAL# BLOCKING_SESSION
---------- ---------- ----------------
1475 44855
1 row selected.
Elapsed: 00:00:00.09
会话1155阻塞了会话258,会话1475又阻塞了会话1155,所以阻塞的源头是1475会话。该会话的等待事件是Streams apply: waiting for dependency。也就是说开了并发的复制进程,它的子进程之间形成了一条阻塞链。搜索MOS,可以参考相关文档ID(1566466.1,2359146.1)。
2401R1进程无法成功stop,直接kill
kill 2401R1
会话仍然存在,然后杀掉阻塞源头。
Alter system kill session '1475,35578' immediate;
最后重启进程恢复正常。
start 2401R1
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/31480688/viewspace-2655187/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/31480688/viewspace-2655187/