OGG是基于事务级的实时复制工具,也就是说OGG只复制已提交的事务,在遇到事务的commit或rollback之前,
它会将每个事务的操作存储在称为cache的托管虚拟内存池中。内存再大也有不够用的时候,当事务数据超过一定
的阈值或者当前空闲内存无法满足分配请求时,OGG进程会将最少使用的old buffer swap 到磁盘上的dirtmp中。
注意:对于OGG?11及以后版本新增了自动缓存长交易的功能,缺省每隔4小时自动对未提交交易缓存到本地硬盘,这样只需要最多8个小时归档日志即可。
但是缓存长交易操作只在extract运行时有效,停止后不会再缓存,此时所需归档日志最少为8个小时加上停机时间,
一般为了保险起见建议确保重启时要保留有12个小时加上停机时间的归档日志。
当OGG遇到长事务时,如果extract进程因某种原因终止,原有的文件系统上的临时数据都将被删除,
重启后会根据recoverycheckpoint查找最早开始进行的长事务起始点所在的归档日志进行恢复,如果重启时发现所需归档被删除,
则extract进程将无法重启,只能通过恢复归档或初始化解决。即使归档存在,重启后进行recovery会很大程度影响OGG 的性能。
长事务的处理
1.监控ggserr.log 中的长事务警告,可以通过配置extract 进程参数warnlongtrans 调整警告频率
或者在抽取进程中配置参数
edit params exta
warnlongtrans 5h,checkintervals 1h
此参数的含义是每隔1h检查一下长交易,如果超过5h的长交易就会记录在根目录的ggserr.log中
2.监控数据库中的长事务
---判断是否有大事物:
select a.sid,
a.serial#,
a.user#,
a.username,
b.addr,
b.USED_UBLK,
b.USED_UREC,
b.START_TIME,
b.xidusn,
b.XIDSLOT,
b.xidsqn
from v$transaction b, v$session a
where /*b.addr in (select a.taddr from v$session a where a.sid = '') and*/ b.addr=a.taddr order by start_time
3.ggsci提供了如下命令来处理未提交事务
send extract ,showtrans查看正在处理的未提交事务
send extract <进程名> , showtrans [thread n] [count n]
其中,<进程名>为所要察看的进程名,如extsz/extxm/extjx等;Thread n是可选的,表示只查看其中一个节点上的未提交交易;Count n也是可选的,表示只显示n条记录。
send extract ,skiptrans跳过事务(不建议)
SEND EXTRACT <进程名>, SKIPTRANS <5.17.27634> THREAD <2> //跳过交易
send extract ,forcetrans强制认为事务已经提交(不建议)
Ggsci>SEND EXTRACT <进程名>, FORCETRANS <5.17.27634> THREAD <1> //强制认为该交易已经提交
建议所有的事务提交或回滚操作都在数据库中进行
检查是否有长事物
GGSCI (bosdb1) 428> info exti ,showch
EXTRACT EXTI Last Started 2015-03-16 08:49 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:00 ago)
Log Read Checkpoint Oracle Redo Logs
2015-03-30 17:51:32 Thread 1, Seqno 227332, RBA 832172448
Log Read Checkpoint Oracle Redo Logs
2015-03-30 17:51:32 Thread 2, Seqno 237971, RBA 1087576160
Current Checkpoint Detail:
Read Checkpoint #1
Oracle RAC Redo Log
Startup Checkpoint (starting position in the data source):
Thread #: 1
Sequence #: 225535
RBA: 339018768
Timestamp: 2015-03-15 17:44:29.000000
SCN: 36.583726867 (155202549523)
Redo File: Not Avaliable
Recovery Checkpoint (position of oldest unprocessed transaction in the data source):
Thread #: 1
Sequence #:
227332
RBA: 832024080
Timestamp: 2015-03-30 17:51:32.000000
SCN: 37.441459345 (159355249297)
Redo File: Not Avaliable
Current Checkpoint (position of last record read in the data source):
Thread #: 1
Sequence #:
227332
RBA: 832172448
Timestamp: 2015-03-30 17:51:32.000000
SCN: 37.441460372 (159355250324)
Redo File: +ASMREDOLOG/zjsbos/onlinelog/group_9
Read Checkpoint #2
Oracle RAC Redo Log
Startup Checkpoint (starting position in the data source):
Thread #: 2
Sequence #:
235385
RBA: 340122128
Timestamp: 2015-03-15 16:57:08.000000
SCN: 36.575439230 (155194261886)
Redo File: +ASMREDOLOG/zjsbos/onlinelog/group_17
Recovery Checkpoint (position of oldest unprocessed transaction in the data source):
Thread #: 2
Sequence #:
237944
RBA: 1542672
Timestamp: 2015-03-30 14:12:05.000000
SCN: 37.390664240 (159304454192)
Redo File: Not Avaliable
Current Checkpoint (position of last record read in the data source):
Thread #: 2
Sequence #: 237971
RBA: 1087576160
Timestamp: 2015-03-30 17:51:32.000000
SCN: 37.441460346 (159355250298)
Redo File: +ASMREDOLOG/zjsbos/onlinelog/group_22
Write Checkpoint #1
GGS Log Trail
Current Checkpoint (current write position):
Sequence #:
91029
RBA: 9222693
Timestamp: 2015-03-30 17:51:32.728005
Extract Trail: /ggs/dirdat/ai
如果 current checkpoint 和Recovery Checkpoint 不一致的话,说明存在长事物
GGSCI (bosdb1) 426> send extract exta ,showtrans thread 2 count 10
2015-03-30 17:45:01 INFO OGG-00987 GGSCI command (oracle): send extract exta showtrans thread 2 count 10.
Sending showtrans request to EXTRACT EXTA ...
Oldest redo log files necessary to restart Extract are:
Redo Thread 1, Redo Log Sequence Number 227332, SCN 37.439599992 (159353389944), RBA 182787088
Redo Thread 2, Redo Log Sequence Number 237944, SCN 37.390664240 (159304454192), RBA 1542672
------------------------------------------------------------
XID:
66.18.50273680 ---通过此XID 和v$transaction 中的 XIDUSN XIDSLOT XIDSQN 这三个值对应
Items: 0
Extract: EXTA
Redo Thread: 2
Start Time: 2015-03-30:14:12:05
SCN: 37.390664240 (159304454192)
Redo Seq: 237944
Redo RBA: 1542672
Status: Running
------------------------------------------------------------
XID: 166.3.18169807
Items: 0
Extract: EXTA
Redo Thread: 2
Start Time: 2015-03-30:17:40:13
SCN: 37.438521190 (159352311142)
Redo Seq: 237969
Redo RBA: 1170547216
Status: Running
------------------------------------------------------------
XID: 70.24.37070746
Items: 0
Extract: EXTA
Redo Thread: 2
Start Time: 2015-03-30:17:44:57
SCN: 37.439749060 (159353539012)
Redo Seq: 237970
Redo RBA: 825525776
Status: Running
SQL> select a.sid,
2 a.serial#,
3 a.user#,
4 a.username,
5 b.addr,
6 b.USED_UBLK,
7 b.USED_UREC,
8 b.START_TIME,
9 b.xidusn,
10 b.XIDSLOT,
11 b.xidsqn
12 from v$transaction b, v$session a
13 where /*b.addr in (select a.taddr from v$session a where a.sid = '') and*/ b.addr=a.taddr order by start_time
14 ;
SID SERIAL# USER# USERNAME ADDR USED_UBLK USED_UREC START_TIME XIDUSN XIDSLOT XIDSQN
---------- ---------- ---------- ------------------------------ ---------------- ---------- ---------- -------------------- ---------- ---------- ----------
2515 60746 87 B 07000026FE5C7B78 240 11521 03/30/15 14:12:03 66 18 50273680
339 20908 87 B 070000276EF65600 107 5942 03/30/15 17:49:34 142 4 26914958
通过sid 找到对应的sql_id,可以知道对应的长事物是哪个,是否有必要提交或者是回滚。
或者用这个sql查看事物
select t.start_time, t.xidusn||'.'||t.xidslot||'.'||t.xidsqn xid, s.status,
s.sid,s.serial#,s.username,s.status,s.schemaname,
decode(s.sql_id,null,s.prev_sql_id) sqlid, decode(s.sql_child_number,null,s.prev_child_number) child
from v$transaction t, v$session s
where s.saddr = t.ses_addr
order by t.start_time