前言:
本文档主要描述通过Oracle Grid Infrastructure Agents (XAG)基于Oracle RAC实现GoldenGate(OGG)软件高可用的实施操作
环境信息:
源端 |
目标端 |
||
节点一IP 节点二IP |
192.168.1.84 192.168.1.86 |
节点一IP 节点二IP |
192.168.1.200 192.168.1.210 |
VIP |
192.168.1.88 |
VIP |
192.168.1.204 |
数据库版本 |
11.2.0.4 |
数据库版本 |
12.2.0.1 |
OGG版本 |
19.1.0.0.4 |
OGG版本 |
19.1.0.0.4 |
XAG版本 |
10.2 |
XAG版本 |
7.1.0 |
EXTRACT进程 |
E_BSS_1 |
REPLICAT进程 |
R_ING_1 |
PUMP进程 |
P_BSR_1 |
源端OGG高可用配置:
1 配置vip资源:
--添加vip资源,这里使用集群新建的network 2网络,可以使用默认的network 1,同网段即可
#network is the network number that you want to use.
#ip is the IP address provided by your system administrator for the new Application VIP. This IP address must be in the same subnet as determined above.
#gg_vip_source is the name of the application VIP that you will create.
/u01/app/11.2.0/grid/bin/appvipcfg create -network=2 \
-ip=192.168.1.88 \
-vipname=gg_vip_source \
-user=root
--root授权Oracle用户有启动vip的权限
/u01/app/11.2.0/grid/bin/crsctl setperm resource gg_vip_source -u user:oracle:r-x
--Oracle用户启动资源
/u01/app/11.2.0/grid/bin/crsctl start resource gg_vip_source
--验证资源
[oracle@rac1 ~]$ /u01/app/11.2.0/grid/bin/crsctl status resource gg_vip_source
NAME=gg_vip_source
TYPE=app.appvip_net2.type
TARGET=ONLINE
STATE=ONLINE on rac2
2 配置ACFS资源:
--配置ACFS,在节点一操作即可
--添加vol卷
su -grid
ASMCMD> volcreate -G DATA -s 5G acfsvol
--查看acfs生成的卷组
su - grid
ASMCMD> volinfo -G DATA acfsvol
Diskgroup Name: DATA
Volume Name: ACFSVOL
Volume Device: /dev/asm/acfsvol-119
State: ENABLED
Size (MB): 5120
Resize Unit (MB): 32
Redundancy: UNPROT
Stripe Columns: 4
Stripe Width (K): 128
Usage:
Mountpath:
--挂载/ogg 文件系统
--确认asm下面生成acfs卷组
# ls /dev/asm
--创建挂载目录
su - root
mkdir /ogg
--格式划acfs
su - root
# /sbin/mkfs -t acfs /dev/asm/acfsvol-119
--加入集群管理
su - root
/u01/app/11.2.0/grid/bin/srvctl add filesystem -d /dev/asm/acfsvol-119 -g 'DATA' -v ACFSVOL -m /ogg -u oracle
--挂载
/u01/app/11.2.0/grid/bin/srvctl start filesystem -d /dev/asm/acfsvol-119
--查看是否online以及mount
su - grid
crsctl stat res -t |grep -i acfs
ora.data.acfsvol.acfs
ora.registry.acfs
df -h
--如果没有online,再手动mount
su - root
# mount.acfs -o all
--授予目录权限
su - root
# chown oracle.oinstall /ogg
3 安装ogg软件:
--安装ogg软件,要安装在acfs所在的目录/ogg里面
cd fbo_ggs_Linux_x64_shiphome/
cd Disk1/
./runInstaller
4 ogg开启ddl支持:
sqlplus / as sysdba
@marker_setup.sql
@ddl_setup.sql
@role_setup.sql
GRANT GGS_GGSUSER_ROLE TO ogg_owner;
@ddl_enable.sql
@?/rdbms/admin/dbmspool.sql
@ddl_pin ogg_owner
5 配置MGR进程:
--需要注意配置的参数是autostart,因为高可用是依靠mgr去自动启动进程的
GGSCI (rac1) 3> edit params mgr
port 7809
DYNAMICPORTLIST 7940-8100
AUTOSTART ER *
AUTORESTART ER *,RETRIES 10, WAITMINUTES 1, RESETMINUTES 60
PURGEOLDEXTRACTS /ogg/dirdat/*,USECHECKPOINTS,MINKEEPDAYS 5
LAGREPORTHOURS 1
LAGINFOMINUTES 30
LAGCRITICALMINUTES 45
--启动mgr进程
GGSCI (rac1) 5> start mgr
6 源端添加表级附加日志:
--连接数据库
GGSCI>dblogin userid ogg password ogg
--添加表级附加日志
GGSCI>add trandata test.*
--显示日志添加情况
GGSCI>info trandata test.*
7 配置EXTRACT进程:
--添加extract进程
add extract E_BSS_1 tranlog,begin now,threads 2
add exttrail ./dirdat/es,extract E_BSS_1,megabytes 1000
--需要注意的是要通过tns去连接数据库,而不是ORACLE_SID方式,并且连接的tns需要可以访问到全部节点,最好连scan_ip
--配置ogg连接数据库的tns
ORCL =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = rac-scan)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orcl)
)
)
--配置extract进程
GGSCI (rac1) 3> edit params e_bss_1
extract E_BSS_1
SETENV (ORACLE_HOME = "/u01/app/oracle/product/11.2.0/db_1")
SETENV (NLS_LANG = "AMERICAN_AMERICA.ZHS16GBK")
userid ogg@orcl, password ogg
exttrail ./dirdat/es
gettruncates
TRANLOGOPTIONS INCLUDEREGIONID,EXCLUDEUSER ogg
TRANLOGOPTIONS BUFSIZE 2048000
TRANLOGOPTIONS DBLOGREADER,DBLOGREADERBUFSIZE 2048000
DISCARDFILE ./dirrpt/e_bss_1.dsc,APPEND,MEGABYTES 1000
DISCARDROLLOVER AT 6:00
REPORTROLLOVER AT 6:00
REPORTCOUNT EVERY 1 HOURS,RATE
DDL INCLUDE MAPPED , OBJTYPE 'TABLE' &
INCLUDE MAPPED OBJTYPE 'INDEX'
DDLOPTIONS ADDTRANDATA RETRYOP RETRYDELAY 10 MAXRETRIES 10
DDLOPTIONS REPORT
FETCHOPTIONS MISSINGROW ABEND
STATOPTIONS REPORTFETCH
WARNLONGTRANS 1H,CHECKINTERVAL 10m
DYNAMICRESOLUTION
TABLE test.test;
8 配置PUMP进程:
--添加投递进程
add extract P_BSR_1,exttrailsource ./dirdat/es
add RMTTRAIL ./dirdat/rs,ext P_BSR_1,megabytes 1000
--需要注意的是这里远程的IP是目标端的VIP,这样才能在failover之后,依然可以正常投递
--配置投递进程参数
GGSCI (rac1) 3> edit params P_BSR_1
extract P_BSR_1
rmthost 192.168.1.204, mgrport 7809
rmttrail ./dirdat/rs
passthru
DISCARDFILE ./dirrpt/p_bsr_1.dsc,APPEND,MEGABYTES 1000
DISCARDROLLOVER AT 6:00
REPORTROLLOVER AT 6:00
REPORTCOUNT EVERY 1 HOURS,RATE
TABLE TEST.TEST;
9 添加源端OGG高可用服务实例:
++++++++++++++++++++++++++++++++++++++++++++++++参数说明
agctl add goldengate --help
Adds Goldengate instance to Oracle Clusterware.
Instance name
--gg_home GoldenGate home directory
--instance_type {source | target | dual}
--server_pool Name of the Server pool where instance can be run
--nodes [,...] List of nodes where instance can be run
--vip_name Virtual IP name
--network Network number
--ip IP address
--user Operating System user name that owns the instance
--group Name of the group to which the Operating System user belongs
--oracle_home ORACLE_HOME location
--databases [,...] List of database instance dependencies
--db_services [,...] List of database service dependencies
--filesystems [,...] List of file system resource dependencies
--attribute =[,...] Attributes that can be applied
--environment_vars [,...] Additional environment variables to set
--monitor_extracts [,...] List of EXTRACT groups to monitor
--monitor_replicats [,...] List of REPLICAT groups to monitor
--critical_extracts [,...] List of critical EXTRACT groups to monitor and failover
--critical_replicats [,...] List of critical REPLICAT groups to monitor and failover
--dataguard_autostart Start GoldenGate on DataGuard role transition to PRIMARY
--jagent_autostart Start JAgent on GoldenGate startup
--agctl添加高可用source_ogg实例,oracle用户执行以及管理
--注意这里源端VIP不是必需的,因为源端不需要用到VIP
--注意这里的database也可以不用设置,这里的设置主要是强关联database,这样当节点的database关闭会触发ogg failover
agctl add goldengate source_ogg --gg_home /ogg \
--instance_type source \
--nodes rac1,rac2 \
--vip_name gg_vip_source \
--filesystems ora.data.acfsvol.acfs\
--databases ora.orcl.db \
--oracle_home /u01/app/oracle/product/11.2.0/db_1 \
--monitor_extracts E_BSS_1,P_BSR_1
--查看实例配置
[oracle@rac1 ogg]$ agctl config goldengate source_ogg
Instance name: source_ogg
Application GoldenGate location is: /ogg
Goldengate MicroServices Architecture environment: no
GoldenGate instance type is: source
EXTRACT groups to monitor: E_BSS_1,P_BSR_1
REPLICAT groups to monitor:
Critical EXTRACT groups:
Critical REPLICAT groups:
Autostart on DataGuard role transition to PRIMARY: no
Autostart JAgent: no
Configured to run on Nodes: rac1 rac2
ORACLE_HOME location is: /u01/app/oracle/product/11.2.0/db_1
Databases needed: ora.orcl.db
File System resources needed: ora.data.acfsvol.acfs
VIP name: gg_vip_source
10 启动源端OGG高可用服务实例:
--开启source_ogg实例,oracle用户执行
[oracle@rac1 ogg]$ agctl start goldengate source_ogg --node rac1
--会把mgr进程启动,mgr会把extract进程,pump进程启动
[oracle@rac1 ogg]$ ./ggsci
Oracle GoldenGate Command Interpreter for Oracle
Version 19.1.0.0.4 OGGCORE_19.1.0.0.0_PLATFORMS_191017.1054_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Oct 17 2019 23:13:12
Operating system character set identified as UTF-8.
Copyright (C) 1995, 2019, Oracle and/or its affiliates. All rights reserved.
GGSCI (rac1) 1> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING E_BSS_1 00:00:00 00:00:03
EXTRACT RUNNING P_BSR_1 00:00:00 00:00:00
--查看集群服务状态,当前服务在节点一
crsctl stat res -t
Cluster Resources
--------------------------------------------------------------------------------
gg_vip_source
1 ONLINE ONLINE rac1
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rac1
ora.bkrac1-vip.vip
1 ONLINE ONLINE rac1
ora.bkrac2-vip.vip
1 ONLINE ONLINE rac2
ora.cvu
1 ONLINE ONLINE rac1
ora.oc4j
1 ONLINE ONLINE rac2
ora.orcl.db
1 ONLINE ONLINE rac1 Open
2 ONLINE ONLINE rac2 Open
ora.rac1.vip
1 ONLINE ONLINE rac1
ora.rac2.vip
1 ONLINE ONLINE rac2
ora.scan1.vip
1 ONLINE ONLINE rac1
xag.source_ogg.goldengate
1 ONLINE ONLINE rac1
GGSCI (rac1) 2>
--关闭source_ogg实例,用oracle用户执行
agctl stop goldengate source_ogg
--手动切换source_ogg实例到去其他节点,用oracle用户执行
agctl relocate goldengate source_ogg --node rac2
目标端OGG高可用配置:
1 配置VIP,ACFS,OGG软件安装参照源端配置即可
2 配置MGR进程:
--需要注意配置的参数是autostart,因为高可用是依靠mgr去自动启动进程的
GGSCI (rac1) 3> edit params mgr
port 7809
DYNAMICPORTLIST 7940-8100
AUTOSTART ER *
AUTORESTART ER *,RETRIES 10, WAITMINUTES 1, RESETMINUTES 60
PURGEOLDEXTRACTS /ogg/dirdat/*,USECHECKPOINTS,MINKEEPDAYS 5
LAGREPORTHOURS 1
LAGINFOMINUTES 30
LAGCRITICALMINUTES 45
--启动mgr进程
GGSCI (rac1) 5> start mgr
3 配置replicat应用进程:
--创建检查表
GGSCI> dblogin userid ogg password ogg
GGSCI > add checkpointtable ogg.rep_ogg_ckpt
--添加应用进程
GGSCI> add replicat R_ING_1,exttrail ./dirdat/rs,checkpointtable ogg.rep_ogg_ckpt
--需要注意的是要通过tns去连接数据库,而不是ORACLE_SID方式,并且连接的tns需要可以访问到全部节点,最好连scan_ip
--配置ogg连接数据库的tns
ORCL =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = rac-scan)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orcl)
)
)
--配置应用进程
GGSCI (rac1) 3> edit params r_ing_1
REPLICAT R_ING_1
SETENV (ORACLE_HOME = "/oracle/app/oracle/product/12.2/db_1")
SETENV (NLS_LANG = "AMERICAN_AMERICA.AL32UTF8")
userid ogg@orcl, password ogg
ASSUMETARGETDEFS
DBOPTIONS DEFERREFCONST
DBOPTIONS SUPPRESSTRIGGERS
DISCARDFILE ./dirrpt/r_ing_1.dsc, APPEND, MEGABYTES 1000
DISCARDROLLOVER AT 6:00
REPERROR (DEFAULT, ABEND)
REPERROR (24344, DISCARD)
gettruncates
DDL INCLUDE MAPPED , OBJTYPE 'TABLE' &
INCLUDE MAPPED OBJTYPE 'INDEX'
DDLOPTIONS REPORT
4 添加目标端OGG高可用服务实例:
--agctl添加高可用target_ogg实例,用oracle用户执行以及管理
--注意这里的database也可以不用设置,这里的设置主要是强关联database,这样当节点的database关闭会触发ogg failover
agctl add goldengate target_ogg --gg_home /ogg \
--instance_type target \
--nodes rac1,rac2 \
--vip_name gg_vip_source \
--filesystems ora.mgmt.acfsvol.acfs \
--databases ora.orcl.db \
--oracle_home /oracle/app/oracle/product/12.2/db_1 \
--monitor_replicats R_ING_1
--查看实例配置
[oracle@rac1 ogg]$ agctl config goldengate target_ogg
GoldenGate location is: /ogg
GoldenGate instance type is: target
Configured to run on Nodes: rac1 rac2
ORACLE_HOME location is: /oracle/app/oracle/product/12.2/db_1
Databases needed: ora.orcl.db
File System resources needed: ora.mgmt.acfsvol.acfs
VIP name: gg_vip_source
EXTRACT groups to monitor: E_BSS_1,P_BSR_1
REPLICAT groups to monitor: R_ING_1
Critical EXTRACT groups:
Critical REPLICAT groups:
Autostart on DataGuard role transition to PRIMARY: no
Autostart JAgent: no
5 启动目标端OGG高可用服务实例:
--开启source_ogg实例,用oracle用户执行
[oracle@rac1 ogg]$ agctl start goldengate target_ogg --node rac1
--会把mgr进程启动,mgr会把extract进程,pump进程启动
[oracle@rac1 ogg]$ ./ggsci
[oracle@rac1 ogg]$ ./ggsci
Oracle GoldenGate Command Interpreter for Oracle
Version 19.1.0.0.4 OGGCORE_19.1.0.0.0_PLATFORMS_191017.1054_FBO
Linux, x64, 64bit (optimized), Oracle 12c on Oct 18 2019 01:38:51
Operating system character set identified as UTF-8.
Copyright (C) 1995, 2019, Oracle and/or its affiliates. All rights reserved.
GGSCI (rac1) 1> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT RUNNING R_ING_1 00:00:00 00:00:00
--查看集群服务状态,当前服务在节点一
crsctl stat res -t
Cluster Resources
--------------------------------------------------------------------------------
gg_vip_source
1 ONLINE ONLINE rac1 STABLE
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rac2 STABLE
ora.MGMTLSNR
1 ONLINE ONLINE rac1 169.254.179.17 192.1
68.2.10,STABLE
ora.asm
1 ONLINE ONLINE rac1 Started,STABLE
2 ONLINE ONLINE rac2 Started,STABLE
3 OFFLINE OFFLINE STABLE
ora.cvu
1 ONLINE ONLINE rac2 STABLE
ora.mgmtdb
1 ONLINE ONLINE rac1 Open,STABLE
ora.orcl.db
1 ONLINE ONLINE rac1 Open,HOME=/oracle/ap
p/oracle/product/12.
2/db_1,STABLE
2 ONLINE ONLINE rac2 Open,HOME=/oracle/ap
p/oracle/product/12.
2/db_1,STABLE
ora.qosmserver
1 ONLINE ONLINE rac2 STABLE
ora.rac1.vip
1 ONLINE ONLINE rac1 STABLE
ora.rac2.vip
1 ONLINE ONLINE rac2 STABLE
ora.scan1.vip
1 ONLINE ONLINE rac2 STABLE
xag.target_ogg.goldengate
1 ONLINE ONLINE rac1 STABLE
--------------------------------------------------------------------------------
GGSCI (rac1) 2>
--关闭source_ogg实例,用oracle用户执行
agctl stop goldengate target_ogg
--手动切换source_ogg实例到去其他节点,用oracle用户执行
agctl relocate goldengate target_ogg --node rac2
测试OGG高可用配置:
1 源端OGG节点主机重启:
--在源端开启会话,模拟表操作
declare
v_count number(10):=0;
begin
for i in 1..1000000 loop
insert into test.test select * from dba_objects where rownum<4;
delete from test.test where rownum<2;
update test.test
set owner='aaa'
where rownum<2;
v_count:=v_count+1;
if v_count>=1 THEN
commit;
v_count:=0;
end if;
end loop;
commit;
end;
/
--当前源端ogg所在节点主机rac1
[oracle@rac1 ~]$ agctl status goldengate source_ogg
Goldengate instance 'source_ogg' is running on rac1
--重启主机rac1
--ogg实例服务会自动转移到节点二rac2
[oracle@rac2 ogg]$ agctl status goldengate source_ogg
Goldengate instance 'source_ogg' is running on rac2
[oracle@rac2 ogg]$ ./ggsci
Oracle GoldenGate Command Interpreter for Oracle
Version 19.1.0.0.4 OGGCORE_19.1.0.0.0_PLATFORMS_191017.1054_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Oct 17 2019 23:13:12
Operating system character set identified as UTF-8.
Copyright (C) 1995, 2019, Oracle and/or its affiliates. All rights reserved.
GGSCI (rac2) 1> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING E_BSS_1 00:00:02 00:00:10
EXTRACT RUNNING P_BSR_1 00:00:00 00:00:02
GGSCI (rac2) 2>
--目标端进程无感知
--源端和目标端两边的数据同步一致
2 目标端OGG节点主机重启:
--当前目标端ogg所在节点主机rac1
[oracle@rac1 ogg]$ agctl status goldengate target_ogg
Goldengate instance 'target_ogg' is running on rac1
--重启主机rac1
--ogg实例服务会自动转移到节点二rac2
[oracle@rac2 ~]$ agctl status goldengate target_ogg
Goldengate instance 'target_ogg' is running on rac2
GGSCI (rac2) 1> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT RUNNING R_ING_1 00:00:00 00:00:08
GGSCI (rac2) 2>
--源端的投递进程会短暂abend,因为vip要从目标节点一漂移到节点二,所以会短暂的连接不上,但由于配置了autorestart参数,所以mgr进程会尝试重新start投递进程
GGSCI (rac2) 11> info all
--出现abend
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING E_BSS_1 00:00:02 00:00:06
EXTRACT ABENDED P_BSR_1 00:00:00 00:01:39
--自己恢复正常
GGSCI (rac2) 12> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING E_BSS_1 00:00:02 00:00:00
EXTRACT RUNNING P_BSR_1 00:00:00 00:00:04
--源端和目标端两边的数据同步一致
source:
SQL> select count(*) from test.test;
COUNT(*)
----------
294962
SQL>
target:
select count(*) from test1.test
COUNT(*)
----------
294962