Oralce GoldenGate与Kafka集群集成

本文介绍如何配置Oracle GoldenGate 同步数据到Kafka,包括OGG源端安装配置,OGG for Big Data replication安装配置,以及如何与Kafka集群集成。





软件准备


1.     Oracle11.2.0.2

2. Oracle GoldenGate V12.2.0.1.1 for Oracleon Linux x86-64

部署在Oracle Server上。

3. Oracle GoldenGate for Big DataV12.2.0.1.1 on Linux x86-64

部署在Replication Server

4. zookeeper 3.4.6

部署在kafka servers上,用于failover

5. kafka 2.11.0.9.1


kafka集群安装配置


三台节点信息如下:

192.168.2.105 kafka01

192.168.2.107 kafka02

192.168.2.108 kafka03


节点间免登设置


1.在三台机器上创建kafka用户:

useradd kafka


2.配置三台机器的相互信任关系(SSH)

 kafka01机器上,切换到kafka用户

键入命令ssh-keygen -t rsa,三次回车后,在.ssh文件夹下生成一个私钥和一个公钥

将公钥生成一个authorized_keys命令:catid_rsa.pub >authorized_keys

分别在kafka02kafka03上执行ssh-keygen-t rsa命令,三台机器都生成了私钥和公钥

把在kafka01上生成的authorized_keys传到kafka02skafka03上去

scp authorized_keys kafka02:~/.ssh

kafka02的公钥添加到authorized_keys文件中

cat id_rsa.pub >>authorized_keys

kaka02中把authorized_keys文件scpkafka03

 kafka03.ssh目录,同样将公钥追加到authorized_keys文件

cat id_rsa.pub >>authorized_keys

将此文件分别传给kafka02kafka01,使三台机器的authorized_keys文件一致

到此,三台节点已经建立好无密码登陆


Zookeeper安装配置


以下例子中是把zookeeper分布安装在3个节点上:

192.168.2.105 kafka01

192.168.2.107 kafka02

192.168.2.108 kafka03

  • 下载zookeeper3.4.6

http://zookeeper.apache.org/releases.html

  • kafka01上作如下操作

mkdir /home/kafka/zookeeper

释放zookeeper-3.4.6.tar.gz/home/kafka/ zookeeper

  • 配置zoo.cfg

cd /home/kafka/ zookeeper

cp zoo_sample.cfg zoo.cfg

vi zoo.cfg


设置如下参数:

dataDir=/home/kafka/zookeeper/data

server.1= kafka01:2888:3888

server.2= kafka02:2888:3888

server.3= kafka03:2888:3888

  • 创建节点标识

cd /home/kafka/zookeeper

mkdir data

cd data

vi myid

输入"1


复制安装包到kafka02,kafka03上并且修改myid内容分别为2和3


scp -r zookeeper-3.4.6 kafka02:~


scp -r zookeeper-3.4.6 kafka03:~


  • 启动Zookeeper


在三个节点上分别启动zk zkServer.sh start


查看节点状态 zkServer.sh status


Kafka安装配置


在三个节点上做如下操作 


修改Kafka Servre配置文件


cd /home/kafka/kafka/config

vi server.properties


修改broker.id=1kafka01为1kafka02为2kafka03为3

修改zookeeper.connect=kafka01:2181,kafka02:2181,kafka03:2181

注意:需要配置host.name=kafka节点的IP地址

                               advertised.host.name = kafka节点的IP地址

                               advertised.port=kafka服务的port(这里使用默认的9092)

如果没有配置上面的参数,客户端可能无法发送消息到kafka集群。


后台启动kakfa


在集群中的这三个节点上分别后台启动Kafka,分别执行如下命令:

cd /home/kafka/kafka/bin

./kafka-server-start.sh/home/kafka/kafka/config/server.properties &


测试


创建一个名称为oggtestTopic5个分区,并且复制因子为3,执行如下命令:

./kafka-topics.sh –create –zookeeperkafka01:2181,kafka02:2181,kafka03:2181 –replication-factor 3 –partitions 5–topic oggtest


查看创建的Topic,执行如下命令:(可以在任意节点上查看,这里在slave2

./kafka-topics.sh –describe –zookeeperkafka01:2181,kafka02:2181,kafka03:2181–topic oggtest

查看所有topic命令

./kafka-topics.sh –describe –zookeeperkafka01:2181,kafka02:2181,kafka03:2181


我们可以通过Kafka自带的bin/kafka-console-producer.shbin/kafka-console-consumer.sh脚本,来验证演示如果发布消息、消费消息。


在一个终端,启动Producer,并向我们上面创建的名称为oggtestTopic中生产消息,执行如下脚本:


./kafka-console-producer.sh –broker-listkafka01:9092,kafka02:9092,kafka03:9092 –topic oggtest


在另一个终端,启动Consumer,并订阅我们上面创建的名称为my-replicated-topic5Topic中生产的消息,执行如下脚本:

./kafka-console-consumer.sh –zookeeperkafka01:2181,kafka02:2181,kafka03:2181 –from-beginning –topic oggtest


Oracle源端部署与配置


Oracle数据源端开启archive logging


1.检查是否开启Archive logging


SQL> archive log list;


Database log mode Archive Mode


Automatic archival Enabled


Archive destination /oracle/data/test/arch


Oldest online log sequence 49


Next log sequence to archive 51


Current log sequence 51


如果没有开启Archive logging,执行下面操作开启,该操作需要关闭Oracle


SQL> alter database mount;


Database altered.


SQL> alter database archivelog;


Database altered.


SQL> alter database open;


开启force logging minimalsupplemental logging


1.     检查是否开启forcelogging minimal supplemental logging


SQL > SELECT supplemental_log_data_min,force_logging FROM v$database;


SUPPLEME FORCE_LOGGING


________________________________________


YES YES


如果没有开启Supplemental logging,执行下面操作开启,该操作无需关闭Oracle


SQL > alter database add supplemental log data (primary key) columns;


Database altered.


SQL > alter database add supplemental log data (unique) columns;


Database altered.

注意:如果不指定Primary key 和unique 属性,OGG将不会传送PK字段或Unique indiex字段信息。这样,下游的应用,在处理update数据时将失去依据。

SQL > alter database force logging;


SQL> alter system switch logfile;


创建OGG帐户


SQL >create user ogg identified by oggaccount;


SQL >grant connect,resource to ogg;


SQL >grant select any dictionary to ogg;


SQL >grant select any table to ogg;


源端安装配置OGG


  • 下载OracleGoldenGate V12.2.0.1.1 for Oracle on Linux x86-64

http://www.oracle.com/technetwork/middleware/goldengate/downloads/index.html


  • 创建OGG home路径

su - oracle以下操作均在oracle帐户下进行

mkdir -p /opt/ogg

拷贝fbo_ggs_Linux_x64_shiphome.zip/opt/ogg


cd /opt/ogg

unzip fbo_ggs_Linux_x64_sh iphome.zip


  • 设置response参数


vi/opt/ogg/fbo_ggs_Solaris_sparc_shiphome/Disk1/response/oggcore.rsp


INSTALL_OPTION=ORA11g


SOFTWARE_LOCATION=/opt/ogg/


START_MANAGER=false


MANAGER_PORT=9001


DATABASE_LOCATION=$ORACLE_HOME


注意:由于我用的oracle版本是11.2.0.2.0,因此install option设置为ORA11g,如果用的是12C,这里要设置ORA12c


  • 运行安装程序


cd fbo_ggs_Linux_x64_shiphome/Disk1/


./runInstaller -silent -responseFile/opt/ogg/fbo_ggs_Solaris_sparc_shiphome/Disk1/response/oggcore.rsp


  • 设置环境变量


vi ~/.bash_profile


export ORACLE_HOME =


export OGG_HOME=


export PATH=PATH:PATH: ORACLE_HOME:OGGHOMEexportLDLIBRARYPATH=OGG H OMEexportLD L IBRARY P ATH=LD_LIBRARY_PATH:$ORACLE_HOME/lib


  • 配置OGG


: cd $OGG_HOME


: ./ggsci–启动ogg


创建相关子目录


cd /opt/ogg


./ggsci


GGSCI>create subdirs


  • 加密ogg帐户的登陆密码

    当然,也可以使用明文密码做测试。


GGSCI> encrypt password ogg BLOWFISH,ENCRYPTKEY DEFAULT


Using Blowfish encryption with DEFAULT key.


Encrypted password:AACAAAAAAAAAAADAHBLDCCIIOIRFNEPB


Algorithm used: BLOWFISH


  • 配置OGG manager


GGSCI>edit params mgr


输入下面脚本将配置manager使用9007端口运行OGG服务


PORT 9001


dynamicportlist 9901-9920,9930


autostart er *


autorestart er *,retries 4,waitminutes 4


startupvalidationdelay 5


purgeoldextracts/opt/ogg/dirdat/tt,usecheckpoints,minkeephours 2


启动OGG manager


GGSCI>start mgr


  • 注册登录用户


GGSCI>dblogin userid ogg,passwordAACAAAAAAAAAAADAHBLDCCIIOIRFNEPB, BLOWFISH, ENCRYPTKEY DEFAULT


  • 创建Extract


GGSCI>edit params ext_test


输入下面脚本创建了ext_testextract


extract ext_test


userid ogg@test,passwordAACAAAAAAAAAAADAHBLDCCIIOIRFNEPB, BLOWFISH, ENCRYPTKEY DEFAULT


rmthost 192.168.2.106,mgrport 9007


add exttrail /opt/ogg/dirdat/et, extract ext_test,megabytes 100


RMTfile /opt/ogg/dirdat/rt


table ogg.*; –any schema is fine, since ogghas open access for any table under any other account


  • 添加extract toOGG


GGSCI>add extract ext_test, INTEGRATEDTRANLOG, BEGIN NOW


extract指定远程trail文件路径及特征,注意特征必须是两个字符,这里为et


GGSCI>add rmttrail /opt/ogg/et,extractext_test


Replication安装配置


安装OGG for Big Data


  1. 下载Oracle GoldenGate for Big Data V12.2.0.1.1


122011_ggs_Adapters_Linux_x64.zip


  1. 在所有replication servers上创建OGG home路径,比如:/opt/ogg

  2. 编辑/etc/hosts,添加kafka Servers


vi /etc/hosts


192.168.2.105 kafka01


192.168.2.107 kafka02


192.168.2.108 kafka03


  1. 释放压缩文件包到OGG home路径:/opt/ogg

  2. 拷贝配置模板文件


export OGGHOME=/opt/oggcpOGG HOME=/opt/oggcp OGG_HOME/AdapterExamples/big-data/kafka/* $OGG_HOME/dirprm/


配置OGG for kafka


  1. 启动ogg,并创建相关子目录


cd /opt/ogg/


./ggsci


GGSCI>create subdirs


Creating subdirectories under currentdirectory /opt/ogg


Parameter files /opt/ogg/dirprm: created


Report files /opt/ogg/dirrpt: created


Checkpoint files /opt/ogg/dirchk: created


Process status files /opt/ogg/dirpcs:created


SQL script files /opt/ogg/dirsql: created


Database definitions files /opt/ogg/dirdef:created


Extract data files /opt/ogg/dirdat: created


Temporary files /opt/ogg/dirtmp: created


Credential store files /opt/ogg/dircrd:created


Masterkey wallet files /opt/ogg/dirwlt:created


Dump files /opt/ogg/dirdmp: created


  1. 配置manager


GGSCI>edit params mgr


输入下面脚本,分配9007端口给mgr服务


port 9007


ACCESSRULE, PROG REPLICAT, IPADDR 192.168.2.*,ALLOW


dynamicportlist 9901-9920,9930


autostart er *


autorestart er *,retries 4,waitminutes 4


startupvalidationdelay 5


purgeoldextracts/opt/ogg/dirdat/*,usecheckpoints,minkeephours


LAGREPORTHOURS 1


LAGINFOMINUTES 30


LAGCRITICALMINUTES 45


退出


GGSCI>exit


  1. 配置kafka.props


cd $OGG_HOME


vi ./dirprm/kafka.props


输入下面脚本


gg.handlerlist = kafkahandler


gg.handler.kafkahandler.type=kafka


gg.handler.kafkahandler.KafkaProducerConfigFile=custom_kafka_producer.properties


gg.handler.kafkahandler.TopicName =oggtest


gg.handler.kafkahandler.format=avro_op


gg.handler.kafkahandler.format=delimitedtext


gg.handler.kafkahandler.format.fieldDelimiter=|


gg.handler.kafkahandler.SchemaTopicName=myoggtest


gg.handler.kafkahandler.BlockingSend=false


gg.handler.kafkahandler.includeTokens=false


gg.handler.kafkahandler.mode=op


gg.handler.kafkahandler.maxGroupSize =100,1Mb


gg.handler.kafkahandler.minGroupSize =50,500Kb


goldengate.userexit.timestamp=utc


goldengate.userexit.writers=javawriter


javawriter.stats.display=TRUE


javawriter.stats.full=TRUE


gg.log=log4j


gg.log.level=INFO


gg.report.time=30sec


gg.classpath=/opt/ogg/dirprm/:/kafka/kfk/libs/:/usr/avro/:


javawriter.bootoptions=-Xmx512m -Xms32m-Djava.class.path=/kafka/oggforbigdata/ggjava/ggjava.jar


注意:


gg.handler.kafkahandler.TopicName必须指定kafka端定义的topic


gg.handler.kafkahandler.format下面配置使用文本,并用”|”相隔,kafka最终接收到如下格式的消息。


I|ORATEST.OGG_TEST|2016-10-1712:17:24.996945|2016-10-17T20:41:32.169000|00000000000000001556|11|test11


gg.classpath须指定相应的lib路径


  1. 配置custom_kafka_producer.properties


vi dirprm/custom_kafka_producer.properties


输入下面脚本,注意bootstrap.servers要指定Kafka的端口


bootstrap.servers=kafka01:9092,kafka02:9092, kafka03:9092


acks=1


compression.type=gzip


reconnect.backoff.ms=1000


value.serializer=org.apache.kafka.common.serialization.ByteArraySerializer


key.serializer=org.apache.kafka.common.serialization.ByteArraySerializer


100KB per partition


batch.size=102400


linger.ms=10000


  1. 定义replication


cd $OGG_HOME


./ggsci


GGSCI>edit params rep_test


输入下面脚本将接收源端ogg发来的test表数据流


REPLICAT rep_test


Command to add REPLICAT


add replicat rep_test, exttrail /opt/ogg/dirdat/et


TARGETDB LIBFILE libggjava.so SETproperty=dirprm/kafka.props


EXTFILE /kafka/testoggforbigdata/dirdat/et


DDL INCLUDE ALL


SOURCEDEFS /opt/ogg/dirdef/test.p


REPORTCOUNT EVERY 1 MINUTES, RATE


GROUPTRANSOPS 10000


MAP ogg., TARGET ogg.;


注意这里需要定义表结构,该文件可以存放多个表的定义信息。


下面是一个def文件示例:


Database type: ORACLE


Character set ID: UTF-8


National character set ID: UTF-16


Locale: neutral


Case sensitivity: 14 14 14 14 14 14 14 1414 14 14 14 11 14 14 14


TimeZone: +08:00


*


Definition for table OGG.TEST


Record length: 142


Syskey: 0


Columns: 2


ID 64 50 0 0 0 1 0 50 50 50 0 0 0 0 1 0 1 22 -1 0 0 0


NAME 64 80 56 0 0 1 0 80 80 0 0 0 0 0 1 0 10 1 -1 20 0 0


End of definition


  1. 指定Trail文件


GGSCI> add replicat rep_test, exttrail/opt/ogg/dirdat/et


注意:这里trail文件的location和特征要和源端Extract的定义一致


测试


1.启动kafka consumerconsole


kafka-console-consumer.sh –zookeeper :2181–topic oggtest –from-beginning


2.oracle端启动oggextract


cd OGGHOME./ggsciGGSCI>startmgrGGSCI>startextractexttest检查运行状态GGSCI>infoall3.启动Recplication,Replicationservercd OGG_HOME


./ggsci


GGSCI>start mgr


GGSCI>start replicat rep_test


4.在源端测试表中插入数据


sqlplus ogg/ogg@test


SQL>insert into ogg.test (id, name)values(1, ‘test01’);


SQL>commit;


5.查看kafka消费控制台是否接收到该行数据


 






你可能感兴趣的:(Oracle,GoldenGate)