多实例 canal应用-1个server+2个instance+2个client+2个mysql

canal应用-1个server+2个instance+2个client+2个mysql 原创

一 canal应用架构设计

多实例 canal应用-1个server+2个instance+2个client+2个mysql_第1张图片
组件说明:

  • 1 . linux内核版本(CentOS Linux 7):(命令:uname -a)
    Linux slave1 3.10.0-693.el7.x86_64 #1 SMP Tue Aug 22 21:09:27 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
  • 2.mysql版本:(SQL命令:select version(); 或 status)
    Server version: 5.6.43-log MySQL Community Server (GPL)
  • 3.canal版本:canal-1.1.3
  • 4.JDK版本: 1.8

canal工作原理:

  • 1.模拟mysql slave的交互协议,伪装自己为mysql slave,向mysql master发送dump协议;
  • 2.mysql master收到dump请求,开始推送binary log给slave(也就是canal);
  • 3.解析binary log对象(原始为byte流)

了解更多详细更新可以查看文章:【了解canal,看这个就够了】

二 架构落地实现流程

2.1 mysql配置与安装

1. 下载安装

在192.168.175.21和192.168.175.22上分别安装mysql,具体安装流程可参考文章:Linux-安装MySQL.

2. 创建canal账户

在创建root账号并设置远程访问之后,接着创建canal账号并设置远程访问和权限:

 
  1. mysql> CREATE USER 'canal'@'%' IDENTIFIED BY 'canal';

  2. mysql> GRANT ALL ON canal.* TO 'canal'@'%';

  3. mysql> GRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'canal'@'%';

  4. mysql>FLUSH PRIVILEGES;

3. 验证登录

 
  1. #远程登录

  2. mysql -h 192.168.175.22 -P 3306 -u canal -pcanal

  3.  
  4. #本地登录

  5. mysql -ucanal -pcanal

4. 修改my.cnf配置

分别在175.21和175.22两台服务器修改my.conf配置,查找my.cnf配置位置命令:whereis my.

192.168.175.21中的my.cnf配置新增如下内容:

 
  1. log_bin=mysql-bin #指定bin-log的名称,尽量可以标识业务含义

  2. binlog_format=row #选择row模式,必须!!!

  3. server_id=1 #mysql服务器id

2.2 canal server配置与启动

1. 下载canal

下载地址: https://github.com/alibaba/canal/releases/download/canal-1.1.3/canal.deployer-1.1.3.tar.gz

2.上传并解压

进入192.168.175.20服务器,使用rz命令上传,使用如下命令进行解压至/usr/local/hadoop/app/canal:

tar xzvf canal.deployer-1.1.3.tar.gz -C canal

3. 修改配置

新解压的文件夹/usr/local/hadoop/app/canal/conf/有一个example文件夹,一个example就代表一个instance实例.而一个instance实例就是一个消息队列,所以这里可以将文件名改为example1,同时再复制出来一个叫example2.(命名可以使用监听的数据库名)

修改/usr/local/hadoop/app/canal/conf/example1/instance.properties配置文件:

 
  1. canal.instance.master.address=192.168.175.21:3306

  2. canal.instance.dbUsername=canal

  3. canal.instance.dbPassword=canal

  4. canal.instance.connectionCharset = UTF-8

  5. canal.mq.topic=example1

修改/usr/local/hadoop/app/canal/conf/example2/instance.properties配置文件:

 
  1. canal.instance.master.address=192.168.175.22:3306

  2. canal.instance.dbUsername=canal

  3. canal.instance.dbPassword=canal

  4. canal.instance.connectionCharset = UTF-8

  5. canal.mq.topic=example2

配置文件参数说明,可查看:https://github.com/alibaba/canal/wiki/AdminGuide

4. 启动canal server

进入文件夹/usr/local/hadoop/app/canal/bin执行如下命令:

./startup.sh

查看日志/usr/local/hadoop/app/canal/logs/canal/canal.log,出现如下内容,即表示启动成功:

 
  1. 2019-06-07 21:15:03.372 [main] INFO com.alibaba.otter.canal.deployer.CanalLauncher - ## load canal configurations

  2. 2019-06-07 21:15:03.427 [main] INFO c.a.o.c.d.monitor.remote.RemoteConfigLoaderFactory - ## load local canal configurations

  3. 2019-06-07 21:15:03.529 [main] INFO com.alibaba.otter.canal.deployer.CanalStater - ## start the canal server.

  4. 2019-06-07 21:15:06.251 [main] INFO com.alibaba.otter.canal.deployer.CanalController - ## start the canal server[192.168.175.22:11111]

  5. 2019-06-07 21:15:22.245 [main] INFO com.alibaba.otter.canal.deployer.CanalStater - ## the canal server is running now ......

5. 启动canal client

注意运行canal客户端代码时,一定要先启动canal server!!!

(1) 添加pom依赖

 
  1. com.alibaba.otter

  2. canal.client

  3. 1.1.3

(2) canal client代码:

 
  1. package com.xgh.canal;

  2.  
  3.  
  4. import java.net.InetSocketAddress;

  5. import java.util.List;

  6.  
  7. import com.alibaba.otter.canal.client.CanalConnector;

  8. import com.alibaba.otter.canal.client.CanalConnectors;

  9. import com.alibaba.otter.canal.protocol.CanalEntry.Column;

  10. import com.alibaba.otter.canal.protocol.CanalEntry.Entry;

  11. import com.alibaba.otter.canal.protocol.CanalEntry.EntryType;

  12. import com.alibaba.otter.canal.protocol.CanalEntry.EventType;

  13. import com.alibaba.otter.canal.protocol.CanalEntry.RowChange;

  14. import com.alibaba.otter.canal.protocol.CanalEntry.RowData;

  15. import com.alibaba.otter.canal.protocol.Message;

  16.  
  17. public class CanalClientTest {

  18.  
  19. public static void main(String args[]) {

  20. // 创建链接

  21. CanalConnector connector = CanalConnectors.newSingleConnector(new InetSocketAddress("192.168.175.20", 11111),

  22. "example1", "", "");//或者example2

  23. int batchSize = 1000;

  24. int emptyCount = 0;

  25. try {

  26. connector.connect();

  27. connector.subscribe(".*\\..*");//订阅所有库下面的所有表

  28. //connector.subscribe("canal.t_canal");//订阅库canal库下的表t_canal

  29. connector.rollback();

  30. int totalEmtryCount = 1200;

  31. while (emptyCount < totalEmtryCount) {//实际生产中需要设置为true,死循环

  32. Message message = connector.getWithoutAck(batchSize); // 获取指定数量的数据

  33. long batchId = message.getId();

  34. int size = message.getEntries().size();

  35. if (batchId == -1 || size == 0) {

  36. emptyCount++;

  37. System.out.println("empty count : " + emptyCount);//此時代表當前數據庫無遍更數據

  38. try {

  39. Thread.sleep(1000);

  40. } catch (InterruptedException e) {

  41. e.printStackTrace();

  42. }

  43. } else {

  44. emptyCount = 0;

  45. System.out.printf("message[batchId=%s,size=%s] \n", batchId, size);

  46. printEntry(message.getEntries());

  47. }

  48.  
  49. connector.ack(batchId); // 提交确认

  50. // connector.rollback(batchId); // 处理失败, 回滚数据

  51. }

  52.  
  53. System.out.println("empty too many times, exit");

  54. } finally {

  55. connector.disconnect();

  56. }

  57. }

  58.  
  59. private static void printEntry(List entrys) {

  60. for (Entry entry : entrys) {

  61. if (entry.getEntryType() == EntryType.TRANSACTIONBEGIN

  62. || entry.getEntryType() == EntryType.TRANSACTIONEND) {

  63. continue;

  64. }

  65.  
  66. RowChange rowChage = null;

  67. try {

  68. rowChage = RowChange.parseFrom(entry.getStoreValue());

  69. } catch (Exception e) {

  70. throw new RuntimeException("ERROR ## parser of eromanga-event has an error , data:" + entry.toString(),

  71. e);

  72. }

  73. System.out.println("rowChare ======>"+rowChage.toString());

  74.  
  75. EventType eventType = rowChage.getEventType(); //事件類型,比如insert,update,delete

  76. System.out.println(String.format("================> binlog[%s:%s] , name[%s,%s] , eventType : %s",

  77. entry.getHeader().getLogfileName(),//mysql的my.cnf配置中的log-bin名稱

  78. entry.getHeader().getLogfileOffset(), //偏移量

  79. entry.getHeader().getSchemaName(),//庫名

  80. entry.getHeader().getTableName(), //表名

  81. eventType));//事件名

  82.  
  83. for (RowData rowData : rowChage.getRowDatasList()) {

  84. if (eventType == EventType.DELETE) {

  85. printColumn(rowData.getBeforeColumnsList());

  86. } else if (eventType == EventType.INSERT) {

  87. printColumn(rowData.getAfterColumnsList());

  88. } else {

  89. System.out.println("-------> before");

  90. printColumn(rowData.getBeforeColumnsList());

  91. System.out.println("-------> after");

  92. printColumn(rowData.getAfterColumnsList());

  93. }

  94. }

  95. }

  96. }

  97.  
  98. private static void printColumn(List columns) {

  99. for (Column column : columns) {

  100. System.out.println(column.getName() + " : " + column.getValue() + " update=" + column.getUpdated());

  101. }

  102. }

  103.  
  104. }

  105.  

canal client运行实例:

 
  1. empty count : 1

  2. empty count : 2

  3. empty count : 3

  4. empty count : 4

6. 触发数据库变更

创建库:create database canal;
创建表:create table t_canal (id int,name varchar(20),status int);
插入数据:insert into t_canal values(10,'hello',1);

canal client输出日志:

 
  1. ================> binlog[mysql-bin.000001:6764] , name[canal,t_canal] , eventType : INSERT

  2. id : 10 update=true

  3. name : hello update=true

  4. status : 1 update=true

三. 自问自答-为何设置了数据表的过滤条件,但貌似没有生效?

:首先看文档AdminGuide,了解canal.instance.filter.regex的书写格式。mysql 数据解析关注的表,Perl正则表达式.多个正则之间以逗号(,)分隔,转义符需要双斜杠(\)
常见例子:

1. 所有表:.* or .*\\..*
2. canal schema下所有表: canal\\..*
3. canal下的以canal打头的表:canal\\.canal.*
4. canal schema下的一张表:canal.test1
5. 多个规则组合使用:canal\\..*,mysql.test1,mysql.test2 (逗号分隔)

检查binlog格式,过滤条件只针对row模式的数据有效(ps. mixed/statement因为不解析sql,所以无法准确提取tableName进行过滤)。

检查下CanalConnector是否调用subscribe(filter)方法;有的话,filter需要和instance.properties的canal.instance.filter.regex一致,否则subscribe的filter会覆盖instance的配置,如果subscribe的filter是.,那么相当于你消费了所有的更新数据 【特别注意

参考文章:

  1. https://www.cnblogs.com/jayinnn/p/9606466.html
  2. https://github.com/alibaba/canal

你可能感兴趣的:(ETL,canal)