canal构建实时索引(两种方式adapter与springboot)

一.canal概述与安装

1.1 简述

canal模拟mysql slave交互协议,伪装成mysql一个从节点,向mysql master发送dump协议,mysql master收到dump请求之后,开始推送binary log 给slave。canal解析binary log对象即byte流。

1.2 安装

1.2.1 canal.deployer安装

(1)开启mysql binlog
SHOW VARIABLES LIKE '%log_bin%'
如果 log_bin的值为OFF是未开启,为ON是已开启。
若OFF则需要开启
mysql -h localhost -u root -p
create user canal@'%' IDENTIFIED by 'canal';
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT,SUPER ON *.* TO 'canal'@'%';
FLUSH PRIVILEGES;
​(2)下载后解压
wget https://github.com/alibaba/canal/releases/download/canal-1.1.4/canal.adapter-1.1.4.tar.gz
wget https://github.com/alibaba/canal/releases/download/canal-1.1.4/canal.deployer-1.1.4.tar.gz
​
 tar -zxvf canal.adapter-1.1.4.tar.gz -C /data/canal/canal.adapter-1.1.4
tar -zxvf canal.deployer-1.1.4.tar.gz -C /data/canal/canal.deployer-1.1.4
​
(3)修改配置文件

canal.instance.mysql.slaveId=2
canal.instance.master.address=自己内网:3306
canal.instance.dbUsername=canal
canal.instance.dbPassword=canal
​
进入mysql中执行下面语句查看binlog所在位置
mysql端执行:show master status;

./bin/startup.sh
查看日志
cat /data/canal/canal.deployer-1.1.4/logs
查看端口是否被占用
netstat -an |grep 11111

 

 

1.2.2 canal.adapter安装

将监听到的mysql的binlog信息发送到另一个地方例如mq,hbase等

(1)修改 application.yml
server:
  port: 8081
spring:
  jackson:
    date-format: yyyy-MM-dd HH:mm:ss
    time-zone: GMT+8
    default-property-inclusion: non_null
​
canal.conf:
  mode: tcp
  canalServerHost: 127.0.0.1:11111
  batchSize: 500
  syncBatchSize: 1000
  retries: 0
  timeout:
  accessKey:
  secretKey:
  srcDataSources:
    defaultDS:
      url: jdbc:mysql://127.0.0.1:3306/school-edu?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true&useSSL=false&serverTimezone=GMT%2B8&zeroDateTimeBehavior=convertToNull
      username: canal
      password: canal
  canalAdapters:
  - instance: example 
    groups:
    - groupId: g1
      outerAdapters:
       - name: logger
       - name: es
         hosts: 127.0.0.1:9300
         properties:
           cluster.name: elasticsearch 
​
(2)自定义yml,完成数据变更到es
vim edu.yml
dataSourceKey: defaultDS
destination: example
groupId: g1
esMapping:
  _index: teacher
  _type: _doc
  _id: id
  upsert: true
  sql: "select t.id, t.name, t.career, t.level from edu_teacher t"
  etlCondition: "where t.c_time>={}"
  commitBatch: 3000
1.2.2 成功展示

canal构建实时索引(两种方式adapter与springboot)_第1张图片

遇到的问题:

(1)mysql8或者mysql7高点的版本需要替换jar

(2)es必须配置cluster.name: elasticsearch 否则报NPE

(3)数据需要先生成然后再修改,否则就数据就仅仅是修改的数据了,而不是整个文档例如下图id=12

 

二、canal与springboot构建实时索引

2.1 具体代码与思想

      总的来说:(1)监听canal11111端口获取实时修改数据流;(2)拿到修改信息做出解析(那个表,哪个库,哪个字段);(3)根据解析结果重新构建索引。

@Component
@Slf4j
public class CanalScheduling implements Runnable, ApplicationContextAware {
​
    private ApplicationContext applicationContext;
​
    @Autowired
    private CourseEsService courseEsService;
    @Resource
    private CanalConnector canalConnector;
​
    @Autowired
    private RestHighLevelClient restHighLevelClient;
​
    @Override
    @Scheduled(fixedDelay = 100)
    public void run() {
        long batchId = -1;
        try {
            int batchSize = 1000;
            //依此从canal种取1000条数据
            Message message = canalConnector.getWithoutAck(batchSize);
            batchId = message.getId();
            List entries = message.getEntries();
            if (batchId != -1 && entries.size() > 0) {
                for (CanalEntry.Entry entry : entries) {
                    //若是行记录更新
                    if (entry.getEntryType() == CanalEntry.EntryType.ROWDATA) {
                        //解析处理
                        publishCanalEvent(entry);
                    }
                }
            }
            canalConnector.ack(batchId);
        } catch (Exception e) {
            e.printStackTrace();
            canalConnector.rollback(batchId);
        }
​
    }
​
    private void publishCanalEvent(CanalEntry.Entry entry) throws Exception {
        //我们可以拿到binlog类型,是创建还是修改或者其它进而根据业务就行修改代码
        //CanalEntry.EventType eventType = entry.getHeader().getEventType();
​
        //获取数据库相关信息
        String database = entry.getHeader().getSchemaName();
        String table = entry.getHeader().getTableName();
        CanalEntry.RowChange change = CanalEntry.RowChange.parseFrom(entry.getStoreValue());
        for (CanalEntry.RowData rowData : change.getRowDatasList()) {
            //获取变更主键id
            List columns = rowData.getAfterColumnsList();
            String primaryKey = "id";
            CanalEntry.Column idColumn = columns.stream().filter(column -> column.getIsKey()
                    && primaryKey.equals(column.getName())).findFirst().orElse(null);
            Map dataMap = parseColumnsToMap(columns);
            //根据数据库,表,id信息重新从数据库捞数据构建索引
            indexES(dataMap, database, table);
        }
    }
​
​
    Map parseColumnsToMap(List columns) {
        Map jsonMap = new HashMap<>();
        columns.forEach(column -> {
            if (column == null) {
                return;
            }
            jsonMap.put(column.getName(), column.getValue());
        });
        return jsonMap;
    }
​
    /**
     * 根据业务逻辑重新构建索引
     *
     * @param dataMap
     * @param database
     * @param table
     * @throws IOException
     */
    private void reBuildEs(Map dataMap, String database, String table) throws Exception {
        
    }
​
    @Override
    public void setApplicationContext(ApplicationContext applicationContext) throws BeansException {
        this.applicationContext = applicationContext;
    }
}

2.2结果展示

 

canal构建实时索引(两种方式adapter与springboot)_第2张图片

 

你可能感兴趣的:(java,数据库,elasticsearch,canal,实时索引构建)