使用Debezium+Kafka 为SqlServer 表建立Snapshot

本文不涉及 zookeeper kafaka debezium的原理, 也不使用 docker 等.

requirement

  • 一个已经部署好的或者正在运行的SQLServer 实例
  • 一个Windows10 主机可以运行Wsl2 , 或者一台Linux主机(或者虚拟机, docker container 都可以)必须能连接 SQL Server

安装java , 博主用的是 WSL2 debian buster(10)

  // 默认安装的是 openjdk-11 可以用 java --version 检查.
 sudo apt install default-jdk 

安装 kafaka

mkdir /kafaka
cd /kafaka
wget https://archive.apache.org/dist/kafka/2.8.0/kafka_2.13-2.8.0.tgz
tar -xzf ./kafka_2.13-2.8.0.tgz

安装 Debezium (这里用1.6 正式版.

 mkdir /kafaka/kafka_2.12-2.8.0/connect
 cd /kafaka/kafka_2.12-2.8.0/connect
 wget https://repo1.maven.org/maven2/io/debezium/debezium-connector-sqlserver/1.6.1.Final/debezium-connector-sqlserver-1.6.1.Final-plugin.tar.gz
 tar -xzf debezium-connector-sqlserver-1.6.1.Final-plugin.tar.gz

修改 kfaka 配置

code /kafka/kafka_2.12-2.8.0/config/connect-distributed.properties
# 修改 kfaka connect-distributed plugin.path 配置, 把它指向放 Debezium的文件夹
plugin.path=/kafka/kafka_2.12-2.8.0/connect

# 修改一下 listener, 
code /kafka/kafka_2.12-2.8.0/config/server.properties
listeners=PLAINTEXT://:9092
# 绑定到 wsl2 局域网ip
advertised.listeners=PLAINTEXT://192.168.200.199:9092

依次启动 zookeeper kafka kafka connect

# 这里可以多开几个terminal 用来查看日志
cd /kafka/kafka_2.12-2.8.0/bin
./zookeeper-server-start.sh ../config/zookeeper.properties
./kafka-server-start.sh ../config/server.properties
./connect-distributed.sh ../config/connect-distributed.properties

sql server 开启 cdc

-- 数据库开启cdc
USE AdventureWorks2012;  
GO  
EXECUTE sys.sp_cdc_enable_db;  
GO  
-- 开启追踪
EXECUTE sys.sp_cdc_enable_table  
    @source_schema = N'dbo'  
  , @source_name = N'Customer'  
  , @role_name = N'public';  
GO  

向 kafka connect 添加 connector

curl --location --request POST 'http://localhost:8083/connectors/' \
--header 'Content-Type: application/json' \
--data-raw '{
  "name": "sqlserver-yourdb-connector",  
  "config": {
    "connector.class": "io.debezium.connector.sqlserver.SqlServerConnector", 
    "database.hostname": "localhost", 
    "database.port": "1433", 
    "database.user": "sa", 
    "database.password": "password", 
    "database.dbname": "yourdb",
    "table.include.list":"dbo.Customer", 
    "database.server.name": "", 
    "database.history.kafka.bootstrap.servers": "localhost:9092", 
    "database.history.kafka.topic": "dbhistory.yourdb" 
  }
}'

完成

这个时候用kafka tools 或者其他的工具, 连接到zookeeper 应该就能看到数据了
image.png

但是如果你是从一个很大的表上面建立了connector 那么你可能要等待kafka 建立快照几个小时.

引用

  • https://docs.microsoft.com/en-us/sql/relational-databases/track-changes/about-change-data-capture-sql-server?view=sql-server-ver15
  • https://debezium.io/documentation/reference/1.6/connectors/sqlserver.html
  • https://hevodata.com/learn/kafka-cdc-sql-server/

你可能感兴趣的:(使用Debezium+Kafka 为SqlServer 表建立Snapshot)