TiDB 2.1 开启binlog 将数据发送到kafka

 TiDB开启binlog之后后面可以接MySQL、protocol buffe、TiDB、TiFlash、kafka作为数据的拓展使用.

TiDB 的binlog默认是关闭,需要手动开启,慢查询日志也是默认关闭的

$ vim /home/tidb/tidb-ansible/inventory.ini
tidb_version = v2.1.4

## binlog trigger
enable_binlog = True

# kafka cluster address for monitoring, example:
# kafka_addrs = "192.168.0.11:9092,192.168.0.12:9092,192.168.0.13:9092"
kafka_addrs = ""

# zookeeper address of kafka cluster for monitoring, example:
# zookeeper_addrs = "192.168.0.11:2181,192.168.0.12:2181,192.168.0.13:2181"
zookeeper_addrs = ""

# store slow query log into seperate file
enable_slow_query_log = True

$ vim /home/tidb/tidb-ansible/conf/tidb.yml

log:
  # Log level: debug, info, warn, error, fatal.
  # level: "info"

  # Log format, one of json, text, console.
  # format: "text"

  # Disable automatic timestamps in output
  # disable-timestamp: false

  # Queries with execution time greater than this value will be logged. (Milliseconds)
  slow-threshold: 300

  # Queries with internal result greater than this value will be logged.
  expensive-threshold: 10000

  # Maximum query length recorded in log.
  query-log-max-len: 2048

  # File logging.
  file:
    # Max log file size in MB. (upper limit to 4096MB).
    max-size: 300

    # Max log file keep days. No clean up by default.
    max-days: 10

    # Maximum number of old log files to retain. No clean up by default.
    max-backups: 7

    # Rotate log by day
    log-rotate: true
....
binlog:
  # WriteTimeout specifies how long it will wait for writing binlog to pump.
  # write-timeout: "15s"

  # If IgnoreError is true, when writting binlog meets error, TiDB would stop writting binlog,
  # but still provide service.
  # ignore-error: false

--drainer 配置:
自TiDB 2.1GA版本之后TiDB 的binlog改版:
$ vim /home/tidb/tidb-ansible/conf/drainer.toml 
# drainer Configuration.

# the interval time (in seconds) of detect pumps' status
detect-interval = 10

# syncer Configuration.
[syncer]

# disable sync these schema
ignore-schemas = "INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql"

# number of binlog events in a transaction batch
txn-batch = 1

# work count to execute binlogs
worker-count = 1

disable-dispatch = false

# safe mode will split update to delete and insert
safe-mode = false

# downstream storage, equal to --dest-db-type
# valid values are "mysql", "pb", "tidb", "flash", "kafka"
#db-type = "mysql"
db-type = "kafka"

##replicate-do-db priority over replicate-do-table if have same db name
##and we support regex expression , start with '~' declare use regex expression.
#
#replicate-do-db = ["~^b.*","s1"]
#[[syncer.replicate-do-table]]
#db-name ="test"
#tbl-name = "log"

#[[syncer.replicate-do-table]]
#db-name ="test"
#tbl-name = "~^a.*"

# the downstream mysql protocol database
#[syncer.to]
#host = "127.0.0.1"
#user = "root"
#password = ""
#port = 3306
# Time and size limits for flash batch write
# time-limit = "30s"
# size-limit = "100000"


# Uncomment this if you want to use pb or sql as db-type.
# Compress compresses output file, like pb and sql file. Now it supports "gzip" algorithm only. 
# Values can be "gzip". Leave it empty to disable compression. 
#[syncer.to]
#compression = ""


# when db-type is kafka, you can uncomment this to config the down stream kafka, it will be the globle config kafka default
[syncer.to] 
#注释:zookeeper-addrs可以不配置
# only need config one of zookeeper-addrs and kafka-addrs, will get kafka address if zookeeper-addrs is configured.
zookeeper-addrs = "127.0.0.1:2181"
kafka-addrs = "127.0.0.1:9092"
kafka-version = "0.8.2.0"

-- 查看:

$ /home/tidb/tidb-ansible/resources/bin/binlogctl -pd-urls=http://172.16.4.173:2379,172.16.4.174:2379,172.16.4.175:2379 -cmd pumps
INFO[0000] pump: {NodeID: cdh-tikv1-4-173:8250, Addr: 172.16.4.173:8250, State: online, MaxCommitTS: 406698390188720129, UpdateTime: 2019-03-01 17:07:17 +0800 CST} 
INFO[0000] pump: {NodeID: cdh-tikv2-4-174:8250, Addr: 172.16.4.174:8250, State: online, MaxCommitTS: 406698389480931329, UpdateTime: 2019-03-01 17:07:15 +0800 CST} 
INFO[0000] pump: {NodeID: cdh-tikv3-4-175:8250, Addr: 172.16.4.175:8250, State: online, MaxCommitTS: 406698389664432129, UpdateTime: 2019-03-01 17:07:15 +0800 CST} 

--生成一个序列号:

/home/tidb/tidb-ansible/resources/bin/binlogctl -pd-urls=http://172.16.4.173:2379,172.16.4.174:2379,172.16.4.175:2379 -cmd generate_meta
INFO[0000] [pd] create pd client with endpoints [http://172.16.4.173:2379 172.16.4.174:2379 172.16.4.175:2379] 
INFO[0000] [pd] leader switches to: http://172.16.4.174:2379, previous:  
INFO[0000] [pd] init cluster id 6663269911818749478     
INFO[0000] meta: &{CommitTS:406698423363043329}  

--将序列号写入到配置文件 inventory:

[drainer_servers]
drainer_kafka ansible_host=172.16.4.173 initial_commit_ts="406698423363043329"

--通过中控机重启drainer服务:

 $ cd /home/tidb/tidb-ansible/conf
 $ cp drainer.toml  drainer_kafka_drainer.toml
 $ vim drainer_kafka_drainer.toml
 
 部署 Drainer
$ ansible-playbook deploy_drainer.yml
启动 Drainer
$ ansible-playbook start_drainer.yml

--在源库TiDB进行DDL和DML操作:

--编译官方的driver:

--将topic的序列写入:

--查看输出的protocol buffer文件内容:

recv: commit_ts:406855570293522434 dml_data: column_info: 
column_info: column_info: column_info:
 mutations: columns: columns: columns: columns: > 
 change_row: columns: columns: columns: columns: > > 
 mutations: columns: columns: columns: columns: > 
 change_row: columns: columns: columns: columns: > > 
 mutations: columns: columns: columns: columns: > 
 change_row: columns: columns: columns: columns: > > > >

TiDB的binlog输出的数据格式为protocol buffer:

你可能感兴趣的:(TiDB)