Start kafka connect

启动kafka broker

  1. 先下载从官网下载kafka console包,
wget "https://archive.apache.org/dist/kafka/0.11.0.1/kafka_2.12-0.11.0.1.tgz" -O kafka_2.12-0.11.0.1.tgz
tar -xvzf kafka_2.12-0.11.0.1.tgz
cd kafka_2.12-0.11.0.1
  1. 启动zookeeper
nohup ./bin/zookeeper-server-start.sh config/zookeeper.properties > /tmp/zk.log 2>&1 &
  1. 启动kafka broker
nohup ./bin/kafka-server-[start.sh](start.sh) config/server.properties > /tmp/kafka.log 2>&1 &
  1. 创建kafka topic
./bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic connect-test --partitions 1 --replication-factor 1
./bin/kafka-topics.sh --list --zookeeper localhost:2181

管理POM依赖
需要添加connect-api、connect-runtime、connect-file、connect-json依赖包。由于connect-runtime中的jetty-util包的版本过旧,所以新引入jetty-util包。
pom.xml内容如下:



    4.0.0

    com.ict.add
    connect-demo
    1.0-SNAPSHOT

    
        
            org.slf4j
            slf4j-api
            1.7.5
        

        
            org.apache.kafka
            kafka-clients
            1.0.1
            compile
            
                
                    org.slf4j
                    slf4j-log4j12
                
                
                    log4j
                    log4j
                
            
        

        
            org.apache.kafka
            kafka-streams
            1.0.1
        

        
            org.apache.kafka
            connect-api
            1.0.1
        

        
            org.apache.kafka
            connect-runtime
            1.0.1
        

        
            org.apache.kafka
            connect-file
            1.0.1
        

        
            org.apache.kafka
            connect-json
            1.0.1
        

        
            org.rocksdb
            rocksdbjni
            5.8.6
        

        
            org.eclipse.jetty
            jetty-util
            9.2.15.v20160210
        

        
            org.testng
            testng
            6.14.2
            test
        
    

    
        
            
                org.apache.maven.plugins
                maven-compiler-plugin
                3.6.1
                
                    1.8
                    1.8
                
            
        
    

启动kafka connector

package com.ict.add;

import org.apache.kafka.connect.cli.ConnectStandalone;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.testng.annotations.Test;

public class ConnectTest {
    private static final Logger LOGGER = LoggerFactory.getLogger(ConnectTest.class);

    @Test
    public void testConnectStandalone() {
        String workerConfigFile = "connect-standalone.properties";
        String sourceConnectFile = "connect-file-source.properties";
        String sinkConnectFile = "connect-file-sink.properties";
        String[] args = {workerConfigFile, sourceConnectFile, sinkConnectFile};
        try {
            ConnectStandalone.main(args);
        } catch (Exception e) {
            LOGGER.error("error ", e);
        }
    }

}

testConnectStandalone单元测试中提供了3个properties配置文件;

  1. connect-standalone.properties 是kafka-connect的Worker使用的配置。
bootstrap.servers=localhost:9092
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
offset.storage.file.filename=/tmp/connect.offsets
offset.flush.interval.ms=10000
rest.port=18083
plugin.path=
  1. connect-file-source.properties是kafka-connect的Connector的配置,这个是sourceConnector的配置,sourceConnector负责监听source.file.log中的改动,并将改动写入kafka topic。
name=local-file-source
connector.class=org.apache.kafka.connect.file.FileStreamSourceConnector
tasks.max=1
file=source.file.log
topic=connect-test
  1. connect-file-sink.properties是kafka-connect的Connector的配置,这个是sinkConnector的配置,sinkConnector负责消费kafka topic的内容,并写入sink.file.log文件中。
name=local-file-sink
connector.class=org.apache.kafka.connect.file.FileStreamSinkConnector
tasks.max=1
file=sink.file.log
topics=connect-test
  1. log4j.properties是log4j的配置文件,放在test/resources/路径下。
# Set root logger level to DEBUG and its only appender to A1.
log4j.rootLogger=INFO, stdout
# stdout appender is set to be a ConsoleAppender.
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Threshold=INFO
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d [%t] %-5p %c{1} %x - %m%n
log4j.logger.org.reflections=ERROR

运行testConnectStandalone单元测试,就会启动2个connector,其中一个FileStreamSourceConnector和一个FileStreamSinkConnector。

touch source.file.log sink.file.log
echo 'hello' >>source.file.log
echo 'world' >>source.file.log
date >>source.file.log
cat sink.file.log

看到sink.file.log文件内容和刚刚写入source.file.log的内容一样。

./bin/kafka-console-consumer.sh  --zookeeper localhost:2181 --from-beginning --topic connect-test

启动一个kafka consumer,消费到的数据就是刚刚写入source.file.log的内容。

你可能感兴趣的:(Start kafka connect)