flume + Kafka采集数据 超简单

说到标题,这只是实时架构中的一小部分内容。

下载最新版本flume:apache-flume-1.6.0-bin.tar.gz 

解压缩,修改conf/flume-conf.properties 名字可以随便写。

我目前实现的是从目录中读取数据写到Kafka中,原理的东东网上一大堆,只接上代码吧:

a1.sources = r1
a1.sinks = k1
a1.channels = c1

a1.sources.r1.type = spooldir
a1.sources.r1.channels = c1
a1.sources.r1.spoolDir = /data/pv/20150812/
a1.sources.r1.fileHeader = true

a1.channels = c1
a1.channels.c1.type = memory
a1.channels.c1.capacity = 10000
a1.channels.c1.transactionCapacity = 10000
a1.channels.c1.byteCapacityBufferPercentage = 20
a1.channels.c1.byteCapacity = 800000

a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.topic = testflume
a1.sinks.k1.brokerList = xxxx:9092,xxxx:9092,xxxx:9092
a1.sinks.k1.requiredAcks = 1
a1.sinks.k1.batchSize = 20
a1.sinks.k1.channel = c1

启动flume: 

./bin/flume-ng agent -n a1 -c conf -f conf/flume-conf.properties


到Kafka里面去查询数据:


./bin/kafka-console-consumer.sh  --zookeeper xxxx:2181/kafka --topic testflume


可以看到数据源源不断的加到Kafka里面了。

你可能感兴趣的:(flume + Kafka采集数据 超简单)