一些场景需要采集文本内容发送到kafka,flume正好提供了这种解决方案
Flume agent脚本及配置如下
Mac
安装:
brew install flume
启动脚本:
nohup /usr/local/Cellar/flume/1.9.0/bin/flume-ng agent --conf /data/flume/run/flume/conf/conf.d/ --conf-file /data/flume/run/flume/conf/conf.d/mg-res_test.conf --name mg-res_test -Dflume.codeadmin.logger=INFO,console &
linux
安装:
http://flume.apache.org/download.html
启动脚本:
nohup /data/flume/run/flume/bin/flume-ng agent --conf /data/flume/run/flume/conf/conf.d/ --conf-file /data/flume/run/flume/conf/conf.d/mg-res_test.conf --name mg-res_test -Dflume.codeadmin.logger=INFO,console &
配置说明
/data/flume/run/flume/conf/conf.d/mg-res_test.conf
flume主要配置信息,关注中文描述的部分即可
# Name the components on this agent
mg-res_test.sources = mg-res_test
mg-res_test.sinks = mg-res_test
mg-res_test.channels = mg-res_test
# Describe/configure the source
mg-res_test.sources.mg-res_test.type = TAILDIR
# flume 记录偏移量文件
mg-res_test.sources.mg-res_test.positionFile=/data/logs/test/res-test/mg-res_test.json
mg-res_test.sources.mg-res_test.filegroups=f1
# 待采集的日志文件
mg-res_test.sources.mg-res_test.filegroups.f1=/data/logs/test/res-test/app.log
mg-res_test.sources.mg-res_test.fileHeader=true
# Describe the sink
mg-res_test.sinks.mg-res_test.channel = mg-res_test
mg-res_test.sinks.mg-res_test.type = org.apache.flume.sink.kafka.KafkaSink
# kafka topic配置 以实际为准
mg-res_test.sinks.mg-res_test.kafka.topic = res-test
# kafka 配置信息 以实际为准
mg-res_test.sinks.mg-res_test.kafka.bootstrap.servers = localhost:9092
mg-res_test.sinks.mg-res_test.kafka.flumeBatchSize = 2
mg-res_test.sinks.mg-res_test.kafka.producer.acks = 1
mg-res_test.sinks.mg-res_test.kafka.producer.linger.ms = 1
mg-res_test.sinks.mg-res_test.kafka.producer.compression.type = snappy
# Use a channel which buffers events in memory
mg-res_test.channels.mg-res_test.type = memory
mg-res_test.channels.mg-res_test.capacity = 100000
mg-res_test.channels.mg-res_test.transactionCapacity = 10000
# Bind the source and sink to the channel
mg-res_test.sources.mg-res_test.channels = mg-res_test
mg-res_test.sinks.mg-res_test.channel = mg-res_test
/data/flume/run/flume/conf/conf.d/flume-env.sh
配置java home即可
export JAVA_OPTS="-Xms128m -Xmx128m -Dcom.sun.management.jmxremote"
JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home
验证
本地环境可以直接启动kafka消费看有没有消息进入,测试环境请查询kafka消息增量或业务系统实际消费情况
kafka-console-consumer --bootstrap-server localhost:9092 --topic res-test --from-beginning