flume + kafka 日志采集

版权声明:本文为博主原创文章,如需转载,请注明转载地址: https://blog.csdn.net/liurui_wuhan/article/details/82985265

将系统产生日志信息通过flume采集,推送至kafka进行消费处理

架构图

flume + kafka 日志采集_第1张图片

服务 ip port 备注
flume collectors 10.200.132.181 6333 flume collectors
flume agent 10.200.132.168   flume采集器(目前使用一个agent)
kafka 10.200.132.181 9092 2181 kafka和zookeeper

一台机器部署一个flume agent,如果需要采集多个服务的日志,在配置文件里面可以配置多个collect,

关于flume内部架构的详细介绍,可以参考博客:https://blog.csdn.net/lsjseu/article/details/51670933

关于kafka的安装,前面我已经安装过了,可以参考博客:https://blog.csdn.net/liurui_wuhan/article/details/82984615

本文主要安装flume和如何实现日志采集

一、安装部署

1、下载安装flume

下载地址:http://flume.apache.org/download.html

目前最新是1.8版本 http://www.apache.org/dyn/closer.lua/flume/1.8.0/apache-flume-1.8.0-bin.tar.gz

[root@log-system opt]# tar -zxvf apache-flume-1.8.0-bin.tar.gz

2、10.200.132.181机器上配置

新建 flume-collecters.properties ,用于将收集到flume agent日志 推送到kafka

[root@log-system apache-flume-1.8.0-bin]#  cd conf

[root@log-system conf]# vim flume-collecters.properties 
#flume collecters
agent.sources = s1Flume
agent.channels = c1
agent.sinks =sinkKafka
 
# For each one of the sources, the type is defined
agent.sources.s1Flume.channels = c1
agent.sources.s1Flume.type = avro

#flume ip

agent.sources.s1Flume.bind = 10.200.132.181

# flume 端口
agent.sources.s1Flume.port = 6333
 
# The channel can be defined as follows.
agent.sources.s1Flume.channels = c1
 
# Each sink's type must be defined
agent.sinks.sinkKafka.type = org.apache.flume.sink.kafka.KafkaSink

# kafka消息队列名称
agent.sinks.sinkKafka.topic = topic-pear

# kafka ip:port
agent.sinks.sinkKafka.brokerList = 10.200.132.181:9092 
agent.sinks.sinkKafka.requiredAcks = 1
agent.sinks.sinkKafka.batchSize = 20
agent.sinks.sinkKafka.channel = c1
#Specify the channel the sink should use
#agent.sinks.loggerSink.channel = memoryChannel
# Each channel's type is defined.
agent.channels.c1.type = memory
 
# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.c1.capacity = 100

启动flume

[root@log-system apache-flume-1.8.0-bin]#  bin/flume-ng agent -c conf -f conf/flume-collecters.properties -n agent -Dflume.root.logger=INFO,console,LOGFILE &

查看6333端口是否已启动

3、10.200.132.168上flume配置

[root@cm-elk-02 conf]# vim flume-test-collect.properties
agent.sources = fileSource
agent.channels = memoryChannel
agent.sinks = collecter1
 
agent.sinkgroups = gCollecters
agent.sinkgroups.gCollecters.sinks = collecter1
 
#sink调度模式 load_balance  failover
agent.sinkgroups.gCollecters.processor.type = failover
#负载均衡模式  轮询  random  round_robin
agent.sinkgroups.gCollecters.processor.selector=round_robin
#失效降级
agent.sinkgroups.gCollecters.processor.backoff=true
#降级时间30秒
agent.sinkgroups.gCollecters.processor.maxTimeOut=30000
 
 
agent.sources.fileSource.type = exec
# 监控的日志文件
agent.sources.fileSource.command = tail -F /opt/test/logs/test.log
#agent.sources.fileSource.charset=utf-8
agent.sources.fileSource.channels = memoryChannel
 
agent.sources.fileSource.restartThrottle = 10000
agent.sources.fileSource.restart = true
agent.sources.fileSource.logStdErr = true
 
# Each sink's type must be defined
agent.sinks.collecter1.channel = memoryChannel
agent.sinks.collecter1.type = avro
# flume 服务ip
agent.sinks.collecter1.hostname = 10.200.132.181
# flume 端口
agent.sinks.collecter1.port = 6333
agent.sinks.collecter1.batch-size = 10
# Each channel's type is defined.
agent.channels.memoryChannel.type = memory
 
# Other config values specific to each type of channel(sink or source)
#The max number of events stored in the channel
agent.channels.memoryChannel.capacity = 100
#The max number of events stored in the channel per transaction
agent.channels.memoryChannel.transactionCapacity = 100
#Timeout in seconds for adding or removing an event
agent.channels.memoryChannel.keep-alive=30

 

创建日志目录(如果没有就创建)

[root@cm-elk-02 conf]# mkdir -p /opt/test/logs/

启动服务

[root@cm-elk-02 apache-flume-1.8.0-bin]# bin/flume-ng agent -c conf -f conf/flume-test-collect.properties -n agent -Dflume.root.logger=INFO,console,LOGFILE

 

部署配置基本完成

 

二、验证

登录10.200.132.181服务器,执行kafka消费消息

[root@log-system ~]# /opt/kafka_2.12-2.0.0/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic topic-pear --from-beginning

登录10.200.132.168服务器,往日志文件写日志

[root@cm-elk-02 logs]# echo "hello world" >>test.log

写入完成之后,大概等几秒钟,就可以看到kafka消费者消费的队列信息了。

 

自己也可以写一个springboot程序产生日志,修改flume agent 监控的日志目录文件,就可以实时的将日志通过flume推送至kafka

 

 

 

 

 

 

 

 

 

 

 

 

你可能感兴趣的:(Devops)