将A服务器上的日志采集到B服务器
技术选型:
(1)exec source +memory channel +avro sink
(2) avro source +memory channgel+logger sink
需要写两个配置文档
写配置文档
A服务器:exec-memory-avro.conf
exec-memory-avro.sources = exec-source #a1代表agent名称,r1:数据源的名称
exec-memory-avro.sinks = avro-sink #k1 sink名称
exec-memory-avro.channels = memory-channel #c1 channel名称
exec-memory-avro.sources.exec-source.type = exec
exec-memory-avro.sources.exec-source.command = tail -F /home/hadoop/data/data.log
exec-memory-avro.sources.exec_source.shell = /bin/sh -c
exec-memory-avro.sinks.avro-sink.type = logger
exec-memory-avro.sinks.avro-sink.hostname = 192.168.199.150
exec-memory-avro.sinks.avro-sink.port = 44444
exec-memory-avro.channels.memory-channel.type = memoryexec-memory-avro.sinks.avro-sink.channel = memory-channel
B服务器:avro-memory-logger.conf
avro-memory-logger.sources = avro-source #a1代表agent名称,r1:数据源的名称
avro-memory-logger.sinks = logger-sink #k1 sink名称
avro-memory-logger.channels = memory-channel #c1 channel名称
avro-memory-logger.sources.avro-source.type = avro
avro-memory-logger.sources.avro-source.bind = 192.168.199.150
avro-memory-logger.sources.avro_source.port = 44444
avro-memory-logger.sinks.logger-sink.type = logger
avro-memory-logger.channels.memory-channel.type = memory启动Agent
1.主要启动顺序,首先启动B服务器上的agent
flume-ng agent --name avro-memory-logger --conf $FLUME_HOME/conf --conf-file $FLUME_HOME/conf/avro-memory-logger.conf \
-Dflume.root.logger=INFO,console
2.启动A服务器上的agent
flume-ng agent --name exec-memory-avro --conf $FLUME_HOME/conf --conf-file $FLUME_HOME/conf/exec-memory-avro.conf \
-Dflume.root.logger=INFO,console
日志收集过程:
1.机器A上监控一个文件,当我们访问主站时会有用户行为日志记录到ACCESS.LOG中
2.AVRO sink 把新产生的日志输出到对应的AVRO source指定的hostname 和port 上
3.通过AVRO souce对应的agent将我们的日志输出到控制台,后面大家也可以输出到kafak或HDFS上