【Flume】flume 容错环境的搭建 failover

关于failover网上也有很多例子,但是看到的有多重做法,个人觉得,本着职责单一的原则

1、一台机子运行一个flume agent

2、一个agent 的下游sink指向一个flume agent,不要一个flume agent配置多个端口【影响性能】

3、分机子配置,可以避免一台机子司机,另一个仍可以使用,否则陪在一台机子上通过端口区分,一旦死机,全盘崩溃


下面看具体实例:

首先是flumet agent client的配置

priority越高,优先级越高,会优先使用该sink

# Name the components on this agent
a1.sources = r1
a1.sinks = k1 k2
a1.channels = c1
 
# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.channels=c1
a1.sources.r1.command=tail -F /root/dev/biz/logs/bizlogic.log

#define sinkgroups
a1.sinkgroups=g1
a1.sinkgroups.g1.sinks=k1 k2
a1.sinkgroups.g1.processor.type=failover
a1.sinkgroups.g1.processor.priority.k1=10
a1.sinkgroups.g1.processor.priority.k2=5
a1.sinkgroups.g1.processor.maxpenalty=10000

#define the sink 1
a1.sinks.k1.type=avro
a1.sinks.k1.hostname=192.168.11.179
a1.sinks.k1.port=9876

#define the sink 2
a1.sinks.k2.type=avro
a1.sinks.k2.hostname=192.168.11.178
a1.sinks.k2.port=9876


# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
a1.sinks.k2.channel=c1
这里可以看到使用了sinkgroup,其中包括了两个sink,两个sink分别指向不同的flume agent

再来看flume agent server的配置,即179,178的配置,看一个即可


# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
 
# Describe/configure the source
a1.sources.r1.type=avro
#any address to listen
a1.sources.r1.bind=0.0.0.0
a1.sources.r1.port=9876
a1.sources.r1.channels=c1

# Describe the sink
a1.sinks.k1.type = file_roll
a1.sinks.k1.sink.directory=/root/dev/flumeout/file
a1.sinks.k1.sink.rollInterval=3600


# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

可以看出flume agent client和server之间是通过avro来传输数据的,avro是flume内置的协议,非常方便,可以将flume整个串起来

下面先启动flume agent server,再启动flume agent client

测试如下:

for i in {1..100};
do echo "exec test tail -f $i  on terminator 176" >> bizlogic.log;
echo $i;
sleep 0.1;
done
往文件中写内容,触发flume agent client的tail -F,这样内容就会通过flume agent client 到memory channel中,在通过failover机制选择优先级高的sink去输出,最终输出的地方,有最后一环的flume配置中sink.type决定,可以看出是file_roll,也就是文件形式写到磁盘上,会按照一定方式滚动


起初启动的时候,178和179都会产生此文件,但是当你开始产生文件内容的时候,也还有179才会写入文件内容了


至此,完整的flume failover 机制就走通了,共勉!



你可能感兴趣的:(Flume,Java)