Flume集群配置故障转移,踩过的坑

1.需求

​ 体会SinkProcessor,场景就是多个sink同时从一个channel拉取数据!

Agent1: netcatsource----memorychannel-----2AvroSink (hadoop102)

Agent2: ArvoSource----memorychannel-----loggersink (103)

Agent3: ArvoSource----memorychannel-----loggersink (104)

2.所需组件

2.1 Default Sink Processor

​ agent中只有一个sink,此时就使用Default Sink Processor,不强制用户来配置Sink Processor和sink组!

2.2 Failover Sink Processor

​ Failover Sink Processor:故障转移的sink处理器! 这个sink处理器会维护一组有优先级的sink!默认挑选优先级最高(数值越大)的sink来处理数据!故障的sink会放入池中冷却一段时间,恢复后,重新加入到存活的池中,此时在live pool(存活的池)中选优先级最高的,接替工作!

配置:

sinks 空格分割的,多个sink组成的集合
processor.type default failover
**processor.priority.**sinkName 优先级

示例:
a1.sinkgroups = g1
a1.sinkgroups.g1.sinks = k1 k2
a1.sinkgroups.g1.processor.type = failover
a1.sinkgroups.g1.processor.priority.k1 = 5
a1.sinkgroups.g1.processor.priority.k2 = 10
a1.sinkgroups.g1.processor.maxpenalty = 10000

3.故障转移案例

3.1 Agent1

netcatsource----memorychannel-----2AvroSink (hadoop102)

// 在103上的flume的,flumeagents目录下新建example2-agent1.conf

a1.sources = r1
a1.sinks = k1 k2
a1.channels = c1

# 配置source
a1.sources.r1.type = netcat
a1.sources.r1.bind = hadoop102
a1.sources.r1.port = 44444

#配置sink组
a1.sinkgroups = g1
a1.sinkgroups.g1.sinks = k1 k2
a1.sinkgroups.g1.processor.type = failover
a1.sinkgroups.g1.processor.priority.k1 = 5
a1.sinkgroups.g1.processor.priority.k2 = 10
a1.sinkgroups.g1.processor.maxpenalty = 10000

# 配置sink
a1.sinks.k1.type = avro
a1.sinks.k1.hostname=hadoop103
a1.sinks.k1.port=12345

a1.sinks.k2.type = avro
a1.sinks.k2.hostname=hadoop104
a1.sinks.k2.port=1234

# 配置channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000

# 绑定和连接组件
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c1

3.2 Agent2

Agent2: ArvoSource----memorychannel-----loggersink (103)

// 在103上,flumeagents下新建example2-agent2.conf

a1.sources = r1
a1.sinks = k1
a1.channels = c1

# 配置source
a1.sources.r1.type = avro
a1.sources.r1.bind = hadoop103
a1.sources.r1.port = 12345

# 配置sink
a1.sinks.k1.type = logger

# 配置channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000

# 绑定和连接组件
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

3.3 Agent3

// 在103上,flumeagents下新建example2-agent3.conf

a1.sources = r1
a1.sinks = k1
a1.channels = c1

# 配置source
a1.sources.r1.type = avro
a1.sources.r1.bind = hadoop104
a1.sources.r1.port = 1234

# 配置sink
a1.sinks.k1.type = logger

# 配置channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000

# 绑定和连接组件
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

1.103上全部新建成功后:记得分发到102,104,xsync
2.先启104,103,在启102(注意:!!! 这个命令在文本输入框,写成一行再复制过去执行!
bin/flume-ng agent -c conf/ -n a1 -f flumeagents/example2-agent2.conf -Dflume.root.logger=INFO,console
2.1.在hadoop102上 nc hadoop102 44444 连上
3.102是source。xshell同时开多个窗口,水平排,在102上发数据,观察104,和103谁干活(接收数据),将104挂掉,crtl+c。转移到103干活。再重启104,104(优先级高)又干活

结果如下:
Flume集群配置故障转移,踩过的坑_第1张图片

遇到的bug:
1.将配置信息复制过去,vim编辑界面,粘贴时,丢失了开头的文本!
复制:
在这里插入图片描述
粘贴:
Flume集群配置故障转移,踩过的坑_第2张图片
2.将两行命令一次复制过去执行,粘贴时执行不全。只执行了第一行的!
所以要将合并成一行再复制过去!
结果如下:
Flume集群配置故障转移,踩过的坑_第3张图片

你可能感兴趣的:(flume)