5. Flume企业开发案例-负载均衡和故障转移

文章目录

    • Flume负载均衡和故障转移
      • 1. 故障转移
        • 1.1 需求分析
        • 1.2 配置文件
          • flume1.conf
          • flume2.conf
          • flume3.conf
        • 1.3 测试
      • 2. 负载均衡

Flume负载均衡和故障转移

Flume 1.7.0 User Guide

1. 故障转移

1.1 需求分析

使用 Flume1 监控一个端口,其 sink 组中的 sink 分别对接 Flume2 和 Flume3,采用
Failover Sink Processor,实现故障转移的功能。
5. Flume企业开发案例-负载均衡和故障转移_第1张图片

#在/opt/module/flume/job 目录下创建 group3文件夹
mkdir group3
  • hadoop–flume1.conf
    • resource1 -- netcat
    • channel1 -- memory
    • sink -- avro
    • a1.sinkgroups.g1.processor.type = failover
  • hadoop–flume2.conf
    • resource1 -- avro
    • channel1 -- memory
    • sink -- logger
  • hadoop–flume3.conf
    • resource1 -- avro
    • channel1 -- memory
    • sink -- logger

1.2 配置文件

flume1.conf
# Name the components on this agent
a1.sources = r1
a1.channels = c1

a1.sinkgroups = g1
a1.sinks = k1 k2

# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = hadoop
a1.sources.r1.port = 44444

#设置故障转移
a1.sinkgroups.g1.processor.type = failover
#优先级值。绝对值越大表示优先级越高,优先级较高的值接收器将较早激活。
a1.sinkgroups.g1.processor.priority.k1 = 5
a1.sinkgroups.g1.processor.priority.k2 = 10

#失败的接收的最大回退周期(单位为毫秒)
a1.sinkgroups.g1.processor.maxpenalty = 10000

# Describe the sink
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = hadoop
a1.sinks.k1.port = 4141

a1.sinks.k2.type = avro
a1.sinks.k2.hostname = hadoop
a1.sinks.k2.port = 4142

# Describe the channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinkgroups.g1.sinks = k1 k2

a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c1
flume2.conf
#Name
a2.sources = r1
a2.sinks = k1
a2.channels = c1

#Sources
a2.sources.r1.type = avro
a2.sources.r1.bind = hadoop
a2.sources.r1.port = 4141

#Sink
a2.sinks.k1.type = logger

#Channel
a2.channels.c1.type = memory
a2.channels.c1.capacity = 1000
a2.channels.c1.transactionCapacity = 100

#Bind
a2.sinks.k1.channel = c1
a2.sources.r1.channels = c1
flume3.conf
#Name
a3.sources = r1
a3.sinks = k1
a3.channels = c1

#Sources
a3.sources.r1.type = avro
a3.sources.r1.bind = hadoop
a3.sources.r1.port = 4142

#Sink
a3.sinks.k1.type = logger

#Channel
a3.channels.c1.type = memory
a3.channels.c1.capacity = 1000
a3.channels.c1.transactionCapacity = 100

#Bind
a3.sinks.k1.channel = c1
a3.sources.r1.channels = c1

1.3 测试

bin/flume-ng agent -c conf/ -n a3 -f job/group3/flume3.conf -Dflume.root.logger=INFO,console

bin/flume-ng agent -c conf/ -n a2 -f job/group3/flume2.conf -Dflume.root.logger=INFO,console

bin/flume-ng agent -c conf/ -n a1 -f job/group3/flume1.conf -Dflume.root.logger=INFO,console

#在hadoop上输入以下内容
nc hadoop 44444
hello
OK
world
OK
lala
OK
#发现数据全走的Flume3
#把Flume3杀掉ctrl + c
shazi
OK
haha
OK
hahha
OK
hahah
OK
#这时候数据通过Flume2打印纸控制台上

2. 负载均衡

使用 Flume1 监控一个端口,其 sink 组中的 sink 分别对接 Flume2 和 Flume3,采用
Load balancing Sink Processor,实现负载均衡的功能。

对上面的配置文件进行如下修改flume1.conf

# Name the components on this agent
a1.sources = r1
a1.channels = c1
a1.sinkgroups = g1
a1.sinks = k1 k2

# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = hadoop
a1.sources.r1.port = 44444

#设置负载均衡
a1.sinkgroups.g1.processor.type = load_balance
a1.sinkgroups.g1.processor.backoff = true
a1.sinkgroups.g1.processor.selector = random


#失败的接收的最大回退周期(单位为毫秒)
a1.sinkgroups.g1.processor.maxpenalty = 10000

# Describe the sink
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = hadoop
a1.sinks.k1.port = 4141

a1.sinks.k2.type = avro
a1.sinks.k2.hostname = hadoop
a1.sinks.k2.port = 4142

# Describe the channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinkgroups.g1.sinks = k1 k2

a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c1

测试同上…

你可能感兴趣的:(Flume)