1.上传flume
2.解压: tar -zxvf apache-flume-1.6.0-bin.tar.gz -c /home/hadoop/soft
3.软连接 ln -s apache-flume-1.6.0-bin flume
4.配置环境变量FLUME_HOME 在~/.bashrc 中
$ vim ~/.bashrc
export JAVA_HOME=/home/hadoop/soft/jdk
export HADOOP_HOME=/home/hadoop/soft/hadoop
export ZOOKEEPER_HOME=/home/hadoop/soft/zoo
export HBASE_HOME=/home/hadoop/soft/hbase
export HIVE_HOME=/home/hadoop/soft/hive
export SPARK_HOME=/home/hadoop/soft/spark
export FLUME_HOME=/home/hadoop/soft/flume
export KAFKA_HOME=/home/hadoop/soft/kafka
PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOOKEEPER_HOME/bin:$PATH:$HBASE_HOME/bin:$HIVE_HOME/bin:$SPARK_HOME/bin:$FLUME_HOME/bin:$KAFKA_HOME/bin:$HOME/bin
export PATH
扫描一下环境变量 $: Source ~/.bashrc
输出一下是否成功 $: echo $FLUME_HOME
完成之后配置文件测试:
[hadoop@master soft]$ cd flume/conf/
[hadoop@master conf]$ cp flume-conf.properties.template flume-conf.properties
[hadoop@master conf]$vim flume-conf.properties
案例1.
单节点Flume部署,此配置允许用户生成事件,然后将其记录到控制台显示,此配置定义名为a1的单个代理。 a1有一个侦听端口44444上的数据的源,一个缓冲内存中事件数据的通道,以及一个将事件数据记录到控制台的接收器。配置文件命名各种组件,然后描述其类型和配置参数。
1-1)配置文件
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
保存退出:wq
启动flume:
1.首先在打开一个客户端例如:192.168.11.10 (1客户端),再打开一个客户端192.168.11.10(2客户端)
2.在客户端(1)中输入 $: nc localhost 44444
启动flume:
flume-ng agent -n a1 -c /home/hadoop/soft/flume/conf -f flume-conf.properties -Dflume.root.logger=INFO,console
3.这时候在客户(2)中输入$: nc localhost 44444
开启数据采集连接,在客户端(2)生产数据时,flume会采集数据同步到客户端(1)打印在客户端(1)的控制台上
#客户端(2)
#客户端(1)收集的数据打印在控制台
#注意输出控制台指令的空格问题,否则报错错误
案例2.
exec Source:去监听三个指定的文件,即只要应用程序向这个文件里面写数据,这个source组件就可以采集到信息。
2-1)配置文件
#Name the components on this agent
a2.sources = r1 r2 r3
a2.sinks = k1
a2.channels = c1
#Describe/configure the source
a2.sources.r1.type = exec
a2.sources.r1.command = tail -F /home/hadoop/Desktop/a1.log
a2.sources.r2.type = exec
a2.sources.r2.command = tail -F /home/hadoop/Desktop/a2.log
a2.sources.r3.type = exec
a2.sources.r3.command = tail -F /home/hadoop/Desktop/a3.log
#Describe the sink
a2.sinks.k1.type = logger
#Use a channel which buffers events in memory
a2.channels.c1.type = memory
a2.channels.c1.capacity = 1000
a2.channels.c1.transcationCapacity = 1000
# Bind the source and sink to the channel
a2.sources.r1.channels = c1
a2.sources.r2.channels = c1
a2.sources.r3.channels = c1
a2.sinks.k1.channel = c1
2-2)启动flume的agent 的代理服务
[hadoop@master conf]$ flume-ng agent -n a2 -c /home/hadoop/soft/flume/conf -f flume-conf.properties -Dflume.root.logger=INFO,console
2-3)在控制台上查看打印的日志
2018-10-12 10:33:30,232 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java :94)] Event: { headers:{} body: 69 20 6C 6F 76 65 20 79 6F 75 i love you }
2018-10-12 10:33:30,236 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java :94)] Event: { headers:{} body: 69 20 6C 6F 76 65 20 79 6F 75 i love you }
2018-10-12 10:33:30,238 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java :94)] Event: { headers:{} body: 69 20 6C 6F 76 65 20 79 6F 75 i love you }
2018-10-12 10:33:30,244 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java :94)] Event: { headers:{} body: 69 20 6C 6F 76 65 20 79 6F 75 20 74 6F i love you to }
2018-10-12 10:33:30,244 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java :94)] Event: { headers:{} body: 69 20 6C 6F 76 65 20 79 6F 75 20 74 6F i love you to }
2018-10-12 10:33:30,246 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java :94)] Event: { headers:{} body: 69 20 6C 6F 76 65 20 79 6F 75 20 74 6F i love you to }
2018-10-12 10:33:30,247 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java :94)] Event: { headers:{} body: 69 20 6C 6F 76 65 20 79 6F 75 20 74 6F i love you to }
2018-10-12 10:33:30,247 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java :94)] Event: { headers:{} body: 69 20 6C 6F 76 65 20 79 6F 75 20 74 6F i love you to }
2018-10-12 10:33:30,248 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java :94)] Event: { headers:{} body: 69 20 6C 6F 76 65 20 79 6F 75 20 74 6F i love you to }
2018-10-12 10:33:30,248 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java :94)] Event: { headers:{} body: 69 20 6C 6F 76 65 20 79 6F 75 20 74 6F i love you to }
2018-10-12 10:33:30,249 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java :94)] Event: { headers:{} body: 69 20 6C 6F 76 65 20 79 6F 75 20 74 6F i love you to }
2018-10-12 10:33:30,252 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java :94)] Event: { headers:{} body: EF BF BD EF B7 AB 3F 31 30 3A 35 36 3A 34 35 ......?10:56:45 }
2018-10-12 10:33:30,253 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java :94)] Event: { headers:{} body: EF BF BD DC B3 EF BF BD EF BF BD EF BF BD .............. }
2018-10-12 10:33:30,254 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java :94)] Event: { headers:{} body: EF BF BD EF BF BD EF BF BD D3 A2 3F 31 30 3A 35 ...........?10:5 }
2-4)对文件a1.log、a2.log、a3.log追加数据
[hadoop@master Desktop]$ echo wo shi a1 >> a1.log
[hadoop@master Desktop]$ echo wo shi a1 >> a1.log
[hadoop@master Desktop]$ echo wo shi a1 >> a1.log
[hadoop@master Desktop]$ echo wo shi a2 >> a2.log
[hadoop@master Desktop]$ echo wo shi a2 >> a2.log
[hadoop@master Desktop]$ echo wo shi a2 >> a2.log
[hadoop@master Desktop]$ echo wo shi a2 >> a2.log
2-5)去控制台查看
2018-10-12 15:07:17,842 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: EF BF BD EF B7 AB 3F 31 30 3A 35 37 3A 34 38 ......?10:57:48 }
2018-10-12 15:07:17,847 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: EF BF BD EF BF BD C3 B4 EF BF BD EF BF BD EF BF ................ }
2018-10-12 15:07:17,849 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: EF BF BD EF B7 AB 3F 31 30 3A 35 39 3A 31 36 ......?10:59:16 }
2018-10-12 15:07:17,850 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: EF BF BD EF BF BD EF BF BD D5 B5 EF BF BD EF BF ................ }
2018-10-12 15:07:17,850 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: 77 6F 20 73 68 69 20 61 31 wo shi a1 }
2018-10-12 15:07:17,856 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: 77 6F 20 73 68 69 20 61 31 wo shi a1 }
2018-10-12 15:07:17,856 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: 77 6F 20 73 68 69 20 61 31 wo shi a1 }
2018-10-12 15:07:17,857 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: EF BF BD EF BF BD EF BF BD EF BF BD ............ }
2018-10-12 15:07:17,858 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: EF BF BD EF B7 AB 3F 31 30 3A 35 37 3A 30 35 ......?10:57:05 }
2018-10-12 15:07:17,858 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: EF BF BD EF BF BD EF BF BD EF BF BD EF BF BD EF ................ }
2018-10-12 15:07:17,858 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: EF BF BD EF B7 AB 3F 31 30 3A 35 37 3A 34 38 ......?10:57:48 }
2018-10-12 15:07:17,861 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: EF BF BD EF BF BD C3 B4 EF BF BD EF BF BD EF BF ................ }
2018-10-12 15:07:17,861 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: EF BF BD EF B7 AB 3F 31 30 3A 35 39 3A 31 36 ......?10:59:16 }
2018-10-12 15:07:17,862 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: EF BF BD EF BF BD EF BF BD D5 B5 EF BF BD EF BF ................ }
2018-10-12 15:07:17,864 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: 77 6F 20 73 68 69 20 61 32 wo shi a2 }
2018-10-12 15:07:17,865 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: 77 6F 20 73 68 69 20 61 32 wo shi a2 }
2018-10-12 15:07:17,865 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: 77 6F 20 73 68 69 20 61 32 wo shi a2 }