本文主要包括如下的几个部分:
到Flume的官方网站下载相关文件Flume官网,本人下载的是最新的稳定版本:1.7.0。(ps:下载编译后的文件,不要下载源文件。我们的目标是使用Flume,而不是研究Flume的源码。)
下载图示如下:
为什么需要配置Flume的文件呢?
因为各个机器都不相同,配置也就不相同。其中,还有将Flume的log输出到控制台,输出到hdfs,输出到kafka,输出到物理文件和mongodb等等等… 这些都需要配置进行,Flume运行时会将这些配置进行解析,以达到简单配置即可使用的目的。大大降低了软件的使用难度和部署、维护难度。
首先,我们当然要解压之前下载好的flume文件包。(windows/mac直接点击解压,Linux可以使用tar -zxvf 命令进行解压。)
解压后的我的文件目录为:
localhost:Flume Sean$ ls
apache-flume-1.7.0-bin apache-flume-1.7.0-bin.tar.gz
随后,我们便要进入Flume文件夹,对于配置文件进行编辑了。当然和往常一样,在进行编辑之前,我们需要将原文件进行保存备份。
其操作命令如下所示:
cd apache-flume-1.7.0-bin/conf
scp flume-conf.properties.template flume-conf.properties
这样在conf目录就有flume-conf.properties这个文件了,也是我们接下来需要配置的这个文件,启动Flume的时候会使用到。
接着,我们将官方的配置写入其中,官方配置如下:
# example.conf: A single-node Flume configuration
# Name the components on this agent
# The name of the agent is defined as a1.
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
# 内存模式
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
# 传输参数设置。
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
# The sources->channels->sinks config , the channels and the channel name must be defined.
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
没有接触过Linux的小伙伴,可以通过vim 命令进行操作Esc->:x
,或者Esc->:wq进行保存。(不懂的,可以搜索下vim命令,反正我们的目的是在里面创建一个叫做“flume-conf.properties“的文件,并且将上述的配置内容写入即可。)
配置好步骤2的一切之后,就可以启动Flume了。
我们先退回到根目录下:
也就是”/Users/xxx/Software/Flume/apache-flume-1.7.0-bin/”下
具体看flume文件的放置位置。
然后输入启动命令:
./bin/flume-ng agent –conf conf –conf-file conf/example.conf
–name a1 -Dflume.root.logger=INFO,console
如果看到如下的输出,就代表启动成功啦。
localhost:apache-flume-1.7.0-bin Sean$ ./bin/flume-ng agent --conf conf --conf-file conf/example.conf --name a1 -Dflume.root.logger=INFO,console
Info: Including Hadoop libraries found via (/Users/Sean/Software/hadoop/hadoop-2.2.0/bin/hadoop) for HDFS access
+ exec /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/bin/java -Xmx20m -Dflume.root.logger=INFO,console -cp '/Users/Sean/Software/Flume/apache-flume-1.7.0-bin/conf:/Users/Sean/Software/Flume/apache-flume-1.7.0-bin/lib/*:/Users/Sean/Software/hadoop/hadoop-2.2.0/etc/hadoop:/Users/Sean/Software/hadoop/hadoop-2.2.0/share/hadoop/common/lib/*:/Users/Sean/Software/hadoop/hadoop-2.2.0/share/hadoop/common/*:/Users/Sean/Software/hadoop/hadoop-2.2.0/share/hadoop/hdfs:/Users/Sean/Software/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/*:/Users/Sean/Software/hadoop/hadoop-2.2.0/share/hadoop/hdfs/*:/Users/Sean/Software/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/*:/Users/Sean/Software/hadoop/hadoop-2.2.0/share/hadoop/yarn/*:/Users/Sean/Software/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/*:/Users/Sean/Software/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/*:/Users/Sean/Software/hadoop/hadoop-2.2.0/contrib/capacity-scheduler/*.jar' -Djava.library.path=:/Users/Sean/Software/hadoop/hadoop-2.2.0/lib/native org.apache.flume.node.Application --conf-file conf/example.conf --name a1
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/Sean/Software/Flume/apache-flume-1.7.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/Sean/Software/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2017-01-09 00:46:14,796 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start(PollingPropertiesFileConfigurationProvider.java:62)] Configuration provider starting
2017-01-09 00:46:14,802 (conf-file-poller-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:134)] Reloading configuration file:conf/example.conf
2017-01-09 00:46:14,810 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:930)] Added sinks: k1 Agent: a1
2017-01-09 00:46:14,811 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:k1
2017-01-09 00:46:14,811 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:k1
2017-01-09 00:46:14,824 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:140)] Post-validation flume configuration contains configuration for agents: [a1]
2017-01-09 00:46:14,824 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:147)] Creating channels
2017-01-09 00:46:14,829 (conf-file-poller-0) [INFO - org.apache.flume.channel.DefaultChannelFactory.create(DefaultChannelFactory.java:42)] Creating instance of channel c1 type memory
2017-01-09 00:46:14,835 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:201)] Created channel c1
2017-01-09 00:46:14,835 (conf-file-poller-0) [INFO - org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:41)] Creating instance of source r1, type netcat
2017-01-09 00:46:14,845 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:42)] Creating instance of sink: k1, type: logger
2017-01-09 00:46:14,850 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:116)] Channel c1 connected to [r1, k1]
2017-01-09 00:46:14,856 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:137)] Starting new configuration:{ sourceRunners:{r1=EventDrivenSourceRunner: { source:org.apache.flume.source.NetcatSource{name:r1,state:IDLE} }} sinkRunners:{k1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@720b314f counterGroup:{ name:null counters:{} } }} channels:{c1=org.apache.flume.channel.MemoryChannel{name: c1}} }
2017-01-09 00:46:14,865 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:144)] Starting Channel c1
2017-01-09 00:46:14,930 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:119)] Monitored counter group for type: CHANNEL, name: c1: Successfully registered new MBean.
2017-01-09 00:46:14,930 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:95)] Component type: CHANNEL, name: c1 started
2017-01-09 00:46:14,932 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:171)] Starting Sink k1
2017-01-09 00:46:14,933 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:182)] Starting Source r1
2017-01-09 00:46:14,934 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.source.NetcatSource.start(NetcatSource.java:155)] Source starting
2017-01-09 00:46:14,961 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.source.NetcatSource.start(NetcatSource.java:169)] Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/127.0.0.1:44444]
这边可以解析下启动命令:
./bin/flume-ng agent –conf conf –conf-file conf/example.conf –name a1 -Dflume.root.logger=INFO,console
为了测试,我们重新打开一个命令框,输入命令”telnet localhost 44444”,预计得到下述结果:
当前命令窗:
localhost:~ xxx$ telnet localhost 44444
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
hello
OK
Flume命令窗:
2017-01-09 00:56:54,946 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:95)] Event: { headers:{} body: 68 65 6C 6C 6F 0D hello. }
这边便是监控到了,在另一个页面输入到44444端口到hello信息。
好了,第一节到Flume介绍就到此为止了。