Each component (source, sink or channel) in the flow has a name, type, and set of properties that are specific to the type and instantiation. For example, an Avro source needs a hostname (or IP address) and a port number to receive data from. A memory channel can have max queue size (“capacity”), and an HDFS sink needs to know the file system URI, path to create files, frequency of file rotation (“hdfs.rollInterval”) etc. All such attributes of a component needs to be set in the properties file of the hosting Flume agent.
# example.conf: A single-node Flume configuration
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = node1
a1.sources.r1.port = 44444
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
bin/flume-ng agent -n a1 -c conf -f conf/example.conf -Dflume.root.logger=INFO,console
端口启动
telnet node1 44444
# Name the components on this agent
#定义agent ,命名为a1.
a1.sources = r1
#定义agent a1的数据源sources为r1.
a1.sinks = k1
#定义agent a1的输出槽sinks为k1.
a1.channels = c1
#定义agent a1的管道channels 为c1.
#一个angent可以有多个source,sink,channel
# Describe/configure the source
#定义对应agent a1的sources具体配置,a1.sources.r1 依旧可见a1.sources可以多个source
a1.sources.r1.type = netcat
#定义r1的 类型为netcat,网络端口监听
a1.sources.r1.bind = node1
#定义r1绑定服务器
a1.sources.r1.port = 44444
#定义r1的监听端口
# Describe the sink
a1.sinks.k1.type = logger
# 定义sink输出类型为logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
# 定义管道类型为 内存
a1.channels.c1.capacity = 1000
# 定义管道内存储的最大事件数为 1000,事件event 即记录
a1.channels.c1.transactionCapacity = 100
# 通道将从源接收的事件的最大数量,或每个事务提供给接收器的事件的最大数量。
# 即每次从source获取到事件的数量
# Bind the source and sink to the channel
# 将source、sink分别绑定到channel上
a1.sources.r1.channels = c1
# 将soource绑定到channel上,可以绑定多个source
a1.sinks.k1.channel = c1
bin/flume-ng agent -n a1 \
-c conf \
-f conf/example.conf \
-Dflume.root.logger=INFO,console
[root@node1 flume-1.7.0]# flume-ng
Error: Unknown or unspecified command ''
Usage: /home/hadoop/flume-1.7.0/bin/flume-ng <command> [options]...
commands:
help display this help text
agent run a Flume agent
#执行一个agent
avro-client run an avro Flume client
version show Flume version info
global options:
--conf,-c <conf> use configs in <conf> directory
# 指定conf目录
--classpath,-C <cp> append to the classpath
--dryrun,-d do not actually start Flume, just print the command
--plugins-path <dirs> colon-separated list of plugins.d directories. See the
plugins.d section in the user guide for more details.
Default: $FLUME_HOME/plugins.d
-Dproperty=value sets a Java system property value
-Xproperty=value sets a Java -X option
agent options:
--name,-n <name> the name of this agent (required)
# 指定启用的agent名
--conf-file,-f <file> specify a config file (required if -z missing)
--zkConnString,-z <str> specify the ZooKeeper connection to use (required if -f missing)
--zkBasePath,-p <path> specify the base path in ZooKeeper for agent configs
--no-reload-conf do not reload config file if changed
--help,-h display help text
avro-client options:
--rpcProps,-P <file> RPC client properties file with server connection params
--host,-H <host> hostname to which events will be sent
--port,-p <port> port of the avro source
--dirname <dir> directory to stream to avro source
--filename,-F <file> text file to stream to avro source (default: std input)
--headerFile,-R <file> File containing event headers as key/value pairs on each new line
--help,-h display help text
Either --rpcProps or both --host and --port must be specified.
Note that if <conf> directory is specified, then it is always included first
in the classpath.
用户手册:http://flume.apache.org/releases/content/1.7.0/FlumeUserGuide.html
官网地址:http://flume.apache.org/index.html