flume使用入门

1.下载apache-flume-1.4.0-bin.tar.gz并解压

tar -xzvf apache-flume-1.4.0-bin.tar.gz
cd apache-flume-1.4.0-bin

 


2.创建一个简单的flume配置文件,内容如下:

   vi a1.conf

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

 

3.启动flumeng

./bin/flume-ng agent --conf conf --conf-file a1.conf --name a1 -Dflume.root.logger=INFO,console

 

 

4.在另一个控制台,启用telnet

 telnet localhost 44444
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.
hello world
OK


将会在步骤3中的控制台中看到flume接收到的消息。

在步骤3中的控制台按ctrl+C,中止测试。

 

5.修改a1的来源为从日志文件中获取数据

修改后的内容如下:

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /var/log/secure

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

 

6.再次启动flumeng

./bin/flume-ng agent --conf conf --conf-file a1.conf --name a1 -Dflume.root.logger=INFO,console

 


打开新的ssh客户端,输入用户名密码登录本服务器,将会看到flume有日志产生

但看到的日志不完整

/var/log/secure中的内容为:

Mar 12 14:50:21 web5 sshd[9856]: Accepted password for root from 10.0.2.11 port 1135 ssh2
Mar 12 14:50:21 web5 sshd[9856]: pam_unix(sshd:session): session opened for user root by (uid=0)

flume控制台内容为:
2014-03-12 06:50:21,571 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: { headers:{} body: 4D 61 72 20 31 32 20 31 34 3A 35 30 3A 32 31 20 Mar 12 14:50:21  }
2014-03-12 06:50:25,575 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: { headers:{} body: 4D 61 72 20 31 32 20 31 34 3A 35 30 3A 32 31 20 Mar 12 14:50:21  }

 

7.为flume添加个文件sink

修改后的a1.conf文件为:

# Name the components on this agent
a1.sources = r1
a1.sinks = k1 k2
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /var/log/secure

# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k2.type = file_roll
a1.sinks.k2.sink.directory = /tmp/flume
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c1

 

重新启动flume后,会在/tmp/flume下生成数据

查看/tmp/flume下的文件,和flume控制台的数据,感觉二者把日志的数据分开了,
一部分日志数据显示在控制台,一部分数据保存到了/tmp/flume下的文件里


/var/log/secure中内容为:

Mar 12 14:50:21 web5 sshd[9856]: Accepted password for root from 10.0.2.11 port 1135 ssh2
Mar 12 14:50:21 web5 sshd[9856]: pam_unix(sshd:session): session opened for user root by (uid=0)
Mar 12 14:59:36 web5 sshd[10350]: Accepted password for root from 10.0.2.11 port 1193 ssh2
Mar 12 14:59:36 web5 sshd[10350]: pam_unix(sshd:session): session opened for user root by (uid=0)
Mar 12 15:00:56 web5 sshd[10350]: Received disconnect from 10.0.2.11: 11: Disconnect requested by Windows SSH Client.
Mar 12 15:00:56 web5 sshd[10350]: pam_unix(sshd:session): session closed for user root


/tmp/flume下的文件约30秒产生一个。


8.为flume添加个文件channel

修改后的a1.conf文件为:

# Name the components on this agent

a1.sources = r1
a1.sinks = k1 k2
a1.channels = c1 c2

# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /var/log/secure

# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k2.type = file_roll
a1.sinks.k2.sink.directory = /tmp/flume
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
a1.channels.c2.type = memory
a1.channels.c2.capacity = 1000
a1.channels.c2.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1 c2
a1.sources.r1.selector.type = replicating
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c2

 
 修改后,控制台和/tmp/flume中的内容,与日志就对应上了。

你可能感兴趣的:(Flume)