Flume实战监听文件夹内文件变化


Flume官网有多种场景的source,sink,channel的配置

Flume实战监听文件夹内文件变化_第1张图片


Flume实战监听文件夹内文件变化_第2张图片


Flume实战监听文件夹内文件变化_第3张图片


1、flume安装目录下新建文件夹 example 

2、在example下新建文件 

spooldir-logger.conf

内容如下:

a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = spooldir
a1.sources.r1.spoolDir = /home/hadoop/flume_test
a1.sources.r1.fileHeader = true

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1


3、创建文件夹  /home/hadoop/flume_test


4、启动,

命令:flume-ng agent -c ../conf -f spooldir-logger.conf -n a1 -Dflume.root.logger=INFO,console


5、在flume_test文件夹下新建文件

echo "11111"  >>  hello.txt

flume_test文件夹下多了个文件:hello.txt.COMPLETED


但flume报错了。

2017-03-22 21:55:50,005 (pool-3-thread-1) [ERROR - org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:280)] FATAL: Spool Directory source r1: { spoolDir: /home/hadoop/flume_test }: Uncaught exception in SpoolDirectorySource thread. Restart or reconfigure Flume to continue processing.
java.lang.IllegalStateException: File name has been re-used with different files. Spooling assumptions violated for /home/hadoop/flume_test/hello.txt.COMPLETED

跟踪抛出异常的源码,SpoolDirectorySource会启动一个线程轮询监控目录下的目标文件,当读取完该文件(readEvents)之后会对该文件进行重名(rollCurrentFile),当重命名失败时会抛出IllegalStateException,被SpoolDirectoryRunnable catch重新抛出RuntimeException,导致当前线程退出,从源码看SpoolDirectoryRunnable是单线程执行的,因此线程结束后,监控目录下其他文件不再被处理。所以,再新建个 word.txt 文件,flume没有监听动作了。如下图,word.txt没有被重命名 word.txt.COMPLETED

Flume实战监听文件夹内文件变化_第4张图片


正确的做法:


不要在flume_test文件夹下直接新建文件,写内容。在其他文件下新建,写好内容,mv 到flume_test文件夹下。

[hadoop@nbdo3 ~]$ cd testdata/
[hadoop@nbdo3 testdata]$ ll
total 4
-rw-rw-r--. 1 hadoop hadoop 71 Mar 10 20:19 hello.txt
[hadoop@nbdo3 testdata]$ cp hello.txt ../flume_test/
[hadoop@nbdo3 testdata]$ echo "123456778" >> world.txt
[hadoop@nbdo3 testdata]$ cp world.txt ../flume_test/
[hadoop@nbdo3 testdata]$

Flume实战监听文件夹内文件变化_第5张图片



-------------

更多的Java,Angular,Android,大数据,J2EE,Python,数据库,Linux,Java架构师,:

http://www.cnblogs.com/zengmiaogen/p/7083694.html



你可能感兴趣的:(大数据)