注意这是源码文件
flume-ng-1.6.0-cdh5.7.0-src.tar.gz
解压到Windows目录下
找到 getMatchFiles 方法
注释完后在下面添加如下代码
/**
* 修改flume源码,使其支持递归
* @param parentDir
* @param fileNamePattern
* @return
*/
private List<File> getMatchFiles(File parentDir, final Pattern fileNamePattern) {
//所有指定文件夹下的所有文件,在通过正则匹配规则过滤不符合条件的文件
List<File> result = Lists.newArrayList();
for(File f: getAllFiles(parentDir)){
String fileName = f.getName();
if (fileNamePattern.matcher(fileName).matches()) {
result.add(f);
}
}
Collections.sort(result, new TailFile.CompareByLastModifiedTime());
return result;
}
/**
* 新增方法
* 获取指定目录下的所有文件,通过递归的方式
* @param parentDir
* @return
*/
private List<File> getAllFiles(File parentDir){
List<File> fileList = Lists.newArrayList();
getAllFiles(parentDir,fileList);
return fileList;
}
/**
* 新增方法
*/
private void getAllFiles(File parentDir,List<File> fileList){
File[] files = parentDir.listFiles();
if(null != files){
for(File file: parentDir.listFiles()){
if(file.isDirectory()){
getAllFiles(file,fileList);
}else{
fileList.add(file);
}
}
}
}
这是解压后的源码文件
把这个类ReliableTaildirEventReader
上传到该路径下
[hadoop@vm01 taildir]$ ll
total 40
-rw-r--r--. 1 hadoop hadoop 12707 Aug 22 06:22 ReliableTaildirEventReader.java
-rw-rw-r--. 1 hadoop hadoop 2418 Mar 23 2016 TaildirSourceConfigurationConstants.java
-rw-rw-r--. 1 hadoop hadoop 12027 Mar 23 2016 TaildirSource.java
-rw-rw-r--. 1 hadoop hadoop 5129 Mar 23 2016 TailFile.java
[hadoop@vm01 taildir]$ pwd
/home/hadoop/source/flume-ng-1.6.0-cdh5.7.0/flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir
[hadoop@vm01 taildir]$
编译
[hadoop@vm01 flume-taildir-source]$ pwd
/home/hadoop/source/flume-ng-1.6.0-cdh5.7.0/flume-ng-sources/flume-taildir-source
[hadoop@vm01 flume-taildir-source]$ mvn clean package
[hadoop@vm01 flume-taildir-source]$ cd target/
[hadoop@vm01 target]$ ll
total 36
drwxrwxr-x. 4 hadoop hadoop 31 Aug 22 06:43 classes
-rw-rw-r--. 1 hadoop hadoop 31331 Aug 22 06:44 flume-taildir-source-1.6.0-cdh5.7.0.jar #这个包
drwxrwxr-x. 4 hadoop hadoop 47 Aug 22 06:43 generated-sources
drwxrwxr-x. 2 hadoop hadoop 27 Aug 22 06:44 maven-archiver
drwxrwxr-x. 3 hadoop hadoop 21 Aug 22 06:42 maven-shared-archive-resources
drwxrwxr-x. 2 hadoop hadoop 4096 Aug 22 06:44 surefire-reports
drwxrwxr-x. 4 hadoop hadoop 31 Aug 22 06:43 test-classes
把该目录下的flume-taildir-source-1.6.0-cdh5.7.0.jar
包复制到Flume应用程序的lib目录下
[hadoop@vm01 target]$ cp flume-taildir-source-1.6.0-cdh5.7.0.jar ~/app/apache-flume-1.6.0-cdh5.7.0-bin/lib/
[hadoop@vm01 conf]$ pwd
/home/hadoop/app/apache-flume-1.6.0-cdh5.7.0-bin/conf
[hadoop@vm01 conf]$ cat taildir-channels-logger.conf
a1.sources=r1
a1.sinks=k1
a1.channels=c1
a1.sources.r1.type=TAILDIR
a1.sources.r1.filegroups=f1
a1.sources.r1.filegroups.f1=/tmp/flume/.*.log
a1.sources.r1.positionFile=/tmp/flume/position/taildir_position.json
a1.sources.r1.headers.f1.headerKey1=value1
a1.sinks.k1.type=logger
a1.channels.c1.type=memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
a1.sources.r1.channels=c1
a1.sinks.k1.channel=c1
启动flume-agent,测试
[hadoop@vm01 apache-flume-1.6.0-cdh5.7.0-bin]$ bin/flume-ng agent \
--conf conf \
--conf-file conf/taildir-channels-logger.conf \
--name a1 \
-Dflume.root.logger=INFO,console
克隆一个控制台
[hadoop@vm01 ~]$ mkdir -p /tmp/flume/f/f1/f2
[hadoop@vm01 ~]$ echo "hello hadoop" >> /tmp/flume/f/test.log
[hadoop@vm01 ~]$ echo "666" >> /tmp/flume/f/f1/f2/test.log