flume拦截器实现多channel传输

文章目录

        • 一、拦截器简介
        • 二、idea构建拦截器
        • 三、flume conf文件编写
        • 四、执行命令并查看结果

一、拦截器简介

拦截器主要用来实现日志的分类,修改或者删除不需要的日志信息,拦截器分为内置拦截器和自定义拦截器。
下面我们主要介绍使用自定义拦截器来将信息分类传输。

二、idea构建拦截器

首先构建一个maven工程,在pom依赖包中添加如下依赖:

  <dependency>
      <groupId>org.apache.flume</groupId>
      <artifactId>flume-ng-core</artifactId>
      <version>1.6.0</version>
    </dependency>

具体实现代码如下:

package kb07;

import org.apache.flume.Context;
import org.apache.flume.Event;
import org.apache.flume.interceptor.Interceptor;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;
// 主要是实现Interceptor中的抽象方法
public class InterceptorDemo implements Interceptor {
	// 定义一个Event类型的集合来保存数据
    private List<Event> addHeaderEvents;
    @Override
    public void initialize() {
    // 初始化
        addHeaderEvents = new ArrayList<>();
    }

    @Override
    public Event intercept(Event event) {
        Map<String, String> headers = event.getHeaders();
        String body = new String(event.getBody());
        // 如果events以gree为start,则将文件类型定义为gree
        if(body.startsWith("gree")){
            headers.put("type","gree");
        }else {
        // 否则将文件类型定义为znn
            headers.put("type","znn");
        }
        return event;
    }

    @Override
    public List<Event> intercept(List<Event> list) {
    	// 每接收一个新的event就要清空addHeaderEvents这个list
        addHeaderEvents.clear();
        for (Event event : list) {
        // 将event类型的拦截器信息添加到addHeaderEvents中
            addHeaderEvents.add(intercept(event));
        }
        return addHeaderEvents;
    }

    @Override
    public void close() {

    }
    // 定义一个静态内部类来构建interceptor
    public static class Builder implements Interceptor.Builder{

        @Override
        public Interceptor build() {
            return new InterceptorDemo();
        }

        @Override
        public void configure(Context context) {

        }
    }
}

编写完成打成jar包放到虚拟机的flume/lib目录下

三、flume conf文件编写

在flume/conf/目录下新建一个.conf文件:vi interceptor.conf

a1.sources = r1
a1.channels = c1 c2
a1.sinks = k1 k2

a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

a1.sources.r1.interceptors = i1
# 拦截器的类型:类名$内部类名
a1.sources.r1.interceptors.i1.type = kb07.InterceptorDemo$Builder

a1.sources.r1.selector.type = multiplexing
a1.sources.r1.selector.header = type
a1.sources.r1.selector.mapping.gree = c1
a1.sources.r1.selector.mapping.znn = c2

a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

a1.channels.c2.type = memory
a1.channels.c2.capacity = 1000
a1.channels.c2.transactionCapacity = 100

a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.filePrefix = gree
a1.sinks.k1.hdfs.fileSuffix = .csv
# 如果时以gree开头,则生成的文件类型是greedemo
a1.sinks.k1.hdfs.path = hdfs://192.168.234.150:9000/user/greedemo/%Y-%m-%d
a1.sinks.k1.hdfs.useLocalTimeStamp = true
a1.sinks.k1.hdfs.batchSize = 640
a1.sinks.k1.hdfs.rollCount = 0
a1.sinks.k1.rollSize = 100
a1.sinks.k1.hdfs.rollInterval = 3
a1.sinks.k2.type = hdfs
a1.sinks.k2.hdfs.fileType = DataStream
a1.sinks.k2.hdfs.filePrefix = znn
a1.sinks.k2.hdfs.fileSuffix = .csv
a1.sinks.k2.hdfs.path = hdfs://192.168.234.150:9000/user/znndemo/%Y-%m-%d
a1.sinks.k2.hdfs.useLocalTimeStamp = true
a1.sinks.k2.hdfs.batchSize = 640
a1.sinks.k2.hdfs.rollCount = 0
a1.sinks.k2.rollSize = 100
a1.sinks.k2.hdfs.rollInterval = 3

a1.sources.r1.channels = c1 c2
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c2

这里实现的是一个source源,根据event的header信息,来实现两个channel传输到两个sink。

四、执行命令并查看结果

首先在flume目录下输入命名:

[root@znn flume]# flume-ng agent -c conf -f conf/interceptor.conf -n a1 -Dflume.root.logger=INFO,console

然后通过命令 telnet localhost 44444来输入信息:

Connected to localhost.
Escape character is '^]'.
hello
OK
gree adg as
OK
flume
OK
greeflume
OK

进入hdfs查看/user下是否有新的文件产生
flume拦截器实现多channel传输_第1张图片
发现有两个文件目录产生,进入可以查看我们输入的四条信息已经被分开成不同的csv文件保存起来了。

你可能感兴趣的:(flume,interceptor,flume,interceptor)