Flink自定义UDF

准备POJO类

package com.wanshun.bigdata.chapter05;

/**
 * Author:panghu
 * Date:2022-04-19
 * Description:定义一个Flink POJO数据类型的类,方便数据的解析和序列化
 * POJO类型的类需要满足以下几点:
 * 1.类是公共的和独立的(没有非静态内部类)
 * 2.有一个公共的无参构造器
 * 3.所有的属性都是公共的且非final的,或者有公共的getter和setter方法
 * 4.所有属性的类型都是可序列化的
 */
public class Event {
    public String user;
    public String url;
    public Long timeStamp;

    public Event() {
    }

    public Event(String user, String url, Long timeStamp) {
        this.user = user;
        this.url = url;
        this.timeStamp = timeStamp;
    }

    @Override
    public String toString() {
        return "Event{" +
                "user='" + user + '\'' +
                ", url='" + url + '\'' +
                ", timeStamp=" + timeStamp +
                '}';
    }
}

自定义UDF

public class _13UDFTest {
    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        DataStreamSource<Event> streamSource = env.fromElements(
                new Event("Tom", "/info", 2000L),
                new Event("Bob", "/home", 1000L)
        );

        SingleOutputStreamOperator<Event> data = streamSource.filter(new MyFilter("home"));

        // 匿名内部类实现
        // SingleOutputStreamOperator data = streamSource.filter(new FilterFunction() {
        //     @Override
        //     public boolean filter(Event event) throws Exception {
        //         return event.url.contains("home");
        //     }
        // });
        data.print();

        env.execute();
    }

    public static class MyFilter extends RichFilterFunction<Event> {
        private String keyWord;

        public MyFilter(String keyWord) {
            this.keyWord = keyWord;
        }

        @Override
        public boolean filter(Event event) throws Exception {
            return event.url.contains(this.keyWord);
        }
    }
}

你可能感兴趣的:(flink,flink,java,大数据)