Flink自定义数据源

Flink中已经支持了大部分常见的数据源,但是在某些特定的情况下的数据源Flink还没有支持,或者是自定义的数据类型,这个时候我们就需要使用到Flink的自定义数据源,废话不多说,上代码

import lombok.*;
import org.apache.commons.lang3.RandomUtils;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.source.SourceFunction;

import java.util.Random;

/**
 * @Author: J
 * @Version: 1.0
 * @CreateTime: 2023/6/13
 * @Description: 自定义数据源测试
 **/
public class FlinkCustomizeSource {
    public static void main(String[] args) throws Exception {
        // 创建流环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        // 添加自定义数据源
        DataStreamSource dataStreamSource = env.addSource(new customizeSource());
        // 打印内容
        dataStreamSource.print();
        env.execute();
    }
}

// 自定义数据源需要实现SourceFunction接口,注意这个接口是单机的数据源,如果是想自定义分布式的数据源需要集成RichParallelSourceFunction类
class customizeSource implements SourceFunction<CustomizeBean> {
    int flag;
    // Job执行的线程
    @Override
    public void run(SourceContext ctx) throws Exception {
        /*这个方法里就是具体的数据逻辑,实际内容要根据业务需求编写,这里只是为了演示方便*/
        CustomizeBean customizeBean = new CustomizeBean();
        String[] genders = {"M", "W"};
        while (flag != 100) {
            // 这里自定义的Bean作为数据源
            customizeBean.setAge(RandomUtils.nextInt(18, 80)); // 年龄
            customizeBean.setName("A-" + new Random().nextInt()); // 姓名
            customizeBean.setGender(genders[RandomUtils.nextInt(0, genders.length)]); // 性别
            // 将数据收集
            ctx.collect(customizeBean);
            // 这里加上睡眠时间是为了降低控制台打印的速度
            Thread.sleep(1000);
        }
    }

    // Job取消时就会调用cancel方法
    @Override
    public void cancel() {
        // flag为100时就会停止程序
        flag = 100;
    }
}

@Getter
@Setter
@ToString
@NoArgsConstructor
@AllArgsConstructor
class CustomizeBean{
    private String name; // 姓名
    private int age; // 年龄
    private String gender; // 性别
}

控制台打印内容如下

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/xxx/data/maven-repository/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.17.2/log4j-slf4j-impl-2.17.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/xxx/data/maven-repository/repository/org/slf4j/slf4j-log4j12/1.7.25/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2> CustomizeBean(name=A-936337035, age=59, gender=M)
3> CustomizeBean(name=A--1083661159, age=46, gender=W)
4> CustomizeBean(name=A-1844226683, age=67, gender=M)
5> CustomizeBean(name=A--75267383, age=41, gender=M)
6> CustomizeBean(name=A-1372451924, age=67, gender=M)
7> CustomizeBean(name=A--1997466557, age=57, gender=W)
8> CustomizeBean(name=A-1024015616, age=55, gender=W)
9> CustomizeBean(name=A-1955095194, age=77, gender=M)
10> CustomizeBean(name=A--1661025692, age=51, gender=W)
1> CustomizeBean(name=A-961725897, age=52, gender=M)
2> CustomizeBean(name=A--1117628763, age=73, gender=M)
3> CustomizeBean(name=A-1171187615, age=29, gender=M)
4> CustomizeBean(name=A--838540662, age=60, gender=M)
5> CustomizeBean(name=A-355071472, age=45, gender=W)
6> CustomizeBean(name=A-1724444454, age=26, gender=W)
7> CustomizeBean(name=A-1889957428, age=69, gender=W)

你可能感兴趣的:(FLink,flink,java,apache)