Flink 广播流最佳实践

文章目录

      • 广播流与普通流JOIN图解
      • 代码实践

广播流与普通流JOIN图解

user actions 可以看作是事件流

patterns 为广播流,把全量数据加载到不同的计算节点
Flink 广播流最佳实践_第1张图片

普通双流join
根据join 条件,根据key的发到同一个计算节点,如下图类似
Flink 广播流最佳实践_第2张图片

代码实践

主要功能,实现动态添加监控词,满足条件数据继续向下流转,不满足条件的,直接丢弃

主类

 StreamExecutionEnvironment environment = StreamExecutionEnvironment.getExecutionEnvironment();
        environment.setStreamTimeCharacteristic(TimeCharacteristic.IngestionTime);
        environment.enableCheckpointing(1000 * 180);
        FlinkKafkaConsumer010 location = KafkaUtil.getConsumer("event_stream", "test_1", "test");
        FlinkKafkaConsumer010 object = KafkaUtil.getConsumer("bro_stream", "test_2", "test");
        // 把事件流按key进行分流,这样相同的key会发到同一个节点
        KeyedStream driverDatastream = environment.addSource(location).map(new MapFunction() {

            @Override
            public People map(String s) throws Exception {
                return parse(s);
            }
        }).keyBy((KeySelector) people -> people.id);
    
        // 描述这个map ,key value都为string 
        MapStateDescriptor mapStateDescriptor = new MapStateDescriptor("register", Types.STRING, Types.STRING);
        BroadcastStream broadcast = environment.addSource(object).broadcast(mapStateDescriptor);
        driverDatastream.connect(broadcast).process(new PatternEvaluator()).print();
        try {
            environment.execute("register collect");
        } catch (Exception e) {
            e.printStackTrace();
        }

对元素处理

public class PatternEvaluator extends KeyedBroadcastProcessFunction {

    MapStateDescriptor mapStateDescriptor;

    @Override
    public void open(Configuration parameters) throws Exception {
        super.open(parameters);
        // 这里需要初始化map state 描述
        mapStateDescriptor = new MapStateDescriptor("register", Types.STRING, Types.STRING);

    }

    // 处理每一个元素,看state是否有匹配的,有的话,下发到下一个节点
    @Override
    public void processElement(People value, ReadOnlyContext ctx, Collector out) throws Exception {
        ReadOnlyBroadcastState broadcastState = ctx.getBroadcastState(mapStateDescriptor);
        if ((value.getIdCard() != null && broadcastState.get(value.getIdCard()) != null) || (value.getPhone() != null && broadcastState.get(value.getPhone()) != null)) {
            System.out.println("匹配到" + value.toString());
            out.collect(value);
        }

    }


    // 新增加的广播元素,放入state中
    @Override
    public void processBroadcastElement(String value, Context ctx, Collector out) throws Exception {
        System.out.println("新增加需要监控的" + value.toString());
        BroadcastState broadcastState = ctx.getBroadcastState(mapStateDescriptor);
        broadcastState.put(value, value);
    }
}

你可能感兴趣的:(Flink,入门到实践,Flink,实时流,广播流)