flink状态管理

flink的状态管理机制

  • 由于flink考虑到程序可能会因为某些不可预知的问题导致任务失败,而恢复作业又需要找到上次任务的断点,因此引出了flink状态机制。

flink中的状态又分算子状态与键控状态,我这里主要说一下键控状态

键控状态的特点:

  • 只能应用于 KeyedStream 的算子中(keyby 后的处理算子中);
  • 算子为每一个 key 绑定一份独立的状态数据;

代码实现示例

/**
 * @author syx
 * @date 2023/5/18 9:36
 */

package com.syxStudy;

import org.apache.flink.api.common.functions.RichMapFunction;
import org.apache.flink.api.common.functions.RuntimeContext;
import org.apache.flink.api.common.restartstrategy.RestartStrategies;
import org.apache.flink.api.common.state.ListState;
import org.apache.flink.api.common.state.ListStateDescriptor;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.CheckpointingMode;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;

public class KeyedState {
    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setParallelism(1);
        //开启checkpoint机制,默认为Exactly_once机制
        env.enableCheckpointing(2000, CheckpointingMode.EXACTLY_ONCE);
        //指定持久化的路径
        env.getCheckpointConfig().setCheckpointStorage("file:///D:\\idea\\checkpoint");
        //开启task级别的故障重启,设定重启次数等
        env.setRestartStrategy(RestartStrategies.fixedDelayRestart(3,5000));
        //获取数据源,读取socket文本流数据
        DataStreamSource<String> source = env.socketTextStream("192.168.xxx.xxx", 9999);

        source.print();

        //使用map算子实现计算逻辑
        SingleOutputStreamOperator<String> stateCheckpoint = source.keyBy(s -> s).map(new RichMapFunction<String, String>() {
            ListState<String> listState;

            //将checkpoint传入open方法以便获取
            @Override
            public void open(Configuration parameters) throws Exception {
                RuntimeContext runtimeContext = getRuntimeContext();
                //获取一个list结构的存储器
                listState = runtimeContext.getListState(new ListStateDescriptor<String>("lst", String.class));
                //获取一个单值的存储器
                //runtimeContext.getState()
                //获取一个map类型的存储器
                //runtimeContext.getMapState()
            }

            @Override
            public String map(String s) throws Exception {
                //将本条数据放入存储器中
                listState.add(s);
                //获取所有数据结果,进行拼接
                StringBuilder stringBuilder = new StringBuilder();
                for (String s1 : listState.get()) {
                    stringBuilder.append(s1);
                }
                return stringBuilder.toString();
            }
        });
        //打印输出
        stateCheckpoint.print();
        //启动环境执行路口
        env.execute();
    }
}

你可能感兴趣的:(Flink(java版)心得,flink,java)