相关博客:
Flink_Flink中的状态
Flink状态管理详解:Keyed State和Operator List State深度解析
由一个任务维护,并且用来计算某个结果的所有数据,都属于这个任务的状态
可以认为任务状态就是一个本地变量,可以被任务的业务逻辑访问
Flink 会进行状态管理,包括状态一致性、故障处理以及高效存储和访问,以便于开发人员可以专注于应用程序的逻辑
在Flink中,状态始终与特定算子相关联
为了使运行时的Flink了解算子的状态,算子需要预先注册其状态
有两种类型的状态:
算子状态(Operator State)
算子状态的作用范围为算子任务(也就不能跨任务访问)
键控状态(Keyed State)
根据输入数据流中定义的键来维护和访问的
列表状态(List state)
将状态表示为一组数据的列表
联合列表状态(Union list state)
也将状态表示未数据的列表。它与常规列表状态的区别在于,在发生故障时,或者从保存点(savepoint)启动应用程序时如何恢复
广播状态(Broadcast state)
如果一个算子有多项任务,而它的每项任务状态又都相同,那么这种特殊情况最适合应用广播状态
实际使用中一般使用键控状态,算子状态用的比较少。
需要实现ListCheckpointed这个接口
package com.root.state;
import com.root.SensorReading;
import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.api.common.state.ListState;
import org.apache.flink.streaming.api.checkpoint.ListCheckpointed;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import java.util.Collections;
import java.util.List;
/**
* @author Kewei
* @Date 2022/3/6 20:23
*/
public class StateTest3_OperatorState {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);
DataStreamSource<String> inputStream = env.socketTextStream("localhost", 7777);
SingleOutputStreamOperator<SensorReading> dataStream = inputStream.map(line -> {
String[] fields = line.split(",");
return new SensorReading(fields[0], new Long(fields[1]), new Double(fields[2]));
});
SingleOutputStreamOperator<Integer> resultStream = dataStream.map(new MyCountMapper());
resultStream.print();
env.execute();
}
public static class MyCountMapper implements MapFunction<SensorReading, Integer>, ListCheckpointed<Integer>{
private Integer count = 0;
@Override
public Integer map(SensorReading sensorReading) throws Exception {
count++;
return count;
}
@Override
public List<Integer> snapshotState(long l, long l1) throws Exception {
return Collections.singletonList(count);
}
@Override
public void restoreState(List<Integer> list) throws Exception {
for (Integer integer : list) {
count += integer;
}
}
}
}
注:声明一个键控状态,一般在算子的open()中声明,因为运行时才能获取上下文信息。
需要继承富函数RichFunction
package com.root.state;
import akka.stream.impl.ReducerState;
import com.root.SensorReading;
import org.apache.flink.api.common.functions.RichMapFunction;
import org.apache.flink.api.common.state.*;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
/**
* @author Kewei
* @Date 2022/3/6 20:38
*/
public class StateTest4_KeyedState {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);
DataStreamSource<String> inputStream = env.socketTextStream("localhost", 7777);
SingleOutputStreamOperator<SensorReading> dataStream = inputStream.map(line -> {
String[] fields = line.split(",");
return new SensorReading(fields[0], new Long(fields[1]), new Double(fields[2]));
});
SingleOutputStreamOperator<Integer> resultStream = dataStream
.keyBy("id") // 键控需要先分组
.map(new MyMapper());
resultStream.print();
env.execute();
}
public static class MyMapper extends RichMapFunction<SensorReading, Integer>{
// Exception in thread "main" java.lang.IllegalStateException: The runtime context has not been initialized.
// ValueState valueState = getRuntimeContext().getState(new ValueStateDescriptor("my-int", Integer.class));
// 声明 值状态
private ValueState<Integer> valueState;
// 声明 列表状态
private ListState<String> listState;
// 声明 map(key-value)状态
private MapState<String,Double> mapState;
// 声明聚合状态
private ReducingState<SensorReading> readingReducingState;
@Override
public void open(Configuration par) throws Exception {
// 在open方法中初始化状态
valueState = getRuntimeContext().getState(new ValueStateDescriptor<Integer>("value",Integer.class));
listState = getRuntimeContext().getListState(new ListStateDescriptor<String>("list",String.class));
mapState = getRuntimeContext().getMapState(new MapStateDescriptor<String, Double>("map",String.class,Double.class));
// readingReducingState = getRuntimeContext().getReducingState(new ReducingStateDescriptor("reduce",reduce函数,SensorReading.class))
}
@Override
public Integer map(SensorReading value) throws Exception {
// list state 输出,以及添加值
for (String s : listState.get()) {
System.out.println(s);
}
listState.add("hello");
// map state 输出,添加,删除
mapState.get("1");
mapState.put("2",2.1);
mapState.remove("2");
//reducing state 添加值
// readingReducingState.add(value);
// value state 获取值,更新值
Integer count = valueState.value();
count = count==null?0:count;
++count;
valueState.update(count);
return count;
}
}
}
假设做一个温度警报,如果一个传感器前后温度差超过10摄氏度就报警。这里使用键控状态keyed State + FlatMap来实现。
package com.root.state;
import com.root.SensorReading;
import org.apache.flink.api.common.functions.RichFlatMapFunction;
import org.apache.flink.api.common.state.ValueState;
import org.apache.flink.api.common.state.ValueStateDescriptor;
import org.apache.flink.api.java.tuple.Tuple3;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.util.Collector;
/**
* @author Kewei
* @Date 2022/3/6 21:02
*/
public class StateTest5_KeyedSateApplicationCase {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);
DataStreamSource<String> inputStream = env.socketTextStream("localhost", 7777);
SingleOutputStreamOperator<SensorReading> dataStream = inputStream.map(line -> {
String[] fields = line.split(",");
return new SensorReading(fields[0], new Long(fields[1]), new Double(fields[2]));
});
SingleOutputStreamOperator<Tuple3<String, Double, Double>> resultStream = dataStream
.keyBy(SensorReading::getId)
.flatMap(new MyFlatMapper(10.0));
resultStream.print();
env.execute();
}
public static class MyFlatMapper extends RichFlatMapFunction<SensorReading, Tuple3<String,Double,Double>>{
// 定义一个温度差预警值
private final Double th;
public MyFlatMapper(Double th){
this.th = th;
}
// 定义一个值状态,保存上一次的温度
ValueState<Double> lastTemp;
@Override
public void open(Configuration parameters) throws Exception {
// 获取上一次保存的状态--上一次的温度
lastTemp = getRuntimeContext().getState(new ValueStateDescriptor<Double>("last",Double.class));
}
@Override
public void close() throws Exception {
// 释放资源
lastTemp.clear();
}
@Override
public void flatMap(SensorReading value, Collector<Tuple3<String, Double, Double>> out) throws Exception {
Double lastTemper = lastTemp.value();
Double corTemp = value.getTemperature();
// 判断lastTemper是否为空,之后再判断差值是否大于阈值
if (lastTemper!=null) {
if (Math.abs(corTemp - lastTemper) > th) {
out.collect(new Tuple3<>(value.getId(),corTemp,lastTemper));
}
}
// 更新保存上一次温度
lastTemp.update(corTemp);
}
}
}