聊聊flink的Managed Keyed State

本文主要研究一下flink的Managed Keyed State

State

flink-core-1.7.0-sources.jar!/org/apache/flink/api/common/state/State.java

/**
 * Interface that different types of partitioned state must implement.
 *
 * 

The state is only accessible by functions applied on a {@code KeyedStream}. The key is * automatically supplied by the system, so the function always sees the value mapped to the * key of the current element. That way, the system can handle stream and state partitioning * consistently together. */ @PublicEvolving public interface State { /** * Removes the value mapped under the current key. */ void clear(); }

  • State是所有不同类型的State必须实现的接口,它定义了clear方法

ValueState

flink-core-1.7.0-sources.jar!/org/apache/flink/api/common/state/ValueState.java

@PublicEvolving
public interface ValueState extends State {

    /**
     * Returns the current value for the state. When the state is not
     * partitioned the returned value is the same for all inputs in a given
     * operator instance. If state partitioning is applied, the value returned
     * depends on the current operator input, as the operator maintains an
     * independent state for each partition.
     *
     * 

If you didn't specify a default value when creating the {@link ValueStateDescriptor} * this will return {@code null} when to value was previously set using {@link #update(Object)}. * * @return The state value corresponding to the current input. * * @throws IOException Thrown if the system cannot access the state. */ T value() throws IOException; /** * Updates the operator state accessible by {@link #value()} to the given * value. The next time {@link #value()} is called (for the same state * partition) the returned state will represent the updated value. When a * partitioned state is updated with null, the state for the current key * will be removed and the default value is returned on the next access. * * @param value The new value for the state. * * @throws IOException Thrown if the system cannot access the state. */ void update(T value) throws IOException; }

  • ValueState继承了State接口,它定义了value、update两个方法,一个用于取值,一个用于更新值

AppendingState

flink-core-1.7.0-sources.jar!/org/apache/flink/api/common/state/AppendingState.java

@PublicEvolving
public interface AppendingState extends State {

    /**
     * Returns the current value for the state. When the state is not
     * partitioned the returned value is the same for all inputs in a given
     * operator instance. If state partitioning is applied, the value returned
     * depends on the current operator input, as the operator maintains an
     * independent state for each partition.
     *
     * 

NOTE TO IMPLEMENTERS: if the state is empty, then this method * should return {@code null}. * * @return The operator state value corresponding to the current input or {@code null} * if the state is empty. * * @throws Exception Thrown if the system cannot access the state. */ OUT get() throws Exception; /** * Updates the operator state accessible by {@link #get()} by adding the given value * to the list of values. The next time {@link #get()} is called (for the same state * partition) the returned state will represent the updated list. * *

If null is passed in, the state value will remain unchanged. * * @param value The new value for the state. * * @throws Exception Thrown if the system cannot access the state. */ void add(IN value) throws Exception; }

  • AppendingState继承了State接口,它定义了get、add方法,该State接收IN、OUT两个泛型

FoldingState

flink-core-1.7.0-sources.jar!/org/apache/flink/api/common/state/FoldingState.java

@PublicEvolving
@Deprecated
public interface FoldingState extends AppendingState {}
  • FoldingState继承了AppendingState,其中OUT泛型表示ACC,即累积值;FoldingState在Flink 1.4版本被标记为废弃,后续会被移除掉,可使用AggregatingState替代

MergingState

flink-core-1.7.0-sources.jar!/org/apache/flink/api/common/state/MergingState.java

/**
 * Extension of {@link AppendingState} that allows merging of state. That is, two instances
 * of {@link MergingState} can be combined into a single instance that contains all the
 * information of the two merged states.
 *
 * @param  Type of the value that can be added to the state.
 * @param  Type of the value that can be retrieved from the state.
 */
@PublicEvolving
public interface MergingState extends AppendingState { }
  • MergingState继承了AppendingState,这里用命名表达merge state的意思,它有几个子接口,分别是ListState、ReducingState、AggregatingState

ListState

flink-core-1.7.0-sources.jar!/org/apache/flink/api/common/state/ListState.java

@PublicEvolving
public interface ListState extends MergingState> {

    /**
     * Updates the operator state accessible by {@link #get()} by updating existing values to
     * to the given list of values. The next time {@link #get()} is called (for the same state
     * partition) the returned state will represent the updated list.
     *
     * 

If null or an empty list is passed in, the state value will be null. * * @param values The new values for the state. * * @throws Exception The method may forward exception thrown internally (by I/O or functions). */ void update(List values) throws Exception; /** * Updates the operator state accessible by {@link #get()} by adding the given values * to existing list of values. The next time {@link #get()} is called (for the same state * partition) the returned state will represent the updated list. * *

If null or an empty list is passed in, the state value remains unchanged. * * @param values The new values to be added to the state. * * @throws Exception The method may forward exception thrown internally (by I/O or functions). */ void addAll(List values) throws Exception; }

  • ListState继承了MergingState,它的OUT类型为Iterable;它主要用于operation存储partitioned list state,它继承了MergingState接口(指定OUT的泛型为Iterable),同时声明了两个方法;其中update用于全量更新state,如果参数为null或者empty,那么state会被清空;addAll方法用于增量更新,如果参数为null或者empty,则保持不变,否则则新增给定的values

ReducingState

flink-core/1.7.0/flink-core-1.7.0-sources.jar!/org/apache/flink/api/common/state/ReducingState.java

@PublicEvolving
public interface ReducingState extends MergingState {}
  • ReducingState继承了MergingState,它的IN、OUT类型相同

AggregatingState

flink-core/1.7.0/flink-core-1.7.0-sources.jar!/org/apache/flink/api/common/state/AggregatingState.java

@PublicEvolving
public interface AggregatingState extends MergingState {}
  • AggregatingState继承了MergingState,它与ReducingState不同,IN、OUT类型可以不同

MapState

flink-core-1.7.0-sources.jar!/org/apache/flink/api/common/state/MapState.java

@PublicEvolving
public interface MapState extends State {

    /**
     * Returns the current value associated with the given key.
     *
     * @param key The key of the mapping
     * @return The value of the mapping with the given key
     *
     * @throws Exception Thrown if the system cannot access the state.
     */
    UV get(UK key) throws Exception;

    /**
     * Associates a new value with the given key.
     *
     * @param key The key of the mapping
     * @param value The new value of the mapping
     *
     * @throws Exception Thrown if the system cannot access the state.
     */
    void put(UK key, UV value) throws Exception;

    /**
     * Copies all of the mappings from the given map into the state.
     *
     * @param map The mappings to be stored in this state
     *
     * @throws Exception Thrown if the system cannot access the state.
     */
    void putAll(Map map) throws Exception;

    /**
     * Deletes the mapping of the given key.
     *
     * @param key The key of the mapping
     *
     * @throws Exception Thrown if the system cannot access the state.
     */
    void remove(UK key) throws Exception;

    /**
     * Returns whether there exists the given mapping.
     *
     * @param key The key of the mapping
     * @return True if there exists a mapping whose key equals to the given key
     *
     * @throws Exception Thrown if the system cannot access the state.
     */
    boolean contains(UK key) throws Exception;

    /**
     * Returns all the mappings in the state.
     *
     * @return An iterable view of all the key-value pairs in the state.
     *
     * @throws Exception Thrown if the system cannot access the state.
     */
    Iterable> entries() throws Exception;

    /**
     * Returns all the keys in the state.
     *
     * @return An iterable view of all the keys in the state.
     *
     * @throws Exception Thrown if the system cannot access the state.
     */
    Iterable keys() throws Exception;

    /**
     * Returns all the values in the state.
     *
     * @return An iterable view of all the values in the state.
     *
     * @throws Exception Thrown if the system cannot access the state.
     */
    Iterable values() throws Exception;

    /**
     * Iterates over all the mappings in the state.
     *
     * @return An iterator over all the mappings in the state
     *
     * @throws Exception Thrown if the system cannot access the state.
     */
    Iterator> iterator() throws Exception;
}
  • MapState直接继承了State,它接收UK、UV两个泛型,分别是map的key和value的类型

小结

  • flink提供了好几个不同类型的Managed Keyed State,有ValueState、ListState、ReducingState、AggregatingState、FoldingState、MapState
  • ValueState和MapState是直接继承State接口;FoldingState继承了AppendingState(AppendingState直接继承了State);ListState、ReducingState、AggregatingState继承了MergingState(MergingState继承了AppendingState)
  • FoldingState在Flink 1.4版本被标记为废弃,后续会被移除掉,可使用AggregatingState替代

doc

  • Using Managed Keyed State

你可能感兴趣的:(flink)