Flink的状态管理

State 和Fault Tolerance(重点)

有状态操作或者操作算子在处理DataStream的元素或者事件的时候需要存储计算状态,这就使得状态在整个Flink的精细化计算中扮演着非常重要的地位:

  • 记录数据某一个过去时间段到当前时间期间数据状态信息。
  • 在每分钟/小时/天汇总事件时,状态保留待处理的汇总记录。
  • 在训练机器学习模型时,状态保持当前版本的模型参数。

Flink=管理状态,以便使用checkpoint和savepoint实现状态容错。Flink的状态在计算规模发送变化的时候,Flink可以自动在并行实例间实现状态的重新分发。Flink底层使用StateBackend策略存储计算状态,StateBackend决定了状态存储的方式和位置。

在Flink状态管理中将所有能操作的状态分为Keyed StateOperator State,其中Keyed State中状态是和key一一绑定的,并且只能在KeyedStream中使用。所有non-KeyedStream状态操作都叫做Operator State。底层Flink在做状态管理的时候是将Keyed State和由于某一个key仅仅落入其中一个operator-instance中,因此可以简单的理解Keyed State是和进行绑定的。Flink底层会采用Key Group机制对Keyed State进行管理或者分类,所有的keyed-operator在做状态操作的时候可能需要和1~n个KeyGroup进行交互。

Flink在分发keyed-state状态的时候,并不是以key为单位,Key Group是最小分发单元

Operator State (也称为 non-keyed state), 每个operator state 和 一个parallel operator instance绑定。Keyed StateOperator State 以两种形式存在 managed(管理) 和 raw(原生).所有的Flink已知的操作符都支持managed state,但是Raw Sate仅仅是在用户自定义operator时候使用,并且不支持在并行度发生变化的时候状态重新分发。因此Flink虽然支持Raw Sate但是在绝大多数场景,一般使用的都是managed State。

Keyed-state

keyed-state接口提供对不同类型的状态的访问,所有状态都限于当前输入元素的key。

类型 说明 方法
ValueState 这个状态主要存储一个可以用作更新的值。 update(T)
T value()
clear()
ListState 存储List集合元素. add(T)
addAll(List)
Iterable get()
update(List)
clear()
ReducingState 这将保留一个值,该值表示添加到状态的所有值的汇总,
需要用户提供ReduceFunction
add(T)
T get()
clear()
AggregatingState 这将保留一个值,该值表示添加到状态的所有值的汇总,
需要用户提供AggregateFunction
add(IN)
T get()
clear()
FoldingState 这将保留一个值,该值表示添加到状态的所有值的汇总,
需要用户提供FoldFunction
add(IN)
T get()
clear()
MapState 这会保留一个Map。 put(UK, UV)
putAll(Map)
entries()
keys()
values()
clear()

value state

var fsEnv=StreamExecutionEnvironment.getExecutionEnvironment

fsEnv.socketTextStream("CentOS",9999)
.flatMap(_.split("\\s+"))
.map((_,1))
.keyBy(0)
.map(new RichMapFunction[(String,Int),(String,Int)] {
    var vs:ValueState[Int]=_
    override def open(parameters: Configuration): Unit = {
        val vsd=new ValueStateDescriptor[Int]("valueCount",createTypeInformation[Int])
        vs=getRuntimeContext.getState[Int](vsd)
    }
    override def map(value: (String, Int)): (String, Int) = {
        val histroyCount = vs.value()
        val currentCount=histroyCount+value._2
        vs.update(currentCount)
        (value._1,currentCount)
    }
}).print()

fsEnv.execute("wordcount")

AggregatingState

var fsEnv=StreamExecutionEnvironment.getExecutionEnvironment

fsEnv.socketTextStream("CentOS",9999)
.map(_.split("\\s+"))
.map(ts=>(ts(0),ts(1).toInt))
.keyBy(0)
.map(new RichMapFunction[(String,Int),(String,Double)] {
    var vs:AggregatingState[Int,Double]=_
    override def open(parameters: Configuration): Unit = {
        val vsd=new AggregatingStateDescriptor[Int,(Double,Int),Double]("avgCount",new AggregateFunction[Int,(Double,Int),Double] {
            override def createAccumulator(): (Double, Int) = {
                (0.0,0)
            }

            override def add(value: Int, accumulator: (Double, Int)): (Double, Int) = {
                (accumulator._1+value,accumulator._2+1)
            }
            override def merge(a: (Double, Int), b: (Double, Int)): (Double, Int) = {
                (a._1+b._1,a._2+b._2)
            }
            override def getResult(accumulator: (Double, Int)): Double = {
                accumulator._1/accumulator._2
            }
        },createTypeInformation[(Double,Int)])
        vs=getRuntimeContext.getAggregatingState(vsd)
    }
    override def map(value: (String, Int)): (String, Double) = {
        vs.add(value._2)
        val avgCount=vs.get()
        (value._1,avgCount)
    
}).print()

fsEnv.execute("wordcount")

MapState

package com.hw.demo04

import org.apache.flink.api.common.functions.RichMapFunction
import org.apache.flink.api.common.state.{MapState, MapStateDescriptor}
import org.apache.flink.configuration.Configuration
import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
import org.apache.flink.streaming.api.scala._
import scala.collection.JavaConverters._
/**
  * @aurhor:fql
  * @date 2019/10/16 19:41 
  * @type:
  */
object MapState {
  def main(args: Array[String]): Unit = {

    val fsEnv = StreamExecutionEnvironment.getExecutionEnvironment
    fsEnv.socketTextStream("CentOS",9999)
      .map(_.split("\\s+"))
      .map(ts=>Login(ts(0).toInt,ts(1),ts(2),ts(3),ts(4)))
      .keyBy("id","name")
      .map(new RichMapFunction[Login,String] {
        var vs:MapState[String,String]=_
        override def open(parameters: Configuration): Unit = {
          val msd=new MapStateDescriptor[String,String]("mapstate",createTypeInformation[String],createTypeInformation[String])
          vs=getRuntimeContext.getMapState(msd)
        }
        override def map(value: Login): String = {
          println("历史登陆")
          for(k<- vs.keys().asScala){
            println(k+" "+vs.get(k))
          }
          var result=""
          if(vs.keys().iterator().asScala.isEmpty){
            result="ok"
          }else{
            if(!value.city.equalsIgnoreCase(vs.get("city"))){
              result="error"
            }else{
              result="ok"
            }
          }
          //更新状态
          vs.put("ip",value.ip)
          vs.put("city",value.city)
          vs.put("time",value.time)
          result
        }
      }).print()

    fsEnv.execute("wordCount")
  }
}

总结

new Rich[Map|FaltMap]Function {
    var vs:XxxState=_ //状态声明
    override def open(parameters: Configuration): Unit = {
        val xxd=new XxxStateDescription //完成状态的初始化
        vs=getRuntimeContext.getXxxState(xxd)
    }
    override def xxx(value: Xx): Xxx = {
       //状态操作
    }
}
  • ValueState getState(ValueStateDescriptor)
  • ReducingState getReducingState(ReducingStateDescriptor)
  • ListState getListState(ListStateDescriptor)
  • AggregatingState getAggregatingState(AggregatingStateDescriptor)
  • FoldingState getFoldingState(FoldingStateDescriptor)
  • MapState getMapState(MapStateDescriptor)

State Time-To-Live (TTL)

基本使用

可以将state存活时间(TTL)分配给任何类型的key-state.如果配置了TTL且状态值已过期,则flink将尽力清除存储的值。

import org.apache.flink.api.common.state.StateTtlConfig
import org.apache.flink.api.common.state.ValueStateDescriptor
import org.apache.flink.api.common.time.Time

val ttlConfig = StateTtlConfig
    .newBuilder(Time.seconds(1))
    .setUpdateType(StateTtlConfig.UpdateType.OnCreateAndWrite)
    .setStateVisibility(StateTtlConfig.StateVisibility.NeverReturnExpired)
    .build
    
val stateDescriptor = new ValueStateDescriptor[String]("text state", classOf[String])
stateDescriptor.enableTimeToLive(ttlConfig)
  • 案例
package com.hw.demo04
import java.util.Properties
import org.apache.flink.api.common.functions.RichMapFunction
import org.apache.flink.api.common.serialization.SimpleStringSchema
import org.apache.flink.api.common.state.StateTtlConfig.{StateVisibility, UpdateType}
import org.apache.flink.api.common.state.{StateTtlConfig, ValueState, ValueStateDescriptor}
import org.apache.flink.api.common.time.Time
import org.apache.flink.configuration.Configuration
import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer
import org.apache.flink.streaming.api.scala._
/**
  * @aurhor:fql
  * @date 2019/10/16 20:32 
  * @type:
  */
object TTL {
  def main(args: Array[String]): Unit = {
    val fsEnv = StreamExecutionEnvironment.getExecutionEnvironment
    val props = new Properties()
    props.setProperty("bootstrap.servers", "CentOS:9092")
    props.setProperty("group.id", "g1")
    val lines=fsEnv.addSource(new FlinkKafkaConsumer("topic01",new SimpleStringSchema(),props))
      .flatMap(_.split("\\s+"))
      .map((_,1))
      .keyBy(0)
      .map(new RichMapFunction[(String,Int),(String,Int)] {
       var vs:ValueState[Int]=_
        override def open(parameters: Configuration): Unit = {
          val vsd = new ValueStateDescriptor[Int]("valueCount", createTypeInformation[Int])

          val ttlconfig = StateTtlConfig.newBuilder(Time.seconds(5)) //过期时间5s
            .setUpdateType(UpdateType.OnCreateAndWrite) //创建和修改的时候更新过期时间
            .setStateVisibility(StateVisibility.NeverReturnExpired) //永不返回过期的数据
            .build()

          vsd.enableTimeToLive(ttlconfig)
          vs = getRuntimeContext.getState[Int](vsd)
        }
        override def map(value: (String, Int)): (String, Int) = {
               val historCount = vs.value()
               val currentCount= historCount+value._2
          vs.update(currentCount)
          (value._1,currentCount)
        }
      }).print()

    fsEnv.execute("wordcount")
  }
}

注意:开启TTL之后,系统会额外消耗内存存储时间戳(Processing Time),如果用户以前没有开启TTL配置,在启动之前修改代码开启了TTL,在做状态恢复的时候系统启动不起来,,跑出兼容性失败以及StateMigrationException异常。

清除 Expired State

默认情况下,仅当明确读出过期值数据的时候,例如,通过调用ValueState.value(),过期的数据才会被清除。这意味着默认情况下,如果未读取过期状态,则不会将其删除,可能会导致状态的不断增长。

Cleanup in full snapshot

从上一次状态恢复的时候,系统会加载所有的state快照,在加载过程中会踢除那些过期的数据,并不会影响磁盘存储的状态数据。该状态数据只会在checkpoint的时候被覆盖。依然解决不了在运行时自动清除过期且没有用过的数据。

import org.apache.flink.api.common.state.StateTtlConfig
import org.apache.flink.api.common.time.Time

val ttlConfig = StateTtlConfig
    .newBuilder(Time.seconds(1))
    .cleanupFullSnapshot
    .build

只能应用于memory或者fs状态后端实现,不支持RockDB state backend。

Cleanup in backgroup

可以开启后台清除策略,根据state Backend的实现采取默认的清除策略(不同状态后端存储,清除策略不同)

import org.apache.flink.api.common.state.StateTtlConfig
val ttlConfig = StateTtlConfig
    .newBuilder(Time.seconds(1))
    .cleanupInBackground
    .build

Incremental cleanup(基于内存backend)

import org.apache.flink.api.common.state.StateTtlConfig
val ttlConfig = StateTtlConfig.newBuilder(Time.seconds(5))
              .setUpdateType(UpdateType.OnCreateAndWrite)
              .setStateVisibility(StateVisibility.NeverReturnExpired)
              .cleanupIncrementally(100,true) //默认值 5 | false
              .build()

第一个参数表示每一次触发cleanup的时候,系统一次处理100个元素。如果用户操作任意一个state访问系统都会触发cleanup策略。第二参数如果为true,表示系统会只要接收记录数(即使用户没有操作状态)就会触发cleanup。

RocksDB compaction

RockDB(k-v存储)底层异步压缩状态,会将key相同的数据进行Compact(压缩),以减少state文件大小。但是并不 对过期state进行清理,因此可以通过配置CompactFilter让RockDB在compact的时候对过期的state进行排除。这种特性过滤的特性默认是关闭的,如果开启可以再flink-conf.yaml中配置 state.backend.rocksdb.ttl.compaction.filter.enabled: true 或者通过API设置 RocksDBStateBackend::enableTtlCompactionFilter.

Flink的状态管理_第1张图片

import org.apache.flink.api.common.state.StateTtlConfig 
val ttlConfig = StateTtlConfig.newBuilder(Time.seconds(5))
              .setUpdateType(UpdateType.OnCreateAndWrite)
              .setStateVisibility(StateVisibility.NeverReturnExpired)
              .cleanupInRocksdbCompactFilter(1000)//默认配置1000
              .build()

这里的1000表示,系统在做compact的时候,系统会检查1000 元素是否失效。如果失效清除该过期数据。

Operator State

如果用户想使用Operator State,用户只需要实现通用的checkpointedFunction 接口或者ListCheckpointed 注意目前的operator-state仅仅支持list-style风格的状态,要求所存储到的状态必须是一个List,且其中的元素必须可以序列化。

CheckpointedFunction

提供两种不同的状态发布方案:Even-split 和 Union

void snapshotState(FunctionSnapshotContext context) throws Exception;
void initializeState(FunctionInitializationContext context) throws Exception;
  • snapshotState():调用checkpoint的时候,系统会调用snapshotState 对状态做快照
  • initiallizeState():第一次启动或者从上一次状态恢复反时候调用initializeState()
    Even-split:表示系统在故障恢复的时候,会将operator-state的元素均分给所有的operator实例,每个operator实例获取sub-list数据。

Union:表示系统在故障恢复的时候,每一个operator实例可以获取到整个Operator-state的全部数据。

案例:

package com.hw.demo05

import org.apache.flink.api.common.state.{ListState, ListStateDescriptor}
import org.apache.flink.runtime.state.{FunctionInitializationContext, FunctionSnapshotContext}
import org.apache.flink.streaming.api.checkpoint.CheckpointedFunction
import org.apache.flink.streaming.api.functions.sink.SinkFunction
import org.apache.flink.streaming.api.scala._
import scala.collection.mutable.ListBuffer
import scala.collection.JavaConverters._
/**
  * @aurhor:fql
  * @date 2019/10/17 17:40 
  * @type:
  */
class BufferSink (threshold:Int=0) extends SinkFunction[(String,Int)] with CheckpointedFunction{

  var listState:ListState[(String,Int)]=_
  val bufferedElements=ListBuffer[(String,Int)]()

  //负责将数据输出到外围系统
  override def invoke(value: (String, Int)): Unit = {
     bufferedElements+=value     //将value的值添加到bufferedElements
    if(bufferedElements.size==threshold){  //判断值是否达到阈值
      for(ele <-bufferedElements){   //进行遍历
        println(ele)  //输出元素
      }
      bufferedElements.clear()
    }
  }
  //是在savepoint|checkpoint时候数据持久化
  override def snapshotState(context: FunctionSnapshotContext): Unit = {
         listState.clear() //首先清空
    for(ele <-bufferedElements){   //遍历bufferrdElements
      listState.add(ele)  //强元素添加到listState
    }
  }
  //状态恢复|初始化 创建状态
  override def initializeState(context: FunctionInitializationContext): Unit = {
    val lsd = new ListStateDescriptor[(String, Int)]("buffered-elements",createTypeInformation[(String,Int)])

    listState = context.getOperatorStateStore.getListState(lsd)  //获取值
     if(context.isRestored){ //进行状态判断
       for(element <-listState.get().asScala){  //对listState进行遍历
         bufferedElements+=element   //将遍历出来的元素添加到bufferedElements
       }
     }
  }
}

 val fsEnv = StreamExecutionEnvironment.getExecutionEnvironment

    fsEnv.socketTextStream("CentOS",9999)
      .flatMap(_.split("\\s+"))
      .map((_,1))
      .keyBy(0)
      .addSink(new BufferSink(5)) //设置阈值
    fsEnv.execute("testOpreateo")
  • 启动服务
[root@CentOS ~]# nc -lk 9999
  • 任务提交
    Flink的状态管理_第2张图片
    注意:将任务的并行度设置为 1 ,方便测试

  • 输入数据

[root@CentOS ~]# nc -lk 9999
a1 b1 c1 d1
  • 取消任务,并创建savepoint
[root@CentOS flink-1.8.1]# ./bin/flink list -m CentOS:8081
------------------ Running/Restarting Jobs -------------------
17.10.2019 09:49:20 : f21795e74312eb06fbf0d48cb8d90489 : testoperatorstate (RUNNING)
--------------------------------------------------------------
[root@CentOS flink-1.8.1]# ./bin/flink cancel -m CentOS:8081 -s hdfs:///savepoints f21795e74312eb06fbf0d48cb8d90489
Cancelling job f21795e74312eb06fbf0d48cb8d90489 with savepoint to hdfs:///savepoints.
Cancelled job f21795e74312eb06fbf0d48cb8d90489. Savepoint stored in hdfs://CentOS:9000/savepoints/savepoint-f21795-38e7beefe07b.

注意,如果Flink需要和Hadoop整合,必须保证在当前环境变量下有HADOOP_HOME|HADOOP_CALSSPATH

  • 测试状态
    Flink的状态管理_第3张图片
    Flink的状态管理_第4张图片

ListCheckpointed

该接口是CheckpointedFunction一种变体形式,仅仅只支持Even-split状态的分发策略。

List<T> snapshotState(long checkpointId, long timestamp) throws Exception;
void restoreState(List<T> state) throws Exception;
  • snapshotState: 调用checkpoint的时候,系统会调用SnapshotState 对状态做快照。
  • restoreState: 等价上述CheckpointedFunction中声明的initializeState方法,用作状态恢复。

案例

import java.lang.{Long => JLong} //修改类别名
import scala.{Long => SLong} //修改类别名
class CustomStatefulSourceFunction extends ParallelSourceFunction[SLong] with ListCheckpointed[JLong]{
  @volatile
  var isRunning:Boolean = true
  var offset = 0L
  override def run(ctx: SourceFunction.SourceContext[SLong]): Unit = {
    val lock = ctx.getCheckpointLock
    while(isRunning){
       Thread.sleep(1000)
       lock.synchronized({
         ctx.collect(offset)
         offset += 1
       })
    }
  }

  override def cancel(): Unit = {
    isRunning=false
  }

  override def snapshotState(checkpointId: Long, timestamp: Long): util.List[JLong] = {
    Collections.singletonList(offset) //存储的是 当前source的偏移量,如果状态不可拆分,用户可以使Collections.singletonList
  }

  override def restoreState(state: util.List[JLong]): Unit = {
    for (s <- state.asScala) {
      offset = s
    }
  }
}
var fsEnv=StreamExecutionEnvironment.getExecutionEnvironment

fsEnv.addSource[Long](new CustomStatefulSourceFunction)
.print("offset:")

fsEnv.execute("testOffset")

广播状态

支持的operator state的第三种类型是广播状态。引入了广播状态以支持用例,其中需求将来自一个流的某些数据广播到所有下游任务,广播的状态将存储在本地,用于处理另一个流上的所有传入数据。

A third type of supported operator state is the Broadcast State. Broadcast state was introduced to support use cases where some data coming from one stream is required to be broadcasted to all downstream tasks, where it is stored locally and is used to process all incoming elements on the other stream.

√non-keyed

case  class Rule(channel:String,threshold:Int)

case  class UserAction(id:String,name:String,channel:String,action:String)

case  class UserBuyPath(id:String,name:String,channel:String,path:Int)

package com.hw.demo06
import org.apache.flink.api.common.functions.RichMapFunction
import org.apache.flink.api.common.state.{MapState, MapStateDescriptor}
import org.apache.flink.configuration.Configuration
import org.apache.flink.streaming.api.scala._

/**
  * @aurhor:fql
  * @date 2019/10/17 20:58 
  * @type:
  */
class UserActionRichMapFunction extends RichMapFunction[UserAction,UserBuyPath]{

  var buyPathState:MapState[String,Int]=_

  override def open(parameters: Configuration): Unit = {
    val msd= new MapStateDescriptor[String,Int]("buy-path",createTypeInformation[String],createTypeInformation[Int])
    buyPathState=getRuntimeContext.getMapState(msd)
  }
  override def map(value: UserAction): UserBuyPath = {
         val channel=value.channel
         var path=0

        if(value.action.equals("buy")){
          buyPathState.remove(channel)
        }else{
          buyPathState.put(channel,path+1)
        }
    UserBuyPath(value.id,value.name,value.channel,buyPathState.get(channel))
  }
}

package com.hw.demo06


import org.apache.flink.api.common.functions.RichMapFunction
import org.apache.flink.api.common.state.{MapState, MapStateDescriptor}
import org.apache.flink.configuration.Configuration
import org.apache.flink.streaming.api.scala._


/**
  * @aurhor:fql
  * @date 2019/10/17 20:58 
  * @type:
  */
class UserActionRichMapFunction extends RichMapFunction[UserAction,UserBuyPath]{

  var buyPathState:MapState[String,Int]=_

  override def open(parameters: Configuration): Unit = {
    val msd= new MapStateDescriptor[String,Int]("buy-path",createTypeInformation[String],createTypeInformation[Int])
    buyPathState=getRuntimeContext.getMapState(msd) //获取
  }


  override def map(value: UserAction): UserBuyPath = {
         val channel=value.channel  //读取channel的值
         var path=0   //设定path的初始值

        if(value.action.equals("buy")){  //判断动作是否是buy
          buyPathState.remove(channel)   //为buy则交易已完成,则移除这个channel
        }else{ 
          buyPathState.put(channel,path+1)  //动作不为buy,顾客还在观望,path+1,存储状态
        }

    UserBuyPath(value.id,value.name,value.channel,buyPathState.get(channel))
  }
}

package com.hw.demo06

import org.apache.flink.api.common.state.MapStateDescriptor
import org.apache.flink.streaming.api.functions.co.BroadcastProcessFunction
import org.apache.flink.util.Collector
import scala.collection.JavaConverters._

/**
  * @aurhor:fql
  * @date 2019/10/17 20:54 
  * @type:
  */
class  UserBuyPathBroadcastProcessFunction(msd:MapStateDescriptor[String,Int]) extends BroadcastProcessFunction[UserBuyPath,Rule,String]{



  //处理的是UserBuyParh 读取广播状态
  override def processElement(value: UserBuyPath,
                              ctx: BroadcastProcessFunction[UserBuyPath, Rule, String]#ReadOnlyContext,
                              out: Collector[String]): Unit = {
    val broadcastState = ctx.getBroadcastState(msd)  //进行广播状态的读取
    if(broadcastState.contains(value.channel)){  //判断广播状态中是否含有value的channel
      val threshold = broadcastState.get(value.channel)   //读取输入的channel阈值
      if(value.path>=threshold){  //判断访问的path是否大于阈值
        out.collect(value.id+" "+value.name+" "+value.channel+" "+value.path)  //输出这条数据
      }
    }

  }
  //处理的是规则 Rule数据 ,记录修改广播状态
  override def processBroadcastElement(value: Rule,
                                       ctx: BroadcastProcessFunction[UserBuyPath,
                                         Rule, String]#Context,
                                       out: Collector[String]): Unit = {
    val broadcastState = ctx.getBroadcastState(msd) //获取广播状态
    broadcastState.put(value.channel,value.threshold) //将得到的channel和threshold存入广播状态中

    println("=========================")
    for(entry <- broadcastState.entries().asScala){  //遍历广播状态
      println(entry.getKey+"\t"+entry.getValue)   //输出channel和Threashold
    }
    println()
    println()
  }
}

package com.hw.demo06

import org.apache.flink.api.common.state.MapStateDescriptor
import org.apache.flink.streaming.api.datastream.BroadcastStream
import org.apache.flink.streaming.api.scala._

object FlinkStreamNonKeyedBroadCastState {
  def main(args: Array[String]): Unit = {
    var fsEnv=StreamExecutionEnvironment.getExecutionEnvironment
    // id   name    channel  action
    // 001 zhangsan 手机      view
    // 001 zhangsan 手机      view
    // 001 zhangsan 手机      addToCart
    // 001 zhangsan 手机      buy
    val userStream = fsEnv.socketTextStream("CentOS", 9999)
      .map(line => line.split("\\s+"))
      .map(ts => UserAction(ts(0), ts(1), ts(2), ts(3)))
      .keyBy("id", "name")
      .map(new UserActionRichMapFunction)  //状态的存贮

    val msd=new MapStateDescriptor[String,Int]("braodcast-sate",createTypeInformation[String],
      createTypeInformation[Int])   
    // channel 阈值
    // 手机类 10
    val broadcastStream: BroadcastStream[Rule] = fsEnv.socketTextStream("CentOS", 8888)
      .map(line => line.split("\\s+"))
      .map(ts => Rule(ts(0), ts(1).toInt))
      .broadcast(msd)  // msd的广播

    userStream.connect(broadcastStream)   //两个流的connect
      .process(new UserBuyPathBroadcastProcessFunction(msd))
      .print()
    fsEnv.execute("testoperatorstate")
  }

}

keyed

class UserBuyPathKeyedBroadcastProcessFunction(msd:MapStateDescriptor[String,Int]) extends KeyedBroadcastProcessFunction[String,UserAction,Rule,String]{
  override def processElement(value: UserAction,
                              ctx: KeyedBroadcastProcessFunction[String, UserAction, Rule, String]#ReadOnlyContext,
                              out: Collector[String]): Unit = {
    println("value:"+value +" key:"+ctx.getCurrentKey)
    println("=====state======")
    for(entry <- ctx.getBroadcastState(msd).immutableEntries().asScala){
      println(entry.getKey+"\t"+entry.getValue)
    }
  }

  override def processBroadcastElement(value: Rule, ctx: KeyedBroadcastProcessFunction[String, UserAction, Rule, String]#Context, out: Collector[String]): Unit = {
     println("Rule:"+value)
    //更新状态
    ctx.getBroadcastState(msd).put(value.channel,value.threshold)
  }
}
case class Rule(channel:String,threshold:Int)
case class UserAction(id:String,name:String ,channel:String,action:String)
var fsEnv=StreamExecutionEnvironment.getExecutionEnvironment
// id   name    channel  action
// 001 zhangsan 手机      view
// 001 zhangsan 手机      view
// 001 zhangsan 手机      addToCart
// 001 zhangsan 手机 buy
val userKeyedStream = fsEnv.socketTextStream("CentOS", 9999)
.map(line => line.split("\\s+"))
.map(ts => UserAction(ts(0), ts(1), ts(2), ts(3)))
.keyBy(0)//只可以写一个参数


val msd=new MapStateDescriptor[String,Int]("braodcast-sate",createTypeInformation[String],
                                           createTypeInformation[Int])
// channel 阈值
// 手机类 10
// 电子类 10
val broadcastStream: BroadcastStream[Rule] = fsEnv.socketTextStream("CentOS", 8888)
.map(line => line.split("\\s+"))
.map(ts => Rule(ts(0), ts(1).toInt))
.broadcast(msd)

userKeyedStream.connect(broadcastStream)
.process(new UserBuyPathKeyedBroadcastProcessFunction(msd))
.print()


fsEnv.execute("testoperatorstate")

Checkpoint & SavePoints

Checkpoint 是Flink实现故障容错一种机制,系统根据配置的检查点定期自动对程序计算状态进行备份。一旦程序计算过程中出现故障,系统会选择一个最近的检查点进行故障恢复。

SavePoint是一种有效运维手段,需要用户手动触发程序进行状态备份。本质也是在做checkpoint。

实现故障恢复先决条件:

  • 持久(或持久)数据源,可以在一定时间内重复记录。(FlinkKafkaConsumer)
  • 状态的永久性存储,通常是分布式文件系统(例如,HDFS)
var fsEnv=StreamExecutionEnvironment.getExecutionEnvironment
//启动检查点机制
fsEnv.enableCheckpointing(5000,CheckpointingMode.EXACTLY_ONCE)
//配置checkpoint必须在2s内完成一次checkpoint,否则检查点终止
fsEnv.getCheckpointConfig.setCheckpointTimeout(2000)
//设置checkpoint之间时间间隔 <=  Checkpoint interval
fsEnv.getCheckpointConfig.setMinPauseBetweenCheckpoints(5)
//配置checkpoint并行度,不配置默认1
fsEnv.getCheckpointConfig.setMaxConcurrentCheckpoints(1)
//一旦检查点不能正常运行,Task也将终止
fsEnv.getCheckpointConfig.setFailOnCheckpointingErrors(true)
//将检查点存储外围系统 filesystem、rocksdb,可以配置在cancel任务时候,系统是否保留checkpoint
fsEnv.getCheckpointConfig.enableExternalizedCheckpoints(ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION)
val props = new Properties()
props.setProperty("bootstrap.servers", "CentOS:9092")
props.setProperty("group.id", "g1")

fsEnv.addSource(new FlinkKafkaConsumer[String]("topic01",new SimpleStringSchema(),props))
.flatMap(line => line.split("\\s+"))
.map((_,1))
.keyBy(0)//只可以写一个参数
.sum(1)
.print()

fsEnv.execute("testoperatorstate")

State backend

state backend决定Flink如何存储系统状态信息(Checkpoint形式),目前Flink提供了三种state backend实现。

  • Memory (jobmanager):这是Flink默认实现,通常用于测试,系统会将计算状态存储在JobManagwer的内存中,但是在实际生产环境下,由于计算的状态比较大,使用Memory 很容易导致OOM(out of memory).
  • FileSystem:系统会将计算状态存储在TaskManager的内存中,因此一般用作生产环境,系统会更具checkpoin机制会将TaskManager状态数据在文件系统上进行备份。如果是操大集群规模,TaskManager内存也可能产生溢出。
  • RocksDB : 系统会将计算状态存储在TaskManager的内存中,如果TaskManager内存不够,系统可以使用RocksDB配置本地磁盘完成状态的管理,同时支持将本地的状态数据备份到远程文件系统,因此RocksDB backend 是推荐的选择。

参考:https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/state/state_backends.html

每一个Job 都可以配置自己的状态存储后端实现,

var fsEnv=StreamExecutionEnvironment.getExecutionEnvironment
val fsStateBackend:StateBackend = new FsStateBackend("hdfs:///xxx") //MemoryStateBackend、FsStateBackend、RocksDBStateBackend
fsEnv.setStateBackend(fsStateBackend)

如果用户不配置,系统则使用默认实现,默认实现可以通过flink-conf-yaml配置

[root@CentOS ~]# cd /usr/flink-1.8.1/
[root@CentOS flink-1.8.1]# vi conf/flink-conf.yaml
#==============================================================================
# Fault tolerance and checkpointing
#==============================================================================
# The backend that will be used to store operator state checkpoints if
# checkpointing is enabled.
#
# Supported backends are 'jobmanager', 'filesystem', 'rocksdb', or the
# .
#
 state.backend: rocksdb
# Directory for checkpoints filesystem, when using any of the default bundled
# state backends.
#
 state.checkpoints.dir: hdfs:///flink-checkpoints
# Default target directory for savepoints, optional.
#
 state.savepoints.dir: hdfs:///flink-savepoints
 
# Flag to enable/disable incremental checkpoints for backends that
# support incremental checkpoints (like the RocksDB state backend).
#
 state.backend.incremental: true

注: 必须在环境变量中出现HDOOP_CLASSPATH

Flink计算发布之后,是否还能够修改计算算子?

首先在Spark中这是不允许的,因为Spark持久化代码片段,一旦修改代码,必须删除checkpoint。但是Flink仅仅存储的是各个算子的计算状态,如果用户修改代码,需要用户在有状态的操作的算子上指定uid属性。

fsEnv.addSource(new FlinkKafkaConsumer[String]("topic01",new SimpleStringSchema(),props))
    .uid("kakfa-consumer")
    .flatMap(line => line.split("\\s+"))
    .map((_,1))
    .keyBy(0)//只可以写一个参数
    .sum(1)
    .uid("word-count") //唯一
    .map(t=>t._1+"->"+t._2)
    .print()

你可能感兴趣的:(Flink实时计算)