在开始讨论Akka中对Actor的生命周期管理前,我们先探讨一下所谓的Actor编程模式。对比起我们习惯的行令式(imperative)编程模式,Actor编程模式更接近现实中的应用场景和功能测试模式。这是因为Actor是靠消息来驱动的,每种消息代表一项功能的运算指令。由于消息驱动式的程序是松散耦合的,每项功能都是在独立的线程中运算,互不干扰依赖,所以我们可以很自然的分开来实现各项功能以及独立测试每项功能。虽然Akka同时提供了Java和Scala两种API,但可能由于Akka本身是用Scala开发的,所以感觉用Scala来开发Akka程序会更自然些:笼统来讲,Actor编程主要就是对receive函数的实现。而receive函数就是几个普通的功能函数用模式匹配的方式按消息类型进行调用。receive函数所调用的功能函数可以是任何JVM兼容语言函数,由于每个Actor的运算都在自己独立的线程里进行,所以我们不必担心Actor函数在运行中的交叉调用问题。Akka程序本就是一种原生的多线程程序,每个Actor都在一个自己的线程内独立运算它的receive函数。除此之外Actor的运算环境可以在任何不同的JVM里,只要Akka信息发送能实现跨JVM投递的话,实现分布式程序也是自然而然的事了。所以,理论上Akka编程初学者应该把主要注意力放在这个receive函数的实现上来,按照一种模版式的方式来编写Akka程序就可以了,如下面演示的这个模版例子:
import akka.actor._
object MyActor { //在这个伴生对象里申明MyActor所支持的功能指令
sealed trait ActorCommands
case object RunFuncA extends ActorCommands
case object RunFuncB extends ActorCommands
}
//假设有funcA,funcB. 它们可以从任何JVM函数库里调用
val funcA : () => Any = ???
val funcB : () => Any = ???
class MyActor extends Actor {
import MyActor._
var stateValue: Any = _ //内部状态,代表这个Actor的当前运算结果
override def receive: Receive = {
case RunFuncA => stateValue = funcA //运算funcA,更新stateValue
case RunFuncB => stateValue = funcB //运算funcB,更新stateValue
...
}
}
行令式(imperative)程序的运算流程是按代码顺序进行的。这种流程模式不方便运算流程控制,这个缺点在进行异常处理时更加明显。对于一段我们认为在运算中可能发生异常的代码,我们只能用try-catch来把这段代码包裹起来。那么对于一个安全考虑的比较详细的程序来讲就会出现许多try-catch代码段混合在运算流程里造成整体程序逻辑紊乱,不利对程序的理解和维护。再试想如果容许重试异常运算的话会是怎样的一个场景。而这个问题在Akka编程中得到了完美的解决。在Akka编程里我们可以把每段可能产生异常的代码放到一个独立的Actor中去运算。Akka的Actor组织是一个层级结构。下层Actor是由直接上一层Actor产生,形成一种父子Actor关系。父级Actor除维护自身状态之外还必须负责处理下一层子级Actor所发生的异常,形成一种树形父子层级监管结构。任何子级Actor在运算中发生异常后立即将自己和自己的子级Actor运算挂起,并将下一步行动交付给自己的父级Actor决定。父级Actor对发生异常的子级Actor有以下几种处理方式:
1、恢复运算(Resume):不必理会异常,保留当前状态,跳过当前异常消息,照常继续处理其它消息
2、重新启动(Restart):清除当前状态,保留邮箱及内容,终止当前Actor,再重新构建一个新的Actor实例,沿用原来的消息地址ActorRef继续工作
3、彻底终止(Stop):销毁当前Actor及ActorRef邮箱,把所有消息导向DeadLetter队列。
4、向上提交(Esculate):如果父级无法处理子级异常,则这种情况也视为父级出现的异常。按照规定,父级会将自己和子级Actor运算暂停挂起并把子级Actor实际产生的异常当作自己发生的异常提交给上一层父级处理(也就是说异常信息的发送者sender变成了父级Actor)。
Akka处理异常的方式简单直接:如果发生异常就先暂停挂起然后交给直属父级Actor去处理。这就把异常封闭在这个Actor的监管链条里。Akka系统的监管链条实际代表一个功能的分散封闭运算,所以一个监管链条里发生的异常不会影响其它监管链条。换句话说就是Actor发生异常是封闭在它所属的功能内部的,一个功能发生异常不会影响其它功能。而在行令式程序中,如果没有try-catch,任何一段产生异常的代码都会导致整个程序中断。
Akka提供了OneForOneStrategy和AllForOneStrategy两种对待异常Actor的策略配置,策略中定义了对下属子级发生的各种异常的处理方式。异常处理策略是以策略施用对象分类的,如下:
OneForOneStrategy:只针对发生异常的Actor施用策略
AllForOneStrategy:虽然一个直属子级Actor发生了异常,监管父级Actor把它当作所有下属子级同时发生了相同异常,对所有子级Actor施用策略
下面是一个典型的策略例子:
OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) {
case _: ArithmeticException => Resume
case _: SomeMinerExecption => Resume
case _: NullPointerException => Restart
case _: IllegalArgumentException => Stop
case _: Exception => Escalate
}
Akka对待这种父子监管的原则保证了在Akka系统中不会出现任何孤儿,也就是说保证不会出现断裂的监管树。这就要求当任何一个Actor在暂停挂起前都要保证先暂停挂起它的所有直属子级Actor,而子级则必须先暂停挂起它们的直属子级,如此递归。同样,任何Actor在重启(Restart)时也必须递归式地重启直属子级,因为重启一个Actor需要先停止再启动,我们必须肯定在停止时不会产生孤儿Actor。如果一个父级Actor无法处理子级异常需要向上提交(Esculate)的话,首先它需要采取递归方式来暂停挂起自身以下的监管链条。它的直属父级Actor会按自己的异常处理策略来对待提交上来的异常,处理的结果将会递归式沿着监管树影响属下的所有子子孙孙。但如果这个级别的Actor异常处理策略还是无法覆盖这个异常时,它又会挂起自己,再向上提交(Esculate)。那么如果到达了顶级Actor又如何向上提交呢?Akka系统最终的异常处理策略可以在config文件里配置:
# The guardian "/user" will use this class to obtain its supervisorStrategy.
# It needs to be a subclass of akka.actor.SupervisorStrategyConfigurator.
# In addition to the default there is akka.actor.StoppingSupervisorStrategy.
guardian-supervisor-strategy = "akka.actor.DefaultSupervisorStrategy"
/**
* When supervisorStrategy is not specified for an actor this
* `Decider` is used by default in the supervisor strategy.
* The child will be stopped when [[akka.actor.ActorInitializationException]],
* [[akka.actor.ActorKilledException]], or [[akka.actor.DeathPactException]] is
* thrown. It will be restarted for other `Exception` types.
* The error is escalated if it's a `Throwable`, i.e. `Error`.
*/
final val defaultDecider: Decider = {
case _: ActorInitializationException ⇒ Stop
case _: ActorKilledException ⇒ Stop
case _: DeathPactException ⇒ Stop
case _: Exception ⇒ Restart
}
/**
* When supervisorStrategy is not specified for an actor this
* is used by default. OneForOneStrategy with decider defined in
* [[#defaultDecider]].
*/
f
inal val defaultStrategy: SupervisorStrategy = {
OneForOneStrategy()(defaultDecider)
}
override val supervisorStrategy =
OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) {
case _: ArithmeticException => Resume
case _: MyException => Restart
case t =>
super.supervisorStrategy.decider.applyOrElse(t, (_: Any) => Escalate)
}
上面提到Akka绝对不容许有孤儿Actor存在(断裂的监管树),所以停止任何一个Actor,它下属的子子孙孙都会自下而上依次停止运算。为了更好的理解Actor的监管策略,我们必须先从了解Actor的生命周期(lift-cycle)开始。一个Actor从构建产生ActorRef开始到彻底终止为整个生命周期。其中可以发生多次重启(Restart)。我们在下面对Actor的开始、终止、重启这三个环节中发生的事件进行描述:
1、开始
当Akka通过Props构建一个Actor后,这个Actor可以立即开始处理消息,进入开始(started)状态。Akka提供了针对开始状态的事件接口(event hooks)preStart如下:
/**
* User overridable callback.
*
* Is called when an Actor is started.
* Actors are automatically started asynchronously when created.
* Empty default implementation.
*/
@throws(classOf[Exception]) // when changing this you MUST also change ActorDocTest
//#lifecycle-hooks
def preStart(): Unit = ()
override def preStart={
log.info ("Starting storage actor...")
initDB
}
2、终止
一个Actor可能因为完成运算、发生异常又或者人为通过发送Kill,PoisonPill强行终止等而进入停止(stopping)状态。在停止过程中这个Actor会先以递归方式停止它属下的所有子孙Actor然后停止处理消息并将所有发给它的消息导向DeadLetter队列。Akka提供了事件接口postStop:
/**
* User overridable callback.
*
* Is called asynchronously after 'actor.stop()' is invoked.
* Empty default implementation.
*/
@throws(classOf[Exception]) // when changing this you MUST also change ActorDocTest
//#lifecycle-hooks
def postStop(): Unit = ()
override def postStop={
log.info ("Stopping storage actor...")
db.release
}
3、重启
重启是Actor生命周期里一个最重要的环节。在一个Actor的生命周期里可能因为多种原因发生重启(Restart)。造成一个Actor需要重启的原因可能有下面几个:
1、在处理某特定消息时造成了系统性的异常,必须通过重启来清理系统错误
2、内部状态毁坏,必须通过重启来重新构建状态
3、在处理消息时无法使用到一些依赖资源,需要重启来重新配置资源
重启是一个先停止再开始的过程。父级Actor通过递归方式先停止下面的子孙Actor,那么在启动过程中这些停止的子孙Actor是否会自动构建呢?这里需要特别注意:因为父级Actor是通过Props重新构建的,如果子级Actor的构建是在父级Actor的类构建器内而不是在消息处理函数内构建的,那么子级Actor会自动构建。Akka提供了preRestart和postRestart两个事件接口。preRestart发生在停止之前,postRestart发生在开始前,如下:
/**
* Scala API: User overridable callback: '''By default it disposes of all children and then calls `postStop()`.'''
* @param reason the Throwable that caused the restart to happen
* @param message optionally the current message the actor processed when failing, if applicable
*
* Is called on a crashed Actor right BEFORE it is restarted to allow clean
* up of resources before Actor is terminated.
*/
@throws(classOf[Exception]) // when changing this you MUST also change ActorDocTest
//#lifecycle-hooks
def preRestart(reason: Throwable, message: Option[Any]): Unit = {
context.children foreach { child ⇒
context.unwatch(child)
context.stop(child)
}
postStop()
}
//#lifecycle-hooks
/**
* User overridable callback: By default it calls `preStart()`.
* @param reason the Throwable that caused the restart to happen
*
* Is called right AFTER restart on the newly created Actor to allow reinitialization after an Actor crash.
*/
@throws(classOf[Exception]) // when changing this you MUST also change ActorDocTest
//#lifecycle-hooks
def postRestart(reason: Throwable): Unit = {
preStart()
}
override def preRestart(reason: Throwable, message: Option[Any]): Unit = {
log.info(s"Parent restarting with error ${message}...")
doSomeWorkBeforeStopping
super.preRestart(reason, message)
}
override def postRestart(reason: Throwable): Unit = {
log.info("need to initialize too ...")
doSomeExtraInit
super.postRestart(reason)
}
我们知道:很多时候由于外界原因,Actor的重启无法保证一次成功。这种现象在使用依赖资源如数据库、网络连接等最为明显。我们前面介绍过的异常处理策略中就包含了重试(retry)次数及最长重试时间,如下:
/**
* Applies the fault handling `Directive` (Resume, Restart, Stop) specified in the `Decider`
* to the child actor that failed, as opposed to [[akka.actor.AllForOneStrategy]] that applies
* it to all children.
*
* @param maxNrOfRetries the number of times a child actor is allowed to be restarted, negative value means no limit,
* if the limit is exceeded the child actor is stopped
* @param withinTimeRange duration of the time window for maxNrOfRetries, Duration.Inf means no window
* @param decider mapping from Throwable to [[akka.actor.SupervisorStrategy.Directive]], you can also use a
* [[scala.collection.immutable.Seq]] of Throwables which maps the given Throwables to restarts, otherwise escalates.
* @param loggingEnabled the strategy logs the failure if this is enabled (true), by default it is enabled
*/
case class OneForOneStrategy(
maxNrOfRetries: Int = -1,
withinTimeRange: Duration = Duration.Inf,
override val loggingEnabled: Boolean = true)(val decider: SupervisorStrategy.Decider)
extends SupervisorStrategy {...}
/**
* Back-off supervisor that stops and starts a child actor using a back-off algorithm when the child actor stops.
* This back-off supervisor is created by using `akka.pattern.BackoffSupervisor.props`
* with `Backoff.onStop`.
*/
final class BackoffSupervisor(
val childProps: Props,
val childName: String,
minBackoff: FiniteDuration,
maxBackoff: FiniteDuration,
val reset: BackoffReset,
randomFactor: Double,
strategy: SupervisorStrategy)
extends Actor with HandleBackoff {...}
/**
* Props for creating a [[BackoffSupervisor]] actor from [[BackoffOptions]].
*
* @param options the [[BackoffOptions]] that specify how to construct a backoff-supervisor.
*/
def props(options: BackoffOptions): Props = options.props
/**
* Builds back-off options for creating a back-off supervisor.
* You can pass `BackoffOptions` to `akka.pattern.BackoffSupervisor.props`.
* An example of creating back-off options:
* {{{
* Backoff.onFailure(childProps, childName, minBackoff, maxBackoff, randomFactor)
* .withManualReset
* .withSupervisorStrategy(
* OneforOneStrategy(){
* case e: GivingUpException => Stop
* case e: RetryableException => Restart
* }
* )
*
* }}}
*/
object Backoff {
/**
* Back-off options for creating a back-off supervisor actor that expects a child actor to restart on failure.
*
* This explicit supervisor behaves similarly to the normal implicit supervision where
* if an actor throws an exception, the decider on the supervisor will decide when to
* `Stop`, `Restart`, `Escalate`, `Resume` the child actor.
*
* When the `Restart` directive is specified, the supervisor will delay the restart
* using an exponential back off strategy (bounded by minBackoff and maxBackoff).
*
* This supervisor is intended to be transparent to both the child actor and external actors.
* Where external actors can send messages to the supervisor as if it was the child and the
* messages will be forwarded. And when the child is `Terminated`, the supervisor is also
* `Terminated`.
* Transparent to the child means that the child does not have to be aware that it is being
* supervised specifically by this actor. Just like it does
* not need to know when it is being supervised by the usual implicit supervisors.
* The only caveat is that the `ActorRef` of the child is not stable, so any user storing the
* `sender()` `ActorRef` from the child response may eventually not be able to communicate with
* the stored `ActorRef`. In general all messages to the child should be directed through this actor.
*
* An example of where this supervisor might be used is when you may have an actor that is
* responsible for continuously polling on a server for some resource that sometimes may be down.
* Instead of hammering the server continuously when the resource is unavailable, the actor will
* be restarted with an exponentially increasing back off until the resource is available again.
*
* '''***
* This supervisor should not be used with `Akka Persistence` child actors.
* `Akka Persistence` actors shutdown unconditionally on `persistFailure()`s rather
* than throw an exception on a failure like normal actors.
* [[#onStop]] should be used instead for cases where the child actor
* terminates itself as a failure signal instead of the normal behavior of throwing an exception.
* ***'''
* You can define another
* supervision strategy by using `akka.pattern.BackoffOptions.withSupervisorStrategy` on [[akka.pattern.BackoffOptions]].
*
* @param childProps the [[akka.actor.Props]] of the child actor that
* will be started and supervised
* @param childName name of the child actor
* @param minBackoff minimum (initial) duration until the child actor will
* started again, if it is terminated
* @param maxBackoff the exponential back-off is capped to this duration
* @param randomFactor after calculation of the exponential back-off an additional
* random delay based on this factor is added, e.g. `0.2` adds up to `20%` delay.
* In order to skip this additional delay pass in `0`.
*/
def onFailure(
childProps: Props,
childName: String,
minBackoff: FiniteDuration,
maxBackoff: FiniteDuration,
randomFactor: Double): BackoffOptions =
BackoffOptionsImpl(RestartImpliesFailure, childProps, childName, minBackoff, maxBackoff, randomFactor)
/**
* Back-off options for creating a back-off supervisor actor that expects a child actor to stop on failure.
*
* This actor can be used to supervise a child actor and start it again
* after a back-off duration if the child actor is stopped.
*
* This is useful in situations where the re-start of the child actor should be
* delayed e.g. in order to give an external resource time to recover before the
* child actor tries contacting it again (after being restarted).
*
* Specifically this pattern is useful for persistent actors,
* which are stopped in case of persistence failures.
* Just restarting them immediately would probably fail again (since the data
* store is probably unavailable). It is better to try again after a delay.
*
* It supports exponential back-off between the given `minBackoff` and
* `maxBackoff` durations. For example, if `minBackoff` is 3 seconds and
* `maxBackoff` 30 seconds the start attempts will be delayed with
* 3, 6, 12, 24, 30, 30 seconds. The exponential back-off counter is reset
* if the actor is not terminated within the `minBackoff` duration.
*
* In addition to the calculated exponential back-off an additional
* random delay based the given `randomFactor` is added, e.g. 0.2 adds up to 20%
* delay. The reason for adding a random delay is to avoid that all failing
* actors hit the backend resource at the same time.
*
* You can retrieve the current child `ActorRef` by sending `BackoffSupervisor.GetCurrentChild`
* message to this actor and it will reply with [[akka.pattern.BackoffSupervisor.CurrentChild]]
* containing the `ActorRef` of the current child, if any.
*
* The `BackoffSupervisor`delegates all messages from the child to the parent of the
* `BackoffSupervisor`, with the supervisor as sender.
*
* The `BackoffSupervisor` forwards all other messages to the child, if it is currently running.
*
* The child can stop itself and send a [[akka.actor.PoisonPill]] to the parent supervisor
* if it wants to do an intentional stop.
*
* Exceptions in the child are handled with the default supervisionStrategy, which can be changed by using
* [[BackoffOptions#withSupervisorStrategy]] or [[BackoffOptions#withDefaultStoppingStrategy]]. A
* `Restart` will perform a normal immediate restart of the child. A `Stop` will
* stop the child, but it will be started again after the back-off duration.
*
* @param childProps the [[akka.actor.Props]] of the child actor that
* will be started and supervised
* @param childName name of the child actor
* @param minBackoff minimum (initial) duration until the child actor will
* started again, if it is terminated
* @param maxBackoff the exponential back-off is capped to this duration
* @param randomFactor after calculation of the exponential back-off an additional
* random delay based on this factor is added, e.g. `0.2` adds up to `20%` delay.
* In order to skip this additional delay pass in `0`.
*/
def onStop(
childProps: Props,
childName: String,
minBackoff: FiniteDuration,
maxBackoff: FiniteDuration,
randomFactor: Double): BackoffOptions =
BackoffOptionsImpl(StopImpliesFailure, childProps, childName, minBackoff, maxBackoff, randomFactor)
}
/**
* Configures a back-off supervisor actor. Start with `Backoff.onStop` or `Backoff.onFailure`.
* BackoffOptions is immutable, so be sure to chain methods like:
* {{{
* val options = Backoff.onFailure(childProps, childName, minBackoff, maxBackoff, randomFactor)
* .withManualReset
* context.actorOf(BackoffSupervisor.props(options), name)
* }}}
*/
trait BackoffOptions {
/**
* Returns a new BackoffOptions with automatic back-off reset.
* The back-off algorithm is reset if the child does not crash within the specified `resetBackoff`.
* @param resetBackoff The back-off is reset if the child does not crash within this duration.
*/
def withAutoReset(resetBackoff: FiniteDuration): BackoffOptions
/**
* Returns a new BackoffOptions with manual back-off reset. The back-off is only reset
* if the child sends a `BackoffSupervisor.Reset` to its parent (the backoff-supervisor actor).
*/
def withManualReset: BackoffOptions
/**
* Returns a new BackoffOptions with the supervisorStrategy.
* @param supervisorStrategy the supervisorStrategy that the back-off supervisor will use.
* The default supervisor strategy is used as fallback if the specified supervisorStrategy (its decider)
* does not explicitly handle an exception. As the BackoffSupervisor creates a separate actor to handle the
* backoff process, only a [[OneForOneStrategy]] makes sense here.
*/
def withSupervisorStrategy(supervisorStrategy: OneForOneStrategy): BackoffOptions
/**
* Returns a new BackoffOptions with a default `SupervisorStrategy.stoppingStrategy`.
* The default supervisor strategy is used as fallback for throwables not handled by `SupervisorStrategy.stoppingStrategy`.
*/
def withDefaultStoppingStrategy: BackoffOptions
/**
* Returns the props to create the back-off supervisor.
*/
private[akka] def props: Props
}
val childProps = Props(classOf[EchoActor])
val supervisor = BackoffSupervisor.props(
Backoff.onFailure(
childProps,
childName = "myEcho",
minBackoff = 3.seconds,
maxBackoff = 30.seconds,
randomFactor = 0.2 // adds 20% "noise" to vary the intervals slightly
).withManualReset
.withSupervisorStrategy(
OneforOneStrategy(){
case _: GivingUpException => Stop
case _: RetryableException => Restart
case _: MinorException => Resume
}
)
)
system.actorOf(supervisor, name = "echoSupervisor")
以上的withManualReset是个BackoffOption的方法:要求子级Actor在成功重启后手动发送akka.pattern.BackoffSupervisor.Reset给它的监管父级Actor使其可以清除那些计数器,Akka源代码中是这样处理的:
case Reset ⇒
reset match {
case ManualReset ⇒ restartCount = 0
case msg ⇒ unhandled(msg)
}
/**
* Returns a new BackoffOptions with automatic back-off reset.
* The back-off algorithm is reset if the child does not crash within the specified `resetBackoff`.
* @param resetBackoff The back-off is reset if the child does not crash within this duration.
*/
def withAutoReset(resetBackoff: FiniteDuration): BackoffOptions
/**
* Returns a new BackoffOptions with manual back-off reset. The back-off is only reset
* if the child sends a `BackoffSupervisor.Reset` to its parent (the backoff-supervisor actor).
*/
def withManualReset: BackoffOptions
/**
* Have this FunctionRef watch the given Actor. This method must not be
* called concurrently from different threads, it should only be called by
* its parent Actor.
*
* Upon receiving the Terminated message, unwatch() must be called from a
* safe context (i.e. normally from the parent Actor).
*/
def watch(actorRef: ActorRef): Unit = {
watching += actorRef
actorRef.asInstanceOf[InternalActorRef].sendSystemMessage(Watch(actorRef.asInstanceOf[InternalActorRef], this))
}
/**
* Have this FunctionRef unwatch the given Actor. This method must not be
* called concurrently from different threads, it should only be called by
* its parent Actor.
*/
def unwatch(actorRef: ActorRef): Unit = {
watching -= actorRef
actorRef.asInstanceOf[InternalActorRef].sendSystemMessage(Unwatch(actorRef.asInstanceOf[InternalActorRef], this))
}
被监视对象的终止事件是通过Terminate消息获取的。典型的监视方式示范如下:
class DeathPactExceptionParentActor extends Actor with ActorLogging{
def receive={
case "create_child"=> {
log.info ("creating child")
val child=context.actorOf(Props[DeathPactExceptionChildActor])
context.watch(child) //watch child's death
child!"stop"
}
case "someMessage" => log.info ("some message")
case Terminated(_) => context.stop(self) //child has stopped
}
}
讲了这么多,还是感到有许多疑问,可能还是用一些代码来了解一下这些策略的具体用法。我想,无可否认的BackoffSupervisor应该是个最硬的骨头,我们先设计一个场景来示范BackoffSupervisor的用法和效果:假设一个咖啡餐厅场景,其中有个厨房,厨房内有大厨,这几个环节都可以用Actor来表达。其中餐厅是顶层Actor,直属子级是厨房,而大厨则是厨房的直属子级。我们可以把厨房Actor作为一个BackoffSupervisor,这样当大厨Actor出现任何异常时厨房Actor可以用一种逐步延时的方式来重启大厨Actor。我们先定义这个大厨Actor:
object Chef {
sealed trait Cooking
case object CookSpecial extends Cooking
class ChefBusy(msg: String) extends Exception(msg)
def props = Props(new Chef)
}
class Chef extends Actor with ActorLogging {
import Chef._
log.info(s"Chef actor created at ${System.currentTimeMillis()}")
override def receive: Receive = {
case _ => throw new ChefBusy("Chef is busy cooking!")
}
override def preRestart(reason: Throwable, message: Option[Any]): Unit = {
super.preRestart(reason, message)
log.info(s"Restarting Chef for $message")
}
override def postRestart(reason: Throwable): Unit = {
super.postRestart(reason)
log.info(s"Chef restarted for ${reason.getMessage}")
}
override def postStop(): Unit = {
log.info("Chef stopped!")
}
}
object Kitchen {
def kitchenProps = {
import Chef._
val options = Backoff.onFailure(Chef.props, "chef", 200 millis, 10 seconds, 0.0)
.withSupervisorStrategy(OneForOneStrategy(maxNrOfRetries = 4, withinTimeRange = 30 seconds) {
case _: ChefBusy => SupervisorStrategy.Restart
})
BackoffSupervisor.props(options)
}
}
class Kitchen extends Actor with ActorLogging {
override def receive: Receive = {
case x => context.children foreach {child => child ! x}
}
}
下面我们试试BackoffSupervisor的具体效果:
object Cafe extends App {
import Kitchen._
val cafeSystem = ActorSystem("cafe")
val kitchen = cafeSystem.actorOf(kitchenProps,"kitchen")
println(s"Calling chef at ${System.currentTimeMillis()}")
kitchen ! "CookCook"
println(s"Calling chef at ${System.currentTimeMillis()}")
Thread.sleep(1000)
println(s"Calling chef at ${System.currentTimeMillis()}")
kitchen ! "CookCook"
Thread.sleep(1000)
kitchen ! "CookCook"
Thread.sleep(1000)
kitchen ! "CookCook"
Thread.sleep(1000)
kitchen ! "CookCook"
Thread.sleep(1000 * 30)
cafeSystem.terminate()
}
Calling chef at 1495108529380
[INFO] [05/18/2017 19:55:29.384] [cafe-akka.actor.default-dispatcher-2] [akka://cafe/user/kitchen/chef] Chef actor created at 1495108529382
[ERROR] [05/18/2017 19:55:29.392] [cafe-akka.actor.default-dispatcher-3] [akka://cafe/user/kitchen/chef] Chef is busy cooking!
Chef$ChefBusy: Chef is busy cooking!
at Chef$$anonfun$receive$1.applyOrElse(Cafe.scala:24)
...
[INFO] [05/18/2017 19:55:29.394] [cafe-akka.actor.default-dispatcher-2] [akka://cafe/user/kitchen/chef] Chef stopped!
[INFO] [05/18/2017 19:55:29.614] [cafe-akka.actor.default-dispatcher-4] [akka://cafe/user/kitchen/chef] Chef actor created at 1495108529614
Calling chef at 1495108530382
[ERROR] [05/18/2017 19:55:30.382] [cafe-akka.actor.default-dispatcher-3] [akka://cafe/user/kitchen/chef] Chef is busy cooking!
我们看到Chef被重启过程。值得注意的是:生命周期监控函数中只有postStop被调用过,preRestart和postRestart都没引用。如果这样的话BackoffSupervisor就是一锤子买卖,是正真的let it crash模式体现了。那如果需要重新处理造成异常的消息又怎么办呢?看来只好试试SupervisorStrategy了。我们用下面的例子来示范一下:
import akka.actor._
import scala.util.Random
import scala.concurrent.duration._
object ChildActor {
class RndException(msg: String) extends Exception(msg)
def props = Props[ChildActor]
}
class ChildActor extends Actor with ActorLogging {
import ChildActor._
override def receive: Receive = {
case msg: String => { //任意产生一些RndExcption
if (Random.nextBoolean())
throw new RndException("Any Exception!")
else
log.info(s"Processed message: $msg !!!")
}
}
override def preStart(): Unit = {
log.info("ChildActor Started.")
super.preStart()
}
//在重启时preRestart是在原来的Actor实例上调用preRestart的
override def preRestart(reason: Throwable, message: Option[Any]): Unit = {
log.info(s"Restarting ChildActor for ${reason.getMessage}...")
message match {
case Some(msg) =>
log.info(s"Exception message: ${msg.toString}")
self ! msg //把异常消息再摆放到信箱最后
case None =>
}
super.preRestart(reason, message)
}
override def postRestart(reason: Throwable): Unit = {
super.postRestart(reason)
log.info(s"Restarted ChildActor for ${reason.getMessage}...")
}
override def postStop(): Unit = {
log.info(s"Stopped ChildActor.")
super.postStop()
}
}
//监管父级
class Parent extends Actor with ActorLogging {
def decider: PartialFunction[Throwable,SupervisorStrategy.Directive] = {
case _: ChildActor.RndException => SupervisorStrategy.Restart
}
override val supervisorStrategy =
OneForOneStrategy(maxNrOfRetries = 30, withinTimeRange = 3 seconds) {
decider.orElse(SupervisorStrategy.defaultDecider)
}
val childActor = context.actorOf(ChildActor.props,"childActor")
override def receive: Receive = {
case msg@ _ => childActor ! msg //把所有收到的消息都转给childActor
}
}
object TestMyActor extends App {
val system = ActorSystem("testSystem")
val parentActor = system.actorOf(Props[Parent],"parentActor")
parentActor ! "Hello 1"
parentActor ! "Hello 2"
parentActor ! "Hello 3"
parentActor ! "Hello 4"
parentActor ! "Hello 5"
Thread.sleep(5000)
system.terminate()
}
好了,明白了如何使用BackoffSupervior,我们还是把整个例子完善一下吧:还是这个Cafe场景。Cafe里分厨房Kitchen、收款员Cashier几个部分。上面已经介绍过Kitchen下还有Chef,而Cashier之下还有收据打印机ReceiptPrinter。整个工作流程大致如下:
1、一个客人进店要求一杯特价咖啡
2、Cafe要求厨房在三种咖啡中即时选择任意一款作为特价咖啡
3、Cafe同时要求Cashier按厨房提供的特价咖啡价钱收款并打印收据
4、以上2,3成功后完成一单完整销售,更新销售额
5、完成销售目标后通知厨房打烊
6、收款员看到厨房打烊后停业关门
另外,可能出现几种异常情况:厨房的大厨可能忙不过来准备特价咖啡、收据打印机有可能卡纸。遇到这几种情况直接通知客人迟点再来光顾。
很明显Cafe需要维护内部状态即当前销售额salesAmount,Kitchen的状态是当前特餐currentSpecial,Cashier的状态是paperJammed代表收据打印机是否卡纸。
我们先从Chef,Kitchen及BackoffSupervisor监管开始:
object Chef {
sealed trait Order //消息类型
case object MakeSpecial extends Order //烹制特饮
class ChefBusy(msg: String) extends Exception(msg) //异常类型
def props = Props[Chef]
}
class Chef extends Actor with ActorLogging {
import Chef._
log.info("Chef says: I am ready to work ...") //构建成功信息
//内部状态
var currentSpecial: Cafe.Coffee = Cafe.Original
var chefBusy: Boolean = false
val specials = Map(0 -> Cafe.Original,1 -> Cafe.Espresso, 2 -> Cafe.Cappuccino)
override def receive: Receive = {
case MakeSpecial => {
if ((Random.nextInt(6) % 6) == 0) { //任意产生异常 2/6
log.info("Chef is busy ...")
chefBusy = true
throw new ChefBusy("Busy!")
}
else {
currentSpecial = randomSpecial //选出当前特饮
log.info(s"Chef says: Current special is ${currentSpecial.toString}.")
sender() ! currentSpecial
}
}
}
def randomSpecial = specials(Random.nextInt(specials.size)) //选出当前特饮
override def preRestart(reason: Throwable, message: Option[Any]): Unit = {
log.info(s"Restarting Chef for ${reason.getMessage}...")
super.preRestart(reason, message)
}
override def postRestart(reason: Throwable): Unit = {
log.info(s"Restarted Chef for ${reason.getMessage}.")
context.parent ! BackoffSupervisor.Reset
super.postRestart(reason)
}
override def postStop(): Unit = {
log.info("Stopped Chef.")
super.postStop()
}
}
//Kitchen只是Chef的Backoff监管,没有任何其它功能
class Kitchen extends Actor with ActorLogging {
override def receive: Receive = {
//context.children.size == 1,就是chef。 直接把所有消息转发到Chef
case msg@_ => //注意,无法使用Chef ?因为sender不明
context.children foreach ( chef => chef forward msg)
}
override def postStop(): Unit = {
log.info("Kitchen close!")
super.postStop()
}
}
object Kitchen {
//指定的异常处理策略
val kitchenDecider: PartialFunction[Throwable, SupervisorStrategy.Directive] = {
case _: Chef.ChefBusy => SupervisorStrategy.Restart
}
def kitchenProps: Props = { //定义BackoffSupervisor strategy
val option = Backoff.onFailure(Chef.props,"chef",1 seconds, 5 seconds, 0.0)
.withManualReset
.withSupervisorStrategy {
OneForOneStrategy(maxNrOfRetries = 5, withinTimeRange = 5 seconds) {
kitchenDecider.orElse(SupervisorStrategy.defaultDecider)
}
}
BackoffSupervisor.props(option)
}
}
Chef的唯一功能就是烹制当前特饮。如果太忙无法接单,产生ChefBusy异常。
Cashier和ReceiptPrinter同样是一种父子监管关系。我们用SupervisorStrategy来实现这两个Actor:
object ReceiptPrinter {
case class PrintReceipt(sendTo: ActorRef, receipt: Cafe.Receipt) //print command
class PaperJamException extends Exception
def props = Props[ReceiptPrinter]
}
class ReceiptPrinter extends Actor with ActorLogging {
import ReceiptPrinter._
var paperJammed: Boolean = false
override def receive: Receive = {
case PrintReceipt(customer, receipt) => //打印收据并发送给顾客
if ((Random.nextInt(6) % 6) == 0) {
log.info("Printer jammed paper ...")
paperJammed = true
throw new PaperJamException
} else {
log.info(s"Printing receipt $receipt and sending to ${customer.path.name}")
customer ! receipt
}
}
override def preRestart(reason: Throwable, message: Option[Any]): Unit = {
log.info(s"Restarting ReceiptPrinter for ${reason.getMessage}...")
super.preRestart(reason, message)
}
override def postRestart(reason: Throwable): Unit = {
log.info(s"Started ReceiptPrinter for ${reason.getMessage}.")
super.postRestart(reason)
}
override def postStop(): Unit = {
log.info("Stopped ReceiptPrinter.")
super.postStop()
}
}
object Cashier {
case class RingRegister(cup: Cafe.Coffee, customer: ActorRef) //收款并出具收据
def props(kitchen: ActorRef) = Props(classOf[Cashier],kitchen)
}
class Cashier(kitchen: ActorRef) extends Actor with ActorLogging {
import Cashier._
import ReceiptPrinter._
context.watch(kitchen) //监视厨房。如果打烊了就关门歇业
val printer = context.actorOf(ReceiptPrinter.props,"printer")
//打印机卡纸后重启策略
def cashierDecider: PartialFunction[Throwable,SupervisorStrategy.Directive] = {
case _: PaperJamException => SupervisorStrategy.Restart
}
override def supervisorStrategy: SupervisorStrategy =
OneForOneStrategy(maxNrOfRetries = 5, withinTimeRange = 5 seconds){
cashierDecider.orElse(SupervisorStrategy.defaultDecider)
}
val menu = Map[Cafe.Coffee,Double](Cafe.Original -> 5.50,
Cafe.Cappuccino -> 12.95, Cafe.Espresso -> 11.80)
override def receive: Receive = {
case RingRegister(coffee, customer) => //收款并出具收据
log.info(s"Producing receipt for a cup of ${coffee.toString}...")
val amt = menu(coffee) //计价
val rcpt = Cafe.Receipt(coffee.toString,amt)
printer ! PrintReceipt(customer,rcpt) //打印收据。可能出现卡纸异常
sender() ! Cafe.Sold(rcpt) //通知Cafe销售成功 sender === Cafe
case Terminated(_) =>
log.info("Cashier says: Oh, kitchen is closed. Let's make the end of day!")
context.system.terminate() //厨房打烊,停止营业。
}
}
下面是Cafe和Customer的实现代码:
object Cafe {
sealed trait Coffee //咖啡种类
case object Original extends Coffee
case object Espresso extends Coffee
case object Cappuccino extends Coffee
case class Receipt(item: String, amt: Double)
sealed trait Routine
case object PlaceOrder extends Routine
case class Sold(receipt: Receipt) extends Routine
}
class Cafe extends Actor with ActorLogging {
import Cafe._
import Cashier._
import context.dispatcher
implicit val timeout = Timeout(1 seconds)
var totalAmount: Double = 0.0
val kitchen = context.actorOf(Kitchen.kitchenProps,"kitchen")
//Chef可能重启,但path不变。必须直接用chef ? msg,否则经Kitchen转发无法获取正确的sender
val chef = context.actorSelection("/user/cafe/kitchen/chef")
val cashier = context.actorOf(Cashier.props(kitchen),"cashier")
var customer: ActorRef = _ //当前客户
override def receive: Receive = {
case Sold(rcpt) =>
totalAmount += rcpt.amt
log.info(s"Today's sales is up to $totalAmount")
customer ! Customer.OrderServed(rcpt) //send him the order
if (totalAmount > 100.00) {
log.info("Asking kichen to clean up ...")
context.stop(kitchen)
}
case PlaceOrder =>
customer = sender() //send coffee to this customer
(for {
item <- (chef ? Chef.MakeSpecial).mapTo[Coffee]
sales <- (cashier ? RingRegister(item,sender())).mapTo[Sold]
} yield(Sold(sales.receipt))).mapTo[Sold]
.recover {
case _: AskTimeoutException => Customer.ComebackLater
}.pipeTo(self) //send receipt to be added to totalAmount
}
}
object Customer {
sealed trait CustomerOrder
case object OrderSpecial extends CustomerOrder
case class OrderServed(rcpt: Cafe.Receipt) extends CustomerOrder
case object ComebackLater extends CustomerOrder
def props(cafe: ActorRef) = Props(new Customer(cafe))
}
class Customer(cafe: ActorRef) extends Actor with ActorLogging {
import Customer._
import context.dispatcher
override def receive: Receive = {
case OrderSpecial =>
log.info("Customer place an order ...")
cafe ! Cafe.PlaceOrder
case OrderServed(rcpt) =>
log.info(s"Customer says: Oh my! got my order ${rcpt.item} for ${rcpt.amt}")
case ComebackLater =>
log.info("Customer is not so happy! says: I will be back later!")
context.system.scheduler.scheduleOnce(1 seconds){cafe ! Cafe.PlaceOrder}
}
}
在这里由于使用了?来发送消息,所以具体的发送主体sender有可能出现混乱情况。?产生Future,Future是个monad,如果需要串联多个Future可以用flatMap或者for-comprehension。
下面就是整个例子的测试运行代码:
object MyCafe extends App {
import Cafe._
import Customer._
import scala.concurrent.ExecutionContext.Implicits.global
val cafeSys = ActorSystem("cafeSystem")
val cafe = cafeSys.actorOf(Props[Cafe],"cafe")
val customer = cafeSys.actorOf(Customer.props(cafe),"customer")
cafeSys.scheduler.schedule(1 second, 1 second, customer, OrderSpecial)
}
这,对于像我这样的人来说,的确是一种全新的编程方式!
下面就是这个例子完整的示范源代码:
package mycafe
import akka.actor._
import scala.concurrent.duration._
import scala.util.Random
import akka.pattern._
import akka.util.Timeout
import scala.concurrent._
object Chef {
sealed trait Order //消息类型
case object MakeSpecial extends Order //烹制特饮
class ChefBusy(msg: String) extends Exception(msg) //异常类型
def props = Props[Chef]
}
class Chef extends Actor with ActorLogging {
import Chef._
log.info("Chef says: I am ready to work ...") //构建成功信息
//内部状态
var currentSpecial: Cafe.Coffee = Cafe.Original
var chefBusy: Boolean = false
val specials = Map(0 -> Cafe.Original,1 -> Cafe.Espresso, 2 -> Cafe.Cappuccino)
override def receive: Receive = {
case MakeSpecial => {
if ((Random.nextInt(6) % 6) == 0) { //任意产生异常 2/6
log.info("Chef is busy ...")
chefBusy = true
throw new ChefBusy("Busy!")
}
else {
currentSpecial = randomSpecial //选出当前特饮
log.info(s"Chef says: Current special is ${currentSpecial.toString}.")
sender() ! currentSpecial
}
}
}
def randomSpecial = specials(Random.nextInt(specials.size)) //选出当前特饮
override def preRestart(reason: Throwable, message: Option[Any]): Unit = {
log.info(s"Restarting Chef for ${reason.getMessage}...")
super.preRestart(reason, message)
}
override def postRestart(reason: Throwable): Unit = {
log.info(s"Restarted Chef for ${reason.getMessage}.")
context.parent ! BackoffSupervisor.Reset
super.postRestart(reason)
}
override def postStop(): Unit = {
log.info("Stopped Chef.")
super.postStop()
}
}
//Kitchen只是Chef的Backoff监管,没有任何其它功能
class Kitchen extends Actor with ActorLogging {
override def receive: Receive = {
//context.children.size == 1,就是chef。 直接把所有消息转发到Chef
case msg@_ => //注意,无法使用Chef ?因为sender不明
context.children foreach ( chef => chef forward msg)
}
override def postStop(): Unit = {
log.info("Kitchen close!")
super.postStop()
}
}
object Kitchen {
//指定的异常处理策略
val kitchenDecider: PartialFunction[Throwable, SupervisorStrategy.Directive] = {
case _: Chef.ChefBusy => SupervisorStrategy.Restart
}
def kitchenProps: Props = { //定义BackoffSupervisor strategy
val option = Backoff.onFailure(Chef.props,"chef",1 seconds, 5 seconds, 0.0)
.withManualReset
.withSupervisorStrategy {
OneForOneStrategy(maxNrOfRetries = 5, withinTimeRange = 5 seconds) {
kitchenDecider.orElse(SupervisorStrategy.defaultDecider)
}
}
BackoffSupervisor.props(option)
}
}
object ReceiptPrinter {
case class PrintReceipt(sendTo: ActorRef, receipt: Cafe.Receipt) //print command
class PaperJamException extends Exception
def props = Props[ReceiptPrinter]
}
class ReceiptPrinter extends Actor with ActorLogging {
import ReceiptPrinter._
var paperJammed: Boolean = false
override def receive: Receive = {
case PrintReceipt(customer, receipt) => //打印收据并发送给顾客
if ((Random.nextInt(6) % 6) == 0) {
log.info("Printer jammed paper ...")
paperJammed = true
throw new PaperJamException
} else {
log.info(s"Printing receipt $receipt and sending to ${customer.path.name}")
customer ! receipt
}
}
override def preRestart(reason: Throwable, message: Option[Any]): Unit = {
log.info(s"Restarting ReceiptPrinter for ${reason.getMessage}...")
super.preRestart(reason, message)
}
override def postRestart(reason: Throwable): Unit = {
log.info(s"Started ReceiptPrinter for ${reason.getMessage}.")
super.postRestart(reason)
}
override def postStop(): Unit = {
log.info("Stopped ReceiptPrinter.")
super.postStop()
}
}
object Cashier {
case class RingRegister(cup: Cafe.Coffee, customer: ActorRef) //收款并出具收据
def props(kitchen: ActorRef) = Props(classOf[Cashier],kitchen)
}
class Cashier(kitchen: ActorRef) extends Actor with ActorLogging {
import Cashier._
import ReceiptPrinter._
context.watch(kitchen) //监视厨房。如果打烊了就关门歇业
val printer = context.actorOf(ReceiptPrinter.props,"printer")
//打印机卡纸后重启策略
def cashierDecider: PartialFunction[Throwable,SupervisorStrategy.Directive] = {
case _: PaperJamException => SupervisorStrategy.Restart
}
override def supervisorStrategy: SupervisorStrategy =
OneForOneStrategy(maxNrOfRetries = 5, withinTimeRange = 5 seconds){
cashierDecider.orElse(SupervisorStrategy.defaultDecider)
}
val menu = Map[Cafe.Coffee,Double](Cafe.Original -> 5.50,
Cafe.Cappuccino -> 12.95, Cafe.Espresso -> 11.80)
override def receive: Receive = {
case RingRegister(coffee, customer) => //收款并出具收据
log.info(s"Producing receipt for a cup of ${coffee.toString}...")
val amt = menu(coffee) //计价
val rcpt = Cafe.Receipt(coffee.toString,amt)
printer ! PrintReceipt(customer,rcpt) //打印收据。可能出现卡纸异常
sender() ! Cafe.Sold(rcpt) //通知Cafe销售成功 sender === Cafe
case Terminated(_) =>
log.info("Cashier says: Oh, kitchen is closed. Let's make the end of day!")
context.system.terminate() //厨房打烊,停止营业。
}
}
object Cafe {
sealed trait Coffee //咖啡种类
case object Original extends Coffee
case object Espresso extends Coffee
case object Cappuccino extends Coffee
case class Receipt(item: String, amt: Double)
sealed trait Routine
case object PlaceOrder extends Routine
case class Sold(receipt: Receipt) extends Routine
}
class Cafe extends Actor with ActorLogging {
import Cafe._
import Cashier._
import context.dispatcher
implicit val timeout = Timeout(1 seconds)
var totalAmount: Double = 0.0
val kitchen = context.actorOf(Kitchen.kitchenProps,"kitchen")
//Chef可能重启,但path不变。必须直接用chef ? msg,否则经Kitchen转发无法获取正确的sender
val chef = context.actorSelection("/user/cafe/kitchen/chef")
val cashier = context.actorOf(Cashier.props(kitchen),"cashier")
var customer: ActorRef = _ //当前客户
override def receive: Receive = {
case Sold(rcpt) =>
totalAmount += rcpt.amt
log.info(s"Today's sales is up to $totalAmount")
customer ! Customer.OrderServed(rcpt) //send him the order
if (totalAmount > 100.00) {
log.info("Asking kichen to clean up ...")
context.stop(kitchen)
}
case PlaceOrder =>
customer = sender() //send coffee to this customer
(for {
item <- (chef ? Chef.MakeSpecial).mapTo[Coffee]
sales <- (cashier ? RingRegister(item,sender())).mapTo[Sold]
} yield(Sold(sales.receipt))).mapTo[Sold]
.recover {
case _: AskTimeoutException => Customer.ComebackLater
}.pipeTo(self) //send receipt to be added to totalAmount
}
}
object Customer {
sealed trait CustomerOrder
case object OrderSpecial extends CustomerOrder
case class OrderServed(rcpt: Cafe.Receipt) extends CustomerOrder
case object ComebackLater extends CustomerOrder
def props(cafe: ActorRef) = Props(new Customer(cafe))
}
class Customer(cafe: ActorRef) extends Actor with ActorLogging {
import Customer._
import context.dispatcher
override def receive: Receive = {
case OrderSpecial =>
log.info("Customer place an order ...")
cafe ! Cafe.PlaceOrder
case OrderServed(rcpt) =>
log.info(s"Customer says: Oh my! got my order ${rcpt.item} for ${rcpt.amt}")
case ComebackLater =>
log.info("Customer is not so happy! says: I will be back later!")
context.system.scheduler.scheduleOnce(1 seconds){cafe ! Cafe.PlaceOrder}
}
}
object MyCafe extends App {
import Cafe._
import Customer._
import scala.concurrent.ExecutionContext.Implicits.global
val cafeSys = ActorSystem("cafeSystem")
val cafe = cafeSys.actorOf(Props[Cafe],"cafe")
val customer = cafeSys.actorOf(Customer.props(cafe),"customer")
cafeSys.scheduler.schedule(1 second, 1 second, customer, OrderSpecial)
}