akka集成spark过程中踩的几个小坑

多线程的一个坑

error:


 ERROR (com.ximalaya.xqlserver.xql.engine.adapter.BatchSqlRunnerEngine:74) - executor result throw 
java.lang.IllegalArgumentException: spark.sql.execution.id is already set

后面根据这个error找到了jira
,又跟踪了一下源码,发现是ThreadLocal和ForkJoinPool不兼容的问题,想了下,由于我是将spark的执行逻辑包在akka中的,而akka actor默认使用的线程池正是forkJoinPool,所以解决起来也有思路,那就是改变默认线程池就可以了

思路1:

将执行逻辑包装在Future中,将Future的默认线程池改掉:

  private val executor = Executors.newFixedThreadPool(poolSize)
  private implicit val ex = ExecutionContext.fromExecutor(executor)

思路2:
配置akka conf中,额外配一个线程池,而新建Actor的时候选择用我们自己额外配置线程池取创建

akka remote byte限制

由于我要将执行完的dataFrame从yarn返回给本地,而通信用的akka remote,akka remote默认有包大小的限制,所以如果超过大小就会丢包

17:18:34 691  INFO (com.ximalaya.xql.communication.engine.RemoteSystemDriver$:44) - remote config is Config(SimpleConfigObject({"akka":{"actor":{"provider":"akka.remote.RemoteActorRefProvider","remote":{"enabled-transports":["akka.remote.netty.tcp"],"netty":{"tcp":{"maximum-frame-size":"30000000b","message-frame-size":"30000000b","receive-buffer-size":"30000000b","send-buffer-size":"30000000b"}}},"serialization-bindings":{"akka.actor.ActorIdentity":"kryo","akka.actor.Identify":"kryo","akka.remote.RemoteWatcher$Rewatch":"kryo","com.ximalaya.xql.communication.common.bean.ResultXQL":"kryo"},"serializers":{"kryo":"com.twitter.chill.akka.AkkaSerializer"}},"daemonic":"off","loglevel":"INFO"}}))

[ERROR] [10/26/2016 17:19:13.613] [remoteSystem-akka.remote.default-remote-dispatcher-5] [akka.tcp://remoteSystem@192.168.17.75:2552/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FlocalSystem%40192.168.3.228%3A2222-0/endpointWriter] Transient association error (association remains live)
akka.remote.OversizedPayloadException: Discarding oversized payload sent to Actor[akka.tcp://localSystem@192.168.3.228:2222/user/localMaster#-1650884510]: max allowed size 128000 bytes, actual size of encoded class c was 133699 bytes.

跟踪了下源代码,后面看了一下配置,设置了下配置,解决了这个问题:

akka{
   remote {
      enabled-transports = ["akka.remote.netty.tcp"]
      netty.tcp {
        message-frame-size =  30000000b
        send-buffer-size =  30000000b
        receive-buffer-size =  30000000b
        maximum-frame-size = 30000000b
      }
    }
}

你可能感兴趣的:(scala,akka)