Spark1.6推出的RpcEnv、RpcEndPoint、RpcEndpointRef为核心的新型架构下的RPC通信方式,在底层封装了Akka和Netty,为未来扩充更多的通信系统提供了可能。RpcEnv是一个更宏观的Env,是Spark集群Rpc通信的基础服务环境,因此在集群启动时候所有的节点(无论Master还是Worker)都会创建一个RpcEnv,然后将该节点注册到RpcEnv中。RpcEnv是RPC的环境,所有的RpcEndpoint都需要注册到RpcEnv实例对象中,管理着这些注册的RpcEndpoint的生命周期:
Spark RPC中最为重要的三个抽象(“三剑客”)为:RpcEnv、RpcEndpoint、RpcEndpointRef,这样做的好处有:
创建RpcEnv的代码:
/*
变量声明
包名:org.apache.spark
类名:SparkEnv
*/
private[spark] val driverSystemName = "sparkDriver"
private[spark] val executorSystemName = "sparkExecutor"
val isDriver = executorId == SparkContext.DRIVER_IDENTIFIER
val systemName = if (isDriver) driverSystemName else executorSystemName
val rpcEnv = RpcEnv.create(systemName, bindAddress, advertiseAddress, port.getOrElse(-1), conf,
securityManager, numUsableCores, !isDriver)
/*
变量处理
第一步
包名:org.apache.spark.rpc
类名:RpcEnv
*/
def create(
name: String,
bindAddress: String,
advertiseAddress: String,
port: Int,
conf: SparkConf,
securityManager: SecurityManager,
numUsableCores: Int,
clientMode: Boolean): RpcEnv = {
val config = RpcEnvConfig(conf, name, bindAddress, advertiseAddress, port, securityManager,
numUsableCores, clientMode)
new NettyRpcEnvFactory().create(config)
}
/*
第二步
包名:org.apache.spark.rpc.netty
类名:NettyRpcEnv
*/
private[rpc] class NettyRpcEnvFactory extends RpcEnvFactory with Logging {
/**
* NettyRpcEnvFactory创建了NettyRpcEnv之后,如果clientMode为false,即服务端(Driver端Rpc通讯),则使用创建出
* 的NettyRpcEnv的函数startServer定义一个函数变量startNettyRpcEnv((nettyEnv, nettyEnv.address.port)为函
* 数的返回值),将该函数作为参数传递给函数Utils.startServiceOnPort,即在Driver端启动服务。
*
* 这里可以进入Utils.startServiceOnPort这个函数看看源代码,可以看出为什么不直接调用nettyEnv.startServer,而要把它封装起来
* 传递给工具类来调用:在这个端口启动服务不一定一次就能成功,工具类里对失败的情况做最大次数的尝试,直到启动成功并返回启
* 动成功后的端口。
*/
def create(config: RpcEnvConfig): RpcEnv = {
val sparkConf = config.conf
// Use JavaSerializerInstance in multiple threads is safe. However, if we plan to support
// KryoSerializer in future, we have to use ThreadLocal to store SerializerInstance
// 在多个线程中使用JavaSerializerInstance是安全的。然而,如果我们计划将来支持KryoSerializer,
// 我们必须使用ThreadLocal来存储SerializerInstance
// Netty的通讯都是基于Jav序列化,暂时不支持Kryo
// 1.初始化JavaSerializer,初始化NettyRpcEnv,如果是 非客户端模式就启动netty服务
val javaSerializerInstance =
new JavaSerializer(sparkConf).newInstance().asInstanceOf[JavaSerializerInstance]
// 2.初始化NettyRpcEnv
val nettyEnv =
new NettyRpcEnv(sparkConf, javaSerializerInstance, config.advertiseAddress,
config.securityManager)
// 3.判断是否在Driver端,如果在,则构建TransportServer并注册dispatcher,否则直接返回nettyEnv
if (!config.clientMode) {
// startNettyRpcEnv作为一个函数变量将在下面的startServiceOnPort中被调用
// 简单解释一下这段代码
// 声明一个函数变量,参数是int(actuslPort),=>后面是实现体,最终返回的是2元组(NettyRpcEnv,int)
val startNettyRpcEnv: Int => (NettyRpcEnv, Int) = { actualPort =>
/** 主要是构建TransportServer和注册dispatcher */
nettyEnv.startServer(config.bindAddress, actualPort)
(nettyEnv, nettyEnv.address.port)
}
try {
// 其实内部实现还是调用startNettyRpcEnv在指定的端口实例化,并且返回nettyEnv对象
Utils.startServiceOnPort(config.port, startNettyRpcEnv, sparkConf, config.name)._1
} catch {
case NonFatal(e) =>
nettyEnv.shutdown()
throw e
}
}
nettyEnv
}
}
/*
包名:org.apache.spark.rpc.netty
类名:NettyRpcEnv
*/
def startServer(bindAddress: String, port: Int): Unit = {
val bootstraps: java.util.List[TransportServerBootstrap] =
// 检查是否启用了Spark通信协议的身份验证。
if (securityManager.isAuthenticationEnabled()) {
// Spark的auth协议进行身份验证
java.util.Arrays.asList(new AuthServerBootstrap(transportConf, securityManager))
} else {
java.util.Collections.emptyList()
}
// 创建TransportServer
server = transportContext.createServer(bindAddress, port, bootstraps)
// 创建RpcEndpointVerifier,然后注册自己到NettyRpcEnv上并发回自己的Ref的实现
dispatcher.registerRpcEndpoint(
RpcEndpointVerifier.NAME, new RpcEndpointVerifier(this, dispatcher))
}
/**
包名:org.apache.spark.util
类名:Utils
/**
启动的时候master at spark://biluos.com:7079 后面的端口号开始从7077开始 一直到成功
*/
def startServiceOnPort[T](
startPort: Int,
startService: Int => (T, Int),
conf: SparkConf,
serviceName: String = ""): (T, Int) = {
/**端口号必须1024 and 65535 之间*/
require(startPort == 0 || (1024 <= startPort && startPort < 65536),
"startPort should be between 1024 and 65535 (inclusive), or 0 for a random free port.")
val serviceString = if (serviceName.isEmpty) "" else s" '$serviceName'"
val maxRetries = portMaxRetries(conf)
for (offset <- 0 to maxRetries) {
// Do not increment port if startPort is 0, which is treated as a special port
val tryPort = if (startPort == 0) {
startPort
} else {
userPort(startPort, offset)
}
try {
val (service, port) = startService(tryPort)
// 17/12/05 11:56:50 INFO Utils: Successfully started service 'sparkDriver' on port 55271.
//22=> 17/12/05 11:56:50 INFO Utils: Successfully started service 'SparkUI' on port 4040.
//25=>17/12/05 11:56:51 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 55290.
logInfo(s"Successfully started service$serviceString on port $port.")
return (service, port)
} catch {
case e: Exception if isBindCollision(e) =>
if (offset >= maxRetries) {
val exceptionMessage = if (startPort == 0) {
s"${e.getMessage}: Service$serviceString failed after " +
s"$maxRetries retries (on a random free port)! " +
s"Consider explicitly setting the appropriate binding address for " +
s"the service$serviceString (for example spark.driver.bindAddress " +
s"for SparkDriver) to the correct binding address."
} else {
s"${e.getMessage}: Service$serviceString failed after " +
s"$maxRetries retries (starting from $startPort)! Consider explicitly setting " +
s"the appropriate port for the service$serviceString (for example spark.ui.port " +
s"for SparkUI) to an available port or increasing spark.port.maxRetries."
}
val exception = new BindException(exceptionMessage)
// restore original stack trace
exception.setStackTrace(e.getStackTrace)
throw exception
}
if (startPort == 0) {
// As startPort 0 is for a random free port, it is most possibly binding address is
// not correct.
logWarning(s"Service$serviceString could not bind on a random free port. " +
"You may check whether configuring an appropriate binding address.")
} else {
logWarning(s"Service$serviceString could not bind on port $tryPort. " +
s"Attempting port ${tryPort + 1}.")
}
}
}
// Should never happen
throw new SparkException(s"Failed to start service$serviceString on port $startPort")
}
RpcEnv、RpcEndPoint、RpcEndpointRef更深入的剖析详见《Spark2.2——RpcEnv(二)》