spark版本: 2.0.0
master管理着spark的主要元数据,用于管理集群,资源调度等。
在start-master.sh脚本中可以看出最终调用的是org.apache.spark.deploy.master.Master
的main方法。现在来分析一下这个方法:
def main(argStrings: Array[String]) {
// 日志
Utils.initDaemon(log)
// spark 配置对象
val conf = new SparkConf
// master参数对象,用于解析传递参数,比如:--host ,--webui-port等
val args = new MasterArguments(argStrings, conf)
val (rpcEnv, _, _) =
// 启动master通信端(核心方法)
startRpcEnvAndEndpoint(args.host, args.port, args.webUiPort, conf)
rpcEnv.awaitTermination()
}
def startRpcEnvAndEndpoint(
host: String,
port: Int,
webUiPort: Int,
conf: SparkConf): (RpcEnv, Int, Option[Int]) = {
// 安全管理器
val securityMgr = new SecurityManager(conf)
// 创建rpc环境对象,现在是基于netty
val rpcEnv = RpcEnv.create(SYSTEM_NAME, host, port, conf, securityMgr)
// 注册master通信端,并返回其通信引用 【1】
val masterEndpoint = rpcEnv.setupEndpoint(ENDPOINT_NAME,
new Master(rpcEnv, rpcEnv.address, webUiPort, securityMgr, conf))
// 向Master的通信终端发送请求,获取绑定的端口号 【2】
val portsResponse = masterEndpoint.askWithRetry[BoundPortsResponse](BoundPortsRequest)
(rpcEnv, portsResponse.webUIPort, portsResponse.restPort)
}
核心位置分析:
【1】
Dispatcher.scala
----------------------------
/**
* 注册rpc通信端
* @param name
* @param endpoint
* @return
*/
def registerRpcEndpoint(name: String, endpoint: RpcEndpoint): NettyRpcEndpointRef = {
val addr = RpcEndpointAddress(nettyEnv.address, name)
// 获取rpc通信端的引用,可以进行通信
val endpointRef = new NettyRpcEndpointRef(nettyEnv.conf, addr, nettyEnv)
synchronized {
if (stopped) {
throw new IllegalStateException("RpcEnv has been stopped")
}
// 添加endpoint名称和对应的数据封装映射
if (endpoints.putIfAbsent(name, new EndpointData(name, endpoint, endpointRef)) != null) {
throw new IllegalArgumentException(s"There is already an RpcEndpoint called $name")
}
val data = endpoints.get(name)
// 添加endpoint引用
endpointRefs.put(data.endpoint, data.ref)
// 添加到消息处理队列中,等待定时任务处理
receivers.offer(data) // for the OnStart message
}
endpointRef
}
上面有一段最核心的代码是:
receivers.offer(data)
看似只是将请求的数据放入receivers队列中,但是它将触发定时任务处理请求,详情如下:
Dispatcher.scala
-------------------
/** 线程池一直在处理MessageLoop的run方法 */
private val threadpool: ThreadPoolExecutor = {
val numThreads = nettyEnv.conf.getInt("spark.rpc.netty.dispatcher.numThreads",
math.max(2, Runtime.getRuntime.availableProcessors()))
// 守护线程不停监听消息
val pool = ThreadUtils.newDaemonFixedThreadPool(numThreads, "dispatcher-event-loop")
for (i <- 0 until numThreads) {
pool.execute(new MessageLoop)
}
pool
}
/** Message loop used for dispatching messages. */
private class MessageLoop extends Runnable {
override def run(): Unit = {
try {
// 不断循环
while (true) {
try {
val data = receivers.take()
// 特殊请求
if (data == PoisonPill) {
// Put PoisonPill back so that other MessageLoops can see it.
receivers.offer(PoisonPill)
return
}
// 接收方处理收信箱
data.inbox.process(Dispatcher.this)
} catch {
case NonFatal(e) => logError(e.getMessage, e)
}
}
} catch {
case ie: InterruptedException => // exit
}
}
}
为了解释上面的data.inbox.process(Dispatcher.this)
,重点介绍一下data.inbox属性
Dispatcher.scala
-------------------
private class EndpointData(
val name: String,
val endpoint: RpcEndpoint,
val ref: NettyRpcEndpointRef) {
// 每次创建一个新对象时,同时创建一个Inbox对象
val inbox = new Inbox(ref, endpoint)
}
private[netty] class Inbox(
val endpointRef: NettyRpcEndpointRef,
val endpoint: RpcEndpoint)
extends Logging {
inbox => // Give this an alias so we can use it more clearly in closures.
// 消息集合,放入这里的消息并不会马上处理,而是要加入到Dispatcher.receivers中,利用线程池并发处理
@GuardedBy("this")
protected val messages = new java.util.LinkedList[InboxMessage]()
/** True if the inbox (and its associated endpoint) is stopped. */
// 是否已经停止接收
@GuardedBy("this")
private var stopped = false
/** Allow multiple threads to process messages at the same time. */
// 是否允许并发
@GuardedBy("this")
private var enableConcurrent = false
/** The number of threads processing messages for this inbox. */
// inbox中活跃线程数
@GuardedBy("this")
private var numActiveThreads = 0
// OnStart should be the first message to process
// 每次创建Inbox对象时,都会先添加一个OnStart消息
inbox.synchronized {
messages.add(OnStart)
}
根据上面分析可知,每次创建EndpointData对象时,就会添加OnStart消息到inbox对象中。所以在注册时receivers.offer(data)
就会添加一个OnStart消息等待处理,现在来看一下真正的处理消息方法(即解释:data.inbox.process(Dispatcher.this)):
def process(dispatcher: Dispatcher): Unit = {
var message: InboxMessage = null
inbox.synchronized {
// 存在线程处理
if (!enableConcurrent && numActiveThreads != 0) {
return
}
// 读取消息
message = messages.poll()
if (message != null) {
numActiveThreads += 1
} else {
return
}
}
while (true) {
safelyCall(endpoint) {
/**
* 处理各种类型的消息
*/
message match {
.......
// 只保留引用到的OnStart消息处理
case OnStart =>
// 这里的endpoint指Master对象,所以就是调用Master.onStart方法
endpoint.onStart()
if (!endpoint.isInstanceOf[ThreadSafeRpcEndpoint]) {
inbox.synchronized {
if (!stopped) {
enableConcurrent = true
}
}
}
.......
}
}
.......
}
接着上面分析的节奏,来分析一下Master.onStart方法
Master.scala
----------------------
override def onStart(): Unit = {
logInfo("Starting Spark master at " + masterUrl)
logInfo(s"Running Spark version ${org.apache.spark.SPARK_VERSION}")
// 使用jetty创建web ui请求服务
webUi = new MasterWebUI(this, webUiPort)
webUi.bind()
masterWebUiUrl = "http://" + masterPublicAddress + ":" + webUi.boundPort
// 检查超时
checkForWorkerTimeOutTask = forwardMessageThread.scheduleAtFixedRate(new Runnable {
override def run(): Unit = Utils.tryLogNonFatalError {
self.send(CheckForWorkerTimeOut)
}
}, 0, WORKER_TIMEOUT_MS, TimeUnit.MILLISECONDS)
// 如果启用了rest server,那么启动rest服务,可以通过该服务向master提交各种请求
if (restServerEnabled) {
val port = conf.getInt("spark.master.rest.port", 6066)
restServer = Some(new StandaloneRestServer(address.host, port, conf, self, masterUrl))
}
restServerBoundPort = restServer.map(_.start())
// 指标监控(不是重点,建议直接跳过)
masterMetricsSystem.registerSource(masterSource)
masterMetricsSystem.start()
applicationMetricsSystem.start()
// Attach the master and app metrics servlet handler to the web ui after the metrics systems are
// started.
// 监控的指标也放在web ui中
masterMetricsSystem.getServletHandlers.foreach(webUi.attachHandler)
applicationMetricsSystem.getServletHandlers.foreach(webUi.attachHandler)
// ------------这段属于master HA部分,以后单独介绍---------------
// 指定是java序列化方式,可以修改为工厂模式
val serializer = new JavaSerializer(conf)
// 根据恢复模式选择,持久化引擎和leader选举
val (persistenceEngine_, leaderElectionAgent_) = RECOVERY_MODE match {
// 如果恢复模式是ZOOKEEPER,那么通过zookeeper来持久化恢复状态
case "ZOOKEEPER" =>
logInfo("Persisting recovery state to ZooKeeper")
val zkFactory =
new ZooKeeperRecoveryModeFactory(conf, serializer)
(zkFactory.createPersistenceEngine(), zkFactory.createLeaderElectionAgent(this))
// 如果恢复模式是文件系统,那么通过文件系统来持久化恢复状态
case "FILESYSTEM" =>
val fsFactory =
new FileSystemRecoveryModeFactory(conf, serializer)
(fsFactory.createPersistenceEngine(), fsFactory.createLeaderElectionAgent(this))
// 如果恢复模式是定制的,那么指定你定制的全路径类名,然后产生相关操作来持久化恢复状态
case "CUSTOM" =>
val clazz = Utils.classForName(conf.get("spark.deploy.recoveryMode.factory"))
val factory = clazz.getConstructor(classOf[SparkConf], classOf[Serializer])
.newInstance(conf, serializer)
.asInstanceOf[StandaloneRecoveryModeFactory]
(factory.createPersistenceEngine(), factory.createLeaderElectionAgent(this))
// 其他处理方式
case _ =>
(new BlackHolePersistenceEngine(), new MonarchyLeaderAgent(this))
}
persistenceEngine = persistenceEngine_
leaderElectionAgent = leaderElectionAgent_
}
其中master.onStart非常简单,就是创建监听服务,访问ui端口,确定master HA恢复模式
上面介绍了这么多,其实只是介绍了startRpcEnvAndEndpoint
方法中的核心代码之一的val masterEndpoint = rpcEnv.setupEndpoint(ENDPOINT_NAME, new Master(rpcEnv, rpcEnv.address, webUiPort, securityMgr, conf))
,现在来介绍一下:val portsResponse = masterEndpoint.askWithRetry[BoundPortsResponse](BoundPortsRequest)
【2】:
RpcEndpointRef.scala
-----------------------
/**
* 多次重试请求
*/
def askWithRetry[T: ClassTag](message: Any, timeout: RpcTimeout): T = {
// TODO: Consider removing multiple attempts
var attempts = 0
var lastException: Exception = null
// 如果没有达到最大重试次数
while (attempts < maxRetries) {
attempts += 1
try {
// 处理请求(核心)
val future = ask[T](message, timeout)
// 等待处理结果
val result = timeout.awaitResult(future)
if (result == null) {
throw new SparkException("RpcEndpoint returned null")
}
return result
} catch {
case ie: InterruptedException => throw ie
case e: Exception =>
lastException = e
logWarning(s"Error sending message [message = $message] in $attempts attempts", e)
}
// 休眠等待下一次重试机会
if (attempts < maxRetries) {
Thread.sleep(retryWaitMs)
}
}
throw new SparkException(
s"Error sending message [message = $message]", lastException)
}
处理请求代码ask[T](message, timeout)
(message=BoundPortsRequest)来分析一下,
NettyRpcEnv.scala
---------------------
private[netty] def ask[T: ClassTag](message: RequestMessage, timeout: RpcTimeout): Future[T] = {
val promise = Promise[Any]()
// 目标地址
val remoteAddr = message.receiver.address
def onFailure(e: Throwable): Unit = {
if (!promise.tryFailure(e)) {
logWarning(s"Ignored failure: $e")
}
}
def onSuccess(reply: Any): Unit = reply match {
case RpcFailure(e) => onFailure(e)
case rpcReply =>
if (!promise.trySuccess(rpcReply)) {
logWarning(s"Ignored message: $reply")
}
}
try {
// 如果请求的目标地址是本机
if (remoteAddr == address) {
val p = Promise[Any]()
// 异步处理消息
p.future.onComplete {
// 如果成功,会调用onSuccess方法,promise.future对象可以获取到数据
case Success(response) => onSuccess(response)
case Failure(e) => onFailure(e)
}(ThreadUtils.sameThread)
// 发送本地消息
dispatcher.postLocalMessage(message, p)
} else {
// 封装rpc请求对象
val rpcMessage = RpcOutboxMessage(serialize(message),
onFailure,
(client, response) => onSuccess(deserialize[Any](client, response)))
//
postToOutbox(message.receiver, rpcMessage)
promise.future.onFailure {
case _: TimeoutException => rpcMessage.onTimeout()
case _ =>
}(ThreadUtils.sameThread)
}
// 超时检查
val timeoutCancelable = timeoutScheduler.schedule(new Runnable {
override def run(): Unit = {
onFailure(new TimeoutException(s"Cannot receive any reply in ${timeout.duration}"))
}
}, timeout.duration.toNanos, TimeUnit.NANOSECONDS)
promise.future.onComplete {
v =>
timeoutCancelable.cancel(true)
}(ThreadUtils.sameThread)
} catch {
case NonFatal(e) =>
onFailure(e)
}
// 如果获取到返回结果,直接转换为T类型对象;出现异常使用超时处理
promise.future.mapTo[T].recover(timeout.addMessageIfTimeout)(ThreadUtils.sameThread)
}
虽然上面的代码很长,但是主要是区分两种请求接收方:
(1) remoteAddr == address,请求和接收方是一台服务器
核心代码是:dispatcher.postLocalMessage(message, p)
(2) remoteAddr != address,不同服务器
核心代码是:postToOutbox(message.receiver, rpcMessage)
不过由于master启动,一般在本机执行,所以这里先之分析remoteAddr == address的请况,在以后会介绍outbox处理。
接下来,我将依次分析这句代码,想看一下:dispatcher.postLocalMessage(message, p)
,它表示通过消息分发器将message发送到本机:
Dispatcher.scala
-------------------
def postLocalMessage(message: RequestMessage, p: Promise[Any]): Unit = {
val rpcCallContext =
new LocalNettyRpcCallContext(message.senderAddress, p)
// 拼装rpc消息对象
val rpcMessage = RpcMessage(message.senderAddress, message.content, rpcCallContext)
// 核心代码**
postMessage(message.receiver.name, rpcMessage, (e) => p.tryFailure(e))
}
private def postMessage(
endpointName: String,
message: InboxMessage,
callbackIfStopped: (Exception) => Unit): Unit = {
val error = synchronized {
val data = endpoints.get(endpointName)
if (stopped) {
Some(new RpcEnvStoppedException())
} else if (data == null) {
Some(new SparkException(s"Could not find $endpointName."))
} else {
// 往需要发送的通信端inbox中添加一条消息,并添加到receivers从而触发消息处理
data.inbox.post(message)
receivers.offer(data)
None
}
}
// We don't need to call `onStop` in the `synchronized` block
error.foreach(callbackIfStopped)
}
这段代码是不是很熟悉,其实就是将message发送到endpoint的inbox,然后通过定时处理请求。
根据前面的分析,可以知道最终相当于调用inbox.process
方法,请求类型是RpcMessage
即:
def process(dispatcher: Dispatcher): Unit = {
..... 为了突出重点,这里是提出这段代码
message match {
case RpcMessage(_sender, content, context) =>
try {
// 这里endpoint = master 即调用master.receiveAndReply方法
endpoint.receiveAndReply(context).applyOrElse[Any, Unit](content, {
msg =>
throw new SparkException(s"Unsupported message $message from ${_sender}")
})
} catch {
case NonFatal(e) =>
context.sendFailure(e)
// Throw the exception -- this exception will be caught by the safelyCall function.
// The endpoint's onError function will be called.
throw e
}
......
Master.scala ->
override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = {
.......
case BoundPortsRequest =>
context.reply(BoundPortsResponse(address.port, webUi.boundPort, restServerBoundPort))
......
Master endpoint对BoundPortsRequest请求处理逻辑非常简单,不做多说明
至此,master启动涉及的核心对象和方法就介绍完了。