在我们的Spark的存储当中有如下的类是起到至关重要的重要的
1,BlockManager:运行在每个节点(driver和executors)上的管理器,它提供用于将本地和远程的块放入和取回到各种存储(内存,磁盘和堆外)的接口,只有调用了initialize()方法之后这个对象才有效
如果这个BlockManager运行在我们的Driver上,那么这个BlockManager主要是负责管理整个Job上的Block,如果这个BlockManager如果是运行在我们的Executor上的话,那么这个BlockManager要做的事就有点多,运行在Executor上的BlockManger负责管理该Executor上的Block,并且向Driver的BlockManager汇报Block的信息和接收来自它的命令,同时他还要负责管理这个executor上的block
2,BlockManagerMaster: Driver上的BlockManagerMaster对存在于Executor上的BlockManager统一管理,比如Executor需要向Driver发送注册BlockManager,更新Executor上Block的最新信息,询问所需要Block目前所在位置以及当前Executor运行结束需要将此Executor移除等
3,BlockManagerMasterEndpoint:是主节点上的[[ThreadSafeRpcEndpoint]]来跟踪所有Slave节点的块管理状态的类
4,BlockManagerSlaverEndpoint:这个类是运行在我们的所有的Slaver结点上的类,负责执行接收BlockManagerMasterEndpoint的命令,例如:删除某个RDD的数据,删除某个Block,删除某个Shuffle数据,返回某些Block的状态
5,BlockTransferservice 这个对象是在我们的SparkEnv(Spark运行时环境对象,里面包括了大量的有关我我们的Executor执行相关的对象)进行创建的,这个类的具体的实现是NettyBlockTransferService
/**
***下面的这个代码是在我们的SparkEnv的create代码中拿出来的**
*创建出我们的BlockManagerMaster,如果是Driver端是在BlockManagerMaster内部,那么则创建终端点
*BlockManagerMasterEndpoint,如果是Executor,则创建BlockManagerMasterEndpoint的引用
*/
def registerOrLookupEndpoint(
name: String, endpointCreator: => RpcEndpoint):
RpcEndpointRef = {
if (isDriver) { //来一个判断,分别创建出一个blockManagerMasterEndpoint和BlockManagerMasterEndpoint的引用
logInfo("Registering " + name)
rpcEnv.setupEndpoint(name, endpointCreator)
} else {
RpcUtils.makeDriverRef(name, conf, rpcEnv)
}
}
//下面是我们的BlockManager和BlockTransferservice对象的具体的实现如下:
//创建远程数据传输服务,使用Netty的方式
val blockTransferService =
new NettyBlockTransferService(conf, securityManager, bindAddress, advertiseAddress,blockManagerPort, numUsableCores)
//创建出我们的BlockManagerMaster,如果是Driver端是在BlockManagerMaster内部,那么则创建终端点
//BlockManagerMasterEndpoint,如果是Executor,则创建BlockManagerMasterEndpoint的引用
val blockManagerMaster = new BlockManagerMaster(registerOrLookupEndpoint(
BlockManagerMaster.DRIVER_ENDPOINT_NAME,
new BlockManagerMasterEndpoint(rpcEnv, isLocal, conf, listenerBus)),
conf, isDriver)
// NB: blockManager is not valid until initialize() is called later.
//最后创建出我们的BlockManager,如果是Driver端包含了BlockManagerMaster,如果是Executor包含的是BlockManagerMaster的引用,另外BlockManager包含了远程数据传输服务blockTransferService,memoryManager,mapOutputTracker,shuffleManager,securityManager只有在我们的BlockManager调用initialize()初始化时真正生效
val blockManager = new BlockManager(executorId, rpcEnv, blockManagerMaster,
serializerManager, conf, memoryManager, mapOutputTracker, shuffleManager,
blockTransferService, securityManager, numUsableCores)
下面是我们的BlockManager的initialize()方法
/**
* Initializes the BlockManager with the given appId. This is not performed in the constructor as
* the appId may not be known at BlockManager instantiation time (in particular for the driver,
* where it is only learned after registration with the TaskScheduler).
*使用给定的appId初始化BlockManager,这不是在构造函数中执行
*因为appId可能在BlockManager实例化时间(特别是对于驱动程序,只有在与TaskScheduler注册后才能学习)中不知道,
*
* This method initializes the BlockTransferService and ShuffleClient, registers with the
* BlockManagerMaster, starts the BlockManagerWorker endpoint, and registers with a local shuffle
* service if configured.
* 此方法初始化BlockTransferService和ShuffleClient,向BlockManagerMaster注册,
* 启动BlockManagerWorker端点,并配置本地随机服务注册,
*/
def initialize(appId: String): Unit = {
//blockTransferService 初始化
//这个是初始化(启动)我们的Block的远程数据传输服务,同时根据配置启动传输服务BlockTransferService,
//该服务器启动后等待其他节点发送请求信息
blockTransferService.init(this)
//shuffleClient初始化
shuffleClient.init(appId)
//blockManagerId创建
blockManagerId = BlockManagerId(
//包括标识Slave的ExecutorId,HostName和Port
executorId, blockTransferService.hostName, blockTransferService.port)
//shuffleServerId创建,当有外部externalShuffleServiceEnabled则初始化
shuffleServerId = if (externalShuffleServiceEnabled) {
BlockManagerId(executorId, blockTransferService.hostName, externalShuffleServicePort)
} else {
blockManagerId
}
/**
* 表示Executor的BlockManger与Driver的BlockManager进行消息通信,例 如:注册BlockManager,更新Block信息,
* 获取Block所在的BlockManager,删除Exceutor
*/
//向blockManagerMaster注册blockManagerId,BlockManagerMaster对存在于所有Executor上的BlockManager统一管理
master.registerBlockManager(blockManagerId, maxMemory, slaveEndpoint)
// Register Executors' configuration with the local shuffle service, if one should exist.
//当有外部shuffle service时,还需要向blockManagerMaster注册shuffleId
if (externalShuffleServiceEnabled && !blockManagerId.isDriver) {
registerWithExternalShuffleServer()
}
}
//当有外部shuffle service时,还需要向blockManagerMaster注册shuffleId
private def registerWithExternalShuffleServer() {
logInfo("Registering executor with local external shuffle service.")
val shuffleConfig = new ExecutorShuffleInfo(
diskBlockManager.localDirs.map(_.toString),
diskBlockManager.subDirsPerLocalDir,
shuffleManager.getClass.getName)
val MAX_ATTEMPTS = 3
val SLEEP_TIME_SECS = 5
for (i <- 1 to MAX_ATTEMPTS) {
try {
// Synchronous and will throw an exception if we cannot connect.
//同步并将抛出异常,如果我们无法连接
shuffleClient.asInstanceOf[ExternalShuffleClient].registerWithShuffleServer(
shuffleServerId.host, shuffleServerId.port, shuffleServerId.executorId, shuffleConfig)
return
} catch {
case e: Exception if i < MAX_ATTEMPTS =>
logError(s"Failed to connect to external shuffle server, will retry ${MAX_ATTEMPTS - i}"
+ s" more times after waiting $SLEEP_TIME_SECS seconds...", e)
Thread.sleep(SLEEP_TIME_SECS * 1000)
}
}