colossus_bigdata

005.flink源码分析-jobmanager的启动

jobmanager概览

JobManager 是 Flink 集群的主节点，它包含几大重要的组件：
1、ResourceManager
Flink的集群资源管理器，只有一个，关于slot的管理和申请等工作，都由他负责。
2、Dispatcher
负责接收用户提交的 JobGragh, 然后启动一个 JobMaster，类似于 YARN 集群中的 AppMaster角色，类似于 Spark Job 中的 Driver 角色。
3、WebMonitorEndpoint
里面维护了很多很多的Handler，如果客户端通过 flink run 的方式来提交一个 job 到 flink集群，最终，是由 WebMonitorEndpoint 来接收，并且决定使用哪一个 Handler 来执行处理。
4、JobMaster/JobManager
负责一个具体的 Job 的执行，在一个集群中，可能会有多个 JobManager 同时执行，类似于 YARN集群中的AppMaster 角色，类似于 Spark Job 中的 Driver 角色。

关于 JobManager 的区分

1、如果我们将 FLink 是主从架构，那么这个 JobManager 就是指主节点，它包含上面讲述的三种角色
2、如果我们将 Job 提交到 YARN 运行的时候，事实上，可以通过启动一个小集群的方式来运行，这个小集群的主节点也是JobManager，你把job提交到 YARN 运行的时候，还有一种模式：job、sessioin, Container（JobManager）Container（StreamTask）

总之，Flink 集群的主节点内部运行着：ResourceManager 和Dispatcher，当 client 提交一个 job 到集群运行的时候（客户端会把该 Job 构建成一个 JobGragh 对象），Dispatcher 负责拉起JobManager/JobMaster 来负责这个 Job 内部的 Task 的执行，执行Task所需要的资源，JobManager 向 ResourceManager 申请。

根据上一篇的分析，JobManager的启动主类：StandaloneSessionClusterEntrypoint：

// 入口
StandaloneSessionClusterEntrypoint.main()
ClusterEntrypoint.runClusterEntrypoint(entrypoint);
clusterEntrypoint.startCluster();
runCluster(configuration, pluginManager);
// 第一步：初始化各种服务（7个服务）
initializeServices(configuration, pluginManager);
// 创建 DispatcherResourceManagerComponentFactory, 初始化各种组件的
工厂实例
// 其实内部包含了三个重要的成员变量：
// 创建 ResourceManager 的工厂实例
// 创建 Dispatcher 的工厂实例
// 创建 WebMonitorEndpoint 的工厂实例
createDispatcherResourceManagerComponentFactory(configuration);
// 创建 集群运行需要的一些组件：Dispatcher， ResourceManager 等
// 创建 ResourceManager
// 创建 Dispatcher
// 创建 WebMonitorEndpoint
clusterComponent =
dispatcherResourceManagerComponentFactory.create(...)

第一步 initializeServices() 中做了很多服务组件的初始化：

// 初始化和启动 AkkaRpcService，内部其实包装了一个 ActorSystem
commonRpcService = AkkaRpcServiceUtils.createRemoteRpcService(...)
// 初始化一个负责 IO 的线程池
ioExecutor = Executors.newFixedThreadPool(...)
// 初始化 HA 服务组件，负责 HA 服务的是：ZooKeeperHaServices
haServices = createHaServices(configuration, ioExecutor);
// 初始化 BlobServer 服务端
blobServer = new BlobServer(configuration, haServices.createBlobStore());
blobServer.start();
// 初始化心跳服务组件, heartbeatServices = HeartbeatServices
heartbeatServices = createHeartbeatServices(configuration);
// 初始化一个用来存储 ExecutionGraph 的 Store, 实现是：
FileArchivedExecutionGraphStore
archivedExecutionGraphStore = createSerializableExecutionGraphStore(...)

第二步 createDispatcherResourceManagerComponentFactory(configuration) 中负责初始化了很多组件的工厂实例：

1、DispatcherRunnerFactory，默认实现：DefaultDispatcherRunnerFactory
2、ResourceManagerFactory，默认实现：StandaloneResourceManagerFactory
3、RestEndpointFactory，默认实现：SessionRestEndpointFactory

DispatcherRunnerFactory 内部也实例化了一个SessionDispatcherLeaderProcessFactoryFactory 组件。
创建三个工厂的代码：

final DispatcherResourceManagerComponentFactory dispatcherResourceManagerComponentFactory = createDispatcherResourceManagerComponentFactory(configuration);

进入createDispatcherResourceManagerComponentFactory方法

protected DefaultDispatcherResourceManagerComponentFactory createDispatcherResourceManagerComponentFactory(Configuration configuration) {
   return DefaultDispatcherResourceManagerComponentFactory.createSessionComponentFactory(StandaloneResourceManagerFactory.getInstance());
}
//调用下面的方法
public static DefaultDispatcherResourceManagerComponentFactory createSessionComponentFactory(
      ResourceManagerFactory<?> resourceManagerFactory) {
   return new DefaultDispatcherResourceManagerComponentFactory(
      DefaultDispatcherRunnerFactory.createSessionRunner(SessionDispatcherFactory.INSTANCE),//创建SessionDispatcherFactory
      resourceManagerFactory,//创建resourceManagerFactory
      SessionRestEndpointFactory.INSTANCE);//创建SessionRestEndpointFactory
}

这样三个工厂就创建出来了
第三步 dispatcherResourceManagerComponentFactory.create(…) 中主要去创建三个重要的组件：

clusterComponent = dispatcherResourceManagerComponentFactory.create(
   configuration,
   ioExecutor,
   commonRpcService,
   haServices,
   blobServer,
   heartbeatServices,
   metricRegistry,
   archivedExecutionGraphStore,
   new RpcMetricQueryServiceRetriever(metricRegistry.getMetricQueryServiceRpcService()),
   this);

进入create方法：

webMonitorEndpoint的创建

首先创建第一个组件并启动：webMonitorEndpoint 组件。

/**
 * 如果用户通过flink run提交了一个job，那么最后是由WebMonitorEndpoint中的jobSubmitHandler来处理，处理完成后，
 * 交给dispatcher处理。创建webMonitorEndpoint，如果时yarn模式，则创建的是MiniDispatcherRestEndpoint
 */
webMonitorEndpoint = restEndpointFactory.createRestEndpoint(
   configuration,
   dispatcherGatewayRetriever,
   resourceManagerGatewayRetriever,
   blobServer,
   executor,
   metricFetcher,
   highAvailabilityServices.getClusterRestEndpointLeaderElectionService(),
   fatalErrorHandler);

log.debug("Starting Dispatcher REST endpoint.");
webMonitorEndpoint.start();

具体创建WebMonitorEndpoint的代码：

@Override
public WebMonitorEndpoint<DispatcherGateway> createRestEndpoint(
      Configuration configuration,
      LeaderGatewayRetriever<DispatcherGateway> dispatcherGatewayRetriever,
      LeaderGatewayRetriever<ResourceManagerGateway> resourceManagerGatewayRetriever,
      TransientBlobService transientBlobService,
      ScheduledExecutorService executor,
      MetricFetcher metricFetcher,
      LeaderElectionService leaderElectionService,
      FatalErrorHandler fatalErrorHandler) throws Exception {
   final RestHandlerConfiguration restHandlerConfiguration = RestHandlerConfiguration.fromConfiguration(configuration);

   return new DispatcherRestEndpoint(
      RestServerEndpointConfiguration.fromConfiguration(configuration),
      dispatcherGatewayRetriever,
      configuration,
      restHandlerConfiguration,
      resourceManagerGatewayRetriever,
      transientBlobService,
      executor,
      metricFetcher,
      leaderElectionService,
      RestEndpointFactory.createExecutionGraphCache(restHandlerConfiguration),
      fatalErrorHandler);
}

再来看webMonitorEndpoint.start();start方法封装在RestServerEndpoint中：webMonitorEndpoint继承自RestServerEndpoint，实际的启动方法start调用的是父类的方法：

/**
 * Starts this REST server endpoint.
 *
 * @throws Exception if we cannot start the RestServerEndpoint
 */
public final void start() throws Exception {
   synchronized (lock) {
      Preconditions.checkState(state == State.CREATED, "The RestServerEndpoint cannot be restarted.");

      log.info("Starting rest endpoint.");
      //初始化一个路由器
      final Router router = new Router();
      final CompletableFuture<String> restAddressFuture = new CompletableFuture<>();
      //初始化各种handlers
      handlers = initializeHandlers(restAddressFuture);

      /* sort the handlers such that they are ordered the following:
       * /jobs
       * /jobs/overview
       * /jobs/:jobid
       * /jobs/:jobid/config
       * /:*
       */
      Collections.sort(
         handlers,
         RestHandlerUrlComparator.INSTANCE);
      //检查唯一性
      checkAllEndpointsAndHandlersAreUnique(handlers);
      handlers.forEach(handler -> registerHandler(router, handler, log));//注册handler到router里面
      //netty例行程序
      ChannelInitializer<SocketChannel> initializer = new ChannelInitializer<SocketChannel>() {

         @Override
         protected void initChannel(SocketChannel ch) {
            RouterHandler handler = new RouterHandler(router, responseHeaders);

            // SSL should be the first handler in the pipeline
            if (isHttpsEnabled()) {
               ch.pipeline().addLast("ssl",
                  new RedirectingSslHandler(restAddress, restAddressFuture, sslHandlerFactory));
            }

            ch.pipeline()
               .addLast(new HttpServerCodec())
               .addLast(new FileUploadHandler(uploadDir))
               .addLast(new FlinkHttpObjectAggregator(maxContentLength, responseHeaders))
               .addLast(new ChunkedWriteHandler())
               .addLast(handler.getName(), handler)
               .addLast(new PipelineErrorHandler(log, responseHeaders));
         }
      };

      NioEventLoopGroup bossGroup = new NioEventLoopGroup(1, new ExecutorThreadFactory("flink-rest-server-netty-boss"));
      NioEventLoopGroup workerGroup = new NioEventLoopGroup(0, new ExecutorThreadFactory("flink-rest-server-netty-worker"));

      bootstrap = new ServerBootstrap();
      bootstrap
         .group(bossGroup, workerGroup)
         .channel(NioServerSocketChannel.class)
         .childHandler(initializer);

      Iterator<Integer> portsIterator;
      try {
         portsIterator = NetUtils.getPortRangeFromString(restBindPortRange);
      } catch (IllegalConfigurationException e) {
         throw e;
      } catch (Exception e) {
         throw new IllegalArgumentException("Invalid port range definition: " + restBindPortRange);
      }

      int chosenPort = 0;
      while (portsIterator.hasNext()) {
         try {
            chosenPort = portsIterator.next();
            final ChannelFuture channel;
            if (restBindAddress == null) {
               channel = bootstrap.bind(chosenPort);
            } else {
               channel = bootstrap.bind(restBindAddress, chosenPort);
            }
            serverChannel = channel.syncUninterruptibly().channel();
            break;
         } catch (final Exception e) {
            // continue if the exception is due to the port being in use, fail early otherwise
            if (!(e instanceof org.jboss.netty.channel.ChannelException || e instanceof java.net.BindException)) {
               throw e;
            }
         }
      }

      if (serverChannel == null) {
         throw new BindException("Could not start rest endpoint on any port in port range " + restBindPortRange);
      }

      log.debug("Binding rest endpoint to {}:{}.", restBindAddress, chosenPort);

      final InetSocketAddress bindAddress = (InetSocketAddress) serverChannel.localAddress();
      final String advertisedAddress;
      if (bindAddress.getAddress().isAnyLocalAddress()) {
         advertisedAddress = this.restAddress;
      } else {
         advertisedAddress = bindAddress.getAddress().getHostAddress();
      }
      final int port = bindAddress.getPort();

      log.info("Rest endpoint listening at {}:{}", advertisedAddress, port);

      restBaseUrl = new URL(determineProtocol(), advertisedAddress, port, "").toString();

      restAddressFuture.complete(restBaseUrl);
      //启动完成
      state = State.RUNNING;
      //调用子类的方法，不同的实现会不一样
      startInternal();
   }

最后调用子类的startInternal();方法：该方法定义在父类自身，是个抽象方法，由子类去实现，父类的方法调用这个抽象方法：

@Override
public void startInternal() throws Exception {
    //选举服务，让当前竞选者this参与选举
   leaderElectionService.start(this);
   //开启一个定时清理ExecutionGraphCache的任务
   startExecutionGraphCacheCleanupTask();

   if (hasWebUI) {
      log.info("Web frontend listening at {}.", getRestBaseUrl());
   }
}

这里，选举的目的是将server端的端口号写入到zk，因为server端的端口地址是随机的。选举有两方面作用：1是组件本身HA的需要，2是服务发现的需要，客户端需要到zk里获取服务端的地址。，进入这个方法：leaderElectionService.start(this);

@Override
public final void start(LeaderContender contender) throws Exception {
   checkNotNull(contender, "Contender must not be null.");
   Preconditions.checkState(leaderContender == null, "Contender was already set.");

   synchronized (lock) {
      leaderContender = contender;
      leaderElectionDriver = leaderElectionDriverFactory.createLeaderElectionDriver(
         this, new LeaderElectionFatalErrorHandler(), leaderContender.getDescription());
      LOG.info("Starting DefaultLeaderElectionService with {}.", leaderElectionDriver);

      running = true;
   }
}

进入createLeaderElectionDriver

@Override
public ZooKeeperLeaderElectionDriver createLeaderElectionDriver(
      LeaderElectionEventHandler leaderEventHandler,
      FatalErrorHandler fatalErrorHandler,
      String leaderContenderDescription) throws Exception {
   return new ZooKeeperLeaderElectionDriver(
      client, latchPath, leaderPath, leaderEventHandler, fatalErrorHandler, leaderContenderDescription);
}

返回一个ZooKeeperLeaderElectionDriver

public ZooKeeperLeaderElectionDriver(
      CuratorFramework client,
      String latchPath,
      String leaderPath,
      LeaderElectionEventHandler leaderElectionEventHandler,
      FatalErrorHandler fatalErrorHandler,
      String leaderContenderDescription) throws Exception {
   this.client = checkNotNull(client);
   this.leaderPath = checkNotNull(leaderPath);
   this.leaderElectionEventHandler = checkNotNull(leaderElectionEventHandler);
   this.fatalErrorHandler = checkNotNull(fatalErrorHandler);
   this.leaderContenderDescription = checkNotNull(leaderContenderDescription);

   leaderLatch = new LeaderLatch(client, checkNotNull(latchPath));
   cache = new NodeCache(client, leaderPath);

   client.getUnhandledErrorListenable().addListener(this);

   running = true;

   leaderLatch.addListener(this);
   leaderLatch.start();

   cache.getListenable().addListener(this);
   cache.start();

   client.getConnectionStateListenable().addListener(listener);
}

这里的leaderLatch.start();就是选举，并通过异步监听选举结果的方式来实现选举

public void start() throws Exception {
    Preconditions.checkState(this.state.compareAndSet(LeaderLatch.State.LATENT, LeaderLatch.State.STARTED), "Cannot be started more than once");
    this.startTask.set(AfterConnectionEstablished.execute(this.client, new Runnable() {
        public void run() {
            try {
                LeaderLatch.this.internalStart();
            } finally {
                LeaderLatch.this.startTask.set((Object)null);
            }

        }
    }));
}

leaderLatch.addListener(this);需要一个监听器参数，而每个参与选举的组件都要事先这个监听器：LeaderLatchListener：这里的this也是实现了监听器

public interface LeaderLatchListener {
    void isLeader();

    void notLeader();
}

curator框架会在选举结束后来调用LeaderLatchListener 实现类this的这两个方法。如果选举成功，则调用isLeader()：

@Override
public void isLeader() {
   leaderElectionEventHandler.onGrantLeadership();
}

leaderElectionEventHandler.onGrantLeadership();：

@Override
@GuardedBy("lock")
public void onGrantLeadership() {
   synchronized (lock) {
      if (running) {
         issuedLeaderSessionID = UUID.randomUUID();
         clearConfirmedLeaderInformation();

         if (LOG.isDebugEnabled()) {
            LOG.debug(
               "Grant leadership to contender {} with session ID {}.",
               leaderContender.getDescription(),
               issuedLeaderSessionID);
         }
            //将当前组件赋予领导者角色
         leaderContender.grantLeadership(issuedLeaderSessionID);
      } else {
         if (LOG.isDebugEnabled()) {
            LOG.debug("Ignoring the grant leadership notification since the {} has " +
               "already been closed.", leaderElectionDriver);
         }
      }
   }

}

leaderContender.grantLeadership(issuedLeaderSessionID);

@Override
public void grantLeadership(final UUID leaderSessionID) {
   log.info("{} was granted leadership with leaderSessionID={}", getRestBaseUrl(), leaderSessionID);
   leaderElectionService.confirmLeadership(leaderSessionID, getRestBaseUrl());
}

leaderElectionService.confirmLeadership

@Override
public void confirmLeadership(UUID leaderSessionID, String leaderAddress) {
   if (LOG.isDebugEnabled()) {
      LOG.debug(
         "Confirm leader session ID {} for leader {}.",
         leaderSessionID,
         leaderAddress);
   }

   checkNotNull(leaderSessionID);

   synchronized (lock) {
      if (hasLeadership(leaderSessionID)) {//是否当前组件又leader权限
         if (running) {
            confirmLeaderInformation(leaderSessionID, leaderAddress);
         } else {
            if (LOG.isDebugEnabled()) {
               LOG.debug("Ignoring the leader session Id {} confirmation, since the " +
                  "LeaderElectionService has already been stopped.", leaderSessionID);
            }
         }
      } else {
         // Received an old confirmation call
         if (!leaderSessionID.equals(this.issuedLeaderSessionID)) {
            if (LOG.isDebugEnabled()) {
               LOG.debug("Receive an old confirmation call of leader session ID {}, " +
                  "current issued session ID is {}", leaderSessionID, issuedLeaderSessionID);
            }
         } else {
            LOG.warn("The leader session ID {} was confirmed even though the " +
               "corresponding JobManager was not elected as the leader.", leaderSessionID);
         }
      }
   }
}

confirmLeaderInformation(leaderSessionID, leaderAddress);

@GuardedBy("lock")
private void confirmLeaderInformation(UUID leaderSessionID, String leaderAddress) {
   confirmedLeaderSessionID = leaderSessionID;
   confirmedLeaderAddress = leaderAddress;
   leaderElectionDriver.writeLeaderInformation(
      LeaderInformation.known(confirmedLeaderSessionID, confirmedLeaderAddress));
}

也就是如果确认当前为leader，那么将当前信息写入zk。写zk大概逻辑代码：

try {
   final ByteArrayOutputStream baos = new ByteArrayOutputStream();
   final ObjectOutputStream oos = new ObjectOutputStream(baos);

   oos.writeUTF(leaderInformation.getLeaderAddress());
   oos.writeObject(leaderInformation.getLeaderSessionID());

   oos.close();

   boolean dataWritten = false;

   while (!dataWritten && leaderLatch.hasLeadership()) {
      Stat stat = client.checkExists().forPath(leaderPath);

      if (stat != null) {
         long owner = stat.getEphemeralOwner();
         long sessionID = client.getZookeeperClient().getZooKeeper().getSessionId();

         if (owner == sessionID) {
            try {
               client.setData().forPath(leaderPath, baos.toByteArray());

               dataWritten = true;
            } catch (KeeperException.NoNodeException noNode) {
               // node was deleted in the meantime
            }
         } else {
            try {
               client.delete().forPath(leaderPath);
            } catch (KeeperException.NoNodeException noNode) {
               // node was deleted in the meantime --> try again
            }
         }
      } else {
         try {
            client.create().creatingParentsIfNeeded().withMode(CreateMode.EPHEMERAL).forPath(
                  leaderPath,
                  baos.toByteArray());

            dataWritten = true;
         } catch (KeeperException.NodeExistsException nodeExists) {
            // node has been created in the meantime --> try again
         }
      }
   }

   if (LOG.isDebugEnabled()) {
      LOG.debug("Successfully wrote leader information: {}.", leaderInformation);
   }
}

到此，组件就将自身信息写到了zk，如果写入成功，就说明获取了leader身份，而且客户端也通过zk知道了服务端的地址信息。下面几个其他的组件的选举过程与这里都是类似的，流程基本一致。

resourcemanager的创建

然后创建第二个组件并启动：resourcemanager

resourceManager = resourceManagerFactory.createResourceManager(
   configuration,
   ResourceID.generate(),
   rpcService,
   highAvailabilityServices,
   heartbeatServices,
   fatalErrorHandler,
   new ClusterInformation(hostname, blobServer.getPort()),
   webMonitorEndpoint.getRestBaseUrl(),
   metricRegistry,
   hostname,
   ioExecutor);
   //。。。  创建dispatcher组件
   log.debug("Starting ResourceManager.");
    resourceManager.start();//启动

这里start方法执行后，就会发送消息给自己，就会执行本身的onStart方法。
创建resourcemanager代码：

public ResourceManager<T> createResourceManager(
      Configuration configuration,
      ResourceID resourceId,
      RpcService rpcService,
      HighAvailabilityServices highAvailabilityServices,
      HeartbeatServices heartbeatServices,
      FatalErrorHandler fatalErrorHandler,
      ClusterInformation clusterInformation,
      @Nullable String webInterfaceUrl,
      MetricRegistry metricRegistry,
      String hostname,
      Executor ioExecutor) throws Exception {

   final ResourceManagerMetricGroup resourceManagerMetricGroup = ResourceManagerMetricGroup.create(metricRegistry, hostname);
   final SlotManagerMetricGroup slotManagerMetricGroup = SlotManagerMetricGroup.create(metricRegistry, hostname);
    //1.
   final ResourceManagerRuntimeServices resourceManagerRuntimeServices = createResourceManagerRuntimeServices(
      configuration, rpcService, highAvailabilityServices, slotManagerMetricGroup);
//2
   return createResourceManager(
      configuration,
      resourceId,
      rpcService,
      highAvailabilityServices,
      heartbeatServices,
      fatalErrorHandler,
      clusterInformation,
      webInterfaceUrl,
      resourceManagerMetricGroup,
      resourceManagerRuntimeServices,
      ioExecutor);
}

这里有两个重要方法：

（1）创建ResourceManagerRuntimeServices
（2）createResourceManager

（1）createResourceManagerRuntimeServices：

private ResourceManagerRuntimeServices createResourceManagerRuntimeServices(
      Configuration configuration,
      RpcService rpcService,
      HighAvailabilityServices highAvailabilityServices,
      SlotManagerMetricGroup slotManagerMetricGroup) throws ConfigurationException {

   return ResourceManagerRuntimeServices.fromConfiguration(
      createResourceManagerRuntimeServicesConfiguration(configuration),
      highAvailabilityServices,
      rpcService.getScheduledExecutor(),
      slotManagerMetricGroup);
}

fromConfiguration：

public static ResourceManagerRuntimeServices fromConfiguration(
      ResourceManagerRuntimeServicesConfiguration configuration,
      HighAvailabilityServices highAvailabilityServices,
      ScheduledExecutor scheduledExecutor,
      SlotManagerMetricGroup slotManagerMetricGroup) {
   //1.创建了一个SlotManager 
   final SlotManager slotManager = createSlotManager(configuration, scheduledExecutor, slotManagerMetricGroup);
    //创建了一个JobLeaderIdService 
   final JobLeaderIdService jobLeaderIdService = new JobLeaderIdService(
      highAvailabilityServices,
      scheduledExecutor,
      configuration.getJobTimeout());
  //返回封装了SlotManager 和JobLeaderIdService 的ResourceManagerRuntimeServices
   return new ResourceManagerRuntimeServices(slotManager, jobLeaderIdService);
}

所以，resourceManager主要是启动了两个服务：SlotManager 和JobLeaderIdService
（2）createResourceManager，具体实现类是StandaloneResourceManagerFactory

protected ResourceManager<ResourceID> createResourceManager(
   Configuration configuration,
   ResourceID resourceId,
   RpcService rpcService,
   HighAvailabilityServices highAvailabilityServices,
   HeartbeatServices heartbeatServices,
   FatalErrorHandler fatalErrorHandler,
   ClusterInformation clusterInformation,
   @Nullable String webInterfaceUrl,
   ResourceManagerMetricGroup resourceManagerMetricGroup,
   ResourceManagerRuntimeServices resourceManagerRuntimeServices,
   Executor ioExecutor) {

   final Time standaloneClusterStartupPeriodTime = ConfigurationUtils.getStandaloneClusterStartupPeriodTime(configuration);

   return new StandaloneResourceManager(
      rpcService,
      resourceId,
      highAvailabilityServices,
      heartbeatServices,
      resourceManagerRuntimeServices.getSlotManager(),
      ResourceManagerPartitionTrackerImpl::new,
      resourceManagerRuntimeServices.getJobLeaderIdService(),
      clusterInformation,
      fatalErrorHandler,
      resourceManagerMetricGroup,
      standaloneClusterStartupPeriodTime,
      AkkaUtils.getTimeoutAsTime(configuration),
      ioExecutor);
}

创建了resourcemanger后，就启动了，调用start方法：resourceManager.start();

// ------------------------------------------------------------------------
//  Start & shutdown & lifecycle callbacks
// ------------------------------------------------------------------------

/**
 * Triggers start of the rpc endpoint. This tells the underlying rpc server that the rpc endpoint is ready
 * to process remote procedure calls.
 */
public final void start() {
   rpcServer.start();
}

RpcServer的start方法，这里是个框架相关的方法，会调用到子类AkkaInvocationHandler的start方法：

class AkkaInvocationHandler implements InvocationHandler, AkkaBasedEndpoint, RpcServer

@Override
public void start() {
   rpcEndpoint.tell(ControlMessages.START, ActorRef.noSender());
}

这个rpcEndpoint是个ActorRef，rpcEndpoint.tell(ControlMessages.START, ActorRef.noSender());这里是发送给自己START消息，这里就会转到StandaloneResourceManager的onStart方法来执行。而实际上调用的是StandaloneResourceManager的父类ResourceManager的onStart方法：

@Override
public final void onStart() throws Exception {
   try {
      startResourceManagerServices();
   } catch (Throwable t) {
      final ResourceManagerException exception = new ResourceManagerException(String.format("Could not start the ResourceManager %s", getAddress()), t);
      onFatalError(exception);
      throw exception;
   }
}

private void startResourceManagerServices() throws Exception {
   try {
       //选举服务，将自己信息写道zk中
      leaderElectionService = highAvailabilityServices.getResourceManagerLeaderElectionService();

      initialize();

      leaderElectionService.start(this);
      jobLeaderIdService.start(new JobLeaderIdActionsImpl());

      registerTaskExecutorMetrics();
   } catch (Exception e) {
      handleStartResourceManagerServicesException(e);
   }
}

整个选举过程与上面的rest服务一样。最后会调用到：

public void onGrantLeadership() {
   synchronized (lock) {
      if (running) {
         issuedLeaderSessionID = UUID.randomUUID();
         clearConfirmedLeaderInformation();

         if (LOG.isDebugEnabled()) {
            LOG.debug(
               "Grant leadership to contender {} with session ID {}.",
               leaderContender.getDescription(),
               issuedLeaderSessionID);
         }
         //调用到这里
         leaderContender.grantLeadership(issuedLeaderSessionID);
      } else {
         if (LOG.isDebugEnabled()) {
            LOG.debug("Ignoring the grant leadership notification since the {} has " +
               "already been closed.", leaderElectionDriver);
         }
      }
   }

}

最后调用到Resourcemanager的grantLeadership方法：

@Override
public void grantLeadership(final UUID newLeaderSessionID) {
   final CompletableFuture<Boolean> acceptLeadershipFuture = clearStateFuture
      .thenComposeAsync((ignored) -> tryAcceptLeadership(newLeaderSessionID), getUnfencedMainThreadExecutor());

   final CompletableFuture<Void> confirmationFuture = acceptLeadershipFuture.thenAcceptAsync(
      (acceptLeadership) -> {
         if (acceptLeadership) {
            // confirming the leader session ID might be blocking,
            leaderElectionService.confirmLeadership(newLeaderSessionID, getAddress());
         }
      },
      ioExecutor);

   confirmationFuture.whenComplete(
      (Void ignored, Throwable throwable) -> {
         if (throwable != null) {
            onFatalError(ExceptionUtils.stripCompletionException(throwable));
         }
      });
}

这里有两处比较重要的部分：首先看代码：
tryAcceptLeadership(newLeaderSessionID), getUnfencedMainThreadExecutor());

private CompletableFuture<Boolean> tryAcceptLeadership(final UUID newLeaderSessionID) {
   if (leaderElectionService.hasLeadership(newLeaderSessionID)) {
      final ResourceManagerId newResourceManagerId = ResourceManagerId.fromUuid(newLeaderSessionID);

      log.info("ResourceManager {} was granted leadership with fencing token {}", getAddress(), newResourceManagerId);

      // clear the state if we've been the leader before
      if (getFencingToken() != null) {
         clearStateInternal();
      }

      setFencingToken(newResourceManagerId);
      //成为leader的resourcemanager才能执行整个方法执行服务
      startServicesOnLeadership();

      return prepareLeadershipAsync().thenApply(ignored -> true);
   } else {
      return CompletableFuture.completedFuture(false);
   }
}

startServicesOnLeadership();：成为leader的resourcemanager才能执行整个方法执行服务，这里启动两个心跳服务，两个定时服务。

private void startServicesOnLeadership() {
   //启动心跳服务
   startHeartbeatServices();
   //启动slotManager，启动两个定时服务
   //1.检查taskExecutor的死活状态 50s没有发送心跳过来
   //2.检查slot的申请请求状态，slot申请超时时间如果没有返回则不要了
   slotManager.start(getFencingToken(), getMainThreadExecutor(), new ResourceActionsImpl());
   //
   onLeadership();
}

1.启动心跳服务

private void startHeartbeatServices() {
   //启动与TaskManager 的Heartbeat
   taskManagerHeartbeatManager = heartbeatServices.createHeartbeatManagerSender(
      resourceId,
      new TaskManagerHeartbeatListener(),
      getMainThreadExecutor(),
      log);
   //启动与jobManager的Heartbeat
   jobManagerHeartbeatManager = heartbeatServices.createHeartbeatManagerSender(
      resourceId,
      new JobManagerHeartbeatListener(),
      getMainThreadExecutor(),
      log);
}

createHeartbeatManagerSender在heartBeatService中定义的，还有类似的方法createHeartbeatManager，区别在于带sender的是发送消息的主动方。

public <I, O> HeartbeatManager<I, O> createHeartbeatManagerSender(
   ResourceID resourceId,
   HeartbeatListener<I, O> heartbeatListener,
   ScheduledExecutor mainThreadExecutor,
   Logger log) {

   return new HeartbeatManagerSenderImpl<>(
      heartbeatInterval,
      heartbeatTimeout,
      resourceId,
      heartbeatListener,
      mainThreadExecutor,
      log);
}

HeartbeatManagerSenderImpl(
      long heartbeatPeriod,
      long heartbeatTimeout,
      ResourceID ownResourceID,
      HeartbeatListener<I, O> heartbeatListener,
      ScheduledExecutor mainThreadExecutor,
      Logger log) {
   this(
      heartbeatPeriod,
      heartbeatTimeout,
      ownResourceID,
      heartbeatListener,
      mainThreadExecutor,
      log,
      new HeartbeatMonitorImpl.Factory<>());
}

HeartbeatManagerSenderImpl(
      long heartbeatPeriod,
      long heartbeatTimeout,
      ResourceID ownResourceID,
      HeartbeatListener<I, O> heartbeatListener,
      ScheduledExecutor mainThreadExecutor,
      Logger log,
      HeartbeatMonitor.Factory<O> heartbeatMonitorFactory) {
   super(
      heartbeatTimeout,
      ownResourceID,
      heartbeatListener,
      mainThreadExecutor,
      log,
      heartbeatMonitorFactory);

   this.heartbeatPeriod = heartbeatPeriod;
   mainThreadExecutor.schedule(this, 0L, TimeUnit.MILLISECONDS);
}

mainThreadExecutor.schedule(this, 0L, TimeUnit.MILLISECONDS);定时服务，调度当前线程。

@Override
public void run() {
   if (!stopped) {
      log.debug("Trigger heartbeat request.");
      for (HeartbeatMonitor<O> heartbeatMonitor : getHeartbeatTargets().values()) {
         requestHeartbeat(heartbeatMonitor);
      }
           //启动的时候调度一次，延迟heartbeatPeriod再次调度，每隔10s
      getMainThreadExecutor().schedule(this, heartbeatPeriod, TimeUnit.MILLISECONDS);
   }
}

HeartbeatMonitor就是taskExecutor。

接下来看slotManager.start(getFencingToken(), getMainThreadExecutor(), new ResourceActionsImpl());

@Override
public void start(ResourceManagerId newResourceManagerId, Executor newMainThreadExecutor, ResourceActions newResourceActions) {
   LOG.info("Starting the SlotManager.");

   this.resourceManagerId = Preconditions.checkNotNull(newResourceManagerId);
   mainThreadExecutor = Preconditions.checkNotNull(newMainThreadExecutor);
   resourceActions = Preconditions.checkNotNull(newResourceActions);

   started = true;
        //第一个定时任务
   taskManagerTimeoutsAndRedundancyCheck = scheduledExecutor.scheduleWithFixedDelay(
      () -> mainThreadExecutor.execute(
         () -> checkTaskManagerTimeoutsAndRedundancy()),
      0L,
      taskManagerTimeout.toMilliseconds(),
      TimeUnit.MILLISECONDS);
//第二个定时任务
   slotRequestTimeoutCheck = scheduledExecutor.scheduleWithFixedDelay(
      () -> mainThreadExecutor.execute(
         () -> checkSlotRequestTimeouts()),
      0L,
      slotRequestTimeout.toMilliseconds(),
      TimeUnit.MILLISECONDS);

   registerSlotManagerMetrics();
}

第一个定时任务检查那些taskmanager超时了：checkTaskManagerTimeoutsAndRedundancy，心跳间隔10s，任务检查30s，任务超时50s。

void checkTaskManagerTimeoutsAndRedundancy() {
   if (!taskManagerRegistrations.isEmpty()) {
      long currentTime = System.currentTimeMillis();

      ArrayList<TaskManagerRegistration> timedOutTaskManagers = new ArrayList<>(taskManagerRegistrations.size());

      // first retrieve the timed out TaskManagers
      for (TaskManagerRegistration taskManagerRegistration : taskManagerRegistrations.values()) {
         if (currentTime - taskManagerRegistration.getIdleSince() >= taskManagerTimeout.toMilliseconds()) {
            // we collect the instance ids first in order to avoid concurrent modifications by the
            // ResourceActions.releaseResource call
            timedOutTaskManagers.add(taskManagerRegistration);
         }
      }

      int slotsDiff = redundantTaskManagerNum * numSlotsPerWorker - freeSlots.size();
      if (freeSlots.size() == slots.size()) {
         // No need to keep redundant taskManagers if no job is running.
         releaseTaskExecutors(timedOutTaskManagers, timedOutTaskManagers.size());
      } else if (slotsDiff > 0) {
         // Keep enough redundant taskManagers from time to time.
         int requiredTaskManagers = MathUtils.divideRoundUp(slotsDiff, numSlotsPerWorker);
         allocateRedundantTaskManagers(requiredTaskManagers);
      } else {
         // second we trigger the release resource callback which can decide upon the resource release
         int maxReleaseNum = (-slotsDiff) / numSlotsPerWorker;
         releaseTaskExecutors(timedOutTaskManagers, Math.min(maxReleaseNum, timedOutTaskManagers.size()));
      }
   }
}

第二个定时任务checkSlotRequestTimeouts检查那些slot请求任务超时了。5分钟超时。

private void checkSlotRequestTimeouts() {
   if (!pendingSlotRequests.isEmpty()) {
      long currentTime = System.currentTimeMillis();

      Iterator<Map.Entry<AllocationID, PendingSlotRequest>> slotRequestIterator = pendingSlotRequests.entrySet().iterator();

      while (slotRequestIterator.hasNext()) {
         PendingSlotRequest slotRequest = slotRequestIterator.next().getValue();

         if (currentTime - slotRequest.getCreationTimestamp() >= slotRequestTimeout.toMilliseconds()) {
            slotRequestIterator.remove();

            if (slotRequest.isAssigned()) {
               cancelPendingSlotRequest(slotRequest);
            }

            resourceActions.notifyAllocationFailure(
               slotRequest.getJobId(),
               slotRequest.getAllocationId(),
               new TimeoutException("The allocation could not be fulfilled in time."));
         }
      }
   }
}

resourcemanager启动总结

resourcemanager的选举启动了两个心跳任务，两个定时任务。

dispatcher的创建并启动

//create方法内部会创建dispatcher并调用start方法启动
log.debug("Starting Dispatcher.");
dispatcherRunner = dispatcherRunnerFactory.createDispatcherRunner(
   highAvailabilityServices.getDispatcherLeaderElectionService(),
   fatalErrorHandler,
   new HaServicesJobGraphStoreFactory(highAvailabilityServices),
   ioExecutor,
   rpcService,
   partialDispatcherServices);

1.创建工厂：dispatcherRunnerFactory.createDispatcherRunner方法

  /**
* 1.创建dispatcher
* 2.启动dispatcher
*/
    @Override
   public DispatcherRunner createDispatcherRunner(
         LeaderElectionService leaderElectionService,
         FatalErrorHandler fatalErrorHandler,
         JobGraphStoreFactory jobGraphStoreFactory,
         Executor ioExecutor,
         RpcService rpcService,
         PartialDispatcherServices partialDispatcherServices) throws Exception {
//
      final DispatcherLeaderProcessFactory dispatcherLeaderProcessFactory = dispatcherLeaderProcessFactoryFactory.createFactory(
         jobGraphStoreFactory,
         ioExecutor,
         rpcService,
         partialDispatcherServices,
         fatalErrorHandler);

      return DefaultDispatcherRunner.create(
         leaderElectionService,
         fatalErrorHandler,
         dispatcherLeaderProcessFactory);
   }

首先创建一个工厂

public DispatcherLeaderProcessFactory createFactory(
      JobGraphStoreFactory jobGraphStoreFactory,
      Executor ioExecutor,
      RpcService rpcService,
      PartialDispatcherServices partialDispatcherServices,
      FatalErrorHandler fatalErrorHandler) {
   final AbstractDispatcherLeaderProcess.DispatcherGatewayServiceFactory dispatcherGatewayServiceFactory = new DefaultDispatcherGatewayServiceFactory(
      dispatcherFactory,
      rpcService,
      partialDispatcherServices);

   return new SessionDispatcherLeaderProcessFactory(
      dispatcherGatewayServiceFactory,
      jobGraphStoreFactory,
      ioExecutor,
      fatalErrorHandler);
}

SessionDispatcherLeaderProcessFactory用来创建SessionDispatcherLeaderProcess的，整个工厂封装给了DispatcherLeaderProcessFactory
然后调用create方法：

public static DispatcherRunner create(
      LeaderElectionService leaderElectionService,
      FatalErrorHandler fatalErrorHandler,
      DispatcherLeaderProcessFactory dispatcherLeaderProcessFactory) throws Exception {
   //创建DefaultDispatcherRunner
   final DefaultDispatcherRunner dispatcherRunner = new DefaultDispatcherRunner(
      leaderElectionService,
      fatalErrorHandler,
      dispatcherLeaderProcessFactory);
   //开启DefaultDispatcherRunner的生命周期
   return DispatcherRunnerLeaderElectionLifecycleManager.createFor(dispatcherRunner, leaderElectionService);
}

public static <T extends DispatcherRunner & LeaderContender> DispatcherRunner createFor(T dispatcherRunner, LeaderElectionService leaderElectionService) throws Exception {
   return new DispatcherRunnerLeaderElectionLifecycleManager<>(dispatcherRunner, leaderElectionService);
}

把选举服务给了这个方法作为参数：leaderElectionService

private DispatcherRunnerLeaderElectionLifecycleManager(T dispatcherRunner, LeaderElectionService leaderElectionService) throws Exception {
   this.dispatcherRunner = dispatcherRunner;
   this.leaderElectionService = leaderElectionService;

   leaderElectionService.start(dispatcherRunner);
}

里面执行了选举服务：leaderElectionService.start(dispatcherRunner);

public final void start(LeaderContender contender) throws Exception {
   checkNotNull(contender, "Contender must not be null.");
   Preconditions.checkState(leaderContender == null, "Contender was already set.");

   synchronized (lock) {
      leaderContender = contender;
      leaderElectionDriver = leaderElectionDriverFactory.createLeaderElectionDriver(
         this, new LeaderElectionFatalErrorHandler(), leaderContender.getDescription());
      LOG.info("Starting DefaultLeaderElectionService with {}.", leaderElectionDriver);

      running = true;
   }
}

这里的leaderContender 是dispatcherRunner。再次来到ZooKeeperLeaderElectionDriver

public ZooKeeperLeaderElectionDriver(
      CuratorFramework client,
      String latchPath,
      String leaderPath,
      LeaderElectionEventHandler leaderElectionEventHandler,
      FatalErrorHandler fatalErrorHandler,
      String leaderContenderDescription) throws Exception {
   this.client = checkNotNull(client);
   this.leaderPath = checkNotNull(leaderPath);
   this.leaderElectionEventHandler = checkNotNull(leaderElectionEventHandler);
   this.fatalErrorHandler = checkNotNull(fatalErrorHandler);
   this.leaderContenderDescription = checkNotNull(leaderContenderDescription);

   leaderLatch = new LeaderLatch(client, checkNotNull(latchPath));
   cache = new NodeCache(client, leaderPath);

   client.getUnhandledErrorListenable().addListener(this);

   running = true;

   leaderLatch.addListener(this);
   leaderLatch.start();

   cache.getListenable().addListener(this);
   cache.start();

   client.getConnectionStateListenable().addListener(listener);
}

选举成功回调ZooKeeperLeaderElectionDriver的isLeader方法：

@Override
public void isLeader() {
   leaderElectionEventHandler.onGrantLeadership();
}

然后到DefaultDispatcherRunner的grantLeadership方法

@Override
public void grantLeadership(UUID leaderSessionID) {
   runActionIfRunning(() -> startNewDispatcherLeaderProcess(leaderSessionID));
}

调startNewDispatcherLeaderProcess

private void startNewDispatcherLeaderProcess(UUID leaderSessionID) {
   //停掉旧的实例
   stopDispatcherLeaderProcess();
    //创建新的实例
   dispatcherLeaderProcess = createNewDispatcherLeaderProcess(leaderSessionID);
    //启动新的实例
   final DispatcherLeaderProcess newDispatcherLeaderProcess = dispatcherLeaderProcess;
   FutureUtils.assertNoException(
      previousDispatcherLeaderProcessTerminationFuture.thenRun(newDispatcherLeaderProcess::start));
}

启动方法：newDispatcherLeaderProcess::start，调用的是AbstractDispatcherLeaderProcess类的start方法

@Override
public final void start() {
   runIfStateIs(
      State.CREATED,
      this::startInternal);
}

private void startInternal() {
   log.info("Start {}.", getClass().getSimpleName());
   state = State.RUNNING;
   onStart();
}

调用的是SessionDispatcherLeaderProcess的onStart方法：

@Override
protected void onStart() {
   startServices();

   onGoingRecoveryOperation = recoverJobsAsync()//拿到所有的jobGraph对象
      .thenAccept(this::createDispatcherIfRunning)//将每个job启动一个dispatcher跑起来
      .handle(this::onErrorIfRunning);
}

1.startService:就是启动jobGraphStore，用来存储jobGraph

private void startServices() {
   try {
      jobGraphStore.start(this);
   } catch (Exception e) {
      throw new FlinkRuntimeException(
         String.format(
            "Could not start %s when trying to start the %s.",
            jobGraphStore.getClass().getSimpleName(),
            getClass().getSimpleName()),
         e);
   }
}

2.recoverJobsAsync：恢复待执行的任务，异步的方式，拿到所有需要恢复的jobGraph，真正恢复需要调用后面的.thenAccept(this::createDispatcherIfRunning)

private CompletableFuture<Collection<JobGraph>> recoverJobsAsync() {
   return CompletableFuture.supplyAsync(
      this::recoverJobsIfRunning,
      ioExecutor);
}

调用的recoverJobsIfRunning：

private Collection<JobGraph> recoverJobsIfRunning() {
   return supplyUnsynchronizedIfRunning(this::recoverJobs).orElse(Collections.emptyList());

}

恢复job方法：recoverJobs

private Collection<JobGraph> recoverJobs() {
   log.info("Recover all persisted job graphs.");
   final Collection<JobID> jobIds = getJobIds();
   final Collection<JobGraph> recoveredJobGraphs = new ArrayList<>();

   for (JobID jobId : jobIds) {
      recoveredJobGraphs.add(recoverJob(jobId));
   }

   log.info("Successfully recovered {} persisted job graphs.", recoveredJobGraphs.size());

   return recoveredJobGraphs;
}

拿到所有的job，然后根据jobid去恢复。

private Collection<JobID> getJobIds() {
   try {
      return jobGraphStore.getJobIds();
   } catch (Exception e) {
      throw new FlinkRuntimeException(
         "Could not retrieve job ids of persisted jobs.",
         e);
   }
}

recoverJob是真正恢复job的方法：

private JobGraph recoverJob(JobID jobId) {
   log.info("Trying to recover job with job id {}.", jobId);
   try {
      return jobGraphStore.recoverJobGraph(jobId);
   } catch (Exception e) {
      throw new FlinkRuntimeException(
         String.format("Could not recover job with job id %s.", jobId),
         e);
   }
}

实现类是DefaultJobGraphStore的recoverJobGraph，拿到所有的jobGraph

public JobGraph recoverJobGraph(JobID jobId) throws Exception {
   checkNotNull(jobId, "Job ID");

   LOG.debug("Recovering job graph {} from {}.", jobId, jobGraphStateHandleStore);

   final String name = jobGraphStoreUtil.jobIDToName(jobId);

   synchronized (lock) {
      verifyIsRunning();

      boolean success = false;

      RetrievableStateHandle<JobGraph> jobGraphRetrievableStateHandle;

      try {
         try {
            jobGraphRetrievableStateHandle = jobGraphStateHandleStore.getAndLock(name);
         } catch (StateHandleStore.NotExistException ignored) {
            success = true;
            return null;
         } catch (Exception e) {
            throw new FlinkException("Could not retrieve the submitted job graph state handle " +
               "for " + name + " from the submitted job graph store.", e);
         }

         JobGraph jobGraph;
         try {
            jobGraph = jobGraphRetrievableStateHandle.retrieveState();
         } catch (ClassNotFoundException cnfe) {
            throw new FlinkException("Could not retrieve submitted JobGraph from state handle under " + name +
               ". This indicates that you are trying to recover from state written by an " +
               "older Flink version which is not compatible. Try cleaning the state handle store.", cnfe);
         } catch (IOException ioe) {
            throw new FlinkException("Could not retrieve submitted JobGraph from state handle under " + name +
               ". This indicates that the retrieved state handle is broken. Try cleaning the state handle " +
               "store.", ioe);
         }

         addedJobGraphs.add(jobGraph.getJobID());

         LOG.info("Recovered {}.", jobGraph);

         success = true;
         return jobGraph;
      } finally {
         if (!success) {
            jobGraphStateHandleStore.release(name);
         }
      }
   }
}

上面拿到了需要恢复的jobGraph，下面执行.thenAccept(this::createDispatcherIfRunning)

private void createDispatcherIfRunning(Collection<JobGraph> jobGraphs) {
   runIfStateIs(State.RUNNING, () -> createDispatcher(jobGraphs));
}

每一个jobGraph都需要一个dispatcher去调度：

private void createDispatcher(Collection<JobGraph> jobGraphs) {

   final DispatcherGatewayService dispatcherService = dispatcherGatewayServiceFactory.create(
      DispatcherId.fromUuid(getLeaderSessionId()),
      jobGraphs,
      jobGraphStore);

   completeDispatcherSetup(dispatcherService);
}

而启动jobGraph在dispatcherGatewayServiceFactory.create方法里面

private void createDispatcher(Collection<JobGraph> jobGraphs) {

   final DispatcherGatewayService dispatcherService = dispatcherGatewayServiceFactory.create(
      DispatcherId.fromUuid(getLeaderSessionId()),
      jobGraphs,
      jobGraphStore);

   completeDispatcherSetup(dispatcherService);
}

public AbstractDispatcherLeaderProcess.DispatcherGatewayService create(
      DispatcherId fencingToken,
      Collection<JobGraph> recoveredJobs,
      JobGraphWriter jobGraphWriter) {

   final Dispatcher dispatcher;
   try {
       //返回的是一个StandaloneDispatcher 
      dispatcher = dispatcherFactory.createDispatcher(
         rpcService,
         fencingToken,
         recoveredJobs,
         (dispatcherGateway, scheduledExecutor, errorHandler) -> new NoOpDispatcherBootstrap(),
         PartialDispatcherServicesWithJobGraphStore.from(partialDispatcherServices, jobGraphWriter));
   } catch (Exception e) {
      throw new FlinkRuntimeException("Could not create the Dispatcher rpc endpoint.", e);
   }

   dispatcher.start();

   return DefaultDispatcherGatewayService.from(dispatcher);
}

整个流程总结一下：
onStart()
->recoverJobsAsync()(恢复所有的jobGraph)
->createDispatcherIfRunning()
-> runIfStateIs(State.RUNNING, () -> createDispatcher(jobGraphs));
->createDispatcher(jobGraphs)(恢复jobGraph)

也就是对于每个jobGraph，都需要一个dispatcher去调度运行。
其中createDispatcher返回的是一个StandaloneDispatcher

@Override
public StandaloneDispatcher createDispatcher(
      RpcService rpcService,
      DispatcherId fencingToken,
      Collection<JobGraph> recoveredJobs,
      DispatcherBootstrapFactory dispatcherBootstrapFactory,
      PartialDispatcherServicesWithJobGraphStore partialDispatcherServicesWithJobGraphStore) throws Exception {
   // create the default dispatcher
   return new StandaloneDispatcher(
      rpcService,
      fencingToken,
      recoveredJobs,
      dispatcherBootstrapFactory,
      DispatcherServices.from(partialDispatcherServicesWithJobGraphStore, DefaultJobManagerRunnerFactory.INSTANCE));
}

dispatcher的start又进入了rpcServer的start从而转到onStart方法。也就是StandaloneDispatcher 的onstart方法

public final void start() {
   rpcServer.start();
}

调用的是父类Dispatcher 的onStart方法

@Override
public void onStart() throws Exception {
   try {
      startDispatcherServices();
   } catch (Throwable t) {
      final DispatcherException exception = new DispatcherException(String.format("Could not start the Dispatcher %s", getAddress()), t);
      onFatalError(exception);
      throw exception;
   }

   startRecoveredJobs();
   this.dispatcherBootstrap = this.dispatcherBootstrapFactory.create(
         getSelfGateway(DispatcherGateway.class),
         this.getRpcService().getScheduledExecutor() ,
         this::onFatalError);
}

startRecoveredJobs();将任务调度执行起来。

private void startRecoveredJobs() {
   for (JobGraph recoveredJob : recoveredJobs) {
      runRecoveredJob(recoveredJob);
   }
   recoveredJobs.clear();
}

private void runRecoveredJob(final JobGraph recoveredJob) {
   checkNotNull(recoveredJob);
   try {
      runJob(recoveredJob, ExecutionType.RECOVERY);
   } catch (Throwable throwable) {
      onFatalError(new DispatcherException(String.format("Could not start recovered job %s.", recoveredJob.getJobID()), throwable));
   }
}

到这里job就运行起来了。运行的模式是ExecutionType.RECOVERY。
job恢复后调用completeDispatcherSetup方法：添加一些回调。

final void completeDispatcherSetup(DispatcherGatewayService dispatcherService) {
   runIfStateIs(
      State.RUNNING,
      () -> completeDispatcherSetupInternal(dispatcherService));
}

private void completeDispatcherSetupInternal(DispatcherGatewayService createdDispatcherService) {
   Preconditions.checkState(dispatcherService == null, "The DispatcherGatewayService can only be set once.");
   dispatcherService = createdDispatcherService;
   dispatcherGatewayFuture.complete(createdDispatcherService.getGateway());
   FutureUtils.forward(createdDispatcherService.getShutDownFuture(), shutDownFuture);
   handleUnexpectedDispatcherServiceTermination(createdDispatcherService);
}

总结

到此，jobmanager的三个最重要的组件就启动完成了，三个组件分别是：

WebMonitorEndpoint

初始化一大堆 Handler 和一个 Router，并且进行排序去重，之后，再把每个 Handler 注册到Router当中

启动一个 Netty 的服务端

启动内部服务：执行竞选！WebMonitorEndpoint 本身就是一个 LeaderContender 角色。如果竞选成功，则回调 isLeader() 方法

竞选成功，其实就只是把 WebMontiroEndpoint 的 address 以及跟 zookeeper的sessionID 写入到 znode 中

启动一个关于 ExecutionGraph 的 Cache 的定时清理任务

ResourceManager

1、ResourceManager 是 RpcEndpoint 的子类，所以在构建 ResourceManager 对象完成之后，会调用 start() 方法来启动这个 RpcEndpoint，然后就调准到它的 onStart() 方法执行。
2、ResourceManager 是 LeaderContender 的子类，会通过 LeaderElectionService 参加竞选，如果竞选成功，则会回调 isLeader() 方法。
3、启动 ResourceManager 需要的一些服务:
两个心跳服务：
（1）ResourceManager 和 TaskExecutor 之间的心跳
（2）ResourceManager 和 JobMaster 之间的心跳
两个定时服务：
（1）checkTaskManagerTimeoutsAndRedundancy() 检查 TaskExecutor的超时
（2）checkSlotRequestTimeouts() 检查 SlotRequest 超时

Dispatcher 启动和初始化

1、启动 JobGraphStore 服务
2、从 JobGraphStrore 恢复执行 Job, 要启动 Dispatcher

到此为止，job manager启动完成！

你可能感兴趣的:(flink源码分析,flink,java,big,data)

JSON 与 AJAX Auscy json ajax 前端
一、JSON（JavaScriptObjectNotation）1.数据类型与语法细节支持的数据类型：基本类型：字符串（需用双引号）、数字、布尔值（true/false）、null。复杂类型：数组（[]）、对象（{}）。严格语法规范：键名必须用双引号包裹（如"name":"张三"）。数组元素用逗号分隔，最后一个元素后不能有多余逗号。数字不能以0开头（如012会被解析为12），不支持八进制/十六进制
JavaScript 树形菜单总结 Auscy microsoft
树形菜单是前端开发中常见的交互组件，用于展示具有层级关系的数据（如文件目录、分类列表、组织架构等）。以下从核心概念、实现方式、常见功能及优化方向等方面进行总结。一、核心概念层级结构：数据以父子嵌套形式存在，如{id:1,children:[{id:2}]}。节点：树形结构的基本单元，包含自身信息及子节点（若有）。展开/折叠：子节点的显示与隐藏切换，是树形菜单的核心交互。递归渲染：因数据层级不固定，
精通Canvas：15款时钟特效代码实现指南烟幕缭绕
本文还有配套的精品资源，点击获取简介：HTML5的Canvas是一个用于绘制矢量图形的API，通过JavaScript实现动态效果。本项目集合了15种不同的时钟特效代码，帮助开发者通过学习绘制圆形、线条、时间更新、旋转、颜色样式设置及动画效果等概念，深化对Canvas的理解和应用。项目中的CSS文件负责时钟的样式设定，而JS文件则包含实现各种特效的逻辑，通过不同的函数或类处理时间更新和动画绘制，提
深入剖析OpenJDK 18 GA源码：Java平台最新发展想法臃肿
本文还有配套的精品资源，点击获取简介：OpenJDK18GA作为Java开发的关键里程碑，提供了诸多新特性和改进。本文章深入探讨了OpenJDK18GA源码，揭示其内部机制，帮助开发者更好地理解和利用这个版本。文章还涵盖了PatternMatching、SealedClasses、Records、JEP395、JEP406和JEP407等特性，以及HotSpot虚拟机、编译器、垃圾收集器、内存模型
Android ViewBinding 使用与封装教程积跬步DEV Android 开发实战大全 android
AndroidViewBinding使用与封装教程：一、ViewBinding是什么？核心功能：为每个XML布局文件自动生成一个绑定类（如ActivityMainBinding），直接暴露所有带ID的视图引用。优点：避免繁琐的findViewById()，类型安全且编译时检查。对比DataBinding：ViewBinding仅处理视图引用，无数据绑定功能。DataBinding支持双向数据绑定，
Java大厂面试实录：谢飞机的电商场景技术问答（Spring Cloud、MyBatis、Redis、Kafka、AI等）
Java大厂面试实录：谢飞机的电商场景技术问答（SpringCloud、MyBatis、Redis、Kafka、AI等）本文模拟知名互联网大厂Java后端岗位面试流程，以电商业务为主线，由严肃面试官与“水货”程序员谢飞机展开有趣的对话，涵盖SpringCloud、MyBatis、Redis、Kafka、SpringSecurity、AI等热门技术栈，并附详细解析，助力求职者备战大厂面试。故事设定谢
【超硬核】JVM源码解读：Java方法main在虚拟机上解释执行 HeapDump性能社区 java 开发语言后端 jvm
本文由HeapDump性能社区首席讲师鸠摩（马智）授权整理发布第1篇-关于Java虚拟机HotSpot，开篇说的简单点开讲Java运行时，这一篇讲一些简单的内容。我们写的主类中的main()方法是如何被Java虚拟机调用到的？在Java类中的一些方法会被由C/C++编写的HotSpot虚拟机的C/C++函数调用，不过由于Java方法与C/C++函数的调用约定不同，所以并不能直接调用，需要JavaC
算法学习笔记：17.蒙特卡洛算法 ——从原理到实战，涵盖 LeetCode 与考研 408 例题
在计算机科学和数学领域，蒙特卡洛算法（MonteCarloAlgorithm）以其独特的随机抽样思想，成为解决复杂问题的有力工具。从圆周率的计算到金融风险评估，从物理模拟到人工智能，蒙特卡洛算法都发挥着不可替代的作用。本文将深入剖析蒙特卡洛算法的思想、解题思路，结合实际应用场景与Java代码实现，并融入考研408的相关考点，穿插图片辅助理解，帮助你全面掌握这一重要算法。蒙特卡洛算法的基本概念蒙特卡
Python 脚本最佳实践2025版
前文可以直接把这篇文章喂给AI,可以放到AI角色设定里,也可以直接作为提示词.这样,你只管提需求,写脚本就让AI来.概述追求简洁和清晰：脚本应简单明了。使用函数(functions)、常量(constants)和适当的导入(import)实践来有逻辑地组织你的Python脚本。使用枚举(enumerations)和数据类(dataclasses)等数据结构高效管理脚本状态。通过命令行参数增强交互性
Java大厂面试故事：谢飞机的互联网音视频场景技术面试全纪录（Spring Boot、MyBatis、Kafka、Redis、AI等）来旺 Java场景面试宝典 Java Spring Boot MyBatis Kafka Redis 微服务 AI
Java大厂面试故事：谢飞机的互联网音视频场景技术面试全纪录（SpringBoot、MyBatis、Kafka、Redis、AI等）互联网大厂技术面试不仅考察技术深度，更注重业务场景与系统设计能力。本篇以严肃面试官与“水货”程序员谢飞机的对话，带你体验音视频业务场景下的Java面试全过程，涵盖主流技术栈，并附详细答案解析，助你面试无忧。故事场景设定谢飞机是一名有趣但技术基础略显薄弱的程序员，这次应
【前端】jQuery数组合并去重方法总结
在jQuery中合并多个数组并去重，推荐使用原生JavaScript的Set对象（高效简单）或$.unique()（仅适用于DOM元素，不适用于普通数组）。以下是完整解决方案：方法1：使用ES6Set（推荐）//定义多个数组constarr1=[1,2,3];constarr2=[2,3,4];constarr3=[3,4,5];//合并数组并用Set去重constmergedArray=[...
CentOS7环境卸载MySQL5.7 Hadoop_Liang mysql 数据库 mysql
备份重要数据切记，卸载之前先备份mysql重要的数据。备份一个数据库例如：备份名为mydatabase的数据库到backup.sql的文件中mysqldump-uroot-ppassword123mydatabase>backup.sql备份所有数据库mysqldump-uroot-ppassword123--all-databases>all_databases_backup.sql注意：-p后
MySQL Explain 详解：从入门到精通，让你的 SQL 飞起来
引言：为什么Explain是SQL优化的“照妖镜”？在Java开发中，我们常常会遇到数据库性能瓶颈的问题。一条看似简单的SQL语句，在数据量增长到一定规模后，可能会从毫秒级响应变成秒级甚至分钟级响应，直接拖慢整个应用的性能。此时，你是否曾困惑于：为什么这条SQL突然变慢了？索引明明建了，为什么没生效？到底是哪里出了问题？答案就藏在MySQL的EXPLAIN命令里。EXPLAIN就像一面“照妖镜”，
Java特性之设计模式【责任链模式】 Naijia_OvO Java特性 java 设计模式责任链模式
一、责任链模式概述顾名思义，责任链模式（ChainofResponsibilityPattern）为请求创建了一个接收者对象的链。这种模式给予请求的类型，对请求的发送者和接收者进行解耦。这种类型的设计模式属于行为型模式在这种模式中，通常每个接收者都包含对另一个接收者的引用。如果一个对象不能处理该请求，那么它会把相同的请求传给下一个接收者，依此类推主要解决：职责链上的处理者负责处理请求，客户只需要将
日历插件-FullCalendar的详细使用老马聊技术 JavaScript 前端 javascript
一、介绍FullCalendar是一个功能强大、高度可定制的JavaScript日历组件，用于在网页中显示和管理日历事件。它支持多种视图（月、周、日等），可以轻松集成各种框架，并提供丰富的事件处理功能。二、实操案例具体代码如下：FullCalendar日期选择body{font-family:Arial,sans-serif;margin:20px;}#calendar{max-width:900
EasyCwmp源码分析与接口实现详解：深入理解源码架构，掌握核心接口
EasyCwmp源码分析与接口实现详解：深入理解源码架构，掌握核心接口去发现同类优质开源项目:https://gitcode.com/在开源项目中，寻找一款能够提升开发效率、简化流程的工具是每个开发者的追求。今天，我们要介绍的这款开源项目EasyCwmp，正是为了帮助开发者深入了解源码架构，掌握核心接口实现，从而加速项目开发进程。以下是关于EasyCwmp源码分析与接口实现详解的项目推荐文章。项目
react-native android 环境搭建
环境：macjava版本：Java11最重要：一定要一定要一定要react涉及到很多的依赖下载，gradle和react相关的，第一次安装环境时有外网环境会快速很多。安装nodejs安装react-nativenpminstallreact-native-clinpminstallreact-native创建一个新项目react-nativeinitfirstReact替换gradle下载源rep
Java 调用 HTTP 接口的 7 种方式：全网最全指南
Java调用HTTP接口的7种方式：全网最全指南在开发过程中，调用HTTP接口是最常见的需求之一。本文将详细介绍Java中7种主流的调用HTTP接口的方式，包括每种工具的优缺点和完整代码实现。1.使用RestTemplateRestTemplate是Spring提供的同步HTTP客户端，适用于传统项目。尽管从Spring5开始被标记为过时，它仍然是许多开发者的首选。示例代码importorg.sp
“Datawhale AI夏令营”基于带货视频评论的用户洞察挑战赛 fzyz123 Datawhale AI夏令营人工智能 Datawhale 大模型技术 NLP 深度学习 AI夏令营
前言：本次是DatawhaleAI夏令营2025年第一期的内容，赛事是：基于带货视频评论的用户洞察挑战赛（科大讯飞AI大赛）一、赛事背景在直播电商爆发式增长浪潮中，短视频平台积累的海量带货视频及用户评论数据蕴含巨大商业价值。这些数据不仅是消费者体验的直接反馈，更是驱动品牌决策的关键资产。用户洞察的核心在于视频内容与评论数据的联合挖掘：通过智能识别推广商品分析评论中的情感表达与观点聚合精准捕捉消费者
Java三年经验程序员技术栈全景指南：从前端到架构，对标阿里美团全栈要求可曾去过倒悬山 java 前端架构
Java三年经验程序员技术栈全景指南：从前端到架构，对标阿里美团全栈要求三年经验是Java程序员的分水岭，技术栈深度决定你成为“业务码农”还是“架构师候选人”。本文整合阿里、美团、滴滴等大厂招聘要求，为你绘制可落地的进阶路线。一、Java核心：从语法糖到JVM底层三年经验与初级的核心差异在于系统级理解，大厂面试常考以下能力：JVM与性能调优内存模型（堆外内存、元空间）、GC算法（G1/ZGC适用场
[特殊字符] 实时数据洪流突围战：Flink+Paimon实现毫秒级分析的架构革命（附压测报告）——日均百亿级数据处理成本降低60%的工业级方案 Lucas55555555 flink 大数据
引言：流批一体的时代拐点据阿里云2025白皮书显示，实时数据处理需求年增速达240%，但传统Lambda架构资源消耗占比超运维成本的70%。某电商平台借助Flink+Paimon重构实时数仓后，端到端延迟从分钟级压缩至800ms，计算资源节省5.6万核/月。技术红利窗口期：2025年ApachePaimon1.0正式发布，支持秒级快照与湖仓一体，成为替代Iceberg的新范式一、痛点深挖：实时数仓
OKHttp3源码分析——学习笔记 Sincerity_ 源码相关 Okhttp 源码解析读书笔记 httpclient cache
文章目录1.HttpClient与HttpUrlConnection的区别2.OKHttp源码分析使用步骤:dispatcher任务调度器,（后面有详细说明）Request请求RealCallAsyncCall3.OKHttp架构分析1.异步请求线程池,Dispather2.连接池清理线程池-ConnectionPool3.缓存整理线程池DisLruCache4.Http2异步事务线程池,http
javascript高级程序设计第3版——第12章 DOM2与DOM3 weixin_30687587 javascript 数据结构与算法 ViewUI
12章——DOM2与DOM3为了增强D0M1，DOM级规范定义了一些模块。DOM2核心：为不同的DOM类型引入了一些与XML命名空间有关的方法，还定义了以编程方式创建Document实例的方法；DOM2级样式：针对操作元素的样式而开发；其特性总结：1.每个元素都有一个关联的style对象，可用来确定和修改行内样式；2.要确定某个元素的计算样式，可使用getComgetComputedStyle（）
Java设计模式实战：高频场景解析与避坑指南 mckim_ 笔记学习 java 设计模式
引言设计模式是软件开发的基石，但许多开发者面对23种模式时容易陷入“学完就忘”或“滥用模式”的困境。本文从工业级项目视角出发，精选10种高频设计模式，结合真实代码案例与主流框架应用，帮你建立模式思维，拒绝纸上谈兵。一、创建型模式：告别new的暴力美学1.工厂方法模式（FactoryMethod）核心痛点：对象创建逻辑散落各处，难以统一管理。场景案例：电商平台需要支持多种支付方式（支付宝、微信、银联
JavaScript 基础09：Web APIs——日期对象、DOM节点梦想当全栈 JavaScript javascript 前端开发语言
JavaScript基础09：WebAPIs——日期对象、DOM节点进一步学习DOM相关知识，实现可交互的网页特效能够插入、删除和替换元素节点。能够依据元素节点关系查找节点。一、日期对象掌握Date日期对象的使用，动态获取当前计算机的时间。ECMAScript中内置了获取系统时间的对象Date，使用Date时与之前学习的内置对象console和Math不同，它需要借助new关键字才能使用。1.实例
《Java前端开发全栈指南：从Servlet到现代框架实战》
前言在当今Web开发领域，Java依然是后端开发的主力语言，而随着前后端分离架构的普及，Java开发者也需要掌握前端技术栈。本文将全面介绍JavaWeb前端开发的核心技术，包括传统Servlet/JSP体系、现代前端框架集成方案，以及全栈开发的最佳实践。通过本文，您将了解如何构建现代化的JavaWeb应用前端界面。一、JavaWeb前端技术演进1.1传统技术栈Servlet：JavaWeb基础，处
javaSE面试题---语法基础、面向对象、常用类、集合、多线程、文件和IO yang_xiao_wu_ java 面试开发语言 javase java基础多线程文件和IO
目录语法基础1.jdkjrejvm区别2.基本数据类型3.引用数据类型4.自动类型转换、强制类型转换5.常见的运算符6.&和&&区别7.++--在前和在后的区别8.+=有什么作用9.switch..case中switch支持哪些数据类型10.break和continue区别11.while和dowhile区别12.如何生成一个取值范围在[min,max]之间的随机数13.数组的长度如何获取？数组下
JAVA 高频八股文 Day03 Conqueror675 java 开发语言
12.TCP和Http的区别是什么TCP是传输层协议，负责建立可靠的点对点连接，确保数据有序、完整地传输（如铁路轨道）；HTTP是应用层协议，基于TCP构建，定义了Web服务交互的报文格式和规则（如货运订单）。TCP关注数据如何可靠送达，通过三次握手建立连接、流量控制等机制保证传输；HTTP关注传输内容的意义，提供请求/响应语义（GET/POST等）和无状态通信。补充：说一下什么是三次握手四次挥手
JVM字节码加载与存储中的细节
问题引出：为什么Java定义int型变量为32767时使用的是bipush32767，而定义int型变量为32768时使用的是ldc#4？在Java中，如果这样定义int型变量：publicclassTest{publicstaticvoidmain(String[]args){inti=0;intj=5;intk=6;intm=32768;intn=32767;}}变量对应的字节码文件内容是这样
JVM与Spring Boot核心解析 AIHacksCash Java场景面试宝典 Java JVM Spring Boot
我是廖志伟，一名Java开发工程师、《Java项目实战——深入理解大型互联网企业通用技术》（基础篇）、（进阶篇）、（架构篇）清华大学出版社签约作家、Java领域优质创作者、CSDN博客专家、阿里云专家博主、51CTO专家博主、产品软文专业写手、技术文章评审老师、技术类问卷调查设计师、幕后大佬社区创始人、开源项目贡献者。拥有多年一线研发和团队管理经验，研究过主流框架的底层源码(Spring、Spri
HttpClient 4.3与4.3版本以下版本比较 spjich java httpclient
网上利用java发送http请求的代码很多，一搜一大把，有的利用的是java.net.*下的HttpURLConnection，有的用httpclient，而且发送的代码也分门别类。今天我们主要来说的是利用httpclient发送请求。 httpclient又可分为 httpclient3.x httpclient4.x到httpclient4.3以下 httpclient4.3
Essential Studio Enterprise Edition 2015 v1新功能体验 Axiba .net
概述：Essential Studio已全线升级至2015 v1版本了！新版本为JavaScript和ASP.NET MVC添加了新的文件资源管理器控件，还有其他一些控件功能升级，精彩不容错过，让我们一起来看看吧！ syncfusion公司是世界领先的Windows开发组件提供商，该公司正式对外发布Essential Studio Enterprise Edition 2015 v1版本。新版本
[宇宙与天文]微波背景辐射值与地球温度 comsci 背景
宇宙这个庞大,无边无际的空间是否存在某种确定的,变化的温度呢? 如果宇宙微波背景辐射值是表示宇宙空间温度的参数之一,那么测量这些数值,并观测周围的恒星能量输出值,我们是否获得地球的长期气候变化的情况呢? &nbs
lvs-server 男人50 server
#!/bin/bash # # LVS script for VS/DR # #./etc/rc.d/init.d/functions # VIP=10.10.6.252 RIP1=10.10.6.101 RIP2=10.10.6.13 PORT=80 case $1 in start) /sbin/ifconfig eth2:0 $VIP broadca
java的WebCollector爬虫框架 oloz 爬虫
WebCollector主页： https://github.com/CrawlScript/WebCollector 下载：webcollector-版本号-bin.zip将解压后文件夹中的所有jar包添加到工程既可。接下来看demo package org.spider.myspider; import cn.edu.hfut.dmic.webcollector.cra
jQuery append 与 after 的区别小猪猪08
1、after函数定义和用法： after() 方法在被选元素后插入指定的内容。语法： $(selector).after(content) 实例： <html> <head> <script type="text/javascript" src="/jquery/jquery.js"></scr
mysql知识充电香水浓 mysql
索引索引是在存储引擎中实现的，因此每种存储引擎的索引都不一定完全相同，并且每种存储引擎也不一定支持所有索引类型。根据存储引擎定义每个表的最大索引数和最大索引长度。所有存储引擎支持每个表至少16个索引，总索引长度至少为256字节。大多数存储引擎有更高的限制。MYSQL中索引的存储类型有两种：BTREE和HASH，具体和表的存储引擎相关； MYISAM和InnoDB存储引擎
我的架构经验系列文章索引 agevs 架构
下面是一些个人架构上的总结，本来想只在公司内部进行共享的，因此内容写的口语化一点，也没什么图示，所有内容没有查任何资料是脑子里面的东西吐出来的因此可能会不准确不全，希望抛砖引玉，大家互相讨论。要注意，我这些文章是一个总体的架构经验不针对具体的语言和平台，因此也不一定是适用所有的语言和平台的。（内容是前几天写的，现附上索引）前端架构 http://www.
Android so lib库远程http下载和动态注册 aijuans andorid
一、背景在开发Android应用程序的实现，有时候需要引入第三方so lib库，但第三方so库比较大，例如开源第三方播放组件ffmpeg库, 如果直接打包的apk包里面, 整个应用程序会大很多.经过查阅资料和实验，发现通过远程下载so文件，然后再动态注册so文件时可行的。主要需要解决下载so文件存放位置以及文件读写权限问题。二、主要
linux中svn配置出错 conf/svnserve.conf:12: Option expected 解决方法 baalwolf option
在客户端访问subversion版本库时出现这个错误： svnserve.conf:12: Option expected 为什么会出现这个错误呢，就是因为subversion读取配置文件svnserve.conf时，无法识别有前置空格的配置文件，如### This file controls the configuration of the svnserve daemon, if you##
MongoDB的连接池和连接管理 BigCat2013 mongodb
在关系型数据库中，我们总是需要关闭使用的数据库连接，不然大量的创建连接会导致资源的浪费甚至于数据库宕机。这篇文章主要想解释一下mongoDB的连接池以及连接管理机制，如果正对此有疑惑的朋友可以看一下。通常我们习惯于new 一个connection并且通常在finally语句中调用connection的close()方法将其关闭。正巧，mongoDB中当我们new一个Mongo的时候，会发现它也
AngularJS使用Socket.IO bijian1013 JavaScript AngularJS Socket.IO
目前，web应用普遍被要求是实时web应用，即服务端的数据更新之后，应用能立即更新。以前使用的技术（例如polling）存在一些局限性，而且有时我们需要在客户端打开一个socket，然后进行通信。 Socket.IO(http://socket.io/)是一个非常优秀的库，它可以帮你实
[Maven学习笔记四]Maven依赖特性 bit1129 maven
三个模块为了说明问题，以用户登陆小web应用为例。通常一个web应用分为三个模块，模型和数据持久化层user-core, 业务逻辑层user-service以及web展现层user-web， user-service依赖于user-core user-web依赖于user-core和user-service 依赖作用范围 Maven的dependency定义
【Akka一】Akka入门 bit1129 akka
什么是Akka Message-Driven Runtime is the Foundation to Reactive Applications In Akka, your business logic is driven through message-based communication patterns that are independent of physical locatio
zabbix_api之perl语言写法 ronin47 zabbix_api之perl
zabbix_api网上比较多的写法是python或curl。上次我用java－－http://bossr.iteye.com/blog/2195679，这次用perl。for example: #!/usr/bin/perl use 5.010 ; use strict ; use warnings ; use JSON :: RPC :: Client ; use
比优衣库跟牛掰的视频流出了，兄弟连Linux运维工程师课堂实录，更加刺激，更加实在！ brotherlamp linux运维工程师 linux运维工程师教程 linux运维工程师视频 linux运维工程师资料 linux运维工程师自学
比优衣库跟牛掰的视频流出了，兄弟连Linux运维工程师课堂实录，更加刺激，更加实在！ ----------------------------------------------------- 兄弟连Linux运维工程师课堂实录-计算机基础-1-课程体系介绍1 链接：http://pan.baidu.com/s/1i3GQtGL 密码：bl65 兄弟连Lin
bitmap求哈密顿距离-给定N（1<=N<=100000）个五维的点A(x1,x2,x3,x4,x5)，求两个点X(x1,x2,x3,x4,x5)和Y( bylijinnan java
import java.util.Random; /** * 题目： * 给定N（1<=N<=100000）个五维的点A(x1,x2,x3,x4,x5)，求两个点X(x1,x2,x3,x4,x5)和Y(y1,y2,y3,y4,y5)， * 使得他们的哈密顿距离（d=|x1-y1| + |x2-y2| + |x3-y3| + |x4-y4| + |x5-y5|）最大
map的三种遍历方法 chicony map
package com.test; import java.util.Collection; import java.util.HashMap; import java.util.Iterator; import java.util.Map; import java.util.Set; public class TestMap { public static v
Linux安装mysql的一些坑 chenchao051 linux
1、mysql不建议在root用户下运行 2、出现服务启动不了，111错误，注意要用chown来赋予权限，我在root用户下装的mysql，我就把usr/share/mysql/mysql.server复制到/etc/init.d/mysqld, (同时把my-huge.cnf复制/etc/my.cnf) chown -R cc /etc/init.d/mysql
Sublime Text 3 配置 daizj 配置 Sublime Text
Sublime Text 3 配置解释(默认){// 设置主题文件“color_scheme”: “Packages/Color Scheme – Default/Monokai.tmTheme”,// 设置字体和大小“font_face”: “Consolas”,“font_size”: 12,// 字体选项：no_bold不显示粗体字，no_italic不显示斜体字，no_antialias和
MySQL server has gone away 问题的解决方法 dcj3sjt126com SQL Server
MySQL server has gone away 问题解决方法，需要的朋友可以参考下。应用程序（比如PHP）长时间的执行批量的MYSQL语句。执行一个SQL，但SQL语句过大或者语句中含有BLOB或者longblob字段。比如，图片数据的处理。都容易引起MySQL server has gone away。今天遇到类似的情景，MySQL只是冷冷的说：MySQL server h
javascript/dom:固定居中效果 dcj3sjt126com JavaScript
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml&
使用 Spring 2.5 注释驱动的 IoC 功能 e200702084 spring bean 配置管理 IOC Office
使用 Spring 2.5 注释驱动的 IoC 功能 developerWorks 文档选项将打印机的版面设置成横向打印模式打印本页将此页作为电子邮件发送将此页作为电子邮件发送级别：初级陈雄华 ([email protected]), 技术总监, 宝宝淘网络科技有限公司 2008 年 2 月 28 日 &nb
MongoDB常用操作命令 geeksun mongodb
1. 基本操作 db.AddUser(username,password) 添加用户 db.auth(usrename,password) 设置数据库连接验证 db.cloneDataBase(fromhost)
php写守护进程（Daemon） hongtoushizi PHP
转载自： http://blog.csdn.net/tengzhaorong/article/details/9764655 守护进程（Daemon）是运行在后台的一种特殊进程。它独立于控制终端并且周期性地执行某种任务或等待处理某些发生的事件。守护进程是一种很有用的进程。php也可以实现守护进程的功能。 1、基本概念 &nbs
spring整合mybatis,关于注入Dao对象出错问题 jonsvien DAO spring bean mybatis prototype
今天在公司测试功能时发现一问题：先进行代码说明： 1，controller配置了Scope="prototype"（表明每一次请求都是原子型） @resource/@autowired service对象都可以（两种注解都可以）。 2，service 配置了Scope="prototype"（表明每一次请求都是原子型）
对象关系行为模式之标识映射 home198979 PHP 架构企业应用对象关系标识映射
HELLO!架构一、概念 identity Map:通过在映射中保存每个已经加载的对象，确保每个对象只加载一次，当要访问对象的时候，通过映射来查找它们。其实在数据源架构模式之数据映射器代码中有提及到标识映射，Mapper类的getFromMap方法就是实现标识映射的实现。二、为什么要使用标识映射？在数据源架构模式之数据映射器中 //c
Linux下hosts文件详解 pda158 linux
　1、主机名：　　无论在局域网还是INTERNET上，每台主机都有一个IP地址，是为了区分此台主机和彼台主机，也就是说IP地址就是主机的门牌号。　　公网：IP地址不方便记忆，所以又有了域名。域名只是在公网（INtERNET)中存在，每个域名都对应一个IP地址，但一个IP地址可有对应多个域名。　　局域网：每台机器都有一个主机名，用于主机与主机之间的便于区分，就可以为每台机器设置主机
nginx配置文件粗解 spjich java nginx
#运行用户#user nobody;#启动进程,通常设置成和cpu的数量相等worker_processes 2;#全局错误日志及PID文件#error_log logs/error.log;#error_log logs/error.log notice;#error_log logs/error.log inf
数学函数 w54653520 java
public class S { // 传入两个整数，进行比较，返回两个数中的最大值的方法。 public int get( int num1, int nu