flink jobmaster分析

一、JobMaster和JobManager
在上一篇着重分析了工作图的导入和分发,由于版本迭代的缘故,JobMaster和JobManager完成的工作逻辑基本是一样的,这里只介绍Jobmaster,这里不再介绍老的JobManager。在前面提到过,工作的图的传递和分发是通过JobManagerRunner的生成,递送到JobMaster,然后再由ExecutionGraph递送到Task。
在前面的代码中:

public abstract class Dispatcher extends FencedRpcEndpoint implements
    DispatcherGateway, LeaderContender, SubmittedJobGraphStore.SubmittedJobGraphListener {
......

    private final Map> jobManagerRunnerFutures;

    private final LeaderElectionService leaderElectionService;

    private final ArchivedExecutionGraphStore archivedExecutionGraphStore;

  //JobManagerRunner的生成工厂类对象
    private final JobManagerRunnerFactory jobManagerRunnerFactory;
  ......
}
//这里得到了JobManagerRunner并且archivedExecutionGraph和ExecutionGraph继承了共同的接口AccessExecutionGraph
private JobManagerRunner startJobManagerRunner(JobManagerRunner jobManagerRunner) throws Exception {
  final JobID jobId = jobManagerRunner.getJobGraph().getJobID();
  jobManagerRunner.getResultFuture().whenCompleteAsync(
    (ArchivedExecutionGraph archivedExecutionGraph, Throwable throwable) -> {
      // check if we are still the active JobManagerRunner by checking the identity
      //noinspection ObjectEquality
      if (jobManagerRunner == jobManagerRunnerFutures.get(jobId).getNow(null)) {
        if (archivedExecutionGraph != null) {
          jobReachedGloballyTerminalState(archivedExecutionGraph);
        } else {
          final Throwable strippedThrowable = ExceptionUtils.stripCompletionException(throwable);
......
        }
      } else {
        log.debug("There is a newer JobManagerRunner for the job {}.", jobId);
      }
    }, getMainThreadExecutor());

  jobManagerRunner.start();

  return jobManagerRunner;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
通过上面的两个地方可以看到这两个类的应用,那么,下面看一下这两个类的实现:


public class JobManagerRunner implements LeaderContender, OnCompletionActions, AutoCloseableAsync {

    private static final Logger log = LoggerFactory.getLogger(JobManagerRunner.class);

    // ------------------------------------------------------------------------

    /** Lock to ensure that this runner can deal with leader election event and job completion notifies simultaneously. */
    private final Object lock = new Object();

    /** The job graph needs to run. */
    private final JobGraph jobGraph;

    /** Used to check whether a job needs to be run. */
    private final RunningJobsRegistry runningJobsRegistry;

    /** Leader election for this job. */
    private final LeaderElectionService leaderElectionService;

    private final LibraryCacheManager libraryCacheManager;

    private final Executor executor;

    private final JobMasterService jobMasterService;

    private final FatalErrorHandler fatalErrorHandler;

    private final CompletableFuture resultFuture;

    private final CompletableFuture terminationFuture;

    private CompletableFuture leadershipOperation;

    /** flag marking the runner as shut down. */
    private volatile boolean shutdown;

    private volatile CompletableFuture leaderGatewayFuture;

    // ------------------------------------------------------------------------

    /**
     * Exceptions that occur while creating the JobManager or JobManagerRunner are directly
     * thrown and not reported to the given {@code FatalErrorHandler}.
     *
     * @throws Exception Thrown if the runner cannot be set up, because either one of the
     *                   required services could not be started, or the Job could not be initialized.
     */
    public JobManagerRunner(
            final JobGraph jobGraph,
            final JobMasterServiceFactory jobMasterFactory,
            final HighAvailabilityServices haServices,
            final LibraryCacheManager libraryCacheManager,
            final Executor executor,
            final FatalErrorHandler fatalErrorHandler) throws Exception {

        this.resultFuture = new CompletableFuture<>();
        this.terminationFuture = new CompletableFuture<>();
        this.leadershipOperation = CompletableFuture.completedFuture(null);

        // make sure we cleanly shut down out JobManager services if initialization fails
        try {
            this.jobGraph = checkNotNull(jobGraph);
            this.libraryCacheManager = checkNotNull(libraryCacheManager);
            this.executor = checkNotNull(executor);
            this.fatalErrorHandler = checkNotNull(fatalErrorHandler);

            checkArgument(jobGraph.getNumberOfVertices() > 0, "The given job is empty");

            // libraries and class loader first
            try {
                libraryCacheManager.registerJob(
                        jobGraph.getJobID(), jobGraph.getUserJarBlobKeys(), jobGraph.getClasspaths());
            } catch (IOException e) {
                throw new Exception("Cannot set up the user code libraries: " + e.getMessage(), e);
            }

            final ClassLoader userCodeLoader = libraryCacheManager.getClassLoader(jobGraph.getJobID());
            if (userCodeLoader == null) {
                throw new Exception("The user code class loader could not be initialized.");
            }

            // high availability services next
            this.runningJobsRegistry = haServices.getRunningJobsRegistry();
            this.leaderElectionService = haServices.getJobManagerLeaderElectionService(jobGraph.getJobID());

            this.leaderGatewayFuture = new CompletableFuture<>();

            // now start the JobManager
            this.jobMasterService = jobMasterFactory.createJobMasterService(jobGraph, this, userCodeLoader);
        }
        catch (Throwable t) {
            terminationFuture.completeExceptionally(t);
            resultFuture.completeExceptionally(t);

            throw new JobExecutionException(jobGraph.getJobID(), "Could not set up JobManager", t);
        }
    }
......

    public void start() throws Exception {
        try {
            leaderElectionService.start(this);
        } catch (Exception e) {
            log.error("Could not start the JobManager because the leader election service did not start.", e);
            throw new Exception("Could not start the leader election service.", e);
        }
    }
......

    private void setNewLeaderGatewayFuture() {
        final CompletableFuture oldLeaderGatewayFuture = leaderGatewayFuture;

        leaderGatewayFuture = new CompletableFuture<>();

        if (!oldLeaderGatewayFuture.isDone()) {
            leaderGatewayFuture.whenComplete(
                (JobMasterGateway jobMasterGateway, Throwable throwable) -> {
                    if (throwable != null) {
                        oldLeaderGatewayFuture.completeExceptionally(throwable);
                    } else {
                        oldLeaderGatewayFuture.complete(jobMasterGateway);
                    }
                });
        }
    }
......
}
//在org.apache.flink.runtime.dispatcher这个包中创建了这个JobManagerRunner
public enum DefaultJobManagerRunnerFactory implements JobManagerRunnerFactory {
    INSTANCE;

    @Override
    public JobManagerRunner createJobManagerRunner(
            JobGraph jobGraph,
            Configuration configuration,
            RpcService rpcService,
            HighAvailabilityServices highAvailabilityServices,
            HeartbeatServices heartbeatServices,
            JobManagerSharedServices jobManagerServices,
            JobManagerJobMetricGroupFactory jobManagerJobMetricGroupFactory,
            FatalErrorHandler fatalErrorHandler) throws Exception {
......

        return new JobManagerRunner(
            jobGraph,
            jobMasterFactory,
            highAvailabilityServices,
            jobManagerServices.getLibraryCacheManager(),
            jobManagerServices.getScheduledExecutorService(),
            fatalErrorHandler);
    }
}

//下面是分发执行图的类
public class ExecutionGraph implements AccessExecutionGraph {

......

    /** Job specific information like the job id, job name, job configuration, etc. */
    private final JobInformation jobInformation;

    /** Serialized job information or a blob key pointing to the offloaded job information. */
    private final Either, PermanentBlobKey> jobInformationOrBlobKey;

    /** The executor which is used to execute futures. */
    private final ScheduledExecutorService futureExecutor;

    /** The executor which is used to execute blocking io operations. */
    private final Executor ioExecutor;

    /** Executor that runs tasks in the job manager's main thread. */
    @Nonnull
    private ComponentMainThreadExecutor jobMasterMainThreadExecutor;

    /** {@code true} if all source tasks are stoppable. */
    private boolean isStoppable = true;

    /** All job vertices that are part of this graph. */
    private final ConcurrentHashMap tasks;

    /** All vertices, in the order in which they were created. **/
    private final List verticesInCreationOrder;

    /** All intermediate results that are part of this graph. */
    private final ConcurrentHashMap intermediateResults;

    /** The currently executed tasks, for callbacks. */
    private final ConcurrentHashMap currentExecutions;

    /** Listeners that receive messages when the entire job switches it status
     * (such as from RUNNING to FINISHED). */
    private final List jobStatusListeners;

    /** Listeners that receive messages whenever a single task execution changes its status. */
    private final List executionListeners;

    /** The implementation that decides how to recover the failures of tasks. */
    private final FailoverStrategy failoverStrategy;
......

    /** Current status of the job execution. */
    private volatile JobStatus state = JobStatus.CREATED;

    /** A future that completes once the job has reached a terminal state. */
    private volatile CompletableFuture terminationFuture;

    /** On each global recovery, this version is incremented. The version breaks conflicts
     * between concurrent restart attempts by local failover strategies. */
    private volatile long globalModVersion;

    /** The exception that caused the job to fail. This is set to the first root exception
     * that was not recoverable and triggered job failure. */
    private volatile Throwable failureCause;

    /** The extended failure cause information for the job. This exists in addition to 'failureCause',
     * to let 'failureCause' be a strong reference to the exception, while this info holds no
     * strong reference to any user-defined classes.*/
    private volatile ErrorInfo failureInfo;

    /**
     * Future for an ongoing or completed scheduling action.
     */
    @Nullable
    private volatile CompletableFuture schedulingFuture;

    // ------ Fields that are relevant to the execution and need to be cleared before archiving  -------

    /** The coordinator for checkpoints, if snapshot checkpoints are enabled. */
    private CheckpointCoordinator checkpointCoordinator;

    /** Checkpoint stats tracker separate from the coordinator in order to be
     * available after archiving. */
    private CheckpointStatsTracker checkpointStatsTracker;

    // ------ Fields that are only relevant for archived execution graphs ------------
    private String jsonPlan;

    @VisibleForTesting
    ExecutionGraph(
            ScheduledExecutorService futureExecutor,
            Executor ioExecutor,
            JobID jobId,
            String jobName,
            Configuration jobConfig,
            SerializedValue serializedConfig,
            Time timeout,
            RestartStrategy restartStrategy,
            SlotProvider slotProvider) throws IOException {

        this(
            new JobInformation(
                jobId,
                jobName,
                serializedConfig,
                jobConfig,
                Collections.emptyList(),
                Collections.emptyList()),
            futureExecutor,
            ioExecutor,
            timeout,
            restartStrategy,
            slotProvider);
    }

......
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
这两个类的基本作用知道了,一个用于从分发器在创建JobManagerRunner时内部启动服务createJobMasterService时由分发向下给JobMaster,一个经JobMaster分发到Task,可见JobMaster是一个中继者,看一下它的定义。

二、JobMaster的构成
这个类是作业管理的类,有点大:

public class JobMaster extends FencedRpcEndpoint implements JobMasterGateway, JobMasterService {

    /** Default names for Flink's distributed components. */
    public static final String JOB_MANAGER_NAME = "jobmanager";
    public static final String ARCHIVE_NAME = "archive";

    // ------------------------------------------------------------------------

    private final JobMasterConfiguration jobMasterConfiguration;

    private final ResourceID resourceId;

    private final JobGraph jobGraph;

    private final Time rpcTimeout;

    private final HighAvailabilityServices highAvailabilityServices;

    private final BlobWriter blobWriter;

    private final JobManagerJobMetricGroupFactory jobMetricGroupFactory;

    private final HeartbeatManager taskManagerHeartbeatManager;

    private final HeartbeatManager resourceManagerHeartbeatManager;

    private final ScheduledExecutorService scheduledExecutorService;

    private final OnCompletionActions jobCompletionActions;

    private final FatalErrorHandler fatalErrorHandler;

    private final ClassLoader userCodeLoader;

    private final SlotPool slotPool;

    private final Scheduler scheduler;

    private final RestartStrategy restartStrategy;

    // --------- BackPressure --------

    private final BackPressureStatsTracker backPressureStatsTracker;

    // --------- ResourceManager --------

    private final LeaderRetrievalService resourceManagerLeaderRetriever;

    // --------- TaskManagers --------

    private final Map> registeredTaskManagers;

    // -------- Mutable fields ---------

    private ExecutionGraph executionGraph;

    @Nullable
    private JobManagerJobStatusListener jobStatusListener;

    @Nullable
    private JobManagerJobMetricGroup jobManagerJobMetricGroup;

    @Nullable
    private String lastInternalSavepoint;

    @Nullable
    private ResourceManagerAddress resourceManagerAddress;

    @Nullable
    private ResourceManagerConnection resourceManagerConnection;

    @Nullable
    private EstablishedResourceManagerConnection establishedResourceManagerConnection;

    private Map accumulators;

    // ------------------------------------------------------------------------

    public JobMaster(
            RpcService rpcService,
            JobMasterConfiguration jobMasterConfiguration,
            ResourceID resourceId,
            JobGraph jobGraph,
            HighAvailabilityServices highAvailabilityService,
            SlotPoolFactory slotPoolFactory,
            SchedulerFactory schedulerFactory,
            JobManagerSharedServices jobManagerSharedServices,
            HeartbeatServices heartbeatServices,
            JobManagerJobMetricGroupFactory jobMetricGroupFactory,
            OnCompletionActions jobCompletionActions,
            FatalErrorHandler fatalErrorHandler,
            ClassLoader userCodeLoader) throws Exception {

        super(rpcService, AkkaRpcServiceUtils.createRandomName(JOB_MANAGER_NAME));

        final JobMasterGateway selfGateway = getSelfGateway(JobMasterGateway.class);

        this.jobMasterConfiguration = checkNotNull(jobMasterConfiguration);
        this.resourceId = checkNotNull(resourceId);
        this.jobGraph = checkNotNull(jobGraph);
        this.rpcTimeout = jobMasterConfiguration.getRpcTimeout();
        this.highAvailabilityServices = checkNotNull(highAvailabilityService);
        this.blobWriter = jobManagerSharedServices.getBlobWriter();
        this.scheduledExecutorService = jobManagerSharedServices.getScheduledExecutorService();
        this.jobCompletionActions = checkNotNull(jobCompletionActions);
        this.fatalErrorHandler = checkNotNull(fatalErrorHandler);
        this.userCodeLoader = checkNotNull(userCodeLoader);
        this.jobMetricGroupFactory = checkNotNull(jobMetricGroupFactory);

        this.taskManagerHeartbeatManager = heartbeatServices.createHeartbeatManagerSender(
            resourceId,
            new TaskManagerHeartbeatListener(selfGateway),
            rpcService.getScheduledExecutor(),
            log);

        this.resourceManagerHeartbeatManager = heartbeatServices.createHeartbeatManager(
                resourceId,
                new ResourceManagerHeartbeatListener(),
                rpcService.getScheduledExecutor(),
                log);

        final String jobName = jobGraph.getName();
        final JobID jid = jobGraph.getJobID();

        log.info("Initializing job {} ({}).", jobName, jid);

        final RestartStrategies.RestartStrategyConfiguration restartStrategyConfiguration =
                jobGraph.getSerializedExecutionConfig()
                        .deserializeValue(userCodeLoader)
                        .getRestartStrategy();

        this.restartStrategy = RestartStrategyResolving.resolve(restartStrategyConfiguration,
            jobManagerSharedServices.getRestartStrategyFactory(),
            jobGraph.isCheckpointingEnabled());

        log.info("Using restart strategy {} for {} ({}).", this.restartStrategy, jobName, jid);

        resourceManagerLeaderRetriever = highAvailabilityServices.getResourceManagerLeaderRetriever();

        this.slotPool = checkNotNull(slotPoolFactory).createSlotPool(jobGraph.getJobID());

        this.scheduler = checkNotNull(schedulerFactory).createScheduler(slotPool);

        this.registeredTaskManagers = new HashMap<>(4);

        this.backPressureStatsTracker = checkNotNull(jobManagerSharedServices.getBackPressureStatsTracker());
        this.lastInternalSavepoint = null;

        this.jobManagerJobMetricGroup = jobMetricGroupFactory.create(jobGraph);
        this.executionGraph = createAndRestoreExecutionGraph(jobManagerJobMetricGroup);
        this.jobStatusListener = null;

        this.resourceManagerConnection = null;
        this.establishedResourceManagerConnection = null;

        this.accumulators = new HashMap<>();
    }

    @Override
    public CompletableFuture rescaleOperators(
            Collection operators,
            int newParallelism,
            RescalingBehaviour rescalingBehaviour,
            Time timeout) {

        if (newParallelism <= 0) {
            return FutureUtils.completedExceptionally(
                new JobModificationException("The target parallelism of a rescaling operation must be larger than 0."));
        }

        // 1. Check whether we can rescale the job & rescale the respective vertices
        try {
            rescaleJobGraph(operators, newParallelism, rescalingBehaviour);
        } catch (FlinkException e) {
            final String msg = String.format("Cannot rescale job %s.", jobGraph.getName());

            log.info(msg, e);
            return FutureUtils.completedExceptionally(new JobModificationException(msg, e));
        }

        final ExecutionGraph currentExecutionGraph = executionGraph;

        final JobManagerJobMetricGroup newJobManagerJobMetricGroup = jobMetricGroupFactory.create(jobGraph);
        final ExecutionGraph newExecutionGraph;

        try {
            newExecutionGraph = createExecutionGraph(newJobManagerJobMetricGroup);
        } catch (JobExecutionException | JobException e) {
            return FutureUtils.completedExceptionally(
                new JobModificationException("Could not create rescaled ExecutionGraph.", e));
        }

        // 3. disable checkpoint coordinator to suppress subsequent checkpoints
        final CheckpointCoordinator checkpointCoordinator = currentExecutionGraph.getCheckpointCoordinator();
        checkpointCoordinator.stopCheckpointScheduler();

        // 4. take a savepoint
        final CompletableFuture savepointFuture = getJobModificationSavepoint(timeout);

        final CompletableFuture executionGraphFuture = restoreExecutionGraphFromRescalingSavepoint(
            newExecutionGraph,
            savepointFuture)
            .handleAsync(
                (ExecutionGraph executionGraph, Throwable failure) -> {
                    if (failure != null) {
                        // in case that we couldn't take a savepoint or restore from it, let's restart the checkpoint
                        // coordinator and abort the rescaling operation
                        if (checkpointCoordinator.isPeriodicCheckpointingConfigured()) {
                            checkpointCoordinator.startCheckpointScheduler();
                        }

                        throw new CompletionException(ExceptionUtils.stripCompletionException(failure));
                    } else {
                        return executionGraph;
                    }
                },
                getMainThreadExecutor());

        // 5. suspend the current job
        final CompletableFuture terminationFuture = executionGraphFuture.thenComposeAsync(
            (ExecutionGraph ignored) -> {
                suspendExecutionGraph(new FlinkException("Job is being rescaled."));
                return currentExecutionGraph.getTerminationFuture();
            },
            getMainThreadExecutor());

        final CompletableFuture suspendedFuture = terminationFuture.thenAccept(
            (JobStatus jobStatus) -> {
                if (jobStatus != JobStatus.SUSPENDED) {
                    final String msg = String.format("Job %s rescaling failed because we could not suspend the execution graph.", jobGraph.getName());
                    log.info(msg);
                    throw new CompletionException(new JobModificationException(msg));
                }
            });

        // 6. resume the new execution graph from the taken savepoint
        final CompletableFuture rescalingFuture = suspendedFuture.thenCombineAsync(
            executionGraphFuture,
            (Void ignored, ExecutionGraph restoredExecutionGraph) -> {
                // check if the ExecutionGraph is still the same
                if (executionGraph == currentExecutionGraph) {
                    clearExecutionGraphFields();
                    assignExecutionGraph(restoredExecutionGraph, newJobManagerJobMetricGroup);
                    scheduleExecutionGraph();

                    return Acknowledge.get();
                } else {
                    throw new CompletionException(new JobModificationException("Detected concurrent modification of ExecutionGraph. Aborting the rescaling."));
                }

            },
            getMainThreadExecutor());

        rescalingFuture.whenCompleteAsync(
            (Acknowledge ignored, Throwable throwable) -> {
                if (throwable != null) {
                    // fail the newly created execution graph
                    newExecutionGraph.failGlobal(
                        new SuppressRestartsException(
                            new FlinkException(
                                String.format("Failed to rescale the job %s.", jobGraph.getJobID()),
                                throwable)));
                }
            }, getMainThreadExecutor());

        return rescalingFuture;
    }

    @Override
    public CompletableFuture> offerSlots(
            final ResourceID taskManagerId,
            final Collection slots,
            final Time timeout) {

        Tuple2 taskManager = registeredTaskManagers.get(taskManagerId);

        if (taskManager == null) {
            return FutureUtils.completedExceptionally(new Exception("Unknown TaskManager " + taskManagerId));
        }

        final TaskManagerLocation taskManagerLocation = taskManager.f0;
        final TaskExecutorGateway taskExecutorGateway = taskManager.f1;

        final RpcTaskManagerGateway rpcTaskManagerGateway = new RpcTaskManagerGateway(taskExecutorGateway, getFencingToken());

        return CompletableFuture.completedFuture(
            slotPool.offerSlots(
                taskManagerLocation,
                rpcTaskManagerGateway,
                slots));
    }

    @Override
    public void failSlot(
            final ResourceID taskManagerId,
            final AllocationID allocationId,
            final Exception cause) {

        if (registeredTaskManagers.containsKey(taskManagerId)) {
            internalFailAllocation(allocationId, cause);
        } else {
            log.warn("Cannot fail slot " + allocationId + " because the TaskManager " +
            taskManagerId + " is unknown.");
        }
    }


    @Override
    public CompletableFuture registerTaskManager(
            final String taskManagerRpcAddress,
            final TaskManagerLocation taskManagerLocation,
            final Time timeout) {

        final ResourceID taskManagerId = taskManagerLocation.getResourceID();

        if (registeredTaskManagers.containsKey(taskManagerId)) {
            final RegistrationResponse response = new JMTMRegistrationSuccess(resourceId);
            return CompletableFuture.completedFuture(response);
        } else {
            return getRpcService()
                .connect(taskManagerRpcAddress, TaskExecutorGateway.class)
                .handleAsync(
                    (TaskExecutorGateway taskExecutorGateway, Throwable throwable) -> {
                        if (throwable != null) {
                            return new RegistrationResponse.Decline(throwable.getMessage());
                        }

                        slotPool.registerTaskManager(taskManagerId);
                        registeredTaskManagers.put(taskManagerId, Tuple2.of(taskManagerLocation, taskExecutorGateway));

                        // monitor the task manager as heartbeat target
                        taskManagerHeartbeatManager.monitorTarget(taskManagerId, new HeartbeatTarget() {
                            @Override
                            public void receiveHeartbeat(ResourceID resourceID, Void payload) {
                                // the task manager will not request heartbeat, so this method will never be called currently
                            }

                            @Override
                            public void requestHeartbeat(ResourceID resourceID, Void payload) {
                                taskExecutorGateway.heartbeatFromJobManager(resourceID);
                            }
                        });

                        return new JMTMRegistrationSuccess(resourceId);
                    },
                    getMainThreadExecutor());
        }
    }

    private Acknowledge startJobExecution(JobMasterId newJobMasterId) throws Exception {

        validateRunsInMainThread();

        checkNotNull(newJobMasterId, "The new JobMasterId must not be null.");

        if (Objects.equals(getFencingToken(), newJobMasterId)) {
            log.info("Already started the job execution with JobMasterId {}.", newJobMasterId);

            return Acknowledge.get();
        }

        setNewFencingToken(newJobMasterId);

        startJobMasterServices();

        log.info("Starting execution of job {} ({}) under job master id {}.", jobGraph.getName(), jobGraph.getJobID(), newJobMasterId);

        resetAndScheduleExecutionGraph();

        return Acknowledge.get();
    }

    private void startJobMasterServices() throws Exception {
        // start the slot pool make sure the slot pool now accepts messages for this leader
        slotPool.start(getFencingToken(), getAddress(), getMainThreadExecutor());
        scheduler.start(getMainThreadExecutor());

        //TODO: Remove once the ZooKeeperLeaderRetrieval returns the stored address upon start
        // try to reconnect to previously known leader
        reconnectToResourceManager(new FlinkException("Starting JobMaster component."));

        // job is ready to go, try to establish connection with resource manager
        //   - activate leader retrieval for the resource manager
        //   - on notification of the leader, the connection will be established and
        //     the slot pool will start requesting slots
        resourceManagerLeaderRetriever.start(new ResourceManagerLeaderListener());
    }

    private void setNewFencingToken(JobMasterId newJobMasterId) {
        if (getFencingToken() != null) {
            log.info("Restarting old job with JobMasterId {}. The new JobMasterId is {}.", getFencingToken(), newJobMasterId);

            // first we have to suspend the current execution
            suspendExecution(new FlinkException("Old job with JobMasterId " + getFencingToken() +
                " is restarted with a new JobMasterId " + newJobMasterId + '.'));
        }

        // set new leader id
        setFencingToken(newJobMasterId);
    }


    private void assignExecutionGraph(
            ExecutionGraph newExecutionGraph,
            JobManagerJobMetricGroup newJobManagerJobMetricGroup) {
        validateRunsInMainThread();
        checkState(executionGraph.getState().isTerminalState());
        checkState(jobManagerJobMetricGroup == null);

        executionGraph = newExecutionGraph;
        jobManagerJobMetricGroup = newJobManagerJobMetricGroup;
    }


    private void scheduleExecutionGraph() {
        checkState(jobStatusListener == null);
        // register self as job status change listener
        jobStatusListener = new JobManagerJobStatusListener();
        executionGraph.registerJobStatusListener(jobStatusListener);

        try {
            executionGraph.scheduleForExecution();
        }
        catch (Throwable t) {
            executionGraph.failGlobal(t);
        }
    }

    private ExecutionGraph createAndRestoreExecutionGraph(JobManagerJobMetricGroup currentJobManagerJobMetricGroup) throws Exception {

        ExecutionGraph newExecutionGraph = createExecutionGraph(currentJobManagerJobMetricGroup);

        final CheckpointCoordinator checkpointCoordinator = newExecutionGraph.getCheckpointCoordinator();

        if (checkpointCoordinator != null) {
            // check whether we find a valid checkpoint
            if (!checkpointCoordinator.restoreLatestCheckpointedState(
                newExecutionGraph.getAllVertices(),
                false,
                false)) {

                // check whether we can restore from a savepoint
                tryRestoreExecutionGraphFromSavepoint(newExecutionGraph, jobGraph.getSavepointRestoreSettings());
            }
        }

        return newExecutionGraph;
    }

    private ExecutionGraph createExecutionGraph(JobManagerJobMetricGroup currentJobManagerJobMetricGroup) throws JobExecutionException, JobException {
        return ExecutionGraphBuilder.buildGraph(
            null,
            jobGraph,
            jobMasterConfiguration.getConfiguration(),
            scheduledExecutorService,
            scheduledExecutorService,
            scheduler,
            userCodeLoader,
            highAvailabilityServices.getCheckpointRecoveryFactory(),
            rpcTimeout,
            restartStrategy,
            currentJobManagerJobMetricGroup,
            blobWriter,
            jobMasterConfiguration.getSlotRequestTimeout(),
            log);
    }

......

    private CompletableFuture restoreExecutionGraphFromRescalingSavepoint(ExecutionGraph newExecutionGraph, CompletableFuture savepointFuture) {
        return savepointFuture
            .thenApplyAsync(
                (@Nullable String savepointPath) -> {
                    if (savepointPath != null) {
                        try {
                            tryRestoreExecutionGraphFromSavepoint(newExecutionGraph, SavepointRestoreSettings.forPath(savepointPath, false));
                        } catch (Exception e) {
                            final String message = String.format("Could not restore from temporary rescaling savepoint. This might indicate " +
                                    "that the savepoint %s got corrupted. Deleting this savepoint as a precaution.",
                                savepointPath);

                            log.info(message);

                            CompletableFuture
                                .runAsync(
                                    () -> {
                                        if (savepointPath.equals(lastInternalSavepoint)) {
                                            lastInternalSavepoint = null;
                                        }
                                    },
                                    getMainThreadExecutor())
                                .thenRunAsync(
                                    () -> disposeSavepoint(savepointPath),
                                    scheduledExecutorService);

                            throw new CompletionException(new JobModificationException(message, e));
                        }
                    } else {
                        // No rescaling savepoint, restart from the initial savepoint or none
                        try {
                            tryRestoreExecutionGraphFromSavepoint(newExecutionGraph, jobGraph.getSavepointRestoreSettings());
                        } catch (Exception e) {
                            final String message = String.format("Could not restore from initial savepoint. This might indicate " +
                                "that the savepoint %s got corrupted.", jobGraph.getSavepointRestoreSettings().getRestorePath());

                            log.info(message);

                            throw new CompletionException(new JobModificationException(message, e));
                        }
                    }

                    return newExecutionGraph;
                }, scheduledExecutorService);
    }

    private CompletableFuture getJobModificationSavepoint(Time timeout) {
        return triggerSavepoint(
......
                getMainThreadExecutor());
    }

     */
    private void rescaleJobGraph(Collection operators, int newParallelism, RescalingBehaviour rescalingBehaviour) throws FlinkException {
        for (JobVertexID jobVertexId : operators) {
            final JobVertex jobVertex = jobGraph.findVertexByID(jobVertexId);

            // update max parallelism in case that it has not been configured
            final ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId);

            if (executionJobVertex != null) {
                jobVertex.setMaxParallelism(executionJobVertex.getMaxParallelism());
            }

            rescalingBehaviour.accept(jobVertex, newParallelism);
        }
    }

    @Override
    public JobMasterGateway getGateway() {
        return getSelfGateway(JobMasterGateway.class);
    }

    private class ResourceManagerLeaderListener implements LeaderRetrievalListener {

        @Override
        public void notifyLeaderAddress(final String leaderAddress, final UUID leaderSessionID) {
            runAsync(
                () -> notifyOfNewResourceManagerLeader(
                    leaderAddress,
                    ResourceManagerId.fromUuidOrNull(leaderSessionID)));
        }

        @Override
        public void handleError(final Exception exception) {
            handleJobMasterError(new Exception("Fatal error in the ResourceManager leader service", exception));
        }
    }

    private class ResourceManagerConnection
            extends RegisteredRpcConnection {
        private final JobID jobID;
......
    }

    //----------------------------------------------------------------------------------------------

    private class JobManagerJobStatusListener implements JobStatusListener {

        private volatile boolean running = true;
......
    }

    private class TaskManagerHeartbeatListener implements HeartbeatListener {

        private final JobMasterGateway jobMasterGateway;
......
    }

    private class ResourceManagerHeartbeatListener implements HeartbeatListener {
......
        }

}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
这个类有点长。但是通过看它的变量和功能函数,可以发现它主要有以下几个功能:
1、工作图的调度执行和管理
2、资源管理(Leader、Gateway、心跳等)
3、任务管理
4、调度分配
5、BackPressure控制
这里重点介绍一下作业图的调度管理,在JobMaster的构造函数里,会把大量的相关服务注册进来,同时得到JobGraph的ID,同时拿到Leader的信息。当然其它的一些基本的状态和管理数据结构也会根据配置文件等进行创建。其中最典型的是createSlotPool和createScheduler等,详细的内容可以看一下上面的构造函数的代码。需要注意的是,这里就包含上面提到的ExecutionGraph,如果仔细看,会发现这个变量几乎贯穿了整个JobMaster这个类,下面会重点分析一下这个变量的创建和使用。
在JobMaster中最主要的就是干了两件事,一个是JobGraph(通过ExecutionGraph)的处理分配,另外一个就是监听并处理分配任务的结果及状态。为了提高处理的效率,这里肯定要使用异步的通信机制了,所以这里要把CompletableFuture和CompletionStage这两个JAVA的基础类的用法搞清楚。
在成员函数里可以看到开始就有start,suspend,onStop,cancel,stop这几个最基础的控制接口。完成的功能也相对来说简单,启动里启动RPC服务,异步启动工作执行,在取消和停止里可以看到对ExecutionGraph的相关操作。在rescaleJob和rescaleOperators中,涉及到了JobVertex,在前面提到过,它是多个operator组成的。它和ExecutionJobVertex对应,而其又和ExecutionVertex相对应。或者这样来理解,为了提高异步的并行度,每个JobGraph对应着并行化ExecutionGraph,它是JobMaster最主要的数据结构和功能。而每个ExecutionVertex是ExecutionJobVertex的一个并发的子任务。
在rescaleOperators,一个重要的部分是重新扩展的动作,会引起检查点和保存点的重新处理,以保证数据流计算的安全性和及时性。而一下updateTaskExecutionState由后面的Task相关来进行更新,requestNextInputSplit函数则获得下一个Task的split.在这个函数内部,得到数据后,得调用Execution按照尝试进行执行,需要注意的是作业和任务之间通过ExecutionAttemptID 来进行联系。再向后是两个检查点相关的函数declineCheckpoint和acknowledgeCheckpoint。
在JobMaster中,还有一个KvStateRegistryGateway的接口的相关实现,它是从JobMasterService继承下来,其实就是一个对象的查找表,看名字也可以大致猜得出来。查找以后进行通知动作。下面的一些槽的处理和相关动作略过,看一下registerTaskManager这个函数,它主要是通过RPC来实现槽池的注册。而requestJobDetails这个函数,则是前面提到的通过具体的实时控制来得到阶段性的状态信息(最后完成才叫结果)。
startCheckpointScheduler检查点调度触发。startJobExecution,内部的JOB启动执行。startJobMasterServices启动JOB相关的服务。setNewFencingToken标记令牌和ID的绑定。assignExecutionGraph,分配执行图,等于是给定了自定义的执行方式。resetAndScheduleExecutionGraph这个函数用来重新规划和调度执行图,这个非常有用,在实际情况中可能会不断的重新执行某一段作业标记。
scheduleExecutionGraph,这个把真正的作业图调度起来,供任务使用。createAndRestoreExecutionGraph,createExecutionGraph,这几个都是对ExecutionGraph的管理和调度。JobStatusChanged这个从字面上就可以看出是作业状态的变化,它由ExecutionGraph进行控制,具体的可以看一下JobStatus这个枚举体的定义。
restoreExecutionGraphFromRescalingSavepoint,恢复保存点。rescaleJobGraph这个函数被rescaleOperators调用,重新处理操作的并行性。
从这上面的分析来看,requestNextInputSplit和scheduleExecutionGraph这两个函数,基本上把ExecutionGraph和Task挂接起来,而rescaleOperators则是提供了一个非常灵活的调用机制。

三、总结
flink的jobmaster还是很好的完成了作业图从作业到任务的传递的。不过,仍然有Java语言中的一些不爽的部分,继承又乱又不好查找,可能熟悉了就会好一些。到这里,作业图的传递基本完成了,下来就开始任务的启动和相关的处理动作。

你可能感兴趣的:(flink)