十三、Flink源码阅读--Client提交任务过程

我们以flink中自带的wordcount例子作为提交任务,查看在客户端Flink提交任务的详细源码步骤。

入口分析

首先我们看一下提交命令如下:

bin/flink run  examples/batch/WordCount.jar

接着会在bin/flink shell脚本中找到提交的主类org.apache.flink.client.cli.CliFrontend。我们从CliFrontend的main函数开始

源码分析
cli.parseParameters(args));// 解析参数+提交
==> 然后进到parseParameters方法,执行到switch case ACTION_RUN
1 ==> run(params); ==> runProgram ==> 

2 ==> 
if (isNewMode && clusterId == null && runOptions.getDetachedMode()) {//yarn per job -d模式
 ...
} else {
	if (clusterId != null) {//standalone -d模式 和 yarn-session模式
		client = clusterDescriptor.retrieve(clusterId);
		shutdownHook = null;
	} else {//job有多部分组成
	}
}

==>我们一个个看,先看yarn per job模式
clusterDescriptor.deployJobCluster ==》 YarnClusterDescriptor.deployJobCluster() ==> AbstractYarnClusterDescriptor.deployInternal
==》 startAppMaster ==》yarnClient.submitApplication(appContext); 最后返回一个 ClusterClient ==> client关闭

==>接着看stand-alone模式
clusterDescriptor.retrieve(clusterId); ===> StandaloneClusterDescriptor.retrieve 返回一个 RestClusterClient类型的client

==>和yarn-session模式,yarn sesion模式 app已经启动,这里只需要通过 appid获取相关的appmaster 的 ip port ,返回一个ClusterClient类型的 client
clusterDescriptor.retrieve(clusterId); ===> AbstractYarnClusterDescriptor.retrieve ===> createYarnClusterClient  ===> return  ClusterClient

3 ===> 接着往下看到 executeProgram(program, client, userParallelism);
===》 JobSubmissionResult result = client.run(program, parallelism); //此时是RestClusterClient 提交任务
===》 prog.invokeInteractiveModeForExecution();  //反射调用 wordcount的main 函数
===》 执行到DataSet.print 会触发 ExecutionEnvironment.execute()执行 ,实际调用的是ContextEnvironment.execute()方法
===》调用ClusterClient.run方法,最后会调用RestClusterClient.submitJob()方法,将jobGraph 用rest请求提交到JobManager,至此客户端提交完成。
stand-alone提交的堆栈

右下往上依次为调用过程,可以对着源码仔细查看。

	at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:261)
	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:486)
	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:474)
	at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62)
	at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:817)
	at org.apache.flink.api.java.DataSet.collect(DataSet.java:413)
	at org.apache.flink.api.java.DataSet.print(DataSet.java:1652)
	at org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:88)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
	at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:426)
	at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:816)
	at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:290)
	at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:216)
	at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1053)
	at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1129)
	at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1129)

你可能感兴趣的:(Apache,Flink)