Flink本地执行模式分析

首先编写一个简单的入口程序:

import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;

import java.util.ArrayList;
import java.util.List;

public class Test {

    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        List list = new ArrayList<>();
        list.add("xxx");
        list.add("yyy");
        list.add("zzz");
        DataStream stream = env.fromCollection(list);
        stream.print();
        env.execute();
    }

}

当本地执行main方法时, 第一步是获取到env, 中间是流处理(生成streamgraph), 最后是执行env。

这里获取到的env类型是LocalStreamEnvironment, 当env调用execute方法时,代码如下:

@Override
	public JobExecutionResult execute(String jobName) throws Exception {
		// transform the streaming program into a JobGraph
		StreamGraph streamGraph = getStreamGraph();
		streamGraph.setJobName(jobName);

		JobGraph jobGraph = streamGraph.getJobGraph();

		Configuration configuration = new Configuration();
		configuration.addAll(jobGraph.getJobConfiguration());

		configuration.setLong(TaskManagerOptions.MANAGED_MEMORY_SIZE, -1L);
		configuration.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, jobGraph.getMaximumParallelism());

		// add (and override) the settings with what the user defined
		configuration.addAll(this.conf);

		if (LOG.isInfoEnabled()) {
			LOG.info("Running job on local embedded Flink mini cluster");
		}

		LocalFlinkMiniCluster exec = new LocalFlinkMiniCluster(configuration, true);
		try {
			exec.start();
			return exec.submitJobAndWait(jobGraph, getConfig().isSysoutLoggingEnabled());
		}
		finally {
			transformations.clear();
			exec.stop();
		}
	}

总结一下,execute()所做的工作如下: 

1. 将env的streamgraph转化为jobgraph

2.  设置任务运行的配置信息configuration

3.  启动LocalFlinkMiniCluster

4.  提交jobgraph到Cluster

你可能感兴趣的:(Flink本地执行模式分析)