// 获取运行环境
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
这行代码会返回一个可用的执行环境,是flink程序执行的上下文,记录了相关配,如并行度等,并提供了一系列方法,如输入流的读入方法,运行整个程序的execute方法等,对于分步式流处理程序来说,flatMap,keyBy等等操作,都可以理解为一种声明,告诉整个程序采用了什么样的算子(这段文字参考自https://www.cnblogs.com/bethunebtj/p/9168274.html),接下来我们开始进入到代码内部,看看运行环境的获取过程。
代码讲解
我们开始看代码:
/**
* Creates an execution environment that represents the context in which the
* program is currently executed. If the program is invoked standalone, this
* method returns a local execution environment, as returned by
* {@link #createLocalEnvironment()}.
*
* @return The execution environment of the context in which the program is
* executed.
*/
public static StreamExecutionEnvironment getExecutionEnvironment() {
return Utils.resolveFactory(threadLocalContextEnvironmentFactory, contextEnvironmentFactory)
.map(StreamExecutionEnvironmentFactory::createExecutionEnvironment)
.orElseGet(StreamExecutionEnvironment::createStreamExecutionEnvironment);
}
其中threadLocalContextEnvironmentFactory的定义如下:
/** The ThreadLocal used to store {@link StreamExecutionEnvironmentFactory}. */
private static final ThreadLocal threadLocalContextEnvironmentFactory =
new ThreadLocal<>();
可以看到这是一个ThreadLocal
contextEnvironmentFactory变量定义代码如下
/**
* The environment of the context (local by default, cluster if invoked through command line).
*/
private static StreamExecutionEnvironmentFactory contextEnvironmentFactory = null;
resolveFactory函数,代码如下:
/**
* Resolves the given factories. The thread local factory has preference over the static factory.
* If none is set, the method returns {@link Optional#empty()}.
*
* @param threadLocalFactory containing the thread local factory
* @param staticFactory containing the global factory
* @param type of factory
* @return Optional containing the resolved factory if it exists, otherwise it's empty
*/
public static Optional resolveFactory(ThreadLocal threadLocalFactory, @Nullable T staticFactory) {
//从线程缓存中获取localFactory
final T localFactory = threadLocalFactory.get();
//如果线程缓存中没有找到那么就采用staticFactory
final T factory = localFactory == null ? staticFactory : localFactory;
//创建Optional类对象,值为facory(这里facory为null会抛出异常)
return Optional.ofNullable(factory);
}
map函数,代码如下:
/**
* If a value is present, apply the provided mapping function to it,
* and if the result is non-null, return an {@code Optional} describing the
* result. Otherwise return an empty {@code Optional}.
*
* @apiNote This method supports post-processing on optional values, without
* the need to explicitly check for a return status. For example, the
* following code traverses a stream of file names, selects one that has
* not yet been processed, and then opens that file, returning an
* {@code Optional}:
*
* {@code
* Optional fis =
* names.stream().filter(name -> !isProcessedYet(name))
* .findFirst()
* .map(name -> new FileInputStream(name));
* }
*
* Here, {@code findFirst} returns an {@code Optional}, and then
* {@code map} returns an {@code Optional} for the desired
* file if one exists.
*
* @param The type of the result of the mapping function
* @param mapper a mapping function to apply to the value, if present
* @return an {@code Optional} describing the result of applying a mapping
* function to the value of this {@code Optional}, if a value is present,
* otherwise an empty {@code Optional}
* @throws NullPointerException if the mapping function is null
*/
public Optional map(Function super T, ? extends U> mapper) {
//断言,如果mapper为null就抛出异常
Objects.requireNonNull(mapper);
if (!isPresent())
//如果当前的Optional类对象的value变量值为null,那么就返回一个成员变量value为null的Optional类对象
return empty();
else {
//否则创建一个StreamExecutionEnvironment类对象同时创建一个Optional类对象
return Optional.ofNullable(mapper.apply(value));
}
}
orElseGet函数,代码如下:
/**
* Return the value if present, otherwise invoke {@code other} and return
* the result of that invocation.
*
* @param other a {@code Supplier} whose result is returned if no value
* is present
* @return the value if present otherwise the result of {@code other.get()}
* @throws NullPointerException if value is not present and {@code other} is
* null
*/
public T orElseGet(Supplier extends T> other) {
//如果value不为null那么就采用value,否则采用other.get()
return value != null ? value : other.get();
}
总结一下,flink中获取环境变量的步骤是:
1、先从本地线程缓存中获取实现StreamExecutionEnvironmentFactory接口的类对象,如果没有那么采用contextEnvironmentFactory变量,并将该类对象封装在Optional类对象中,返回一个value为StreamExecutionEnvironmentFactory接口类对象的OPtional类对象---------resolveFactory函数
2、然后调用Optional类对象的map函数,如果在1中创建了StreamExecutionEnvironmentFactory接口的类对象,那么就调用该接口类对象的createExecutionEnvironment函数创建StreamExecutionEnvironment类对象,如果1中StreamExecutionEnvironmentFactory接口的类对象为null,那么就封装一个value为null的Optional类对象,返回一个value为StreamExecutionEnvironment类对象的Optional类对象-----------map函数
3、如果上面没有获取到StreamExecutionEnvironment类对象,那么就调用StreamExecutionEnvironment类中的静态函数createStreamExecutionEnvironment来获取StreamExecutionEnvironment类对象--------orElseGet函数
createStreamExecutionEnvironment函数代码如下:
private static StreamExecutionEnvironment createStreamExecutionEnvironment() {
// because the streaming project depends on "flink-clients" (and not the other way around)
// we currently need to intercept the data set environment and create a dependent stream env.
// this should be fixed once we rework the project dependencies
ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
if (env instanceof ContextEnvironment) {
return new StreamContextEnvironment((ContextEnvironment) env);
} else if (env instanceof OptimizerPlanEnvironment || env instanceof PreviewPlanEnvironment) {
return new StreamPlanEnvironment(env);
} else {
return createLocalEnvironment();
}
}
createStreamExecutionEnvironment函数我们下篇继续,看看它里面做了些什么。