待执行SQL:
insert into sink
select emp_no, birth_date, first_name, last_name, gender, hire_date
from source u;
在《Dlink0.7.0初探》里,讲到了insert语句最终端调用了Flink Table API的executeInternal方法:
public class TableEnvironmentImpl implements TableEnvironmentInternal {
@Override
public TableResult executeInternal(List<ModifyOperation> operations) {
List<Transformation<?>> transformations = translate(operations);
List<String> sinkIdentifierNames = extractSinkIdentifierNames(operations);
String jobName = getJobName("insert-into_" + String.join(",", sinkIdentifierNames));
Pipeline pipeline = execEnv.createPipeline(transformations, tableConfig, jobName);
try {
JobClient jobClient = execEnv.executeAsync(pipeline);
TableSchema.Builder builder = TableSchema.builder();
Object[] affectedRowCounts = new Long[operations.size()];
for (int i = 0; i < operations.size(); ++i) {
// use sink identifier name as field name
builder.field(sinkIdentifierNames.get(i), DataTypes.BIGINT());
affectedRowCounts[i] = -1L;
}
return TableResultImpl.builder()
.jobClient(jobClient)
.resultKind(ResultKind.SUCCESS_WITH_CONTENT)
.tableSchema(builder.build())
.data(
new InsertResultIterator(
jobClient, Row.of(affectedRowCounts), userClassLoader))
.build();
} catch (Exception e) {
throw new TableException("Failed to execute sql", e);
}
}
}
先跟踪第一句代码,translate:
交给planner,执行转换操作,planner是一个SteamPlanner,可以看到它是一个很重量级的对象,在这里它是一个scala对象(StreamPlanner.scala);
StreamPlanner.scala继承了PlannerBase.scala,这意味着它也继承了translate方法,TableEnvironment.java的translate方法调用planner.translate方法时,实质是调用了StreamPlanner.scala所继承的PlannerBase.scala的translate方法;
translate执行完后,返回List
可以看到Transformation的这个抽象类的几个子类的对象,下面介绍一下;
translate方法返回的是Transformation类对象,它其实是Flink对算子的额外信息的封装,比如算子的名字、id、输出类型、输入、并行度等等这些信息;Transformation 代表了从一个或多个 DataStream 生成新 DataStream 的操作;
@Internal
public abstract class Transformation<T> {
// 分配 唯一ID 使用
// This is used to assign a unique ID to every Transformation
protected static Integer idCounter = 0;
protected final int id;
protected String name;
// 输出类型通过TypeInformation类封装,
// 用来生成序列化用的serializers和比较大小用的comparators,以及进行一些类型检查。
protected TypeInformation<T> outputType;
// This is used to handle MissingTypeInfo. As long as the outputType has not been queried
// it can still be changed using setOutputType(). Afterwards an exception is thrown when
// trying to change the output type.
protected boolean typeUsed;
private int parallelism;
// ......
}
DataStream 的底层其实就是一个Transformation,描述了这个 DataStream 是怎么来的,看看DataStream,其中一个关键字段就是Transformation:
package org.apache.flink.streaming.api.datastream;
@Public
public class DataStream<T> {
protected final StreamExecutionEnvironment environment;
protected final Transformation<T> transformation;
/**
* Create a new {@link DataStream} in the given execution environment with partitioning set to
* forward by default.
*
* @param environment The StreamExecutionEnvironment
*/
public DataStream(StreamExecutionEnvironment environment, Transformation<T> transformation) {
this.environment =
Preconditions.checkNotNull(environment, "Execution Environment must not be null.");
this.transformation =
Preconditions.checkNotNull(
transformation, "Stream Transformation must not be null.");
}
@Internal
public int getId() {
return transformation.getId();
}
public int getParallelism() {
return transformation.getParallelism();
}
@PublicEvolving
public ResourceSpec getMinResources() {
return transformation.getMinResources();
}
// ......
}
看看Transformation类的继承关系:
上面这个是全景,可能有些看不清楚,这里把关键几个类来个特写,看清它们的继承关系:
Transformation的实现类很多,一级实现类也很多,重点讲我们这里用到的 PhysicalTransformation,创建物理操作,它的实现类LegacySinkTransformation、OneInputTransformation是这里涉及到的;
创建物理操作,它启用设置{@link ChainingStrategy}
PhysicalTransformation 继承Transformation抽象类, 新增了一个 setChainingStrategy的方法, 通过该方法可以定义 operator 的连接方式
@Internal
public abstract class PhysicalTransformation<T> extends Transformation<T> {
/**
* Creates a new {@code Transformation} with the given name, output type and parallelism.
*
* @param name The name of the {@code Transformation}, this will be shown in Visualizations and
* the Log
* @param outputType The output type of this {@code Transformation}
* @param parallelism The parallelism of this {@code Transformation}
*/
PhysicalTransformation(String name, TypeInformation<T> outputType, int parallelism) {
super(name, outputType, parallelism);
}
/** Sets the chaining strategy of this {@code Transformation}. */
public abstract void setChainingStrategy(ChainingStrategy strategy);
}
ChainingStrategy 是一个枚举类,它定义算子的链接方案;当一个操作符链接到前置线程时,意味着它们在同一个线程中运行。
@PublicEvolving
public enum ChainingStrategy {
/**
* Operators will be eagerly chained whenever possible.
*
* To optimize performance, it is generally a good practice to allow maximal chaining and
* increase operator parallelism.
*/
ALWAYS,
/** The operator will not be chained to the preceding or succeeding operators. */
NEVER,
/**
* The operator will not be chained to the predecessor, but successors may chain to this
* operator.
*/
HEAD,
/**
* This operator will run at the head of a chain (similar as in {@link #HEAD}, but it will
* additionally try to chain source inputs if possible. This allows multi-input operators to be
* chained with multiple sources into one task.
*/
HEAD_WITH_SOURCES;
public static final ChainingStrategy DEFAULT_CHAINING_STRATEGY = ALWAYS;
}
下面对ChainingStrategy,进行一个说明
类型 | 描述 |
---|---|
ALWAYS | 算子会尽可能的Chain在一起(为了优化性能,最好是使用最大数量的chain和加大算子的并行度) |
NEVER | 当前算子不会与前置和后置算子进行Chain |
HEAD | 当前算子允许被后置算子Chain,但不会与前置算子进行Chain |
HEAD_WITH_SOURCES | 与HEAD类似,但此策略会尝试Chain Source算子 |
表示将算子 org.apache.flink.streaming.api.operators.OneInputStreamOperator 应用到输入 Transformation.
其实单纯从代码上,看不出太多的逻辑,很多是无形的Flink框架来做的;
可以看到它有一个关键的属性,input:
public class OneInputTransformation<IN, OUT> extends PhysicalTransformation<OUT> {
// ......
private final Transformation<IN> input;
// ......
}
我们这个例子里,input值是一个LegacySourceTransformation,就是我们创建的“create table source”;
public class LegacySourceTransformation<T> extends PhysicalTransformation<T>
implements WithBoundedness {
private final StreamOperatorFactory<T> operatorFactory;
private final Boundedness boundedness;
public LegacySourceTransformation(
String name,
StreamSource<T, ?> operator,
TypeInformation<T> outputType,
int parallelism,
Boundedness boundedness) {
this(name, SimpleOperatorFactory.of(operator), outputType, parallelism, boundedness);
}
// ......
}
在LegacySourceTransformation内部是由StreamOperatorFactory、sourceName、outputType、并行度、数据源的格式(是否有界)等;boundedness属性可以决定这个数据源是流式还是批式;
public class LegacySinkTransformation<T> extends PhysicalTransformation<Object> {
private final Transformation<T> input;
private final StreamOperatorFactory<Object> operatorFactory;
// We need this because sinks can also have state that is partitioned by key
private KeySelector<T, ?> stateKeySelector;
private TypeInformation<?> stateKeyType;
public LegacySinkTransformation(
Transformation<T> input, String name, StreamSink<T> operator, int parallelism) {
this(input, name, SimpleOperatorFactory.of(operator), parallelism);
}
public LegacySinkTransformation(
Transformation<T> input,
String name,
StreamOperatorFactory<Object> operatorFactory,
int parallelism) {
super(name, TypeExtractor.getForClass(Object.class), parallelism);
this.input = input;
this.operatorFactory = operatorFactory;
}
// ......
}
这个例子中,translate执行完后返回List
public class TableEnvironmentImpl implements TableEnvironmentInternal {
// ......
private List<String> extractSinkIdentifierNames(List<ModifyOperation> operations) {
List<String> tableNames = new ArrayList<>(operations.size());
Map<String, Integer> tableNameToCount = new HashMap<>();
for (ModifyOperation operation : operations) {
if (operation instanceof CatalogSinkModifyOperation) {
ObjectIdentifier identifier =
((CatalogSinkModifyOperation) operation).getTableIdentifier();
String fullName = identifier.asSummaryString();
tableNames.add(fullName);
tableNameToCount.put(fullName, tableNameToCount.getOrDefault(fullName, 0) + 1);
} else {
throw new UnsupportedOperationException("Unsupported operation: " + operation);
}
}
Map<String, Integer> tableNameToIndex = new HashMap<>();
return tableNames.stream()
.map(
tableName -> {
if (tableNameToCount.get(tableName) == 1) {
return tableName;
} else {
Integer index = tableNameToIndex.getOrDefault(tableName, 0) + 1;
tableNameToIndex.put(tableName, index);
return tableName + "_" + index;
}
})
.collect(Collectors.toList());
}
// ......
}
没特别复杂的逻辑,得到sink的标识符,当入参operation 是 CatalogSinkModifyOperation 才可能有返回值;
这个过程上次有所提及,现在详解一下
最核心的是构造一个StreamGraphGenerator, 然后通过它的generate()方法生成StreamGraph;生成StreamGraph过程中比较重要的就是遍历transformations,将其元素转为计算图,过程比较复杂,调用链比较深,上面尽力把关键代码在一张图里展现;
从TableEnvironmentImpl的executeAsync到ExecutorBase的executeAsync,再到StreamExecutionEnvironment的executeAsync,它再调用DefaultExecutorServiceLoader的getExecutorFactory;
getExecutorFactory这个函数需要返回的是PipelineExecutorFactory,它首先根据配置文件加载并实例化PipelineExecutorFactory的实现类,然后获取加载的factory列表(上图中下面变量表里的factories的knownProvides的values值,这里是4个),再遍历所有加载的factory,将和配置文件兼容的factory放入返回值中,经过异常检查后返回;
这个是Flink源码中的类,到Flink源码中查找该类,有5个,分别位于源码flink-clients、flink-java、flink-kubernetes、flink-streaming-java、flink-yarn子项目中,如下:
flink-clients中的org.apache.flink.core.execution.PipelineExecutorFactory内容如下:
org.apache.flink.client.deployment.executors.RemoteExecutorFactory
org.apache.flink.client.deployment.executors.LocalExecutorFactory
看一下LocalExecutorFactory:
/** An {@link PipelineExecutorFactory} for {@link LocalExecutor local executors}. */
@Internal
public class LocalExecutorFactory implements PipelineExecutorFactory {
@Override
public String getName() {
return LocalExecutor.NAME;
}
@Override
public boolean isCompatibleWith(final Configuration configuration) {
return LocalExecutor.NAME.equalsIgnoreCase(configuration.get(DeploymentOptions.TARGET));
}
@Override
public PipelineExecutor getExecutor(final Configuration configuration) {
return LocalExecutor.create(configuration);
}
}
其中的isCompatibleWith方法,在executeAsync中被调用过,其实它就是和flink当前运行环境的Configuration(来源于配置文件中的execution.target属性值,这里是IDEA环境运行,默认配置为local)进行比较,符合的返回true;
PipelineExecutor主要是完成StreamGraph到JobGraph的转换;
再看一下从TableEnvironmentImpl的executeAsync到ExecutorBase的executeAsync,再到StreamExecutionEnvironment的executeAsync,再调用LocalExecutorFactory的getExecutor:
紧接着是执行LocalExecutor的execute方法,来看一下,调用还是比较深的:
看一下JobGraph的内容:
JobGraph即将被发送到flink集群,可以看到,很多关键字段的内容是二进制的,比如inputs、chainedTaskConfig_、udf、chainedOutputs等;
上图重点展示了LocalExecutor的execute方法中获取到JobGraph的逻辑,然后是提交JobGraph到Flink集群的逻辑,这也是最重量级的一个操作了,这里是IDEA环境调试,所以用的是MiniCluster:
public class LocalExecutor implements PipelineExecutor {
// ......
public CompletableFuture<JobClient> execute(
Pipeline pipeline, Configuration configuration, ClassLoader userCodeClassloader)
throws Exception {
// ......
return PerJobMiniClusterFactory.createWithFactory(effectiveConfig, miniClusterFactory)
.submitJob(jobGraph, userCodeClassloader);
}
//......
}
上图重点展示了LocalExecutor中的execute的最后的JobGraph到Flink集群的逻辑,先是创建并得到PerJobMiniClusterFactory,然后将JobGraph提交到Flink集群;
PerJobMiniClusterFactory的submitJob方法,首先是创建一个MiniCluster,然后调用它的start方法启动,start方法比较长,主要是启动了一大堆的服务:
PerJobMiniClusterFactory的submitJob方法执行完MiniCluster的start方法后,就是执行MiniCluster类对象的submitJob方法了,它主要是返回创建并返回CompletableFuture对象,
实质上这里最原始的CompletableFuture对象是调用工具类生成的,它是org.apache.flink.runtime.concurrent.FutureUtils:
public class FutureUtils {
// ......
public static <T> CompletableFuture<T> retryWithDelay(
final Supplier<CompletableFuture<T>> operation,
final RetryStrategy retryStrategy,
final ScheduledExecutor scheduledExecutor) {
return retryWithDelay(operation, retryStrategy, (throwable) -> true, scheduledExecutor);
}
public static <T> CompletableFuture<T> retryWithDelay(
final Supplier<CompletableFuture<T>> operation,
final RetryStrategy retryStrategy,
final Predicate<Throwable> retryPredicate,
final ScheduledExecutor scheduledExecutor) {
final CompletableFuture<T> resultFuture = new CompletableFuture<>();
retryOperationWithDelay(
resultFuture, operation, retryStrategy, retryPredicate, scheduledExecutor);
return resultFuture;
}
private static <T> void retryOperationWithDelay(
final CompletableFuture<T> resultFuture,
final Supplier<CompletableFuture<T>> operation,
final RetryStrategy retryStrategy,
final Predicate<Throwable> retryPredicate,
final ScheduledExecutor scheduledExecutor) {
if (!resultFuture.isDone()) {
final CompletableFuture<T> operationResultFuture = operation.get();
operationResultFuture.whenComplete(
(t, throwable) -> {
if (throwable != null) {
if (throwable instanceof CancellationException) {
resultFuture.completeExceptionally(
new RetryException(
"Operation future was cancelled.", throwable));
} else {
throwable = ExceptionUtils.stripExecutionException(throwable);
if (!retryPredicate.test(throwable)) {
resultFuture.completeExceptionally(throwable);
} else if (retryStrategy.getNumRemainingRetries() > 0) {
long retryDelayMillis =
retryStrategy.getRetryDelay().toMillis();
final ScheduledFuture<?> scheduledFuture =
scheduledExecutor.schedule(
(Runnable)() -> retryOperationWithDelay(
resultFuture,
operation,
retryStrategy.getNextRetryStrategy(),
retryPredicate,
scheduledExecutor),
retryDelayMillis,
TimeUnit.MILLISECONDS);
resultFuture.whenComplete( (innerT,
innerThrowable) -> scheduledFuture.cancel(false));
} else {
RetryException retryException =
new RetryException(
"Could not complete the operation. Number of retries has been exhausted.",
throwable);
resultFuture.completeExceptionally(retryException);
}
}
} else {
resultFuture.complete(t);
}
});
resultFuture.whenComplete((t, throwable) -> operationResultFuture.cancel(false));
}
}
// ......
}