详解TableEnvironment里的executeInternal执行过程(Dinky Flink)

详解TableEnvironment里的executeInternal执行过程(Dinky Flink)

1. insert into执行过程详解

​ 待执行SQL:

insert into sink
select emp_no, birth_date, first_name, last_name, gender, hire_date
from source u;

​ 在《Dlink0.7.0初探》里,讲到了insert语句最终端调用了Flink Table API的executeInternal方法:

    public class TableEnvironmentImpl implements TableEnvironmentInternal {
        @Override
        public TableResult executeInternal(List<ModifyOperation> operations) {
            List<Transformation<?>> transformations = translate(operations);
            List<String> sinkIdentifierNames = extractSinkIdentifierNames(operations);
            String jobName = getJobName("insert-into_" + String.join(",", sinkIdentifierNames));
            Pipeline pipeline = execEnv.createPipeline(transformations, tableConfig, jobName);
            try {
                JobClient jobClient = execEnv.executeAsync(pipeline);
                TableSchema.Builder builder = TableSchema.builder();
                Object[] affectedRowCounts = new Long[operations.size()];
                for (int i = 0; i < operations.size(); ++i) {
                    // use sink identifier name as field name
                    builder.field(sinkIdentifierNames.get(i), DataTypes.BIGINT());
                    affectedRowCounts[i] = -1L;
                }
    
                return TableResultImpl.builder()
                        .jobClient(jobClient)
                        .resultKind(ResultKind.SUCCESS_WITH_CONTENT)
                        .tableSchema(builder.build())
                        .data(
                                new InsertResultIterator(
                                        jobClient, Row.of(affectedRowCounts), userClassLoader))
                        .build();
            } catch (Exception e) {
                throw new TableException("Failed to execute sql", e);
            }
        }
    }

1.1 translate

​ 先跟踪第一句代码,translate:
详解TableEnvironment里的executeInternal执行过程(Dinky Flink)_第1张图片
​ 交给planner,执行转换操作,planner是一个SteamPlanner,可以看到它是一个很重量级的对象,在这里它是一个scala对象(StreamPlanner.scala);
详解TableEnvironment里的executeInternal执行过程(Dinky Flink)_第2张图片
​ StreamPlanner.scala继承了PlannerBase.scala,这意味着它也继承了translate方法,TableEnvironment.java的translate方法调用planner.translate方法时,实质是调用了StreamPlanner.scala所继承的PlannerBase.scala的translate方法;
详解TableEnvironment里的executeInternal执行过程(Dinky Flink)_第3张图片
​ translate执行完后,返回List>,看一下它的值:
详解TableEnvironment里的executeInternal执行过程(Dinky Flink)_第4张图片
​ 可以看到Transformation的这个抽象类的几个子类的对象,下面介绍一下;

1.1.1 Transformation

​ translate方法返回的是Transformation类对象,它其实是Flink对算子的额外信息的封装,比如算子的名字、id、输出类型、输入、并行度等等这些信息;Transformation 代表了从一个或多个 DataStream 生成新 DataStream 的操作;

@Internal
public abstract class Transformation<T> {

    // 分配 唯一ID 使用
    // This is used to assign a unique ID to every Transformation
    protected static Integer idCounter = 0;

    protected final int id;

    protected String name;

    // 输出类型通过TypeInformation类封装,
    // 用来生成序列化用的serializers和比较大小用的comparators,以及进行一些类型检查。
    protected TypeInformation<T> outputType;
    // This is used to handle MissingTypeInfo. As long as the outputType has not been queried
    // it can still be changed using setOutputType(). Afterwards an exception is thrown when
    // trying to change the output type.
    protected boolean typeUsed;

    private int parallelism;

    // ......
}

​ DataStream 的底层其实就是一个Transformation,描述了这个 DataStream 是怎么来的,看看DataStream,其中一个关键字段就是Transformation:

package org.apache.flink.streaming.api.datastream;

@Public
public class DataStream<T> {

    protected final StreamExecutionEnvironment environment;

    protected final Transformation<T> transformation;

    /**
     * Create a new {@link DataStream} in the given execution environment with partitioning set to
     * forward by default.
     *
     * @param environment The StreamExecutionEnvironment
     */
    public DataStream(StreamExecutionEnvironment environment, Transformation<T> transformation) {
        this.environment =
                Preconditions.checkNotNull(environment, "Execution Environment must not be null.");
        this.transformation =
                Preconditions.checkNotNull(
                        transformation, "Stream Transformation must not be null.");
    }

    @Internal
    public int getId() {
        return transformation.getId();
    }

    public int getParallelism() {
        return transformation.getParallelism();
    }

    @PublicEvolving
    public ResourceSpec getMinResources() {
        return transformation.getMinResources();
    }
    
    // ......
}

​ 看看Transformation类的继承关系:
详解TableEnvironment里的executeInternal执行过程(Dinky Flink)_第5张图片
​ 上面这个是全景,可能有些看不清楚,这里把关键几个类来个特写,看清它们的继承关系:
详解TableEnvironment里的executeInternal执行过程(Dinky Flink)_第6张图片
​ Transformation的实现类很多,一级实现类也很多,重点讲我们这里用到的 PhysicalTransformation,创建物理操作,它的实现类LegacySinkTransformation、OneInputTransformation是这里涉及到的;

1.1.1.1 PhysicalTransformation

​ 创建物理操作,它启用设置{@link ChainingStrategy}

​ PhysicalTransformation 继承Transformation抽象类, 新增了一个 setChainingStrategy的方法, 通过该方法可以定义 operator 的连接方式

@Internal
public abstract class PhysicalTransformation<T> extends Transformation<T> {

    /**
     * Creates a new {@code Transformation} with the given name, output type and parallelism.
     *
     * @param name The name of the {@code Transformation}, this will be shown in Visualizations and
     *     the Log
     * @param outputType The output type of this {@code Transformation}
     * @param parallelism The parallelism of this {@code Transformation}
     */
    PhysicalTransformation(String name, TypeInformation<T> outputType, int parallelism) {
        super(name, outputType, parallelism);
    }

    /** Sets the chaining strategy of this {@code Transformation}. */
    public abstract void setChainingStrategy(ChainingStrategy strategy);
}

​ ChainingStrategy 是一个枚举类,它定义算子的链接方案;当一个操作符链接到前置线程时,意味着它们在同一个线程中运行。

@PublicEvolving
public enum ChainingStrategy {

    /**
     * Operators will be eagerly chained whenever possible.
     *
     * 

To optimize performance, it is generally a good practice to allow maximal chaining and * increase operator parallelism. */ ALWAYS, /** The operator will not be chained to the preceding or succeeding operators. */ NEVER, /** * The operator will not be chained to the predecessor, but successors may chain to this * operator. */ HEAD, /** * This operator will run at the head of a chain (similar as in {@link #HEAD}, but it will * additionally try to chain source inputs if possible. This allows multi-input operators to be * chained with multiple sources into one task. */ HEAD_WITH_SOURCES; public static final ChainingStrategy DEFAULT_CHAINING_STRATEGY = ALWAYS; }

​ 下面对ChainingStrategy,进行一个说明

类型 描述
ALWAYS 算子会尽可能的Chain在一起(为了优化性能,最好是使用最大数量的chain和加大算子的并行度)
NEVER 当前算子不会与前置和后置算子进行Chain
HEAD 当前算子允许被后置算子Chain,但不会与前置算子进行Chain
HEAD_WITH_SOURCES 与HEAD类似,但此策略会尝试Chain Source算子
1.1.1.2 OneInputTransformation

​ 表示将算子 org.apache.flink.streaming.api.operators.OneInputStreamOperator 应用到输入 Transformation.
详解TableEnvironment里的executeInternal执行过程(Dinky Flink)_第7张图片
​ 其实单纯从代码上,看不出太多的逻辑,很多是无形的Flink框架来做的;

​ 可以看到它有一个关键的属性,input:

public class OneInputTransformation<IN, OUT> extends PhysicalTransformation<OUT> {
    // ......
    private final Transformation<IN> input;
    // ......
}

​ 我们这个例子里,input值是一个LegacySourceTransformation,就是我们创建的“create table source”;

1.1.1.3 LegacySourceTransformation
public class LegacySourceTransformation<T> extends PhysicalTransformation<T>
        implements WithBoundedness {

    private final StreamOperatorFactory<T> operatorFactory;

    private final Boundedness boundedness;

    public LegacySourceTransformation(
            String name,
            StreamSource<T, ?> operator,
            TypeInformation<T> outputType,
            int parallelism,
            Boundedness boundedness) {
        this(name, SimpleOperatorFactory.of(operator), outputType, parallelism, boundedness);
    }
    // ......
}

​ 在LegacySourceTransformation内部是由StreamOperatorFactory、sourceName、outputType、并行度、数据源的格式(是否有界)等;boundedness属性可以决定这个数据源是流式还是批式;

1.1.1.4 LegacySinkTransformation
public class LegacySinkTransformation<T> extends PhysicalTransformation<Object> {

    private final Transformation<T> input;

    private final StreamOperatorFactory<Object> operatorFactory;

    // We need this because sinks can also have state that is partitioned by key
    private KeySelector<T, ?> stateKeySelector;

    private TypeInformation<?> stateKeyType;

    public LegacySinkTransformation(
            Transformation<T> input, String name, StreamSink<T> operator, int parallelism) {
        this(input, name, SimpleOperatorFactory.of(operator), parallelism);
    }

    public LegacySinkTransformation(
            Transformation<T> input,
            String name,
            StreamOperatorFactory<Object> operatorFactory,
            int parallelism) {
        super(name, TypeExtractor.getForClass(Object.class), parallelism);
        this.input = input;
        this.operatorFactory = operatorFactory;
    }
    // ......
}

​ 这个例子中,translate执行完后返回List>就只有一个元素,并且就是LegacySinkTransformation的类对象,该对象的input属性值是一个OneInputTransformation类对象,而OneInputTransformation类对象的input属性值是一个LegacySourceTransformation类对象,这就是一个责任链设计模式的体现;

1.2 extractSinkIdentifierNames

public class TableEnvironmentImpl implements TableEnvironmentInternal {
    // ......
    private List<String> extractSinkIdentifierNames(List<ModifyOperation> operations) {
        List<String> tableNames = new ArrayList<>(operations.size());
        Map<String, Integer> tableNameToCount = new HashMap<>();
        for (ModifyOperation operation : operations) {
            if (operation instanceof CatalogSinkModifyOperation) {
                ObjectIdentifier identifier =
                        ((CatalogSinkModifyOperation) operation).getTableIdentifier();
                String fullName = identifier.asSummaryString();
                tableNames.add(fullName);
                tableNameToCount.put(fullName, tableNameToCount.getOrDefault(fullName, 0) + 1);
            } else {
                throw new UnsupportedOperationException("Unsupported operation: " + operation);
            }
        }
        Map<String, Integer> tableNameToIndex = new HashMap<>();
        return tableNames.stream()
                .map(
                        tableName -> {
                            if (tableNameToCount.get(tableName) == 1) {
                                return tableName;
                            } else {
                                Integer index = tableNameToIndex.getOrDefault(tableName, 0) + 1;
                                tableNameToIndex.put(tableName, index);
                                return tableName + "_" + index;
                            }
                        })
                .collect(Collectors.toList());
    }
    // ......
}

​ 没特别复杂的逻辑,得到sink的标识符,当入参operation 是 CatalogSinkModifyOperation 才可能有返回值;

1.3 createPipeline

​ 这个过程上次有所提及,现在详解一下
详解TableEnvironment里的executeInternal执行过程(Dinky Flink)_第8张图片
​ 最核心的是构造一个StreamGraphGenerator, 然后通过它的generate()方法生成StreamGraph;生成StreamGraph过程中比较重要的就是遍历transformations,将其元素转为计算图,过程比较复杂,调用链比较深,上面尽力把关键代码在一张图里展现;

1.4 executeAsync

​ 从TableEnvironmentImpl的executeAsync到ExecutorBase的executeAsync,再到StreamExecutionEnvironment的executeAsync,它再调用DefaultExecutorServiceLoader的getExecutorFactory;
详解TableEnvironment里的executeInternal执行过程(Dinky Flink)_第9张图片
​ getExecutorFactory这个函数需要返回的是PipelineExecutorFactory,它首先根据配置文件加载并实例化PipelineExecutorFactory的实现类,然后获取加载的factory列表(上图中下面变量表里的factories的knownProvides的values值,这里是4个),再遍历所有加载的factory,将和配置文件兼容的factory放入返回值中,经过异常检查后返回;

1.4.1 PipelineExecutorFactory

​ 这个是Flink源码中的类,到Flink源码中查找该类,有5个,分别位于源码flink-clients、flink-java、flink-kubernetes、flink-streaming-java、flink-yarn子项目中,如下:
详解TableEnvironment里的executeInternal执行过程(Dinky Flink)_第10张图片
​ flink-clients中的org.apache.flink.core.execution.PipelineExecutorFactory内容如下:

org.apache.flink.client.deployment.executors.RemoteExecutorFactory
org.apache.flink.client.deployment.executors.LocalExecutorFactory

​ 看一下LocalExecutorFactory:

/** An {@link PipelineExecutorFactory} for {@link LocalExecutor local executors}. */
@Internal
public class LocalExecutorFactory implements PipelineExecutorFactory {

    @Override
    public String getName() {
        return LocalExecutor.NAME;
    }

    @Override
    public boolean isCompatibleWith(final Configuration configuration) {
        return LocalExecutor.NAME.equalsIgnoreCase(configuration.get(DeploymentOptions.TARGET));
    }

    @Override
    public PipelineExecutor getExecutor(final Configuration configuration) {
        return LocalExecutor.create(configuration);
    }
}

​ 其中的isCompatibleWith方法,在executeAsync中被调用过,其实它就是和flink当前运行环境的Configuration(来源于配置文件中的execution.target属性值,这里是IDEA环境运行,默认配置为local)进行比较,符合的返回true;

1.4.2 PipelineExecutor

​ PipelineExecutor主要是完成StreamGraph到JobGraph的转换;

​ 再看一下从TableEnvironmentImpl的executeAsync到ExecutorBase的executeAsync,再到StreamExecutionEnvironment的executeAsync,再调用LocalExecutorFactory的getExecutor:
详解TableEnvironment里的executeInternal执行过程(Dinky Flink)_第11张图片
​ 紧接着是执行LocalExecutor的execute方法,来看一下,调用还是比较深的:
详解TableEnvironment里的executeInternal执行过程(Dinky Flink)_第12张图片
​ 看一下JobGraph的内容:
详解TableEnvironment里的executeInternal执行过程(Dinky Flink)_第13张图片
​ JobGraph即将被发送到flink集群,可以看到,很多关键字段的内容是二进制的,比如inputs、chainedTaskConfig_、udf、chainedOutputs等;

​ 上图重点展示了LocalExecutor的execute方法中获取到JobGraph的逻辑,然后是提交JobGraph到Flink集群的逻辑,这也是最重量级的一个操作了,这里是IDEA环境调试,所以用的是MiniCluster:

public class LocalExecutor implements PipelineExecutor {
    // ......
    public CompletableFuture<JobClient> execute(
            Pipeline pipeline, Configuration configuration, ClassLoader userCodeClassloader)
            throws Exception {
        // ......
        return PerJobMiniClusterFactory.createWithFactory(effectiveConfig, miniClusterFactory)
                .submitJob(jobGraph, userCodeClassloader);
    }
    
    //......
}

详解TableEnvironment里的executeInternal执行过程(Dinky Flink)_第14张图片
​ 上图重点展示了LocalExecutor中的execute的最后的JobGraph到Flink集群的逻辑,先是创建并得到PerJobMiniClusterFactory,然后将JobGraph提交到Flink集群;

​ PerJobMiniClusterFactory的submitJob方法,首先是创建一个MiniCluster,然后调用它的start方法启动,start方法比较长,主要是启动了一大堆的服务:

  1. 初始化IO相关配置
  2. 创建监控相关配置
  3. 如果共享一个RPC服务,创建本地RPC服务,创建通用RPC服务工厂,启动RPC查询服务
  4. 如果不共用RPC服务,获取JobManager和TaskManager地址和端口范围,分别创建各个组件的factory和服务等,
  5. 启动监控查询服务
  6. 创建进程监控指标组
  7. 创建IO线程池
  8. 创建高可用服务
  9. 启动blobServer
  10. 创建心跳服务
  11. 创建blob缓存服务
  12. 启动TaskManager
  13. 创建监控查询获取服务
  14. 创建Dispatcher和ResourceManager,它们在同一个进程中运行
  15. 创建ResourceManager leader获取服务
  16. 创建Dispatcher gateway获取服务
  17. 创建WebMonitor leader获取服务
  18. 分别启动上面创建了但未启动的服务

​ PerJobMiniClusterFactory的submitJob方法执行完MiniCluster的start方法后,就是执行MiniCluster类对象的submitJob方法了,它主要是返回创建并返回CompletableFuture对象,
详解TableEnvironment里的executeInternal执行过程(Dinky Flink)_第15张图片
​ 实质上这里最原始的CompletableFuture对象是调用工具类生成的,它是org.apache.flink.runtime.concurrent.FutureUtils:

public class FutureUtils {
    // ......

    public static <T> CompletableFuture<T> retryWithDelay(
            final Supplier<CompletableFuture<T>> operation,
            final RetryStrategy retryStrategy,
            final ScheduledExecutor scheduledExecutor) {
        return retryWithDelay(operation, retryStrategy, (throwable) -> true, scheduledExecutor);
    }

    public static <T> CompletableFuture<T> retryWithDelay(
            final Supplier<CompletableFuture<T>> operation,
            final RetryStrategy retryStrategy,
            final Predicate<Throwable> retryPredicate,
            final ScheduledExecutor scheduledExecutor) {

        final CompletableFuture<T> resultFuture = new CompletableFuture<>();

        retryOperationWithDelay(
                resultFuture, operation, retryStrategy, retryPredicate, scheduledExecutor);

        return resultFuture;
    }

    private static <T> void retryOperationWithDelay(
            final CompletableFuture<T> resultFuture,
            final Supplier<CompletableFuture<T>> operation,
            final RetryStrategy retryStrategy,
            final Predicate<Throwable> retryPredicate,
            final ScheduledExecutor scheduledExecutor) {

        if (!resultFuture.isDone()) {
            final CompletableFuture<T> operationResultFuture = operation.get();

            operationResultFuture.whenComplete(
            (t, throwable) -> {
                if (throwable != null) {
                    if (throwable instanceof CancellationException) {
                        resultFuture.completeExceptionally(
                                new RetryException(
                                        "Operation future was cancelled.", throwable));
                    } else {
                        throwable = ExceptionUtils.stripExecutionException(throwable);
                        if (!retryPredicate.test(throwable)) {
                            resultFuture.completeExceptionally(throwable);
                        } else if (retryStrategy.getNumRemainingRetries() > 0) {
                            long retryDelayMillis =
                               retryStrategy.getRetryDelay().toMillis();
                            final ScheduledFuture<?> scheduledFuture =
                               scheduledExecutor.schedule(
                                    (Runnable)() -> retryOperationWithDelay(
                                        resultFuture,
                                        operation,
                                        retryStrategy.getNextRetryStrategy(),
                                        retryPredicate,
                                        scheduledExecutor),
                                    retryDelayMillis,
                                    TimeUnit.MILLISECONDS);

                            resultFuture.whenComplete( (innerT, 
                                    innerThrowable) -> scheduledFuture.cancel(false));
                        } else {
                            RetryException retryException =
                                new RetryException(
                                    "Could not complete the operation. Number of retries has been exhausted.",
                                    throwable);
                            resultFuture.completeExceptionally(retryException);
                        }
                    }
                } else {
                    resultFuture.complete(t);
                }
            });

            resultFuture.whenComplete((t, throwable) -> operationResultFuture.cancel(false));
        }
    }
    // ......
}

你可能感兴趣的:(Flink,flink,java)