本篇文章主要是结合hui中涉及到的HoodieTableFactory和HoodieCatalogFactory来说明一下Flink中createDynamicTableSource/createDynamicTableSink/createCatalog是什么时候被调用的
最主要的逻辑还是在PlannerBase的translate的方法中:
override def translate(
modifyOperations: util.List[ModifyOperation]): util.List[Transformation[_]] = {
validateAndOverrideConfiguration()
if (modifyOperations.isEmpty) {
return List.empty[Transformation[_]]
}
val relNodes = modifyOperations.map(translateToRel)
val optimizedRelNodes = optimize(relNodes)
val execGraph = translateToExecNodeGraph(optimizedRelNodes)
val transformations = translateToPlan(execGraph)
cleanupInternalConfigurations()
transformations
}
以上逻辑是SQL转换为Flink transformation的流程,对应的图为:
对应到我们这里的调用流程为:
modifyOperations.translateToRel
||
\/
planBase.getTableSink
||
\/
factoryUtil.createTableSink
||
\/
HoodieTableFactory.createDynamicTableSink
也说该方法的调用是在逻辑生成阶段的.(createDynamicTableSource方法的调用逻辑也是一样的)
Flink中要创建并且使用自定义的catalog可以通过如下方式:
// java中
tableEnv.registerCatalog("myhive", catalog);
// sql中
CREATE CATALOG hoodie_catalog
WITH (
'type'='hudi',
'catalog.path' = '${catalog root path}', -- only valid if the table options has no explicit declaration of table path
'hive.conf.dir' = '${dir path where hive-site.xml is located}',
'mode'='hms' -- also support 'dfs' mode so that all the table metadata are stored with the filesystem
);
对应到SQL中的调用逻辑为
TableEnvironmentImpl.executeInternal
||
\/
TableEnvironmentImpl.createCatalog
||
\/
FactoryUtil.createCatalog
||
\/
HoodieCatalogFactory.createCatalog
最终会调用catalogManager.registerCatalog方法,用catalogManager管理了起来,这样在用到的时候就会调用该get方法得到对应的catalog