本篇介绍一下flink在我司的平台化实践,讲个大概,不能讲太详细。
uber AthenaX https://athenax.readthedocs.io/en/latest/, 这个项目就不详细介绍了,官网写的比较详细,核心功能就是封装flink sql 转换为 JobGraph,然后提交到yarn集群运行。
AthenaX源码中最核心的代码就是JobCompiler的生成JobGraph这个方法
JobGraph getJobGraph() throws IOException {
StreamExecutionEnvironment exeEnv = env.execEnv();
exeEnv.setParallelism(job.parallelism());
this
.registerUdfs()
.registerInputCatalogs();
Table table = env.sqlQuery(job.sql());
for (String t : job.outputs().listTables()) {
table.writeToSink(getOutputTable(job.outputs().getTable(t)));
}
StreamGraph streamGraph = exeEnv.getStreamGraph();
return streamGraph.getJobGraph();
}
env是一个 private final StreamTableEnvironment env;
这里巧妙的将StreamTableEnvironment转为StreamExecutionEnvironment。我们可以在StreamExecutionEnvironment上设置并发度,checkpoint,backend等等一些属性。底层还是利用了flink的Table api,封装为Table对象去执行。还有就是注册source, udf, sink,sideoutput等。从而封装了一个完整的flink任务,构建为JobGraph。
我司在这里进行了一些改造,借鉴了AthenaX的思想,主要是为这个任务构造了更多的设置,这些都是配置前端后传到后台来的,有setCheckpointingMode,setCheckpointInterval,enableExternalizedCheckpoints,restartAttempts,delayBetweenAttempts,retentionTime,setRestartStrategy。等等,
还有就是丰富了 source(目前还是kafka一种) , sink(丰富了多种sink,比如hbase,redis,kafka,es,mysql等) , udf (自己实现了很多udf供业务方使用)
至于说udf和sink怎么扩展,flink官网写的比较清楚的。
JobGraph getJobGraph() throws Exception {
// 判断是否存在维表
hasSideTable = false;
Map sidesMap = job.getSidesMap();
if(sidesMap.size() != 0)
hasSideTable =true;
StreamQueryConfig queryConfig = env.queryConfig();
StreamExecutionEnvironment exeEnv = env.execEnv();
Map extraProps = job.getExtraProps();
String cpm = extraProps.getOrDefault(ParamConstant.CheckpointingMode,"EXACTLY_ONCE");
if(cpm.equalsIgnoreCase("EXACTLY_ONCE"))
exeEnv.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);
else{
exeEnv.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.AT_LEAST_ONCE);
}
String cpi = extraProps.getOrDefault(ParamConstant.CheckpointInterval,"300000");
exeEnv.getCheckpointConfig().setCheckpointInterval(Long.valueOf(cpi));
String cleanup = extraProps.getOrDefault(ParamConstant.ExternalizedCheckpointCleanup,"RETAIN_ON_CANCELLATION");
if(cleanup.equalsIgnoreCase("RETAIN_ON_CANCELLATION")){
exeEnv.getCheckpointConfig().enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION);
}else {
exeEnv.getCheckpointConfig().enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.DELETE_ON_CANCELLATION);
}
String cpsb = extraProps.getOrDefault(ParamConstant.StateBackendPath,"hdfs:/mlink/flink-checkpoints");
exeEnv.setStateBackend(new FsStateBackend(cpsb));
Map jobProps = new HashMap<>();
jobProps.putAll(extraProps);
ParameterTool parameter = ParameterTool.fromMap(jobProps);
exeEnv.getConfig().setGlobalJobParameters(parameter);
String restartAttempts = extraProps.getOrDefault(ParamConstant.RestartAttempts,"5");
String delayTime = extraProps.getOrDefault(ParamConstant.DelayBetweenAttempts,"10000");
String retentionTime = extraProps.getOrDefault(ParamConstant.RetentionTime,"1h");
if (retentionTime != null && !"".equals(retentionTime) && !"null".equals(retentionTime)){
Integer time = Integer.valueOf(retentionTime.substring(0, retentionTime.length() - 1));
Integer seconds = 0;
if (retentionTime.toLowerCase().endsWith("d")) { // day
seconds = time * 24 * 60 * 60;
} else if (retentionTime.toLowerCase().endsWith("h")) { // hour
seconds = time * 60 * 60;
} else if (retentionTime.toLowerCase().endsWith("m")) { // minutes
seconds = time * 60;
} else if (retentionTime.toLowerCase().endsWith("s")) { // seconds
seconds = time;
}
queryConfig.withIdleStateRetentionTime(Time.seconds(seconds), Time.seconds(seconds + 5 * 60));
}
exeEnv.setRestartStrategy(RestartStrategies.fixedDelayRestart(Integer.valueOf(restartAttempts),Long.valueOf(delayTime)));
exeEnv.setParallelism(job.parallelism());
String flinktime = extraProps.get(ParamConstant.FlinkTime);
if(flinktime == null){
LOG.info("there is no timestamp !!! ");
} else if(flinktime.equalsIgnoreCase(ParamConstant.ProcTime)){
LOG.info("proctime is used !!! ");
exeEnv.setStreamTimeCharacteristic(TimeCharacteristic.ProcessingTime);
} else {
LOG.info("rowtime is used !!! ");
exeEnv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
}
this
.registerUdfs()
.registerInputTableSource()
.registerOutputTableSink();
Map> sqls = job.getSqls();
//是否含有view的操作
if (sqls.containsKey(SqlConstant.TABLEVIEW)){
//视图名,对应查询
LinkedHashMap viewMap = sqls.get(SqlConstant.TABLEVIEW);
Set viewNames = viewMap.keySet();
for (String name:viewNames) {
String sql = viewMap.get(name);
Table table = env.sqlQuery(sql);
LOG.info("sql query is -----> "+sql);
env.registerTable(name,table);
putSideTable2Map(name,table);
}
}
if(!hasSideTable){
//正常insert into操作
if (sqls.containsKey(SqlConstant.INSERT_QUERY)){
LinkedHashMap updateMap = sqls.get(SqlConstant.INSERT_QUERY);
Collection values = updateMap.values();
for (String sql:values){
LOG.info("sql insert into is -----> "+sql);
env.sqlUpdate(sql,queryConfig);
}
}
}else{
if (sqls.containsKey(SqlConstant.INSERT_QUERY)){
LinkedHashMap updateMap = sqls.get(SqlConstant.INSERT_QUERY);
Collection values = updateMap.values();
for (String sql:values){
LOG.info("sql insert into is -----> "+sql);
SideSqlExec sideSqlExec = new SideSqlExec();
try {
sideSqlExec.exec(sql, sidesMap, env, localTableCache,localTableSourceCache);
} catch (Exception e) {
throw e ;
}
}
}
}
StreamGraph streamGraph = exeEnv.getStreamGraph();
return streamGraph.getJobGraph();
}
那么构建JobGraph完成了,接下来就是提交到yarn了。这里我们模拟flink shell中的yarn per job 模式,YarnClusterDescriptor类,自己重新写一个 继承子自AbstractYarnClusterDescriptor,能拿到启动appmaster的ip和端口,接着调用ClusterClient的client.runDetached(jobGraph, null) 就完成了任务的提交。其他的比较关键的就是前端(我们前端是拖拽式的,拖拽source,计算节点(就是sql文本),拖拽sink框,中间用线连起来。其他一些设置,checkpoint等 参数设置),后台其实只要解析封装好就行了。目前还是比较稳定,使用比较简单。在详细的也不能多讲,怕被查水表。