FlinkSql一个简单的测试程序
以下是一个简单的 Flink SQL 示例,展示了如何使用 Flink Table API 和 Flink SQL 进行基本的数据流处理。
// 1. 定义数据实体
public static class CC {
public String character;
public long count;
public CC() {
}
public CC(String character, long count) {
this.character = character;
this.count = count;
}
}
// 2. 创建执行环境并模拟数据流
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);
EnvironmentSettings environmentSettings = EnvironmentSettings
.newInstance()
.useBlinkPlanner()
.inStreamingMode()
.build();
StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env, environmentSettings);
DataStream<String> inputStream = env.fromElements(
"hello",
"world",
"!!!"
).uid("source").name("source");
// 3. 对数据流进行flatMap()操作
SingleOutputStreamOperator<CC> streamOperator = inputStream.flatMap(new FlatMapFunction<String, CC>() {
@Override
public void flatMap(String value, Collector<CC> out) throws Exception {
for (char c : value.toCharArray()) {
out.collect(new CC(c + "",1L));
}
}
});
// 4. 将数据流转为Table
Table table = tableEnv.fromDataStream(streamOperator);
// 5. 使用tableApi操作数据流,并输出结果
Table filter = table
.select($("character"), $("count"))
.filter($("character").isNotEqual(""));
Table result = filter
.groupBy($("character"))
.select($("character"), $("count").sum().as("character_count"));
tableEnv.toRetractStream(result, Row.class).print();
// 6. 使用FlinkSql操作数据流,并输出结果
tableEnv.createTemporaryView("CC", table);
Table result2 = tableEnv.sqlQuery("SELECT `character`, SUM(`count`) FROM CC group by `character`");
tableEnv.toRetractStream(result2, Row.class).print();
// 7.执行任务
env.execute("Flink Sql Test");
(true,+I[h, 1])
(true,+I[e, 1])
(true,+I[l, 1])
(false,-U[l, 1])
(true,+U[l, 2])
(true,+I[o, 1])
(true,+I[w, 1])
(false,-U[o, 1])
(true,+U[o, 2])
(true,+I[r, 1])
(false,-U[l, 2])
(true,+U[l, 3])
(true,+I[d, 1])
(true,+I[!, 1])
(false,-U[!, 1])
(true,+U[!, 2])
(false,-U[!, 2])
(true,+U[!, 3])
Process finished with exit code 0
通过这段代码,您可以了解如何使用 Flink Table API 和 Flink SQL 对数据流进行简单的处理和分析,包括数据拆分、选择、过滤、分组和计算。最后,通过 toRetractStream 方法将结果打印输出。