flink DataStream returns 设置返回类型

flink map返回Tuple3时,如果不指定returns则会报错

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
StreamTableEnvironment tEnv = TableEnvironment.getTableEnvironment(env);
Properties kafkaProp = new Properties();
FlinkKafkaConsumer010 myConsumer = new FlinkKafkaConsumer010("test", new SimpleStringSchema(), kafkaProp);

DataStream> dataStream = env
                .addSource(myConsumer)
                .map(record -> {
                    JSONObject jsonObject = JSON.parseObject(record);
                    return new Tuple3<>(jsonObject.getInteger("id"), jsonObject.getString("name"), jsonObject.getInteger("age"));
                });
env.execute();

运行上述代码,错误信息如下:

Exception in thread "main" org.apache.flink.api.common.functions.InvalidTypesException: The return type of function 'main(TestFlinkTable.java:43)' could not be determined automatically, due to type erasure. You can give type information hints by using the returns(...) method on the result of the transformation call, or by letting your function implement the 'ResultTypeQueryable' interface.
	at org.apache.flink.streaming.api.transformations.StreamTransformation.getOutputType(StreamTransformation.java:420)
	at org.apache.flink.streaming.api.datastream.DataStream.getType(DataStream.java:175)
	at org.apache.flink.streaming.api.datastream.DataStream.union(DataStream.java:217)
	at com.miaoke.sync.test.TestFlinkTable.main(TestFlinkTable.java:50)
Caused by: org.apache.flink.api.common.functions.InvalidTypesException: The generic type parameters of 'Tuple3' are missing. In many cases lambda methods don't provide enough information for automatic type extraction when Java generics are involved. An easy workaround is to use an (anonymous) class instead that implements the 'org.apache.flink.api.common.functions.MapFunction' interface. Otherwise the type has to be specified explicitly using type information.
	at org.apache.flink.api.java.typeutils.TypeExtractionUtils.validateLambdaType(TypeExtractionUtils.java:350)
	at org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:579)
	at org.apache.flink.api.java.typeutils.TypeExtractor.getMapReturnTypes(TypeExtractor.java:175)
	at org.apache.flink.streaming.api.datastream.DataStream.map(DataStream.java:585)
	at com.miaoke.sync.test.TestFlinkTable.main(TestFlinkTable.java:43)

根据错误提示,加上returns(),则正常通过

DataStream> dataStream = env
                .addSource(myConsumer)
                .map(record -> {
                    JSONObject jsonObject = JSON.parseObject(record);
                    return new Tuple3<>(jsonObject.getInteger("id"), jsonObject.getString("name"), jsonObject.getInteger("age"));
                }).returns(Types.TUPLE(Types.INT, Types.STRING, Types.INT));

在一般情况下,Java会擦除泛型类型信息。 Flink尝试使用Java保留的少量位(主要是函数签名和子类信息)通过反射重建尽可能多的类型信息。对于函数的返回类型取决于其输入类型的情况,此逻辑还包含一些简单类型推断:

public class AppendOne implements MapFunction> {

    public Tuple2 map(T value) {
        return new Tuple2(value, 1L);
    }
}

在Flink无法重建已擦除的泛型类型信息的情况下,Java API提供所谓的类型提示。类型提示告诉系统函数生成的数据流或数据集的类型:

DataSet result = dataSet
    .map(new MyGenericNonInferrableFunction())
    .returns(SomeType.class);

你可能感兴趣的:(flink)