Flink InvalidTypesException: The return type of function could not be determined automatically...

Flink InvalidTypesException: The return type of function could not be determined automatically...

初学Flink时遇到的小问题,后发现其实挺常见的,但还是稍作记录。

在WordCount的示例代码中有如下代码:

public static final class Tokenizer implements FlatMapFunction<String, Tuple2<String, Integer>> {
    @Override
    public void flatMap(String value, Collector<Tuple2<String, Integer>> out) {
        // normalize and split the line
        String[] tokens = value.toLowerCase().split("\\W+");

        // emit the pairs
        for (String token : tokens) {
            if (token.length() > 0) {
                out.collect(new Tuple2<>(token, 1));
            }
        }
    }
}

发现它其实是一个函数式接口,想改写为Lambda表达式:

(String value, Collector<Tuple2<String, Integer>> out)->{
        // normalize and split the line
        String[] tokens = value.toLowerCase().split("\\W+");
        // emit the pairs
        for (String token : tokens) {
            if (token.length() > 0) {
                out.collect(new Tuple2<>(token, 1));
            }
        }
    }
)

报错:

Executing WordCount example with default input data set.
Use --input to specify file input.
Exception in thread "main" org.apache.flink.api.common.functions.InvalidTypesException: The return type of function 'main(WordCount.java:85)' could not be determined automatically, due to type erasure. You can give type information hints by using the returns(...) method on the result of the transformation call, or by letting your function implement the 'ResultTypeQueryable' interface.
	at org.apache.flink.api.java.DataSet.getType(DataSet.java:178)
	at org.apache.flink.api.java.DataSet.groupBy(DataSet.java:701)
	at org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:98)
Caused by: org.apache.flink.api.common.functions.InvalidTypesException: The generic type parameters of 'Collector' are missing. In many cases lambda methods don't provide enough information for automatic type extraction when Java generics are involved. An easy workaround is to use an (anonymous) class instead that implements the 'org.apache.flink.api.common.functions.FlatMapFunction' interface. Otherwise the type has to be specified explicitly using type information.
	at org.apache.flink.api.java.typeutils.TypeExtractionUtils.validateLambdaType(TypeExtractionUtils.java:350)
	at org.apache.flink.api.java.typeutils.TypeExtractionUtils.extractTypeFromLambda(TypeExtractionUtils.java:176)
	at org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:571)
	at org.apache.flink.api.java.typeutils.TypeExtractor.getFlatMapReturnTypes(TypeExtractor.java:196)
	at org.apache.flink.api.java.DataSet.flatMap(DataSet.java:266)
	at org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:85)

解决方法:

https://stackoverflow.com/questions/50945509/apache-flink-return-type-of-function-could-not-be-determined-automatically-due

.returns(Types.TUPLE(Types.STRING, Types.INT)) // 如果这里想用函数式接口的lambda表达式的话,需要明确泛型返回的类型

你可能感兴趣的:(Flink InvalidTypesException: The return type of function could not be determined automatically...)