flink深入研究(11) DataStream类中flatMap函数调用流程


// 计算数据
DataStream windowCount = text.flatMap(new FlatMapFunction() {
			public void flatMap(String value, Collector out) throws Exception {
				String[] splits = value.split(" ");
				for (String word : splits) {
					out.collect(new WordWithCount(word, 1L));


	 * Applies a FlatMap transformation on a {@link DataStream}. The
	 * transformation calls a {@link FlatMapFunction} for each element of the
	 * DataStream. Each FlatMapFunction call can return any number of elements
	 * including none. The user can also extend {@link RichFlatMapFunction} to
	 * gain access to other features provided by the
	 * {@link org.apache.flink.api.common.functions.RichFunction} interface.
	 * @param flatMapper
	 *            The FlatMapFunction that is called for each element of the
	 *            DataStream
	 * @param 
	 *            output type
	 * @return The transformed {@link DataStream}.
	public  SingleOutputStreamOperator flatMap(FlatMapFunction flatMapper) {
		TypeInformation outType = TypeExtractor.getFlatMapReturnTypes(clean(flatMapper),
				getType(), Utils.getCallLocationName(), true);
		return transform("Flat Map", outType, new StreamFlatMap<>(clean(flatMapper)));




	 * Gets the type of the stream.
	 * @return The type of the datastream.
	public TypeInformation getType() {
		return transformation.getOutputType();

这个函数调用Transformation类对象的getOutputType函数来获取它的输出类型,这个Transformation类属于source operator,也就是我们前面讲的DataStreamSource类对象text。



	public static  TypeInformation getFlatMapReturnTypes(FlatMapFunction flatMapInterface, TypeInformation inType,
			String functionName, boolean allowMissing)
		return getUnaryOperatorReturnType(
			(Function) flatMapInterface,
			new int[]{1, 0},


	 * Returns the unary operator's return type.

This method can extract a type in 4 different ways: * *

1. By using the generics of the base class like MyFunction. * This is what outputTypeArgumentIndex (in this example "4") is good for. * *

2. By using input type inference SubMyFunction. * This is what inputTypeArgumentIndex (in this example "0") and inType is good for. * *

3. By using the static method that a compiler generates for Java lambdas. * This is what lambdaOutputTypeArgumentIndices is good for. Given that MyFunction has * the following single abstract method: * *

	 * void apply(IN value, Collector value)
* *

Lambda type indices allow the extraction of a type from lambdas. To extract the * output type OUT from the function one should pass {@code new int[] {1, 0}}. * "1" for selecting the parameter and 0 for the first generic in this type. * Use {@code TypeExtractor.NO_INDEX} for selecting the return type of the lambda for * extraction or if the class cannot be a lambda because it is not a single abstract * method interface. * *

4. By using interfaces such as {@link TypeInfoFactory} or {@link ResultTypeQueryable}. * *

See also comments in the header of this class. * * @param function Function to extract the return type from * @param baseClass Base class of the function * @param inputTypeArgumentIndex Index of input generic type in the base class specification (ignored if inType is null) * @param outputTypeArgumentIndex Index of output generic type in the base class specification * @param lambdaOutputTypeArgumentIndices Table of indices of the type argument specifying the input type. See example. * @param inType Type of the input elements (In case of an iterable, it is the element type) or null * @param functionName Function name * @param allowMissing Can the type information be missing (this generates a MissingTypeInfo for postponing an exception) * @param Input type * @param Output type * @return TypeInformation of the return type of the function */ @SuppressWarnings("unchecked") @PublicEvolving public static TypeInformation getUnaryOperatorReturnType( Function function, Class baseClass, int inputTypeArgumentIndex, int outputTypeArgumentIndex, int[] lambdaOutputTypeArgumentIndices, TypeInformation inType, String functionName, boolean allowMissing) { Preconditions.checkArgument(inType == null || inputTypeArgumentIndex >= 0, "Input type argument index was not provided"); Preconditions.checkArgument(outputTypeArgumentIndex >= 0, "Output type argument index was not provided"); Preconditions.checkArgument( lambdaOutputTypeArgumentIndices != null, "Indices for output type arguments within lambda not provided"); // explicit result type has highest precedence //如果实现了ResultTypeQueryable接口,那么直接通过function中的getProducedType函数返 //回输出类型 if (function instanceof ResultTypeQueryable) { return ((ResultTypeQueryable) function).getProducedType(); } // perform extraction try { final LambdaExecutable exec; try { //判断function是否是lambda表达式 exec = checkAndExtractLambda(function); } catch (TypeExtractionException e) { throw new InvalidTypesException("Internal error occurred.", e); } //如果是lambda实现的function,那么就根据lambda表达式特点来获取相应的返回类型 if (exec != null) { // parameters must be accessed from behind, since JVM can add additional parameters e.g. when using local variables inside lambda function // paramLen is the total number of parameters of the provided lambda, it includes parameters added through closure final int paramLen = exec.getParameterTypes().length; final Method sam = TypeExtractionUtils.getSingleAbstractMethod(baseClass); // number of parameters the SAM of implemented interface has; the parameter indexing applies to this range final int baseParametersLen = sam.getParameterTypes().length; final Type output; if (lambdaOutputTypeArgumentIndices.length > 0) { output = TypeExtractionUtils.extractTypeFromLambda( baseClass, exec, lambdaOutputTypeArgumentIndices, paramLen, baseParametersLen); } else { output = exec.getReturnType(); TypeExtractionUtils.validateLambdaType(baseClass, output); } return new TypeExtractor().privateCreateTypeInfo(output, inType, null); } else {//通过反射来获取输出类型 if (inType != null) { validateInputType(baseClass, function.getClass(), inputTypeArgumentIndex, inType); } return new TypeExtractor().privateCreateTypeInfo(baseClass, function.getClass(), outputTypeArgumentIndex, inType, null); } } catch (InvalidTypesException e) { if (allowMissing) { return (TypeInformation) new MissingTypeInfo(functionName != null ? functionName : function.toString(), e); } else { throw e; } } }


	 * Checks if the given function has been implemented using a Java 8 lambda. If yes, a LambdaExecutable
	 * is returned describing the method/constructor. Otherwise null.
	 * @throws TypeExtractionException lambda extraction is pretty hacky, it might fail for unknown JVM issues.
	public static LambdaExecutable checkAndExtractLambda(Function function) throws TypeExtractionException {
		try {
			// get serialized lambda
			SerializedLambda serializedLambda = null;
			for (Class clazz = function.getClass(); clazz != null; clazz = clazz.getSuperclass()) {
				try {
					Method replaceMethod = clazz.getDeclaredMethod("writeReplace");
					Object serialVersion = replaceMethod.invoke(function);

					// check if class is a lambda function
					if (serialVersion != null && serialVersion.getClass() == SerializedLambda.class) {
						serializedLambda = (SerializedLambda) serialVersion;
				catch (NoSuchMethodException e) {
					// thrown if the method is not there. fall through the loop

			// not a lambda method -> return null
			if (serializedLambda == null) {
				return null;

			// find lambda method
			String className = serializedLambda.getImplClass();
			String methodName = serializedLambda.getImplMethodName();
			String methodSig = serializedLambda.getImplMethodSignature();

			Class implClass = Class.forName(className.replace('/', '.'), true, Thread.currentThread().getContextClassLoader());

			// find constructor
			if (methodName.equals("")) {
				Constructor[] constructors = implClass.getDeclaredConstructors();
				for (Constructor constructor : constructors) {
					if (getConstructorDescriptor(constructor).equals(methodSig)) {
						return new LambdaExecutable(constructor);
			// find method
			else {
				List methods = getAllDeclaredMethods(implClass);
				for (Method method : methods) {
					if (method.getName().equals(methodName) && getMethodDescriptor(method).equals(methodSig)) {
						return new LambdaExecutable(method);
			throw new TypeExtractionException("No lambda method found.");
		catch (Exception e) {
			throw new TypeExtractionException("Could not extract lambda method out of function: " +
				e.getClass().getSimpleName() + " - " + e.getMessage(), e);


拿到输出类型后,那么运行transform("Flat Map", outType, new StreamFlatMap<>(clean(flatMapper)));


flink深入研究(11) DataStream类中flatMap函数调用流程_第1张图片 operator继承结构图


	 * Method for passing user defined operators along with the type
	 * information that will transform the DataStream.
	 * @param operatorName
	 *            name of the operator, for logging purposes
	 * @param outTypeInfo
	 *            the output type of the operator
	 * @param operator
	 *            the object containing the transformation logic
	 * @param 
	 *            type of the return stream
	 * @return the data stream constructed
	public  SingleOutputStreamOperator transform(String operatorName, TypeInformation outTypeInfo, OneInputStreamOperator operator) {

		// read the output type of the input Transform to coax out errors about MissingTypeInfo
		OneInputTransformation resultTransform = new OneInputTransformation<>(

		@SuppressWarnings({ "unchecked", "rawtypes" })
		SingleOutputStreamOperator returnStream = new SingleOutputStreamOperator(environment, resultTransform);
		return returnStream;




