聊聊flink的BoltWrapper

序

本文主要研究一下flink的BoltWrapper

BoltWrapper

flink-storm_2.11-1.6.2-sources.jar!/org/apache/flink/storm/wrappers/BoltWrapper.java

/**
 * A {@link BoltWrapper} wraps an {@link IRichBolt} in order to execute the Storm bolt within a Flink Streaming program.
 * It takes the Flink input tuples of type {@code IN} and transforms them into {@link StormTuple}s that the bolt can
 * process. Furthermore, it takes the bolt's output tuples and transforms them into Flink tuples of type {@code OUT}
 * (see {@link AbstractStormCollector} for supported types).

 * 

 * Works for single input streams only! See {@link MergedInputsBoltWrapper} for multi-input stream
 * Bolts.
 */
public class BoltWrapper extends AbstractStreamOperator implements OneInputStreamOperator {

    @Override
    public void open() throws Exception {
        super.open();

        this.flinkCollector = new TimestampedCollector<>(this.output);

        GlobalJobParameters config = getExecutionConfig().getGlobalJobParameters();
        StormConfig stormConfig = new StormConfig();

        if (config != null) {
            if (config instanceof StormConfig) {
                stormConfig = (StormConfig) config;
            } else {
                stormConfig.putAll(config.toMap());
            }
        }

        this.topologyContext = WrapperSetupHelper.createTopologyContext(
                getRuntimeContext(), this.bolt, this.name, this.stormTopology, stormConfig);

        final OutputCollector stormCollector = new OutputCollector(new BoltCollector(
                this.numberOfAttributes, this.topologyContext.getThisTaskId(), this.flinkCollector));

        if (this.stormTopology != null) {
            Map inputs = this.topologyContext.getThisSources();

            for (GlobalStreamId inputStream : inputs.keySet()) {
                for (Integer tid : this.topologyContext.getComponentTasks(inputStream
                        .get_componentId())) {
                    this.inputComponentIds.put(tid, inputStream.get_componentId());
                    this.inputStreamIds.put(tid, inputStream.get_streamId());
                    this.inputSchemas.put(tid,
                            this.topologyContext.getComponentOutputFields(inputStream));
                }
            }
        }

        this.bolt.prepare(stormConfig, this.topologyContext, stormCollector);
    }

    @Override
    public void dispose() throws Exception {
        super.dispose();
        this.bolt.cleanup();
    }

    @Override
    public void processElement(final StreamRecord element) throws Exception {
        this.flinkCollector.setTimestamp(element);

        IN value = element.getValue();

        if (this.stormTopology != null) {
            Tuple tuple = (Tuple) value;
            Integer producerTaskId = tuple.getField(tuple.getArity() - 1);

            this.bolt.execute(new StormTuple<>(value, this.inputSchemas.get(producerTaskId),
                    producerTaskId, this.inputStreamIds.get(producerTaskId), this.inputComponentIds
                    .get(producerTaskId), MessageId.makeUnanchored()));

        } else {
            this.bolt.execute(new StormTuple<>(value, this.inputSchemas.get(null), -1, null, null,
                    MessageId.makeUnanchored()));
        }
    }


}

flink用BoltWrapper来包装storm的IRichBolt，它实现了OneInputStreamOperator接口，继承AbstractStreamOperator类
OneInputStreamOperator接口继承了StreamOperator接口，额外定义了processElement、processWatermark、processLatencyMarker三个接口
AbstractStreamOperator类实现的是StreamOperator接口，但是里头帮忙实现了processWatermark、processLatencyMarker这两个接口
BoltWrapper里头主要是实现OneInputStreamOperator接口的processElement方法，然后是覆盖StreamOperator接口定义的open及dispose方法
open方法有个要点就是调用bolt的prepare方法，传入包装BoltCollector的OutputCollector，通过BoltCollector来收集bolt发射的数据到flink，它使用的是flink的TimestampedCollector

BoltCollector

flink-storm_2.11-1.6.2-sources.jar!/org/apache/flink/storm/wrappers/BoltCollector.java

/**
 * A {@link BoltCollector} is used by {@link BoltWrapper} to provided an Storm compatible
 * output collector to the wrapped bolt. It transforms the emitted Storm tuples into Flink tuples
 * and emits them via the provide {@link Output} object.
 */
class BoltCollector extends AbstractStormCollector implements IOutputCollector {

    /** The Flink output Collector. */
    private final Collector flinkOutput;

    /**
     * Instantiates a new {@link BoltCollector} that emits Flink tuples to the given Flink output object. If the
     * number of attributes is negative, any output type is supported (ie, raw type). If the number of attributes is
     * between 0 and 25, the output type is {@link Tuple0} to {@link Tuple25}, respectively.
     *
     * @param numberOfAttributes
     *            The number of attributes of the emitted tuples per output stream.
     * @param taskId
     *            The ID of the producer task (negative value for unknown).
     * @param flinkOutput
     *            The Flink output object to be used.
     * @throws UnsupportedOperationException
     *             if the specified number of attributes is greater than 25
     */
    BoltCollector(final HashMap numberOfAttributes, final int taskId,
            final Collector flinkOutput) throws UnsupportedOperationException {
        super(numberOfAttributes, taskId);
        assert (flinkOutput != null);
        this.flinkOutput = flinkOutput;
    }

    @Override
    protected List doEmit(final OUT flinkTuple) {
        this.flinkOutput.collect(flinkTuple);
        // TODO
        return null;
    }

    @Override
    public void reportError(final Throwable error) {
        // not sure, if Flink can support this
    }

    @Override
    public List emit(final String streamId, final Collection anchors, final List

聊聊flink的BoltWrapper

序

BoltWrapper

BoltCollector

TimestampedCollector.collect

AbstractStormCollector.tansformAndEmit

Task.run

StreamTask.invoke

OneInputStreamTask.run

StreamInputProcessor.processInput

小结

doc

你可能感兴趣的:(聊聊flink的BoltWrapper)