JStorm/Strom配置executors和tasks(线程和实例)

注意:JStorm马上融合到Strom内核了,这意味着以后没有Strom了。不过Twitter 对外宣讲了他们的Heron系统,

JStorm作者的博文分析:深度分析Twitter Heron[http://www.longda.us/?p=529] Twitter Heron[http://www.longda.us/?p=529]



配置executors和tasks(线程和实例)

请始终记得标题:executors和tasks(线程和实例),executors代表线程概念,tasks代表spout或bolt实例。



Storm默认会为每个组成(spout/bolt)创建一个任务(task),默认一个任务运行被一个线程(executor)调用。

> setSpout(String id, IRichSpout spout,Number parallelism_hint)

上面的配置表示:拓扑要创建parallelism_hint个任务(tasks),每一个任务被所属于自己的线程(executor)调用。即任务数目和线程数目一样。

> setSpout(String id, IRichSpout spout,Number parallelism_hint).setNumTasks(Number val)

上面的配置表示拓扑配置val个任务,这val个任务被平均分配给parallelism_hint个线程调用。(val/parallelism_hint平均分配)。

注意:JStorm中配置项:

> @Deprecated
    T setNumTasks(Number val)
上面JStorm中给配置设置了方法过时了。

> /**
     * Define a new spout in this topology with the specified parallelism. If
     * the spout declares itself as non-distributed, the parallelism_hint will
     * be ignored and only one task will be allocated to this component.
     *
     * @param id
     *            the id of this component. This id is referenced by other
     *            components that want to consume this spout's outputs.
     * @param parallelism_hint
     *            the number of tasks that should be assigned to execute this
     *            spout. Each task will run on a thread in a process somwehere
     *            around the cluster.
     * @param spout
     *            the spout
     */
    public SpoutDeclarer setSpout(String id, IRichSpout spout,
            Number parallelism_hint)

JStorm设置的并行parallelism_hint,表示实例数目,也表示线程数目。数目一致。

你可能感兴趣的:(JStorm/Strom配置executors和tasks(线程和实例))