storm集群上运行的是一个个topology,一个topology是spouts和bolts组成的图。当我们开发完topology程序后将其打成jar包,然后在shell中执行storm jar xxxxxx.jar xxxxxxxClass就可以将jar包上传到storm集群的nimbus上,并执行topology。本文主要分析下topology的jar包是如何上传到nimbus上的。首先我们从storm的jar命令入手,jar命令的实现位于storm根目录的bin/storm文件里。定义如下:
def
jar(
jarfile
,
klass
,
*
args
):
"""Syntax: [storm jar topology-jar-path class ...]
Runs the main method of class with the specified arguments.
The storm jars and configs in ~/.storm are put on the classpath.
The process is configured so that StormSubmitter
(http://nathanmarz.github.com/storm/doc/backtype/storm/StormSubmitter.html)
will upload the jar at topology-jar-path when the topology is submitted.
"""
exec_storm_class(
klass
,
jvmtype
=
"-client"
,
extrajars
=
[
jarfile
,
USER_CONF_DIR
,
STORM_DIR
+
"/bin"
],
args
=
args
,
jvmopts
=
[
' '
.
join(
filter(
None
,
[
JAR_JVM_OPTS
,
"-Dstorm.jar="
+
jarfile
]))])
jar命令是由python实现的,很奇怪为什么不用clojure实现呢?(不得而知)。jarfile表示jar包的位置;klass表示topology的入口,也就是有main函数的类;*args表示传递给main函数的参数。jvmtype="-client"表示指定jvm类型为client类型(jvm有两种类型client和server,服务器端默认为server类型);extrajars集合用于存放编译topology的jar包时,所有依赖jar包的路径;jvmopts集合存放以jvm参数,这里比较重要的是-Dstorm.jar参数,这个参数的值是jarfile,这样在运行submitTopology方法时就可以通过storm.jar参数获得jar包的路径了(通过jvm参数进行方法参数传递)exec_storm_class函数的逻辑比较简单,具体实现如下:
def
exec_storm_class(
klass
,
jvmtype
=
"-server"
,
jvmopts
=
[],
extrajars
=
[],
args
=
[],
fork
=
False
):
global
CONFFILE
all_args
=
[
"java"
,
jvmtype
,
get_config_opts
(),
"-Dstorm.home="
+
STORM_DIR
,
"-Djava.library.path="
+
confvalue(
"java.library.path"
,
extrajars
),
"-Dstorm.conf.file="
+
CONFFILE
,
"-cp"
,
get_classpath(
extrajars
),
]
+
jvmopts
+
[
klass
]
+
list(
args)
print
"Running: "
+
" "
.
join(
all_args)
if
fork
:
os
.
spawnvp(
os
.
P_WAIT
,
"java"
,
all_args)
else
:
os
.
execvp(
"java"
,
all_args)
# replaces the current process and never returns
get_config_opts()获取jvm的默认配置信息,confvalue("java.library.path", extrajars)获取storm使用的本地库JZMQ加载路径,get_classpath(extrajars)获取所有依赖jar包的完整路径,然后拼接一个java -cp命令运行topology的main方法。接下来程序执行流程转移到topology的main方法内,我们以storm-starter项目中的wordCountTopology的main方法为例:
public
static
void
main(
String
[]
args)
throws
Exception
{
TopologyBuilder
builder
=
new
TopologyBuilder();
builder
.
setSpout(
"spout"
,
new
RandomSentenceSpout
(),
6);
builder
.
setBolt(
"split"
,
new
SplitSentence
(),
12
).
shuffleGrouping(
"spout");
builder
.
setBolt(
"count"
,
new
WordCount
(),
10
).
fieldsGrouping(
"split"
,
new
Fields(
"word"));
Config
conf
=
new
Config();
conf
.
setDebug(
true);
if (
args
!=
null
&&
args
.
length
>
0)
{
conf
.
setNumWorkers(
4);
StormSubmitter
.
submitTopology(
args
[
0
],
conf
,
builder
.
createTopology());
}
else
{
conf
.
setMaxTaskParallelism(
3);
LocalCluster
cluster
=
new
LocalCluster();
cluster
.
submitTopology(
"word-count"
,
conf
,
builder
.
createTopology());
Thread
.
sleep(
10000);
cluster
.
shutdown();
}
}
main方法构建topology后,调用StormSubmitter类的submitTopology方法提交topology。submitTopology方法如下:
/**
* Submits a topology to run on the cluster. A topology runs forever or until
* explicitly killed.
*
*
* @param name the name of the storm.
* @param stormConf the topology-specific configuration. See {@link Config}.
* @param topology the processing to execute.
* @throws AlreadyAliveException if a topology with this name is already running
* @throws InvalidTopologyException if an invalid topology was submitted
*/
public
static
void
submitTopology(
String
name
,
Map
stormConf
,
StormTopology
topology)
throws
AlreadyAliveException
,
InvalidTopologyException
{
submitTopology(
name
,
stormConf
,
topology
,
null);
}
/**
* Submits a topology to run on the cluster. A topology runs forever or until
* explicitly killed.
*
*
* @param name the name of the storm.
* @param stormConf the topology-specific configuration. See {@link Config}.
* @param topology the processing to execute.
* @param options to manipulate the starting of the topology
* @throws AlreadyAliveException if a topology with this name is already running
* @throws InvalidTopologyException if an invalid topology was submitted
*/
public
static
void
submitTopology(
String
name
,
Map
stormConf
,
StormTopology
topology
,
SubmitOptions
opts)
throws
AlreadyAliveException
,
InvalidTopologyException
{
if
(!
Utils
.
isValidConf(
stormConf))
{
throw
new
IllegalArgumentException(
"Storm conf is not valid. Must be json-serializable");
}
stormConf
=
new
HashMap(
stormConf);
stormConf
.
putAll(
Utils
.
readCommandLineOpts());
Map
conf
=
Utils
.
readStormConfig();
conf
.
putAll(
stormConf);
try
{
String
serConf
=
JSONValue
.
toJSONString(
stormConf);
if(
localNimbus
!=
null)
{
LOG
.
info(
"Submitting topology "
+
name
+
" in local mode");
localNimbus
.
submitTopology(
name
,
null
,
serConf
,
topology);
}
else
{
NimbusClient
client
=
NimbusClient
.
getConfiguredClient(
conf);
if(
topologyNameExists(
conf
,
name))
{
throw
new
RuntimeException(
"Topology with name `"
+
name
+
"` already exists on cluster");
}
submitJar(
conf);
try
{
LOG
.
info(
"Submitting topology "
+
name
+
" in distributed mode with conf "
+
serConf);
if(
opts
!=
null)
{
client
.
getClient
().
submitTopologyWithOpts(
name
,
submittedJar
,
serConf
,
topology
,
opts);
}
else
{
// this is for backwards compatibility
client
.
getClient
().
submitTopology(
name
,
submittedJar
,
serConf
,
topology);
}
}
catch(
InvalidTopologyException
e)
{
LOG
.
warn(
"Topology submission exception"
,
e);
throw
e;
}
catch(
AlreadyAliveException
e)
{
LOG
.
warn(
"Topology already alive exception"
,
e);
throw
e;
}
finally
{
client
.
close();
}
}
LOG
.
info(
"Finished submitting topology: "
+
name);
}
catch(
TException
e)
{
throw
new
RuntimeException(
e);
}
}
submitTopology方法主要完成三件工作:
1. 配置参数
把命令行参数放在stormConf, 从conf/storm.yaml读取配置参数到conf, 再把stormConf也put到conf, 可见命令行参数的优先级更高,将stormConf转化为Json, 因为这个配置是要发送到服务器的
2. 调用submitJar方法
submitJar(
conf)
private
static
void
submitJar(
Map
conf)
{
if(
submittedJar
==
null)
{
LOG
.
info(
"Jar not uploaded to master yet. Submitting jar...");
String
localJar
=
System
.
getProperty(
"storm.jar");
submittedJar
=
submitJar(
conf
,
localJar);
}
else
{
LOG
.
info(
"Jar already uploaded to master. Not submitting jar.");
}
}
System.getProperty("storm.jar")获取jvm参数storm.jar的值,即topology jar包的路径,然后调用重载方法submitJar。
public
static
String
submitJar(
Map
conf
,
String
localJar)
{
if(
localJar
==
null)
{
throw
new
RuntimeException(
"Must submit topologies using the 'storm' client script so that StormSubmitter knows which jar to upload.");
}
NimbusClient
client
=
NimbusClient
.
getConfiguredClient(
conf);
try
{
String
uploadLocation
=
client
.
getClient
().
beginFileUpload();
LOG
.
info(
"Uploading topology jar "
+
localJar
+
" to assigned location: "
+
uploadLocation);
BufferFileInputStream
is
=
new
BufferFileInputStream(
localJar);
while(
true)
{
byte
[]
toSubmit
=
is
.
read();
if(
toSubmit
.
length
==
0)
break;
client
.
getClient
().
uploadChunk(
uploadLocation
,
ByteBuffer
.
wrap(
toSubmit));
}
client
.
getClient
().
finishFileUpload(
uploadLocation);
LOG
.
info(
"Successfully uploaded topology jar to assigned location: "
+
uploadLocation);
return
uploadLocation;
}
catch(
Exception
e)
{
throw
new
RuntimeException(
e);
}
finally
{
client
.
close();
}
}
StormSubmitter的本质是个Thrift Client,而Nimbus则是Thrift Server,所以所有的操作都是通过Thrift RPC来完成,submitJar首先创建client,然后调用nimbus thrift server的beginFileUpload()方法获取nimbus存放jar的目录。beginFileUpload函数如下:
(
beginFileUpload
[
this
]
(
let
[
fileloc (
str (
inbox
nimbus)
"/stormjar-" (
uuid)
".jar"
)]
(
.put (
:uploaders
nimbus)
fileloc
(
Channels/newChannel (
FileOutputStream.
fileloc)))
(
log-message
"Uploading file from client to "
fileloc)
fileloc
))
(inbox nimbus)函数里面又调用了master-inbox函数,master-inbox主要创建storm.local.dir的值/inbox目录,并返回完整目录名,所以topology jar包的将会通过uploadChunk方法上传到nimbus上的storm.local.dir的值/inbox/stormjar-32位uuid.jar。
3. 生成thrift client并调用nimbus thrift server的submitTopologyWithOpts或submitTopology方法(submitTopologyWithOpts或submitTopology方法定义在Nimbus.clj中),submitTopologyWithOpts如下:
(
^
void
submitTopologyWithOpts
[
this
^
String
storm-name
^
String
uploadedJarLocation
^
String
serializedConf
^
StormTopology
topology
^
SubmitOptions
submitOptions
]
(
try
(
assert (
not-nil?
submitOptions))
(
validate-topology-name!
storm-name)
(
check-storm-active!
nimbus
storm-name
false)
(
let
[
topo-conf (
from-json
serializedConf
)]
(
try
(
validate-configs-with-schemas
topo-conf)
(
catch
IllegalArgumentException
ex
(
throw (
InvalidTopologyException. (
.getMessage
ex)))))
(
.validate
^
backtype.storm.nimbus.ITopologyValidator (
:validator
nimbus)
storm-name
topo-conf
topology))
(
swap! (
:submitted-count
nimbus)
inc)
(
let
[
storm-id (
str
storm-name
"-"
@(
:submitted-count
nimbus)
"-" (
current-time-secs))
storm-conf (
normalize-conf
conf
(
->
serializedConf
from-json
(
assoc
STORM-ID
storm-id)
(
assoc
TOPOLOGY-NAME
storm-name))
topology)
total-storm-conf (
merge
conf
storm-conf)
topology (
normalize-topology
total-storm-conf
topology)
storm-cluster-state (
:storm-cluster-state
nimbus
)]
(
system-topology!
total-storm-conf
topology)
;; this validates the structure of the topology
(
log-message
"Received topology submission for "
storm-name
" with conf "
storm-conf)
;; lock protects against multiple topologies being submitted at once and
;; cleanup thread killing topology in b/w assignment and starting the topology
(
locking (
:submit-lock
nimbus)
(
setup-storm-code
conf
storm-id
uploadedJarLocation
storm-conf
topology)
(
.setup-heartbeats!
storm-cluster-state
storm-id)
(
let
[
thrift-status->kw-status
{
TopologyInitialStatus/INACTIVE
:inactive
TopologyInitialStatus/ACTIVE
:active
}]
(
start-storm
nimbus
storm-name
storm-id (
thrift-status->kw-status (
.get_initial_status
submitOptions))))
(
mk-assignments
nimbus)))
(
catch
Throwable
e
(
log-warn-error
e
"Topology submission exception. (topology name='"
storm-name
"')")
(
throw
e))))
storm-name表示topology的名字,uploadedJarLocation表示jar包在nimbus上的位置,serializedConf表示topology的序列化的配置信息,topology参数表示thrift结构的topology,topology结构定义在storm.thrift中,如下:
struct
StormTopology
{
//ids
must
be
unique
across
maps
//
#
workers
to
use
is
in
conf
1
:
required
map<string,
SpoutSpec>
spouts;
2
:
required
map<string,
Bolt>
bolts;
3
:
required
map<string,
StateSpoutSpec>
state_spouts;
}
spouts存放spout id和spout的键值对,bolts存放bolt id和bolt的键值对,StateSpoutSpec暂未实现。SpoutSpec定义如下:
struct
SpoutSpec
{
1
:
required
ComponentObject
spout_object;
2
:
required
ComponentCommon
common;
//
can
force
a
spout
to
be
non-distributed
by
overriding
the
component
configuration
//
and
setting
TOPOLOGY_MAX_TASK_PARALLELISM
to
1
}
Bolt定义如下:
struct
Bolt
{
1
:
required
ComponentObject
bolt_object;
2
:
required
ComponentCommon
common;
}
Bolt和Spout的结构相同,都是由1个ComponentObject结构和1个ComponentCommon结构组成。ComponentObject定义如下:
union
ComponentObject
{
1
:
binary
serialized_java;
2
:
ShellComponent
shell;
3
:
JavaObject
java_object;
}
ComponentObject即是bolt的实现实体,它可以是以下三个类型之一:
1、1个序列化的java对象(这个对象实现IBolt接口)
2、1个ShellComponent对象,意味着bolt是由其他语言实现的。如果以这种方式来定义1个bolt,Storm将会实例化1个ShellBolt对象来
负责处理基于JVM的worker进程与非JVM的component(即该bolt)实现体之间的通讯。
3、1个JavaObject结构,这个结构告诉Storm实例化这个bolt所需要的classname和构造函数参数。这一点在你想用非JVM语言来定义topology时比较有用。这样,在你使用非JVM语言来定义topology时就可以做到既使用基于 JVM的spout或bolt,同时又不需要创建并序列化它们的Java对象。
ComponentCommon定义如下:
struct
ComponentCommon
{
1
:
required
map<GlobalStreamId,
Grouping>
inputs;
2
:
required
map<string,
StreamInfo>
streams
; //key is stream id
3
:
optional
i32
parallelism_hint
; //how many threads across the cluster should be dedicated to this component
//
component
specific
configuration
respects
:
//
topology.debug
:
false
//
topology.max.task.parallelism
:
null
//
can
replace
isDistributed
with
this
//
topology.max.spout.pending
:
null
//
topology.kryo.register
//
this
is
the
only
additive
one
//
component
specific
configuration
4
:
optional
string
json_conf;
}
GlobalStreamId定义如下:
struct
GlobalStreamId
{
1
:
required
string
componentId;
2
:
required
string
streamId;
#
Going
to
need
to
add
an
enum
for
the
stream
type (
NORMAL
or
FAILURE)
}
ComponentCommon定义了这个component的其他所有属性。包括:
1、这个component接收什么stream(被定义在1个component_id到stream_id的map里,在stream做分组时用到)
2、这个component发射什么stream以及stream的元数据(是否是direct stream,stream中field的声明)
3、这个component的并行度
4、这个component的配置项configuration
(assert (not-nil? submitOptions))如果submitOptions为nil,那么assert将会抛出java.lang.AssertionError,(validate-topology-name! storm-name)验证topology的名字,validate-topology-name!定义如下:
(
defn
validate-topology-name!
[
name
]
(
if (
some
#(
.contains
name
%)
DISALLOWED-TOPOLOGY-NAME-STRS)
(
throw (
InvalidTopologyException.
(
str
"Topology name cannot contain any of the following: " (
pr-str
DISALLOWED-TOPOLOGY-NAME-STRS))))
(
if (
clojure.string/blank?
name)
(
throw (
InvalidTopologyException.
(
"Topology name cannot be blank"))))))
DISALLOWED-TOPOLOGY-NAME-STRS定义如下:
(
def
DISALLOWED-TOPOLOGY-NAME-STRS
#
{
"/"
"."
":"
"\\"
})
包含了不允许出现在topology名字中的特殊字符,some函数的第一个参数是一个匿名函数,对DISALLOWED-TOPOLOGY-NAME-STRS集合中的每个元素应用该匿名函数,遇到第一个true则返回true。validate-topology-name!函数主要检查topology的名字中是否包含"非法字符"。check-storm-active!函数用于检查该topology的状态是否是"active"。定义如下:
(
defn
check-storm-active!
[
nimbus
storm-name
active?
]
(
if (
= (
not
active?)
(
storm-active? (
:storm-cluster-state
nimbus)
storm-name))
(
if
active?
(
throw (
NotAliveException. (
str
storm-name
" is not alive")))
(
throw (
AlreadyAliveException. (
str
storm-name
" is already active"))))
))
nimbus是一个保存了nimbus thrift server当前状态的map,这个map是由nimbus-data函数生成的,nimbus-data函数如下:
(
defn
nimbus-data
[
conf
inimbus
]
(
let
[
forced-scheduler (
.getForcedScheduler
inimbus
)]
{
:conf
conf
:inimbus
inimbus
:submitted-count (
atom
0)
:storm-cluster-state (
cluster/mk-storm-cluster-state
conf)
:submit-lock (
Object.)
:heartbeats-cache (
atom
{})
:downloaders (
file-cache-map
conf)
:uploaders (
file-cache-map
conf)
:uptime (
uptime-computer)
:validator (
new-instance (
conf
NIMBUS-TOPOLOGY-VALIDATOR))
:timer (
mk-timer
:kill-fn (
fn
[
t
]
(
log-error
t
"Error when processing event")
(
exit-process!
20
"Error when processing an event")
))
:scheduler (
mk-scheduler
conf
inimbus)
}))
conf保存了storm集群的配置信息,inimbus表示当前nimbus实例,cluster/mk-storm-cluster-state返回一个实现了StormClusterState协议的实例。storm-active?函数定义如下:
(
defn
storm-active?
[
storm-cluster-state
storm-name
]
(
not-nil? (
get-storm-id
storm-cluster-state
storm-name)))
通过调用get-storm-id函数获取指定topology名字的topology id,如果id存在则返回true,否则返回false。get-storm-id函数如下:
(
defn
get-storm-id
[
storm-cluster-state
storm-name
]
(
let
[
active-storms (
.active-storms
storm-cluster-state
)]
(
find-first
#(
=
storm-name (
:storm-name (
.storm-base
storm-cluster-state
%
nil)))
active-storms)
))
active-storms函数获取zookeeper中/storms/的所有children,/storms/{topology-id}中存放当前正在运行的topology信息。保存的内容参考common.clj中的类StormBase。
(
defrecord
StormBase
[
storm-name
launch-time-secs
status
num-workers
component->executors
])
find-first函数返回名字等于storm-name的第一个topology的id。当我们正确提交topology时,由于zookeeper中的/storms中不存在与之对应的{topology-id}文件,所以check-storm-active!函数的第一个if的条件表达式为(= true true)。进而通过check-storm-active!函数的检查。将topology的配置信息绑定到topo-conf,validate-configs-with-schemas函数验证配置信息的正确性,validate-configs-with-schemas定义如下:
(
defn
validate-configs-with-schemas
[
conf
]
(
doseq
[[
k
v
]
conf
:let
[
schema (
CONFIG-SCHEMA-MAP
k
)]]
(
if (
not (
nil?
schema))
(
.validateField
schema
k
v))))
CONFIG-SCHEMA-MAP定义如下:
;; Create a mapping of config-string -> validator
;; Config fields must have a _SCHEMA field defined
(
def
CONFIG-SCHEMA-MAP
(
->> (
.getFields
Config)
(
filter
#(
not (
re-matches
#
".*_SCHEMA$" (
.getName
%))))
(
map (
fn
[
f
]
[(
.get
f
nil)
(
get-FieldValidator
(
->
Config
(
.getField (
str (
.getName
f)
"_SCHEMA"))
(
.get
nil
)))]))
(
into
{})))
Config.java中主要有两类静态变量:一类是配置信息,一类是配置信息对应的校验器,校验器属性以_SCHEMA结尾。CONFIG-SCHEMA-MAP中存放了配置信息变量名和对应校验器的键值对config-string -> validator。
validate-configs-with-schemas函数就是根据配置信息名获取对应校验器,然后对配置信息值进行校验。相关校验器请查看ConfigValidation类的内部类FieldValidator。(:validator nimbus)返回一个实现了backtype.storm.nimbus.ITopologyValidator接口的实例(backtype.storm.nimbus.DefaultTopologyValidators实例)并调用其validate方法。backtype.storm.nimbus.DefaultTopologyValidators类如下:
public
class
DefaultTopologyValidator
implements
ITopologyValidator
{
@Override
public
void
prepare(
Map
StormConf
){
}
@Override
public
void
validate(
String
topologyName
,
Map
topologyConf
,
StormTopology
topology)
throws
InvalidTopologyException
{
}
}
默认情况下validate方法是一个空实现。
swap!函数用于将atom(原子类型,与java中的原子类型相同)类型的(:submitted-count nimbus)加1,保存已提交topology的个数。storm-id绑定了topology的id。storm-conf绑定topology配置信息和集群配置信息合并后序列化器、需要序列化的类、acker的个数和最大任务并行度配置信息。total-storm-conf绑定全部配置信息。normalize-topology函数主要功能就是为topology添加"topology.tasks"(task总数)配置信息。
normalize-topology定义如下:
(
defn
normalize-topology
[
storm-conf
^
StormTopology
topology
]
(
let
[
ret (
.deepCopy
topology
)]
(
doseq
[[
_
component
] (
all-components
ret
)]
(
.set_json_conf
(
.get_common
component)
(
->>
{
TOPOLOGY-TASKS (
component-parallelism
storm-conf
component
)}
(
merge (
component-conf
component))
to-json )))
ret ))
ret绑定一个topology的深度复制,all-components函数返回该topology的所有组件的id和spout/bolt对象的键值对,然后通过调用get_common方法获取spot/bolt对象的ComponentCommon属性,->>是clojure中的一个宏,作用就是将{......}作为merge函数的最后一个参数,然后将merge函数的返回值作为to-json函数的最后一个参数,component-parallelism函数定义如下:
(
defn-
component-parallelism
[
storm-conf
component
]
(
let
[
storm-conf (
merge
storm-conf (
component-conf
component))
num-tasks (
or (
storm-conf
TOPOLOGY-TASKS) (
num-start-executors
component))
max-parallelism (
storm-conf
TOPOLOGY-MAX-TASK-PARALLELISM)
]
(
if
max-parallelism
(
min
max-parallelism
num-tasks)
num-tasks)))
component-parallelism是个私有函数,主要功能就是确定"topology.tasks"的值,num-start-executors函数获取spout/bolt的并行度,没有设置并行度时默认值为1,num-tasks绑定该topology的任务数,max-parallelism绑定最大任务数,最后num-tasks和max-parallelism中较小的。normalize-topology函数会将添加了"topology.tasks"的配置信息保存到spout/bolt的ComponentCommon属性的json_conf中,并返回修改后的topology。
system-topology!函数定义如下:
(
defn
system-topology!
[
storm-conf
^
StormTopology
topology
]
(
validate-basic!
topology)
(
let
[
ret (
.deepCopy
topology
)]
(
add-acker!
storm-conf
ret)
(
add-metric-components!
storm-conf
ret)
(
add-system-components!
storm-conf
ret)
(
add-metric-streams!
ret)
(
add-system-streams!
ret)
(
validate-structure!
ret)
ret
))
validate-basic!验证topology的基本信息,add-acker!添加acker bolt,add-acker!函数定义如下:
(
defn
add-acker!
[
storm-conf
^
StormTopology
ret
]
(
let
[
num-executors (
if (
nil? (
storm-conf
TOPOLOGY-ACKER-EXECUTORS)) (
storm-conf
TOPOLOGY-WORKERS) (
storm-conf
TOPOLOGY-ACKER-EXECUTORS))
acker-bolt (
thrift/mk-bolt-spec* (
acker-inputs
ret)
(
new
backtype.storm.daemon.acker)
{
ACKER-ACK-STREAM-ID (
thrift/direct-output-fields
[
"id"
])
ACKER-FAIL-STREAM-ID (
thrift/direct-output-fields
[
"id"
])
}
:p
num-executors
:conf
{
TOPOLOGY-TASKS
num-executors
TOPOLOGY-TICK-TUPLE-FREQ-SECS (
storm-conf
TOPOLOGY-MESSAGE-TIMEOUT-SECS
)})]
(
dofor
[[
_
bolt
] (
.get_bolts
ret)
:let
[
common (
.get_common
bolt
)]]
(
do
(
.put_to_streams
common
ACKER-ACK-STREAM-ID (
thrift/output-fields
[
"id"
"ack-val"
]))
(
.put_to_streams
common
ACKER-FAIL-STREAM-ID (
thrift/output-fields
[
"id"
]))
))
(
dofor
[[
_
spout
] (
.get_spouts
ret)
:let
[
common (
.get_common
spout)
spout-conf (
merge
(
component-conf
spout)
{
TOPOLOGY-TICK-TUPLE-FREQ-SECS (
storm-conf
TOPOLOGY-MESSAGE-TIMEOUT-SECS
)})]]
(
do
;; this set up tick tuples to cause timeouts to be triggered
(
.set_json_conf
common (
to-json
spout-conf))
(
.put_to_streams
common
ACKER-INIT-STREAM-ID (
thrift/output-fields
[
"id"
"init-val"
"spout-task"
]))
(
.put_to_inputs
common
(
GlobalStreamId.
ACKER-COMPONENT-ID
ACKER-ACK-STREAM-ID)
(
thrift/mk-direct-grouping))
(
.put_to_inputs
common
(
GlobalStreamId.
ACKER-COMPONENT-ID
ACKER-FAIL-STREAM-ID)
(
thrift/mk-direct-grouping))
))
(
.put_to_bolts
ret
"__acker"
acker-bolt)
))
根据是否配置"topology.acker.executors"获取acker线程的个数,如果没有配置num-executors绑定"topology.workers"的值,否则绑定"topology.acker.executors"的值。acker-bolt绑定生成的acker bolt对象。acker-inputs函数定义如下:
(
defn
acker-inputs
[
^
StormTopology
topology
]
(
let
[
bolt-ids (
..
topology
get_bolts
keySet)
spout-ids (
..
topology
get_spouts
keySet)
spout-inputs (
apply
merge
(
for
[
id
spout-ids
]
{[
id
ACKER-INIT-STREAM-ID
]
[
"id"
]}
))
bolt-inputs (
apply
merge
(
for
[
id
bolt-ids
]
{[
id
ACKER-ACK-STREAM-ID
]
[
"id"
]
[
id
ACKER-FAIL-STREAM-ID
]
[
"id"
]}
))]
(
merge
spout-inputs
bolt-inputs)))
bolt-ids绑定topology所有bolt的id,spout-ids绑定所有spout的id,spout-inputs绑定来自spout的输入流,bolt-inputs绑定来自bolt的输入流,最后返回合并后的输入流(一个map对象)。ACKER-ACK-STREAM-ID和ACKER-FAIL-STREAM-ID表示acker的输出流。TOPOLOGY-TICK-TUPLE-FREQ-SECS表示tick tuple的频率,初始值为消息超时的时间。第一个dofor语句为每个bolt添加ACKER-ACK-STREAM-ID和ACKER-FAIL-STREAM-ID输出流用于将ack value发送个acker bolt,第二个dofor为每个spout设置了tick tuple的发送频率,并且设置了发送给acker bolt的ACKER-INIT-STREAM-ID输出流和来自ackerblot的两个输入流。这样acker bolt就可以与spout和bolt进行ack信息通信了。add-metric-components!函数主要功能就是将metric bolts添加到topology定义中。metric bolt主要用于统计线程executor相关的信息。add-metric-components!函数定义如下:
(
defn
add-metric-components!
[
storm-conf
^
StormTopology
topology
]
(
doseq
[[
comp-id
bolt-spec
] (
metrics-consumer-bolt-specs
storm-conf
topology
)]
(
.put_to_bolts
topology
comp-id
bolt-spec)))
metrics-consumer-bolt-specs
函数定义如下:
(
defn
metrics-consumer-bolt-specs
[
storm-conf
topology
]
(
let
[
component-ids-that-emit-metrics (
cons
SYSTEM-COMPONENT-ID (
keys (
all-components
topology)))
inputs (
->> (
for
[
comp-id
component-ids-that-emit-metrics
]
{[
comp-id
METRICS-STREAM-ID
]
:shuffle
})
(
into
{}))
mk-bolt-spec (
fn
[
class
arg p
]
(
thrift/mk-bolt-spec*
inputs
(
backtype.storm.metric.MetricsConsumerBolt.
class
arg)
{}
:p p
:conf
{
TOPOLOGY-TASKS p
}))]
(
map
(
fn
[
component-id
register
]
[
component-id (
mk-bolt-spec (
get
register
"class")
(
get
register
"argument")
(
or (
get
register
"parallelism.hint")
1
))])
(
metrics-consumer-register-ids
storm-conf)
(
get
storm-conf
TOPOLOGY-METRICS-CONSUMER-REGISTER))))
component-ids-that-emit-metrics绑定包括system bolt在内的所有spout和bolt的id,inputs绑定了metric bolt的输入流,并且使用shuffle grouping。mk-bolt-spec绑定一个匿名函数,metrics-consumer-register-ids函数为每个metric consumer对象产生一个component id列表,get函数返回所有metric consumer对象,map函数返回component id和metric consumer对象集合的列表([component-id metric-consumer] [component-id metric-consumer]......)。add-system-components!函数主要功能是将system bolt添加到topology定义中。system bolt用于统计与进程worker相关的信息,如内存使用率,gc情况,网络吞吐量等。每个进程worker中只有一个system bolt。add-system-components!函数定义如下:
(
defn
add-system-components!
[
conf
^
StormTopology
topology
]
(
let
[
system-bolt-spec (
thrift/mk-bolt-spec*
{}
(
SystemBolt.)
{
SYSTEM-TICK-STREAM-ID (
thrift/output-fields
[
"rate_secs"
])
METRICS-TICK-STREAM-ID (
thrift/output-fields
[
"interval"
])}
:p
0
:conf
{
TOPOLOGY-TASKS
0
})]
(
.put_to_bolts
topology
SYSTEM-COMPONENT-ID
system-bolt-spec)))
从thrift/mk-bolt-spec*函数的第一个参数{}我们可以发现system bolt没有输入流,从第三个参数可以发现它有两个输出流用于发送tick tuple,它的并行度为0,因为system bolt是与进程worker相关的,所以没有必要指定并行度。同时他也不需要执行任何task。add-metric-streams!函数主要功能用于给topology添加metric streams定义,add-metric-streams!定义如下:
(
defn
add-metric-streams!
[
^
StormTopology
topology
]
(
doseq
[[
_
component
] (
all-components
topology)
:let
[
common (
.get_common
component
)]]
(
.put_to_streams
common
METRICS-STREAM-ID
(
thrift/output-fields
[
"task-info"
"data-points"
]))))
给spout和bolt添加METRICS-STREAM-ID标示的metric stream。add-system-streams!函数与add-metric-streams!相似,给spout和bolt添加SYSTEM-STREAM-ID标示的system stream。submitTopologyWithOpts函数在调用system-topology!函数后,首先加锁,然后调用setup-storm-code函数,该函数的主要功能就是将上传给nimbus的jar包、topology和配置信息拷贝到{storm.local.dir}/nimbus/stormdist/{topology id}目录中,定义如下:
(
defn-
setup-storm-code
[
conf
storm-id
tmp-jar-location
storm-conf
topology
]
(
let
[
stormroot (
master-stormdist-root
conf
storm-id
)]
(
FileUtils/forceMkdir (
File.
stormroot))
(
FileUtils/cleanDirectory (
File.
stormroot))
(
setup-jar
conf
tmp-jar-location
stormroot)
(
FileUtils/writeByteArrayToFile (
File. (
master-stormcode-path
stormroot)) (
Utils/serialize
topology))
(
FileUtils/writeByteArrayToFile (
File. (
master-stormconf-path
stormroot)) (
Utils/serialize
storm-conf))
))
setup-jar函数将{storm.local.dir}/nimbus/inbox/中的jar包拷贝到{storm.local.dir}/nimbus/stormdist/{topology id}目录,并重命名为stormjar.jar。FileUtils/writeByteArrayToFile将topology对象和storm-conf序列化后分别保存到stormcode.ser和stormconf.ser。setup-heartbeats!函数定义在cluster.clj文件中,是StormClusterState协议的一个函数,主要功能就是在zookeeper上创建该topology用于存放心跳信息的目录。心跳目录:
/storm/workerbeats/{topology id}/。
start-storm函数的主要功能读取整个集群的配置信息、nimbus的配置信息、从stormconf.ser反序列化topology配置信息和从stormcode.ser反序列化出topology,然后通过调用activate-storm!函数将topology的元数据StormBase对象写入zookeeper的/storm/storms/{topology id}文件中。定义如下:
(
defn-
start-storm
[
nimbus
storm-name
storm-id
topology-initial-status
]
{
:pre
[(
#
{
:active
:inactive
}
topology-initial-status
)]}
(
let
[
storm-cluster-state (
:storm-cluster-state
nimbus)
conf (
:conf
nimbus)
storm-conf (
read-storm-conf
conf
storm-id)
topology (
system-topology!
storm-conf (
read-storm-topology
conf
storm-id))
num-executors (
->> (
all-components
topology) (
map-val
num-start-executors
))]
(
log-message
"Activating "
storm-name
": "
storm-id)
(
.activate-storm!
storm-cluster-state
storm-id
(
StormBase.
storm-name
(
current-time-secs)
{
:type
topology-initial-status
}
(
storm-conf
TOPOLOGY-WORKERS)
num-executors))))
submitTopologyWithOpts函数最后调用mk-assignments函数进行任务分配。任务分配是stom架构的重要组成部分。鉴于篇幅问题,有关任务分配的源码分析会在之后的文章中讲解。