初试Storm之常见问题
 

 
错误 1 发布topologies到远程集群时,出现Nimbus host is not set异常。异常内容如下所示:

[root@xop-dev-a bin]# ./storm jar /home/clx/storm-starter.jar storm.starter.WordCountTopology wordcount

Running: export STORM_JAR=/home/clx/storm-starter.jar; java -client -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib  -cp /home/clx/storm-0.5.4/storm-0.5.4.jar:/home/clx/storm-0.5.4/lib/log4j-1.2.16.jar:/home/clx/storm-0.5.4/lib/tools.macro-0.1.0.jar:/home/clx/storm-0.5.4/lib/jline-0.9.94.jar:/home/clx/storm-0.5.4/lib/commons-lang-2.5.jar:/home/clx/storm-0.5.4/lib/core.incubator-0.1.0.jar:/home/clx/storm-0.5.4/lib/junit-3.8.1.jar:/home/clx/storm-0.5.4/lib/compojure-0.6.4.jar:/home/clx/storm-0.5.4/lib/zookeeper-3.3.2.jar:/home/clx/storm-0.5.4/lib/clojure-contrib-1.2.0.jar:/home/clx/storm-0.5.4/lib/httpcore-4.0.1.jar:/home/clx/storm-0.5.4/lib/commons-logging-1.1.1.jar:/home/clx/storm-0.5.4/lib/commons-io-1.4.jar:/home/clx/storm-0.5.4/lib/ring-core-0.3.10.jar:/home/clx/storm-0.5.4/lib/httpclient-4.0.1.jar:/home/clx/storm-0.5.4/lib/commons-codec-1.3.jar:/home/clx/storm-0.5.4/lib/jzmq-2.1.0.jar:/home/clx/storm-0.5.4/lib/jvyaml-1.0.0.jar:/home/clx/storm-0.5.4/lib/commons-fileupload-1.2.1.jar:/home/clx/storm-0.5.4/lib/slf4j-log4j12-1.5.8.jar:/home/clx/storm-0.5.4/lib/servlet-api-2.5.jar:/home/clx/storm-0.5.4/lib/json-simple-1.1.jar:/home/clx/storm-0.5.4/lib/ring-jetty-adapter-0.3.11.jar:/home/clx/storm-0.5.4/lib/slf4j-api-1.5.8.jar:/home/clx/storm-0.5.4/lib/jetty-util-6.1.26.jar:/home/clx/storm-0.5.4/lib/joda-time-1.6.jar:/home/clx/storm-0.5.4/lib/libthrift7-0.7.0.jar:/home/clx/storm-0.5.4/lib/commons-exec-1.1.jar:/home/clx/storm-0.5.4/lib/clojure-1.2.0.jar:/home/clx/storm-0.5.4/lib/ring-servlet-0.3.11.jar:/home/clx/storm-0.5.4/lib/clj-time-0.3.0.jar:/home/clx/storm-0.5.4/lib/hiccup-0.3.6.jar:/home/clx/storm-0.5.4/lib/clout-0.4.1.jar:/home/clx/storm-0.5.4/lib/jetty-6.1.26.jar:/home/clx/storm-0.5.4/lib/servlet-api-2.5-20081211.jar:/home/clx/storm-starter.jar:/root/.storm:/home/clx/storm-0.5.4/bin storm.starter.WordCountTopology wordcount

0    [main] INFO  backtype.storm.StormSubmitter  - Jar not uploaded to master yet. Submitting jar...

Exception in thread "main" java.lang.IllegalArgumentException: Nimbus host is not set

         at backtype.storm.utils.NimbusClient.(NimbusClient.java:30)

         at backtype.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:17)

         at backtype.storm.StormSubmitter.submitJar(StormSubmitter.java:78)

         at backtype.storm.StormSubmitter.submitJar(StormSubmitter.java:71)

         at backtype.storm.StormSubmitter.submitTopology(StormSubmitter.java:50)

         at storm.starter.WordCountTopology.main(WordCountTopology.java:81)

 

解决方法:在~/.storm/目录新建storm.yaml文件,~代表用户主目录。storm.yaml文件内容:nimbus.host: "10.0.0.24"。重启nimbus后台程序,异常消失。

 

错误2启动Supervisor时,出现java.lang.UnsatisfiedLinkError: /usr/local/lib/libjzmq.so.0.0.0: libzmq.so.1: cannot open shared object file: No such file or directory异常。异常内容如下所示:

2011-12-01 11:58:47 worker [ERROR] Error on initialization of server mk-worker

java.lang.UnsatisfiedLinkError: /usr/local/lib/libjzmq.so.0.0.0: libzmq.so.1: cannot open shared object file: No such file or directory

         at java.lang.ClassLoader$NativeLibrary.load(Native Method)

         at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1803)

         at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1728)

         at java.lang.Runtime.loadLibrary0(Runtime.java:823)

         at java.lang.System.loadLibrary(System.java:1028)

         at org.zeromq.ZMQ.(ZMQ.java:34)

         at java.lang.Class.forName0(Native Method)

         at java.lang.Class.forName(Class.java:169)

         at zilch.mq__init.load(Unknown Source)

         at zilch.mq__init.(Unknown Source)

         at java.lang.Class.forName0(Native Method)

         at java.lang.Class.forName(Class.java:247)

         at clojure.lang.RT.loadClassForName(RT.java:1578)

         at clojure.lang.RT.load(RT.java:399)

         at clojure.lang.RT.load(RT.java:381)

         at clojure.core$load$fn__4511.invoke(core.clj:4905)

         at clojure.core$load.doInvoke(core.clj:4904)

         at clojure.lang.RestFn.invoke(RestFn.java:409)

         at clojure.core$load_one.invoke(core.clj:4729)

         at clojure.core$load_lib.doInvoke(core.clj:4766)

         at clojure.lang.RestFn.applyTo(RestFn.java:143)

         at clojure.core$apply.invoke(core.clj:542)

         at clojure.core$load_libs.doInvoke(core.clj:4800)

         at clojure.lang.RestFn.applyTo(RestFn.java:138)

         at clojure.core$apply.invoke(core.clj:542)

         at clojure.core$require.doInvoke(core.clj:4869)

         at clojure.lang.RestFn.invoke(RestFn.java:422)

         at backtype.storm.messaging.zmq$loading__4410__auto__.invoke(zmq.clj:1)

         at backtype.storm.messaging.zmq__init.load(Unknown Source)

         at backtype.storm.messaging.zmq__init.(Unknown Source)

         at java.lang.Class.forName0(Native Method)

         at java.lang.Class.forName(Class.java:247)

         at clojure.lang.RT.loadClassForName(RT.java:1578)

         at clojure.lang.RT.load(RT.java:399)

         at clojure.lang.RT.load(RT.java:381)

         at clojure.core$load$fn__4511.invoke(core.clj:4905)

         at clojure.core$load.doInvoke(core.clj:4904)

         at clojure.lang.RestFn.invoke(RestFn.java:409)

         at clojure.core$load_one.invoke(core.clj:4729)

         at clojure.core$load_lib.doInvoke(core.clj:4766)

         at clojure.lang.RestFn.applyTo(RestFn.java:143)

         at clojure.core$apply.invoke(core.clj:542)

         at clojure.core$load_libs.doInvoke(core.clj:4800)

         at clojure.lang.RestFn.applyTo(RestFn.java:138)

         at clojure.core$apply.invoke(core.clj:542)

         at clojure.core$require.doInvoke(core.clj:4869)

         at clojure.lang.RestFn.invoke(RestFn.java:409)

         at backtype.storm.messaging.loader$mk_zmq_context.doInvoke(loader.clj:8)

         at clojure.lang.RestFn.invoke(RestFn.java:437)

         at backtype.storm.daemon.worker$fn__3074$exec_fn__858__auto____3075.invoke(worker.clj:109)

         at clojure.lang.AFn.applyToHelper(AFn.java:187)

         at clojure.lang.AFn.applyTo(AFn.java:151)

         at clojure.core$apply.invoke(core.clj:540)

         at backtype.storm.daemon.worker$fn__3074$mk_worker__3216.doInvoke(worker.clj:78)

         at clojure.lang.RestFn.invoke(RestFn.java:513)

         at backtype.storm.daemon.worker$_main.invoke(worker.clj:247)

         at clojure.lang.AFn.applyToHelper(AFn.java:174)

         at clojure.lang.AFn.applyTo(AFn.java:151)

         at backtype.storm.daemon.worker.main(Unknown Source)

 
解决方法 1 export LD_LIBRARY_PATH=/usr/local/lib
解决方法 2 编辑/etc/ld.so.conf文件,增加一行:/usr/local/lib。再执行sudo ldconfig命令,重启Supervisor,异常消失。
 
错误 3 发布drpc类型的topologies到远程集群时,出现空指针异常,连接drpc服务器失败。
 
解决方法:在conf/storm.yaml文件中增加drpc服务器配置,启动配置文件中指定的所有drpc服务。内容如下所示:
drpc.servers:
  - "drpc服务器ip"
 
错误 4 客户端调用drpc服务时,worker的日志中出现Failing message,而bolt都未收到数据。日志如下所示:

2011-12-02 09:59:16 task [INFO] Failing message backtype.storm.drpc.DRPCSpout$DRPCMessageId@3770bdf7: source: 1:27, stream: 1, id: {-5919451531315711689=-5919451531315711689}, [foo.com/blog/1, {"port":3772,"id":"5","host":"10.0.0.24"}]

 

解决方法:主机名,域名,hosts文件配置不正确会引起这类错误。检查并修改storm相关机器的主机名,域名,hosts文件。重启网络服务:service network restart。重启storm,再次调用drpc服务,成功。Hosts文件中必须包含如下内容:

【nimbus主机ip】 【nimbus主机名】 【nimbus主机别名】
【supervisor主机ip】 【supervisor主机名】 【supervisor主机别名】
【zookeeper主机ip】 【zookeeper主机名】 【zookeeper主机别名】
 
错误 5 发布topologies时,出现不能序列化log4j.Logger的异常。
解决方法:使用slf4j代替log4j。 
 
错误 6 bolt在处理消息时,worker的日志中出现Failing message。
 
解决方法:提交Topology时设置适当的消息的超时时间,比消息原本的超时时间更长,如下所示:
                     conf.setMessageTimeoutSecs(60);//默认为30秒。