原创文章,欢迎转载。转载请注明出处:http://blog.csdn.net/jmppok/article/details/17073397
在Storm中提交Topology后,一直处于分派状态,查看Supervisor日至,显示
2013-12-02 14:49:52 supervisor [INFO] 46b25fa5-b333-4985-9c1d-3f112d5c615a still hasn't started 2013-12-02 14:49:52 supervisor [INFO] 46b25fa5-b333-4985-9c1d-3f112d5c615a still hasn't started 2013-12-02 14:49:53 supervisor [INFO] 46b25fa5-b333-4985-9c1d-3f112d5c615a still hasn't started 2013-12-02 14:49:53 supervisor [INFO] 46b25fa5-b333-4985-9c1d-3f112d5c615a still hasn't started 2013-12-02 14:49:54 supervisor [INFO] 46b25fa5-b333-4985-9c1d-3f112d5c615a still hasn't started 2013-12-02 14:49:54 supervisor [INFO] 46b25fa5-b333-4985-9c1d-3f112d5c615a still hasn't started 2013-12-02 14:49:55 supervisor [INFO] 46b25fa5-b333-4985-9c1d-3f112d5c615a still hasn't started
只有不停的显示该消息才说明执行task的worker无法启动成功。
通过查看worker的日志,可看到详细的错误信息:
2013-12-02 13:28:02 worker [ERROR] Error on initialization of server mk-worker org.zeromq.ZMQException: Invalid argument(0x16) at org.zeromq.ZMQ$Socket.connect(Native Method) at zilch.mq$connect.invoke(mq.clj:74) at backtype.storm.messaging.zmq.ZMQContext.connect(zmq.clj:65) at backtype.storm.daemon.worker$mk_refresh_connections$this__4293$iter__4300__4304$fn__4305.invoke(worker.clj:244) at clojure.lang.LazySeq.sval(LazySeq.java:42) at clojure.lang.LazySeq.seq(LazySeq.java:60) at clojure.lang.RT.seq(RT.java:473) at clojure.core$seq.invoke(core.clj:133) at clojure.core$dorun.invoke(core.clj:2725) at clojure.core$doall.invoke(core.clj:2741) at backtype.storm.daemon.worker$mk_refresh_connections$this__4293.invoke(worker.clj:238) at backtype.storm.daemon.worker$fn__4348$exec_fn__1228__auto____4349.invoke(worker.clj:351) at clojure.lang.AFn.applyToHelper(AFn.java:185) at clojure.lang.AFn.applyTo(AFn.java:151) at clojure.core$apply.invoke(core.clj:601) at backtype.storm.daemon.worker$fn__4348$mk_worker__4404.doInvoke(worker.clj:323) at clojure.lang.RestFn.invoke(RestFn.java:512) at backtype.storm.daemon.worker$_main.invoke(worker.clj:433) at clojure.lang.AFn.applyToHelper(AFn.java:172) at clojure.lang.AFn.applyTo(AFn.java:151) at backtype.storm.daemon.worker.main(Unknown Source)
2013-12-02 14:49:51 supervisor [INFO] Launching worker with command: java -server -Xmx768m -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -Dlogfile.name=worker-6703.log -Dstorm.home=/opt/storm -Dlog4j.configuration=storm.log.properties -cp /opt/storm/storm-0.8.2.jar:/opt/storm/lib/commons-exec-1.1.jar:/opt/storm/lib/jetty-util-6.1.26.jar:/opt/storm/lib/minlog-1.2.jar:/opt/storm/lib/snakeyaml-1.9.jar:/opt/storm/lib/clj-time-0.4.1.jar:/opt/storm/lib/compojure-1.1.3.jar:/opt/storm/lib/curator-framework-1.0.1.jar:/opt/storm/lib/joda-time-2.0.jar:/opt/storm/lib/reflectasm-1.07-shaded.jar:/opt/storm/lib/log4j-1.2.16.jar:/opt/storm/lib/json-simple-1.1.jar:/opt/storm/lib/jline-0.9.94.jar:/opt/storm/lib/hiccup-0.3.6.jar:/opt/storm/lib/slf4j-log4j12-1.5.8.jar:/opt/storm/lib/clojure-1.4.0.jar:/opt/storm/lib/asm-4.0.jar:/opt/storm/lib/carbonite-1.5.0.jar:/opt/storm/lib/servlet-api-2.5.jar:/opt/storm/lib/servlet-api-2.5-20081211.jar:/opt/storm/lib/disruptor-2.10.1.jar:/opt/storm/lib/ring-servlet-0.3.11.jar:/opt/storm/lib/junit-3.8.1.jar:/opt/storm/lib/ring-jetty-adapter-0.3.11.jar:/opt/storm/lib/core.incubator-0.1.0.jar:/opt/storm/lib/tools.macro-0.1.0.jar:/opt/storm/lib/math.numeric-tower-0.0.1.jar:/opt/storm/lib/zookeeper-3.3.3.jar:/opt/storm/lib/curator-client-1.0.1.jar:/opt/storm/lib/libthrift7-0.7.0.jar:/opt/storm/lib/tools.cli-0.2.2.jar:/opt/storm/lib/tools.logging-0.2.3.jar:/opt/storm/lib/jgrapht-0.8.3.jar:/opt/storm/lib/kryo-2.17.jar:/opt/storm/lib/guava-13.0.jar:/opt/storm/lib/commons-logging-1.1.1.jar:/opt/storm/lib/ring-core-1.1.5.jar:/opt/storm/lib/commons-codec-1.4.jar:/opt/storm/lib/httpclient-4.1.1.jar:/opt/storm/lib/commons-lang-2.5.jar:/opt/storm/lib/commons-io-1.4.jar:/opt/storm/lib/slf4j-api-1.5.8.jar:/opt/storm/lib/jetty-6.1.26.jar:/opt/storm/lib/jzmq-2.1.0.jar:/opt/storm/lib/httpcore-4.1.jar:/opt/storm/lib/clout-1.0.1.jar:/opt/storm/lib/commons-fileupload-1.2.1.jar:/opt/storm/lib/objenesis-1.2.jar:/opt/storm/log4j:/opt/storm/conf:/tmp/storm_tmp/supervisor/stormdist/mytest-2-1385966991/stormjar.jar backtype.storm.daemon.worker mytest-2-1385966991 dc89a2b5-267f-4ed8-b94a-f900ed6300e4 6703 0916c7a9-c47d-43ae-9d88-13ec574ee5e6
2013-12-02 14:49:51 supervisor [INFO] 0916c7a9-c47d-43ae-9d88-13ec574ee5e6 still hasn't started
2013-12-02 14:49:52 supervisor [INFO] 0916c7a9-c47d-43ae-9d88-13ec574ee5e6 still hasn't started
2013-12-02 14:49:52 supervisor [INFO] 0916c7a9-c47d-43ae-9d88-13ec574ee5e6 still hasn't started
2013-12-02 14:49:53 supervisor [INFO] 0916c7a9-c47d-43ae-9d88-13ec574ee5e6 still hasn't started
2013-12-02 14:49:53 supervisor [INFO] 0916c7a9-c47d-43ae-9d88-13ec574ee5e6 still hasn't started
2013-12-02 14:49:54 supervisor [INFO] 0916c7a9-c47d-43ae-9d88-13ec574ee5e6 still hasn't started
2013-12-02 14:49:54 supervisor [INFO] 0916c7a9-c47d-43ae-9d88-13ec574ee5e6 still hasn't started
Storm中关于ZMQ和ZooKeeper连接错误的问题,一般都是本机的host配置有问题导致无法连接。需要在Storm集群中的所有节点,进行如下修改:
1)添加本机IP和主机名的信息,如192.168.0.2 node1
2)添加Strom Cluster中其他主机的信息,如192.168.0.3 node2
192.168.0.4 node3
从而使ZMQ或Zookeeper在连接时能解析到正确的主机。