Storm集群可以搭建在AWS上,也可以直接手动部署在集群机器上。这里使用手动搭建的方式部署在一个机器上。
环境:ubuntu 13.10 64bit
1.Java安装
Java1.6的安装就不多的说了。
2.python安装
ubuntu自带了
3.zookeeper
单机模式部署方式
http://zookeeper.apache.org/doc/r3.3.3/zookeeperStarted.html#sc_InstallingSingleMode
wget http://apache.fayea.com/apache-mirror/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5.tar.gz chmod a+x zookeeper-3.4.5.tar.gz tar zxvf zookeeper-3.4.5.tar.gz
进入conf目录下,新建zoo.cfg文件
tickTime=2000 dataDir=/var/zookeeper clientPort=2181
dataDir是zk用于存储文件的地址,确保运行用户有权限访问该目录
启动zk
bin/zkServer.sh start
验证一下
echo ruok|nc localhost 2181
返回 imok说明zk起来了
4.ZeroMQ
wget http://download.zeromq.org/zeromq-2.1.7.tar.gz tar -xzf zeromq-2.1.7.tar.gz cd zeromq-2.1.7 ./configure make sudo make install
期间可能遇到一些软件ubuntu没有安装 sudo apt-get install xxx 安装即可
5.JZMQ
git clone https://github.com/nathanmarz/jzmq.git cd jzmq ./autogen.sh ./configure make sudo make install
git的安装和配置
sudo apt-get install git git config --global user.name author #将用户名设为author git config --global user.email [email protected] #将用户邮箱设为[email protected]
期间遇到的问题:
(1).make[1]: *** 没有规则可以创建“org/zeromq/ZMQ.class”需要的目标“classdist_noinst.stamp”。 停止
修正方法,创建classdist_noinst.stamp文件,
touch src/classdist_noinst.stamp
(2).错误:无法访问 org.zeromq.ZMQ
修正方法,进入src目录,手动编译相关java代码
javac -d . org/zeromq/*.java
6.Storm
下载最新release版本0.9.0.1
解压
tar zxvf storm-0.9.0.1.tar.gz
修改Storm的配置文件 conf/storm.yaml
storm.zookeeper.servers: - "localhost" storm.local.dir: "/home/username/storm-0.9.0.1/workdir" nimbus.host: "localhost"
说明:
storm.zookeeper.servers:这里使用的zk是本地的,所以用localhost
nimbus.host: 指明nimbus所在的机器
启动:
启动控制节点nimbus
bin/storm nimbus >/dev/null 2>&1 &
启动任务节点supervisor
bin/storm supervisor >/dev/null 2>&1 &
启动ui: 在nimbus节点上运行
bin/storm ui >/dev/null 2>&1 &
UI启动后,可以通过 http://localhost:8080观察集群运行情况。
7.HelloWorld
这里使用《Getting started with Strom》书中的例子,这是一个word count 的例子
https://github.com/storm-book/examples-ch02-getting_started/zipball/master
需要修改一下pom文件
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>storm.book</groupId> <artifactId>Getting-Started</artifactId> <version>0.0.1-SNAPSHOT</version> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>2.3.2</version> <configuration> <source>1.6</source> <target>1.6</target> <compilerVersion>1.6</compilerVersion> </configuration> </plugin> </plugins> </build> <repositories> <!-- Repository where we can found the storm dependencies --> <repository> <id>clojars.org</id> <url>http://clojars.org/repo</url> </repository> </repositories> <dependencies> <!-- Storm Dependency --> <dependency> <groupId>storm</groupId> <artifactId>storm</artifactId> <version>0.9.0.1</version> <scope>provided</scope> </dependency> <dependency> <groupId>com.esotericsoftware.kryo</groupId> <artifactId>kryo</artifactId> <version>2.17</version> </dependency> </dependencies> </project>
由于例子中没有输出机制,这里使用写文件的方式来验证程序的正确性。
修改WordCounter类,增加一个局部变量
private FileWriter fileWriter;
修改prepare方法
@Override public void prepare(Map stormConf, TopologyContext context) { this.counters = new HashMap<String, Integer>(); this.name = context.getThisComponentId(); this.id = context.getThisTaskId(); try { this.fileWriter = new FileWriter((String) stormConf.get("outFile")); } catch (IOException e) { throw new RuntimeException("Error write file ["+stormConf.get("outFile")+"]"); } }
修改execute方法
@Override public void execute(Tuple input, BasicOutputCollector collector) { String str = input.getString(0); /** * If the word dosn't exist in the map we will create * this, if not We will add 1 */ if(!counters.containsKey(str)){ counters.put(str, 1); }else{ Integer c = counters.get(str) + 1; counters.put(str, c); } if(this.fileWriter != null){ try { fileWriter.write("Thread " + Thread.currentThread().getName() + " log counters===================" + counters); fileWriter.write("\r\n"); fileWriter.write("===================================================="); fileWriter.write("\r\n"); fileWriter.flush(); } catch (IOException e) { e.printStackTrace(); } } }
修改TopologyMain的main方法
public class TopologyMain { public static void main(String[] args) throws InterruptedException, AlreadyAliveException, InvalidTopologyException { //Topology definition TopologyBuilder builder = new TopologyBuilder(); builder.setSpout("word-reader",new WordReader()); builder.setBolt("word-normalizer", new WordNormalizer()) .shuffleGrouping("word-reader"); builder.setBolt("word-counter", new WordCounter(),(Number)1) .fieldsGrouping("word-normalizer", new Fields("word")); //Configuration Config conf = new Config(); conf.put("wordsFile", args[0]); conf.put("outFile", args[1]); conf.setDebug(true); //Topology run conf.put(Config.TOPOLOGY_MAX_SPOUT_PENDING, 1); conf.setNumWorkers(3); StormSubmitter.submitTopology("Getting-Started-Toplogie", conf, builder.createTopology()); // LocalCluster cluster = new LocalCluster(); // cluster.submitTopology("Getting-Started-Toplogie", conf, builder.createTopology()); // Thread.sleep(1000); // cluster.shutdown(); } }
mvn install 打包之后,得到Getting-Started-0.0.1-SNAPSHOT.jar文件
创建数据源文件 words.txt
Storm test are great is an Storm simple application but very powerful really Storm is great
提交到集群运行
到storm安装目录
bin/storm Getting-Started-0.0.1-SNAPSHOT.jar TopologyMain /tmp/words.txt /tmp/words-result.txt
运行之后,在/tmp 目录下发现新生成的words-result.txt文件,内容如下
Thread Thread-16-word-counter log counters==================={storm=1} ====================================================Thread Thread-16-word-counter log counters==================={test=1, storm=1} ====================================================Thread Thread-16-word-counter log counters==================={are=1, test=1, storm=1} ====================================================Thread Thread-16-word-counter log counters==================={great=1, are=1, test=1, storm=1} ====================================================Thread Thread-16-word-counter log counters==================={is=1, great=1, are=1, test=1, storm=1} ====================================================Thread Thread-16-word-counter log counters==================={is=1, great=1, are=1, test=1, an=1, storm=1} ====================================================Thread Thread-16-word-counter log counters==================={is=1, great=1, are=1, test=1, an=1, storm=2} ====================================================Thread Thread-16-word-counter log counters==================={is=1, great=1, are=1, test=1, simple=1, an=1, storm=2} ====================================================Thread Thread-16-word-counter log counters==================={application=1, is=1, great=1, are=1, test=1, simple=1, an=1, storm=2} ====================================================Thread Thread-16-word-counter log counters==================={but=1, application=1, is=1, great=1, are=1, test=1, simple=1, an=1, storm=2} ====================================================Thread Thread-16-word-counter log counters==================={but=1, application=1, is=1, great=1, are=1, test=1, simple=1, an=1, storm=2, very=1} ====================================================Thread Thread-16-word-counter log counters==================={but=1, application=1, is=1, great=1, are=1, test=1, simple=1, an=1, storm=2, powerful=1, very=1} ====================================================Thread Thread-16-word-counter log counters==================={really=1, but=1, application=1, is=1, great=1, are=1, test=1, simple=1, an=1, storm=2, powerful=1, very=1} ====================================================Thread Thread-16-word-counter log counters==================={really=1, but=1, application=1, is=1, great=1, are=1, test=1, simple=1, an=1, storm=3, powerful=1, very=1} ====================================================Thread Thread-16-word-counter log counters==================={really=1, but=1, application=1, is=2, great=1, are=1, test=1, simple=1, an=1, storm=3, powerful=1, very=1} ====================================================Thread Thread-16-word-counter log counters==================={really=1, but=1, application=1, is=2, great=2, are=1, test=1, simple=1, an=1, storm=3, powerful=1, very=1} ====================================================
参考:
https://github.com/nathanmarz/storm/wiki/Setting-up-a-Storm-cluster
http://zookeeper.apache.org/doc/r3.3.3/zookeeperStarted.html
http://blog.csdn.net/thermosym/article/details/9254799