Storm wordcount

前言:

1个Spout得到数据源

2个bolt,其中一个用来把获取到的数据进行切分为单词,另一个bolt用来统计词频


创建java工程,导入storm lib包下的jar 或者通过maven方式进行包管理

Spout代码:

package com.storm.stu01;

import java.util.Map;
import java.util.Random;

import org.apache.storm.spout.SpoutOutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichSpout;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Values;
import org.apache.storm.utils.Utils;

public class WordSpout extends BaseRichSpout {

	private SpoutOutputCollector collector;
	private static final String[] msgs = new String[] { "I have a dream",
			"my dream is to be a data analyst",
			"you know do what you are dreaming", "don't give up your dreams",
			"it's just so so",
			"We need change the traditional ideas and practice boldly",
			"Storm enterprise real time calculation of actual combat",
			"you can be what uou want be" };
	private static final Random random = new Random();

	@Override
	public void open(Map conf, TopologyContext context,
			SpoutOutputCollector collector) {
		// TODO Auto-generated method stub
		this.collector = collector;
	}

	@Override
	public void nextTuple() {
		// TODO Auto-generated method stub
		Utils.sleep(1000);
		String sentence = msgs[random.nextInt(msgs.length)];
		collector.emit(new Values(sentence));
	}

	@Override
	public void declareOutputFields(OutputFieldsDeclarer declarer) {
		// TODO Auto-generated method stub
		declarer.declare(new Fields("sentence"));
	}

}

切分单词的bolt代码:

package com.storm.stu01;

import java.util.Map;

import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.BasicOutputCollector;
import org.apache.storm.topology.IBasicBolt;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Tuple;
import org.apache.storm.tuple.Values;

public class SplitSentenceBolt implements IBasicBolt {

	@Override
	public void declareOutputFields(OutputFieldsDeclarer declarer) {
		// TODO Auto-generated method stub
		declarer.declare(new Fields("word"));
	}

	@Override
	public Map getComponentConfiguration() {
		// TODO Auto-generated method stub
		return null;
	}

	@Override
	public void prepare(Map stormConf, TopologyContext context) {
		// TODO Auto-generated method stub

	}

	@Override
	public void execute(Tuple input, BasicOutputCollector collector) {
		// TODO Auto-generated method stub
		String sentence = input.getString(0);
		for (String word : sentence.split(" ")) {
			collector.emit(new Values(word));
		}
	}

	@Override
	public void cleanup() {
		// TODO Auto-generated method stub

	}

}


统计词频bolt代码:

package com.storm.stu01;

import java.util.Map;
import java.util.Random;

import org.apache.storm.spout.SpoutOutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichSpout;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Values;
import org.apache.storm.utils.Utils;

public class WordSpout extends BaseRichSpout {

	private SpoutOutputCollector collector;
	private static final String[] msgs = new String[] { "I have a dream",
			"my dream is to be a data analyst",
			"you know do what you are dreaming", "don't give up your dreams",
			"it's just so so",
			"We need change the traditional ideas and practice boldly",
			"Storm enterprise real time calculation of actual combat",
			"you can be what uou want be" };
	private static final Random random = new Random();

	@Override
	public void open(Map conf, TopologyContext context,
			SpoutOutputCollector collector) {
		// TODO Auto-generated method stub
		this.collector = collector;
	}

	@Override
	public void nextTuple() {
		// TODO Auto-generated method stub
		Utils.sleep(1000);
		String sentence = msgs[random.nextInt(msgs.length)];
		collector.emit(new Values(sentence));
	}

	@Override
	public void declareOutputFields(OutputFieldsDeclarer declarer) {
		// TODO Auto-generated method stub
		declarer.declare(new Fields("sentence"));
	}

}


运行代码:

package com.storm.stu01;

import org.apache.storm.Config;
import org.apache.storm.LocalCluster;
import org.apache.storm.StormSubmitter;
import org.apache.storm.generated.AlreadyAliveException;
import org.apache.storm.generated.AuthorizationException;
import org.apache.storm.generated.InvalidTopologyException;
import org.apache.storm.generated.Nimbus.AsyncProcessor.submitTopology;
import org.apache.storm.generated.TopologyActionOptions;
import org.apache.storm.topology.TopologyBuilder;
import org.apache.storm.tuple.Fields;

public class Tese {
	public static void main(String[] args) throws AlreadyAliveException,
			InvalidTopologyException, AuthorizationException {
		TopologyBuilder builder = new TopologyBuilder();
		builder.setSpout("1", new WordSpout(), 1);
		builder.setBolt("2", new SplitSentenceBolt(), 10).shuffleGrouping("1");
		builder.setBolt("3", new WordCountBolt(), 1).fieldsGrouping("2",
				new Fields("word"));

		Config config = new Config();
		config.setDebug(true);
		config.setNumWorkers(2);

//		// 本地模式
//		LocalCluster cluster = new LocalCluster();
//		cluster.submitTopology("mywordcount", config, builder.createTopology());

		// 集群模式
		StormSubmitter.submitTopology("wordcount", config,
				builder.createTopology());

	}
}


运行方式:

1.本地模式下直接run即可

2.分布式模式下,把工程打入jar包

在nimbus服务器上执行

./storm jar  打包的jar文件路径  执行的class路径 此次topology名字


运行结果:

在本地模式下有如下console结果输出:

====================count:{need=1, can=2, dreams=1, have=1, give=1, change=1, I=1, Storm=1, calculation=1, real=1, of=1, time=1, We=1, combat=1, uou=2, want=2, be=4, just=1, so=2, dream=1, a=1, your=1, enterprise=1, don't=1, traditional=1, you=2, it's=1, the=1, up=1, what=2, actual=1}
13140 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.task - Emitting: 3 default [traditional, 1]
13141 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - BOLT ack TASK: 12 TIME:  TUPLE: source: 2:4, stream: default, id: {}, [traditional]
13141 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Execute done TUPLE source: 2:4, stream: default, id: {}, [traditional] TASK: 12 DELTA: 
13141 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Processing received message FOR 12 TUPLE: source: 2:4, stream: default, id: {}, [ideas]
====================count:{need=1, can=2, dreams=1, have=1, give=1, change=1, I=1, Storm=1, calculation=1, real=1, of=1, time=1, We=1, combat=1, uou=2, want=2, be=4, just=1, ideas=1, so=2, dream=1, a=1, your=1, enterprise=1, don't=1, traditional=1, you=2, it's=1, the=1, up=1, what=2, actual=1}
13142 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.task - Emitting: 3 default [ideas, 1]
13142 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - BOLT ack TASK: 12 TIME:  TUPLE: source: 2:4, stream: default, id: {}, [ideas]
13142 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Execute done TUPLE source: 2:4, stream: default, id: {}, [ideas] TASK: 12 DELTA: 
13142 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Processing received message FOR 12 TUPLE: source: 2:4, stream: default, id: {}, [and]
====================count:{need=1, can=2, dreams=1, have=1, give=1, change=1, I=1, Storm=1, calculation=1, real=1, of=1, time=1, We=1, combat=1, uou=2, want=2, be=4, just=1, ideas=1, so=2, dream=1, a=1, your=1, enterprise=1, don't=1, traditional=1, you=2, it's=1, the=1, and=1, up=1, what=2, actual=1}
13143 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.task - Emitting: 3 default [and, 1]
13143 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - BOLT ack TASK: 12 TIME:  TUPLE: source: 2:4, stream: default, id: {}, [and]
13143 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Execute done TUPLE source: 2:4, stream: default, id: {}, [and] TASK: 12 DELTA: 
13144 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Processing received message FOR 12 TUPLE: source: 2:4, stream: default, id: {}, [practice]
====================count:{need=1, can=2, dreams=1, have=1, give=1, change=1, I=1, Storm=1, calculation=1, real=1, of=1, time=1, We=1, combat=1, uou=2, want=2, be=4, just=1, ideas=1, so=2, dream=1, a=1, your=1, enterprise=1, don't=1, traditional=1, you=2, it's=1, the=1, and=1, up=1, what=2, practice=1, actual=1}
13144 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.task - Emitting: 3 default [practice, 1]
13144 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - BOLT ack TASK: 12 TIME:  TUPLE: source: 2:4, stream: default, id: {}, [practice]
13145 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Execute done TUPLE source: 2:4, stream: default, id: {}, [practice] TASK: 12 DELTA: 
13145 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Processing received message FOR 12 TUPLE: source: 2:4, stream: default, id: {}, [boldly]
====================count:{need=1, can=2, dreams=1, have=1, give=1, change=1, boldly=1, I=1, Storm=1, calculation=1, real=1, of=1, time=1, We=1, combat=1, uou=2, want=2, be=4, just=1, ideas=1, so=2, dream=1, a=1, your=1, enterprise=1, don't=1, traditional=1, you=2, it's=1, the=1, and=1, up=1, what=2, practice=1, actual=1}
13145 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.task - Emitting: 3 default [boldly, 1]
13146 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - BOLT ack TASK: 12 TIME:  TUPLE: source: 2:4, stream: default, id: {}, [boldly]
13146 [Thread-38-3-executor[12 12]] INFO  o.a.s.d.executor - Execute done TUPLE source: 2:4, strea


你可能感兴趣的:(storm)