Spark Streaming sample

搭建kafka环境

下载kafka

wget http://mirrors.tuna.tsinghua.edu.cn/apache/kafka/0.11.0.0/kafka_2.11-0.11.0.0.tgz
tar zvxf kafka
cd kafaka

启动zookeeper

bin/zookeeper-server-start.sh config/zookeeper.properties &

修改配置文件config/server.properties,添加如下内容

host.name=localhost
advertised.host.name=localhost

启动kafka server

bin/kafka-server-start.sh config/server.properties &

创建topic

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
创建成功返回test

列出topic

bin/kafka-topics.sh --list --zookeeper localhost:2181

Spark

运行KafkaWordCountProducer

bin/run-example org.apache.spark.examples.streaming.KafkaWordCountProducer localhost:9092 test 3 5
localhost:9092 是Kafka producer地址和端口
test是topic
3表示每秒发的条数
5表示每条消息的单词数

运行KafkaWordCount

bin/run-example org.apache.spark.examples.streaming.KafkaWordCount localhost:2181 test-consumer-group test 1
localhost:2181是zookeeper监听的端口和地址,
test-consumer-group是comsumer-group的名字,需要和$KAFKA_HOME/config/consumer.properties中的group.id的配置匹配
test是topic
1是线程数

你可能感兴趣的:(Spark Streaming sample)