Spark Streaming消费kafka示例

一 组件版本

Spark版本:spark-2.1.1-bin-hadoop2.7
Kafka版本:kafka_2.11-0.11.0.0

Scala版本:2.11.8

Tips:用scala 2.12.x的版本会报方法不存在错误

二 POM文件内容


    
        org.apache.spark
        spark-core_2.11
        2.2.0
    
    
        org.apache.spark
        spark-streaming_2.11
        2.2.0
    
    
        org.apache.spark
        spark-hive_2.11
        2.2.0
    
    
        org.apache.spark
        spark-streaming-kafka-0-10_2.11
        2.2.0
    

三 示例程序

import org.apache.spark.SparkConf
import org.apache.spark.streaming._
import org.apache.spark.streaming.kafka010._


object StreamingTest {
    def main(args: Array[String]): Unit = {
      //获取sparkstreaming

      val conf = new SparkConf().setMaster("local[2]").setAppName("NetworkWordCount")
      val ssc = new StreamingContext(conf, Seconds(10))

      // Create direct kafka stream with brokers and topics
      val topics="test"
      val brokers="localhost:9092"
      val topicsSet = topics.split(",").toSet
      val kafkaParams = Map[String, String]("bootstrap.servers" -> brokers,
        "value.deserializer" -> "org.apache.kafka.common.serialization.StringDeserializer",
        "key.deserializer" -> "org.apache.kafka.common.serialization.StringDeserializer",
        "group.id" -> "test-consumer-group")
      val messages = KafkaUtils.createDirectStream[String, String](
        ssc,
        LocationStrategies.PreferConsistent,
        ConsumerStrategies.Subscribe[String, String](topicsSet, kafkaParams))

      // Get the lines, split them into words, count the words and print
     
      val lines = messages.map(_.value)
      lines.print()

      // Start the computation
      ssc.start()
      ssc.awaitTermination()
    }
}

你可能感兴趣的:(数据技术)