SparkStreaming wordCountDemo基础案例

 

体现sparkStreaming的秒级准实时性,所以我们需要一个能够持续输入数据的东东

1.CentOS上下载nc

SparkStreaming wordCountDemo基础案例_第1张图片

创建一个scala工程,导入相关pom依赖



    4.0.0

    com.shiao
    spark-01
    1.0

    jar

    
        2.11.8
        2.7.4
        2.0.2
    

    
        
        
            org.scala-lang
            scala-library
            ${scala.version}
        
        
        
            org.apache.spark
            spark-core_2.11
            ${spark.version}
        
        
        
            org.apache.hadoop
            hadoop-client
            ${hadoop.version}
        

        
            mysql
            mysql-connector-java
            5.1.30
        


        
        
            org.apache.spark
            spark-streaming_2.11
            2.0.2
        

    




    
    
        
            
            
                org.scala-tools
                maven-scala-plugin
                2.15.2
                
                    
                        
                            compile
                        
                    
                
            

            
            
                maven-assembly-plugin
                
                    
                        
                            WordCount
                        
                    
                    
                        jar-with-dependencies
                    
                
            
        

    

    

  创建一个object

SparkStreaming wordCountDemo基础案例_第2张图片

编写代码

 

import org.apache.spark.streaming.dstream.{DStream, ReceiverInputDStream}
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.{SparkConf, SparkContext}

object SparkStreamingWordCount {
  def main(args: Array[String]): Unit = {


    //创建sparkContext
    val configStr = new SparkConf().setAppName("SparkStreamingWordCount").setMaster("local[2]")
    val sc = new SparkContext(configStr)

    //创建streamingContext
    val scc = new StreamingContext(sc, Seconds(5))

    //去掉多余的日志,影响观看
    sc.setLogLevel("WARN")

    //创建receive获取socket数据
    val lines: ReceiverInputDStream[String] = scc.socketTextStream("192.168.52.110", 9999)

    //计数处理,以逗号划分,分成一个个字符串;对每个字符串进行处理成值为1的元组;对相同单词进行相加;进行打印
    val value: DStream[(String, Int)] = lines.flatMap(_.split("\\,")).map((_, 1)).reduceByKey(_ + _)
    value.print()

    //开启并阻塞线程,以保持不断获取
    scc.start()
    scc.awaitTermination()
  }
}

跑起来

SparkStreaming wordCountDemo基础案例_第3张图片

 

使用scoket nc打开9999端口发送数据

 测试

SparkStreaming wordCountDemo基础案例_第4张图片

 

你可能感兴趣的:(SparkStreaming wordCountDemo基础案例)