Spark构建独立应用 | sbt应用构建打包

学习用sbt来构建并打包一个简单的单词统计的例程

第一步 创建Scala版的单词统计应用

WordCount.scala

/**
 * Illustrates flatMap + countByValue for wordcount.
 */
import org.apache.spark._
import org.apache.spark.SparkContext._

object WordCount {
    def main(args: Array[String]) {
      val inputFile = args(0)
      val outputFile = args(1)
      val conf = new SparkConf().setAppName("wordCount")
      // Create a Scala Spark Context.
      val sc = new SparkContext(conf)
      // Load our input data.
      val input =  sc.textFile(inputFile)
      // Split up into words.
      val words = input.flatMap(line => line.split(" "))
      // Transform into word and count.
      val counts = words.map(word => (word, 1)).reduceByKey{case (x, y) => x + y}
      // Save the word count back out to a text file, causing evaluation.
      counts.saveAsTextFile(outputFile)
    }
}

第二步 创建sbt构建应用配置

build.sbt

name := "learning-spark-mini-example"

version := "1.0"

scalaVersion := "2.11.8"

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.0" % "provided"

第三步 项目的结构

./
./build.sbt
./src/
./src/main
./src/main/scala
./src/main/scala/WordCount.scala

第四步 Scala构建与运行

在项目文件目录的根目录处

sbt package

$SPARK_HOME$/bin/spark-submit \
--class "WordCount" \
--master local \
learning-spark-mini-example_2.11-1.0.jar \
../opt/modules/spark-2.2.1-bin-hadoop2.7/README.md \
./wordcounts

第五步 计数结果

[elon@hadoop scala]$ cd wordcounts/
[elon@hadoop wordcounts]$ ls
part-00000  _SUCCESS
[elon@hadoop wordcounts]$ cat part-00000 
(package,1)
(For,3)
(Programs,1)
(processing.,1)
(Because,1)
(The,1)
(page](http://spark.apache.org/documentation.html).,1) 
......

转载请注明出处:http://blog.csdn.net/coder__cs/article/details/78992764
本文出自【elon33的博客】

你可能感兴趣的:(【编程语言】➣,Scala)