Scala实现MapReduce经典案例WordCount

话不多说,直接上代码(包含详细步骤版和简化版):

object WordCount {
    def main(args: Array[String]): Unit = {
        val lines = List("qiusuo zhao hello spark", "zhao spark scala qiusuo zhao hello")
        // 1. 分割字符串,获取单词
        val res1: List[String] = lines.flatMap((x: String) => x.split(" "))
        // 2. 重构结构 word => (word, 1)
        val res2: List[(String, Int)] = res1.map((x: String) => (x, 1))
        // 3. 根据单词进行分组
        val res3: Map[String, List[(String, Int)]] = res2.groupBy((x: (String, Int)) => x._1)
        // 4. 统计单词出现的次数
        val res4: Map[String, Int] = res3.map((x: (String, List[(String, Int)])) => (x._1, x._2.size))
        // 5. 排序
        val res5: List[(String, Int)] = res4.toList.sortBy((x: (String, Int)) => x._2)
        // 6. 展示结果
        println(res5)

        // 简化版
        val res: List[(String, Int)] = lines.flatMap(_.split(" ")).map((_, 1)).groupBy(_._1)
            .map(x => (x._1, x._2.size)).toList.sortBy(_._2)
        println("简化版: res=" + res)
    }
}

你可能感兴趣的:(Scala,java)