spark学习系列——2 WordCount

经典的例子

首先上传文件到hdfs,再启动spark-shell,进行计算

[bdpos@BJHC-Client-18562 spark]$ hdfs dfs -mkdir /spark/input
[bdpos@BJHC-Client-18562 spark]$ hdfs dfs -put ./README.md /spark/input
[bdpos@BJHC-Client-18562 spark]$ hdfs dfs -ls /spark/input
Found 1 items
-rw-r--r--   2 bdpos supergroup       3818 2018-03-20 19:07 /spark/input/README.md
scala> sc.textFile("/spark/input/README.md").flatMap(line => line.split(" ")).map(word => (word,1)).reduceByKey(_+_).sortBy(t=>t._2,false).take(10)
res9: Array[(String, Int)] = Array(("",71), (the,24), (to,17), (Spark,16), (for,12), (and,9), (a,8), (##,8), (run,7), (on,7))



你可能感兴趣的:(spark)