eclipse安装scala和spark编译环境并上传到集群运行
本地环境:Window+eclipse4.3.2+scala.2.10.5+JDK1.7
1.scala安装,JDK安装简单,请自查
2.eclipse安装:http://www.eclipse.org/downloads/packages/release/Kepler/SR2
如果安装eclipse 4.5会装不上插件
3.eclipse安装插件
help->install new software
在http://scala-ide.org/download/prev-stable.html中找到http://download.scala-ide.org/sdk/helium/e37/scala210/stable/site输入,然后等待安装,不细讲
提醒:由于网络问题,可能需要install多次,前几次都会失败,多试几次,有时候可能需要五六次。。。
4.安装好后就可以new scala project
5.再导入spark的spark-assembly-1.5.2-hadoop2.6.0.jar
6.本地运行:
/* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements. See the NOTICE file distributed with * this work for additional information regarding copyright ownership. * The ASF licenses this file to You under the Apache License, Version 2.0 * (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ // scalastyle:off println package test1 import scala.math.random import org.apache.spark._ /** Computes an approximation to pi */ object SparkPi { def main(args: Array[String]) { val conf = new SparkConf().setAppName("Spark Pi ").setMaster("local") val spark = new SparkContext(conf) val slices = if (args.length > 0) args(0).toInt else 2 println("slices:\n"+slices) println("args.length:\n"+args.length) val n = math.min(100000L * slices, Int.MaxValue).toInt // avoid overflow val count = spark.parallelize(1 until n, slices).map { i => val x = random * 2 - 1 val y = random * 2 - 1 if (x*x + y*y < 1) 1 else 0 }.reduce(_ + _) println("Pi is roughly " + 4.0 * count / n) spark.stop() } } // scalastyle:on println
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 16/03/03 17:55:12 INFO SparkContext: Running Spark version 1.5.2 16/03/03 17:55:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/03/03 17:55:13 INFO SecurityManager: Changing view acls to: xubo 16/03/03 17:55:13 INFO SecurityManager: Changing modify acls to: xubo 16/03/03 17:55:13 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(xubo); users with modify permissions: Set(xubo) 16/03/03 17:55:15 INFO Slf4jLogger: Slf4jLogger started 16/03/03 17:55:15 INFO Remoting: Starting remoting 16/03/03 17:55:15 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:50812] 16/03/03 17:55:15 INFO Utils: Successfully started service 'sparkDriver' on port 50812. 16/03/03 17:55:15 INFO SparkEnv: Registering MapOutputTracker 16/03/03 17:55:15 INFO SparkEnv: Registering BlockManagerMaster 16/03/03 17:55:15 INFO DiskBlockManager: Created local directory at C:\Users\xubo\AppData\Local\Temp\blockmgr-caa750e6-8702-4649-a5e8-2ba73598a383 16/03/03 17:55:15 INFO MemoryStore: MemoryStore started with capacity 730.6 MB 16/03/03 17:55:15 INFO HttpFileServer: HTTP File server directory is C:\Users\xubo\AppData\Local\Temp\spark-77137efd-98f7-465d-a2a1-da56af107107\httpd-40ccad09-750c-4574-b019-47a2a77b003c 16/03/03 17:55:15 INFO HttpServer: Starting HTTP Server 16/03/03 17:55:15 INFO Utils: Successfully started service 'HTTP file server' on port 50813. 16/03/03 17:55:15 INFO SparkEnv: Registering OutputCommitCoordinator 16/03/03 17:55:16 INFO Utils: Successfully started service 'SparkUI' on port 4040. 16/03/03 17:55:16 INFO SparkUI: Started SparkUI at http://202.38.84.241:4040 16/03/03 17:55:16 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set. 16/03/03 17:55:16 INFO Executor: Starting executor ID driver on host localhost 16/03/03 17:55:16 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 50820. 16/03/03 17:55:16 INFO NettyBlockTransferService: Server created on 50820 16/03/03 17:55:16 INFO BlockManagerMaster: Trying to register BlockManager 16/03/03 17:55:16 INFO BlockManagerMasterEndpoint: Registering block manager localhost:50820 with 730.6 MB RAM, BlockManagerId(driver, localhost, 50820) 16/03/03 17:55:16 INFO BlockManagerMaster: Registered BlockManager slices: 2 args.length: 0 16/03/03 17:55:17 INFO SparkContext: Starting job: reduce at SparkPi.scala:38 16/03/03 17:55:17 INFO DAGScheduler: Got job 0 (reduce at SparkPi.scala:38) with 2 output partitions 16/03/03 17:55:17 INFO DAGScheduler: Final stage: ResultStage 0(reduce at SparkPi.scala:38) 16/03/03 17:55:17 INFO DAGScheduler: Parents of final stage: List() 16/03/03 17:55:17 INFO DAGScheduler: Missing parents: List() 16/03/03 17:55:17 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents 16/03/03 17:55:17 INFO MemoryStore: ensureFreeSpace(1848) called with curMem=0, maxMem=766075207 16/03/03 17:55:17 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1848.0 B, free 730.6 MB) 16/03/03 17:55:17 INFO MemoryStore: ensureFreeSpace(1195) called with curMem=1848, maxMem=766075207 16/03/03 17:55:17 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1195.0 B, free 730.6 MB) 16/03/03 17:55:17 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:50820 (size: 1195.0 B, free: 730.6 MB) 16/03/03 17:55:17 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:861 16/03/03 17:55:17 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) 16/03/03 17:55:17 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks 16/03/03 17:55:17 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 2085 bytes) 16/03/03 17:55:17 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 16/03/03 17:55:17 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1031 bytes result sent to driver 16/03/03 17:55:17 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, PROCESS_LOCAL, 2085 bytes) 16/03/03 17:55:17 INFO Executor: Running task 1.0 in stage 0.0 (TID 1) 16/03/03 17:55:17 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 154 ms on localhost (1/2) 16/03/03 17:55:17 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 1031 bytes result sent to driver 16/03/03 17:55:17 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 46 ms on localhost (2/2) 16/03/03 17:55:17 INFO DAGScheduler: ResultStage 0 (reduce at SparkPi.scala:38) finished in 0.203 s 16/03/03 17:55:17 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 16/03/03 17:55:17 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 0.517522 s Pi is roughly 3.14172 16/03/03 17:55:17 INFO SparkUI: Stopped Spark web UI at http://202.38.84.241:4040 16/03/03 17:55:17 INFO DAGScheduler: Stopping DAGScheduler 16/03/03 17:55:17 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 16/03/03 17:55:17 INFO MemoryStore: MemoryStore cleared 16/03/03 17:55:17 INFO BlockManager: BlockManager stopped 16/03/03 17:55:17 INFO BlockManagerMaster: BlockManagerMaster stopped 16/03/03 17:55:17 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 16/03/03 17:55:17 INFO SparkContext: Successfully stopped SparkContext 16/03/03 17:55:17 INFO ShutdownHookManager: Shutdown hook called 16/03/03 17:55:17 INFO ShutdownHookManager: Deleting directory C:\Users\xubo\AppData\Local\Temp\spark-77137efd-98f7-465d-a2a1-da56af107107