前期配置工作:https://mp.csdn.net/mdeditor/84717937#
1在runcount.scala中写入代码
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object runcount {
def main(args:Array[String]){
val conf=new SparkConf().setAppName("simple application").setMaster("local[*]")
val filepath="/hadoop/hadoop/README.txt"
val sc=new SparkContext(conf)
val file=sc.textFile(filepath).cache()
val counts=file.flatMap(line=>line.split(" ")).map(word =>(word,1)).reduceByKey(_+_)
counts.saveAsTextFile("data/output")
}
}
2点击run-runconfigurations
左侧下拉选择Scala Application,在点击左侧上方的新建
然后在Main class填入要运行的例子
3右击runcount.scala,选择run as->scala Application
c出现错误
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/12/02 15:15:02 INFO SparkContext: Running Spark version 2.2.0
18/12/02 15:15:02 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/12/02 15:15:03 INFO SparkContext: Submitted application: simple application
18/12/02 15:15:03 INFO SecurityManager: Changing view acls to: hadoop01
18/12/02 15:15:03 INFO SecurityManager: Changing modify acls to: hadoop01
18/12/02 15:15:03 INFO SecurityManager: Changing view acls groups to:
18/12/02 15:15:03 INFO SecurityManager: Changing modify acls groups to:
18/12/02 15:15:03 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop01); groups with view permissions: Set(); users with modify permissions: Set(hadoop01); groups with modify permissions: Set()
18/12/02 15:15:03 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/12/02 15:15:03 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/12/02 15:15:03 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/12/02 15:15:03 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/12/02 15:15:03 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/12/02 15:15:03 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/12/02 15:15:03 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/12/02 15:15:03 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/12/02 15:15:03 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/12/02 15:15:03 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/12/02 15:15:03 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/12/02 15:15:03 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/12/02 15:15:03 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/12/02 15:15:03 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/12/02 15:15:03 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/12/02 15:15:03 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
18/12/02 15:15:03 ERROR SparkContext: Error initializing SparkContext.
java.net.BindException: Cannot assign requested address: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:127)
at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:501)
at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1218)
at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:496)
at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:481)
at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:965)
at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:210)
at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:353)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:446)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:748)
18/12/02 15:15:03 INFO SparkContext: Successfully stopped SparkContext
Exception in thread "main" java.net.BindException: Cannot assign requested address: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:127)
at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:501)
at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1218)
at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:496)
at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:481)
at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:965)
at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:210)
at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:353)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:446)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:748)
按照日志提示:
ERROR SparkContext: Error initializing SparkContext.
java.net.BindException: Cannot assign requested address: Service ‘sparkDriver’ failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service ‘sparkDriver’ (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.
百度原因,说是主机ip地址配置错误,终端输入ipconfig发现果然ip地址改变了,于是修改/hosts
再次运行,依旧报错,根据日志提示大概是内存不够的原因,于是更改上述代码成:
val conf=new SparkConf().setAppName("simple application").setMaster("local[*]").set("spark.testing.memory", "2147480000")
成功运行:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/12/02 15:33:49 INFO SparkContext: Running Spark version 2.2.0
18/12/02 15:33:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/12/02 15:33:50 INFO SparkContext: Submitted application: simple application
18/12/02 15:33:50 INFO SecurityManager: Changing view acls to: hadoop01
18/12/02 15:33:50 INFO SecurityManager: Changing modify acls to: hadoop01
18/12/02 15:33:50 INFO SecurityManager: Changing view acls groups to:
18/12/02 15:33:50 INFO SecurityManager: Changing modify acls groups to:
18/12/02 15:33:50 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop01); groups with view permissions: Set(); users with modify permissions: Set(hadoop01); groups with modify permissions: Set()
18/12/02 15:33:50 INFO Utils: Successfully started service 'sparkDriver' on port 43090.
18/12/02 15:33:50 INFO SparkEnv: Registering MapOutputTracker
18/12/02 15:33:50 INFO SparkEnv: Registering BlockManagerMaster
18/12/02 15:33:50 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/12/02 15:33:50 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/12/02 15:33:50 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-7a1cde82-5603-4ea2-8936-67c14001a2e0
18/12/02 15:33:50 INFO MemoryStore: MemoryStore started with capacity 1048.8 MB
18/12/02 15:33:50 INFO SparkEnv: Registering OutputCommitCoordinator
18/12/02 15:33:51 INFO Utils: Successfully started service 'SparkUI' on port 4040.
18/12/02 15:33:51 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.127.135:4040
18/12/02 15:33:51 INFO Executor: Starting executor ID driver on host localhost
18/12/02 15:33:51 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 38110.
18/12/02 15:33:51 INFO NettyBlockTransferService: Server created on 192.168.127.135:38110
18/12/02 15:33:51 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
18/12/02 15:33:51 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.127.135, 38110, None)
18/12/02 15:33:51 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.127.135:38110 with 1048.8 MB RAM, BlockManagerId(driver, 192.168.127.135, 38110, None)
18/12/02 15:33:51 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.127.135, 38110, None)
18/12/02 15:33:51 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.127.135, 38110, None)
18/12/02 15:33:51 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 214.5 KB, free 1048.6 MB)
18/12/02 15:33:52 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 20.4 KB, free 1048.6 MB)
18/12/02 15:33:52 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.127.135:38110 (size: 20.4 KB, free: 1048.8 MB)
18/12/02 15:33:52 INFO SparkContext: Created broadcast 0 from textFile at runcount.scala:11
18/12/02 15:33:52 INFO FileInputFormat: Total input paths to process : 1
18/12/02 15:33:52 INFO SparkContext: Starting job: saveAsTextFile at runcount.scala:13
18/12/02 15:33:52 INFO DAGScheduler: Registering RDD 3 (map at runcount.scala:12)
18/12/02 15:33:52 INFO DAGScheduler: Got job 0 (saveAsTextFile at runcount.scala:13) with 2 output partitions
18/12/02 15:33:52 INFO DAGScheduler: Final stage: ResultStage 1 (saveAsTextFile at runcount.scala:13)
18/12/02 15:33:52 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
18/12/02 15:33:52 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0)
18/12/02 15:33:52 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[3] at map at runcount.scala:12), which has no missing parents
18/12/02 15:33:52 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 4.7 KB, free 1048.6 MB)
18/12/02 15:33:52 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.7 KB, free 1048.6 MB)
18/12/02 15:33:52 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.127.135:38110 (size: 2.7 KB, free: 1048.8 MB)
18/12/02 15:33:52 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
18/12/02 15:33:52 INFO DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[3] at map at runcount.scala:12) (first 15 tasks are for partitions Vector(0, 1))
18/12/02 15:33:52 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
18/12/02 15:33:53 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 4838 bytes)
18/12/02 15:33:53 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 4838 bytes)
18/12/02 15:33:53 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
18/12/02 15:33:53 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
18/12/02 15:33:53 INFO HadoopRDD: Input split: file:/hadoop/hadoop/README.txt:683+683
18/12/02 15:33:53 INFO HadoopRDD: Input split: file:/hadoop/hadoop/README.txt:0+683
18/12/02 15:33:53 INFO MemoryStore: Block rdd_1_1 stored as values in memory (estimated size 1928.0 B, free 1048.6 MB)
18/12/02 15:33:53 INFO BlockManagerInfo: Added rdd_1_1 in memory on 192.168.127.135:38110 (size: 1928.0 B, free: 1048.8 MB)
18/12/02 15:33:53 INFO MemoryStore: Block rdd_1_0 stored as values in memory (estimated size 2.1 KB, free 1048.6 MB)
18/12/02 15:33:53 INFO BlockManagerInfo: Added rdd_1_0 in memory on 192.168.127.135:38110 (size: 2.1 KB, free: 1048.8 MB)
18/12/02 15:33:53 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1854 bytes result sent to driver
18/12/02 15:33:53 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 1854 bytes result sent to driver
18/12/02 15:33:53 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 388 ms on localhost (executor driver) (1/2)
18/12/02 15:33:53 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 372 ms on localhost (executor driver) (2/2)
18/12/02 15:33:53 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
18/12/02 15:33:53 INFO DAGScheduler: ShuffleMapStage 0 (map at runcount.scala:12) finished in 0.442 s
18/12/02 15:33:53 INFO DAGScheduler: looking for newly runnable stages
18/12/02 15:33:53 INFO DAGScheduler: running: Set()
18/12/02 15:33:53 INFO DAGScheduler: waiting: Set(ResultStage 1)
18/12/02 15:33:53 INFO DAGScheduler: failed: Set()
18/12/02 15:33:53 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[5] at saveAsTextFile at runcount.scala:13), which has no missing parents
18/12/02 15:33:53 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 65.3 KB, free 1048.5 MB)
18/12/02 15:33:53 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 23.3 KB, free 1048.5 MB)
18/12/02 15:33:53 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.127.135:38110 (size: 23.3 KB, free: 1048.7 MB)
18/12/02 15:33:53 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006
18/12/02 15:33:53 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 1 (MapPartitionsRDD[5] at saveAsTextFile at runcount.scala:13) (first 15 tasks are for partitions Vector(0, 1))
18/12/02 15:33:53 INFO TaskSchedulerImpl: Adding task set 1.0 with 2 tasks
18/12/02 15:33:53 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 2, localhost, executor driver, partition 0, ANY, 4621 bytes)
18/12/02 15:33:53 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 3, localhost, executor driver, partition 1, ANY, 4621 bytes)
18/12/02 15:33:53 INFO Executor: Running task 0.0 in stage 1.0 (TID 2)
18/12/02 15:33:53 INFO Executor: Running task 1.0 in stage 1.0 (TID 3)
18/12/02 15:33:53 INFO ShuffleBlockFetcherIterator: Getting 2 non-empty blocks out of 2 blocks
18/12/02 15:33:53 INFO ShuffleBlockFetcherIterator: Getting 2 non-empty blocks out of 2 blocks
18/12/02 15:33:53 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 8 ms
18/12/02 15:33:53 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 9 ms
18/12/02 15:33:53 INFO FileOutputCommitter: Saved output of task 'attempt_20181202153352_0001_m_000001_3' to file:/home/hadoop01/workspace/count/data/output/_temporary/0/task_20181202153352_0001_m_000001
18/12/02 15:33:53 INFO FileOutputCommitter: Saved output of task 'attempt_20181202153352_0001_m_000000_2' to file:/home/hadoop01/workspace/count/data/output/_temporary/0/task_20181202153352_0001_m_000000
18/12/02 15:33:53 INFO SparkHadoopMapRedUtil: attempt_20181202153352_0001_m_000000_2: Committed
18/12/02 15:33:53 INFO SparkHadoopMapRedUtil: attempt_20181202153352_0001_m_000001_3: Committed
18/12/02 15:33:53 INFO Executor: Finished task 0.0 in stage 1.0 (TID 2). 1224 bytes result sent to driver
18/12/02 15:33:53 INFO Executor: Finished task 1.0 in stage 1.0 (TID 3). 1181 bytes result sent to driver
18/12/02 15:33:53 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 3) in 244 ms on localhost (executor driver) (1/2)
18/12/02 15:33:53 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 253 ms on localhost (executor driver) (2/2)
18/12/02 15:33:53 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool
18/12/02 15:33:53 INFO DAGScheduler: ResultStage 1 (saveAsTextFile at runcount.scala:13) finished in 0.256 s
18/12/02 15:33:53 INFO DAGScheduler: Job 0 finished: saveAsTextFile at runcount.scala:13, took 1.177447 s
18/12/02 15:33:53 INFO SparkContext: Invoking stop() from shutdown hook
18/12/02 15:33:53 INFO SparkUI: Stopped Spark web UI at http://192.168.127.135:4040
18/12/02 15:33:53 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
18/12/02 15:33:53 INFO MemoryStore: MemoryStore cleared
18/12/02 15:33:53 INFO BlockManager: BlockManager stopped
18/12/02 15:33:53 INFO BlockManagerMaster: BlockManagerMaster stopped
18/12/02 15:33:53 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
18/12/02 15:33:53 INFO SparkContext: Successfully stopped SparkContext
18/12/02 15:33:53 INFO ShutdownHookManager: Shutdown hook called
18/12/02 15:33:53 INFO ShutdownHookManager: Deleting directory /tmp/spark-fefc55dc-fdad-4055-8edb-d3441b374f0b