本例以卡特门罗求Pi的计算模型的日志做分析。运行在local模式中,具体代码如下
val conf = new SparkConf().setAppName("Spark Pi").setMaster("local[2]")/
val spark=new SparkContext(conf);
val slices = 100;
val n = 1000 * slices
val count = spark.parallelize(1 to n,slices).map({ i =>
def random: Double = java.lang.Math.random()
val x = random * 2 - 1
val y = random * 2 - 1
if (xx + yy < 1) 1 else 0
}).reduce(_ + _)
println("Pi is roughly " + 4.0 * count / n)
spark.stop()
日志分析:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties //使用默认的log4j日志模块进行日志输出
17/04/10 18:43:09 INFO SparkContext: Running Spark version 1.3.1 //spark上下文环境运行在Spark1.3.1的版本下
17/04/10 18:43:10 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable //没有使用yarn模式,所以词条警告信息不影响实际运行
17/04/10 18:43:10 INFO SecurityManager: Changing view acls to: Administrator
17/04/10 18:43:10 INFO SecurityManager: Changing modify acls to: Administrator
17/04/10 18:43:10 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(Administrator); users with modify permissions: Set(Administrator)
17/04/10 18:43:10 INFO Slf4jLogger: Slf4jLogger started
17/04/10 18:43:10 INFO Remoting: Starting remoting
17/04/10 18:43:10 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@jdlzy:57217]//spark底层使用akka进行通信,随机生成一个端口进行监听
17/04/10 18:43:10 INFO Utils: Successfully started service 'sparkDriver' on port 57217.//成功在spark中创建临时监听端口
17/04/10 18:43:11 INFO SparkEnv: Registering MapOutputTracker
17/04/10 18:43:11 INFO SparkEnv: Registering BlockManagerMaster
17/04/10 18:43:11 INFO DiskBlockManager: Created local directory at C:\Users\ADMINI~1.USE\AppData\Local\Temp\spark-7b2ae0d5-95ce-4727-9179-74cee0fa6dab\blockmgr-7c7505c5-c9cb-4e03-bead-67d3ef882930 //数据块的管理者DiskBlockManager在本地位置中管理内存
17/04/10 18:43:11 INFO MemoryStore: MemoryStore started with capacity 969.8 MB //本次任务可使用的内存数为969.8Mb
17/04/10 18:43:11 INFO HttpFileServer: HTTP File server directory is C:\Users\ADMINI~1.USE\AppData\Local\Temp\spark-f9e6ad82-19d9-439b-893b-f7f505b84b95\httpd-baf7acc4-154a-448b-b613-88b243249c03
17/04/10 18:43:11 INFO HttpServer: Starting HTTP Server
17/04/10 18:43:11 INFO Server: jetty-8.y.z-SNAPSHOT
17/04/10 18:43:11 INFO AbstractConnector: Started [email protected]:57218
17/04/10 18:43:11 INFO Utils: Successfully started service 'HTTP file server' on port 57218.
17/04/10 18:43:11 INFO SparkEnv: Registering OutputCommitCoordinator
17/04/10 18:43:11 INFO Server: jetty-8.y.z-SNAPSHOT
17/04/10 18:43:11 INFO AbstractConnector: Started [email protected]:4040
17/04/10 18:43:11 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/04/10 18:43:11 INFO SparkUI: Started SparkUI at http://jdlzy:4040
17/04/10 18:43:11 INFO Executor: Starting executor ID
17/04/10 18:43:11 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@jdlzy:57217/user/HeartbeatReceiver
17/04/10 18:43:12 INFO NettyBlockTransferService: Server created on 57238
17/04/10 18:43:12 INFO BlockManagerMaster: Trying to register BlockManager
17/04/10 18:43:12 INFO BlockManagerMasterActor: Registering block manager localhost:57238 with 969.8 MB RAM, BlockManagerId(
17/04/10 18:43:12 INFO BlockManagerMaster: Registered BlockManager
17/04/10 18:43:12 INFO SparkContext: Starting job: reduce at MySparkPi.scala:25 //开始执行任务
17/04/10 18:43:12 INFO DAGScheduler: Got job 0 (reduce at MySparkPi.scala:25) with 100 output partitions (allowLocal=false) //获取到了RDD,并要分割job
17/04/10 18:43:12 INFO DAGScheduler: Final stage: Stage 0(reduce at MySparkPi.scala:25)//第0个stage
17/04/10 18:43:12 INFO DAGScheduler: Parents of final stage: List()//返回一个List
17/04/10 18:43:12 INFO DAGScheduler: Missing parents: List()
17/04/10 18:43:12 INFO DAGScheduler: Submitting Stage 0 (MapPartitionsRDD[1] at map at MySparkPi.scala:15), which has no missing parents //stage0没有parent,提交任务
17/04/10 18:43:12 INFO MemoryStore: ensureFreeSpace(1832) called with curMem=0, maxMem=1016950947 //说明使用了多少的内存空间
17/04/10 18:43:12 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1832.0 B, free 969.8 MB)
17/04/10 18:43:12 INFO MemoryStore: ensureFreeSpace(1293) called with curMem=1832, maxMem=1016950947
17/04/10 18:43:12 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1293.0 B, free 969.8 MB)
17/04/10 18:43:12 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:57238 (size: 1293.0 B, free: 969.8 MB)
17/04/10 18:43:12 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0
17/04/10 18:43:12 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:839
17/04/10 18:43:12 INFO DAGScheduler: Submitting 100 missing tasks from Stage 0 (MapPartitionsRDD[1] at map at MySparkPi.scala:15) //开始提交了1个任务,此任务属于stage0
17/04/10 18:43:12 INFO TaskSchedulerImpl: Adding task set 0.0 with 100 tasks
17/04/10 18:43:12 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 1260 bytes)//提交task任务,
17/04/10 18:43:12 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, PROCESS_LOCAL, 1260 bytes)
17/04/10 18:43:12 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)//executor开始执行stage0中的task1.
17/04/10 18:43:12 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
17/04/10 18:43:12 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 736 bytes result sent to driver //完成stage0中的task1,并将736字节的数据返回给driver
17/04/10 18:43:12 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 736 bytes result sent to driver
·
·
·
·
17/04/10 18:43:13 INFO TaskSetManager: Finished task 99.0 in stage 0.0 (TID 99) in 15 ms on localhost (100/100)
17/04/10 18:43:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
17/04/10 18:43:13 INFO DAGScheduler: Stage 0 (reduce at MySparkPi.scala:25) finished in 0.883 s//stage0任务执行完毕
17/04/10 18:43:13 INFO DAGScheduler: Job 0 finished: reduce at MySparkPi.scala:25, took 1.218286 s //job0任务执行完毕
Pi is roughly 3.12704
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
17/04/10 18:43:13 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
17/04/10 18:43:13 INFO SparkUI: Stopped Spark web UI at http://jdlzy:4040
17/04/10 18:43:13 INFO DAGScheduler: Stopping DAGScheduler
17/04/10 18:43:13 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor stopped!
17/04/10 18:43:13 INFO MemoryStore: MemoryStore cleared
17/04/10 18:43:13 INFO BlockManager: BlockManager stopped
17/04/10 18:43:13 INFO BlockManagerMaster: BlockManagerMaster stopped
17/04/10 18:43:13 INFO SparkContext: Successfully stopped SparkContext
17/04/10 18:43:13 INFO OutputCommitCoordinator$OutputCommitCoordinatorActor: OutputCommitCoordinator stopped!