16/03/11 17:19:40 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: SIGTERM
16/03/11 17:19:40 ERROR executor.Executor: Exception in task 0.0 in stage 147.0 (TID 343)
java.lang.OutOfMemoryError: Java heap space
at org.apache.spark.network.netty.NettyBlockTransferService.uploadBlock(NettyBlockTransferService.scala:133)
at org.apache.spark.network.BlockTransferService.uploadBlockSync(BlockTransferService.scala:118)
at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$replicate(BlockManager.scala:955)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:860)
at org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:645)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:153)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
16/03/11 17:19:40 INFO storage.DiskBlockManager: Shutdown hook called
ExecutorLostFailure (executor 10 exited caused by one of the running tasks) Reason: Container marked as failed: container_e02_1457595842637_0021_02_000019 on host: datanode4. Exit status: 143. Diagnostics: Container killed on request. Exit code is 143
Task在执行任务的时候OOM,OOM重试4次仍然失败的话,Application会重启,重启仍然失败,JOB直接FAIL退出。
16/03/09 15:30:58 INFO receiver.BlockGenerator: Pushed block input-0-1457508633200
16/03/09 15:30:58 INFO storage.MemoryStore: Block input-0-1457508633400 stored as bytes in memory (estimated size 14.9 MB, free 437.9 MB)
16/03/09 15:31:00 INFO receiver.BlockGenerator: Pushed block input-0-1457508633400
16/03/09 15:31:05 INFO zookeeper.ClientCnxn: Client session timed out, have not heard from server in 6690ms for sessionid 0x2519f45091a19f6, closing socket connection and attempting reconnect
16/03/09 15:31:05 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: SIGTERM
16/03/09 15:31:05 WARN storage.BlockManager: Putting block input-0-1457508633600 failed
16/03/09 15:31:05 ERROR util.SparkUncaughtExceptionHandler: [Container in shutdown] Uncaught exception in thread Thread[Thread-5,5,main]
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2271)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
at java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1785)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1188)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:44)
at org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:153)
at org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1196)
at org.apache.spark.storage.BlockManager.dataSerialize(BlockManager.scala:1202)
at org.apache.spark.storage.MemoryStore.putArray(MemoryStore.scala:136)
at org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:173)
at org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:147)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:798)
at org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:645)
at org.apache.spark.streaming.receiver.BlockManagerBasedBlockHandler.storeBlock(ReceivedBlockHandler.scala:77)
at org.apache.spark.streaming.receiver.ReceiverSupervisorImpl.pushAndReportBlock(ReceiverSupervisorImpl.scala:157)
at org.apache.spark.streaming.receiver.ReceiverSupervisorImpl.pushArrayBuffer(ReceiverSupervisorImpl.scala:128)
at org.apache.spark.streaming.receiver.ReceiverSupervisorImpl$$anon$3.onPushBlock(ReceiverSupervisorImpl.scala:109)
at org.apache.spark.streaming.receiver.BlockGenerator.pushBlock(BlockGenerator.scala:296)
at org.apache.spark.streaming.receiver.BlockGenerator.org$apache$spark$streaming$receiver$BlockGenerator$$keepPushingBlocks(BlockGenerator.scala:268)
at org.apache.spark.streaming.receiver.BlockGenerator$$anon$1.run(BlockGenerator.scala:109)
Receiver 启动的时候OOM了,40s 写了内存440M数据。从日志来看,Receiver 接收数据后开始Push Block,push的时候进行data serializer,这个时候OOM,后期需要考虑使用kryo进行序列化。
16/03/09 15:26:57 INFO storage.BlockManagerInfo: Added input-0-1457508369000 on disk on datanode2:40023 (size: 2.8 MB)
16/03/09 15:26:57 INFO cluster.YarnClusterSchedulerBackend: Disabling executor 2.
16/03/09 15:26:57 INFO scheduler.DAGScheduler: Executor lost: 2 (epoch 1)
16/03/09 15:26:57 INFO storage.BlockManagerMasterEndpoint: Trying to remove executor 2 from BlockManagerMaster.
16/03/09 15:26:57 INFO storage.BlockManagerMasterEndpoint: Removing block manager BlockManagerId(2, datanode5, 57965)
16/03/09 15:26:57 INFO storage.BlockManagerMaster: Removed 2 successfully in removeExecutor
16/03/09 15:26:57 INFO yarn.YarnAllocator: Completed container container_1457053973969_0047_01_000003 on host: datanode5 (state: COMPLETE, exit status: 143)
16/03/09 15:26:57 WARN yarn.YarnAllocator: Container marked as failed: container_1457053973969_0047_01_000003 on host: datanode5. Exit status: 143. Diagnostics: Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Killed by external signal
16/03/09 15:26:57 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Container marked as failed: container_1457053973969_0047_01_000003 on host: datanode5. Exit status: 143. Diagnostics: Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Killed by external signal
16/03/09 15:26:57 ERROR cluster.YarnClusterScheduler: Lost executor 2 on datanode5: Container marked as failed: container_1457053973969_0047_01_000003 on host: datanode5. Exit status: 143. Diagnostics: Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Killed by external signal
16/03/09 15:26:57 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 2.0 (TID 70, datanode5): ExecutorLostFailure (executor 2 exited caused by one of the running tasks) Reason: Container marked as failed: container_1457053973969_0047_01_000003 on host: datanode5. Exit status: 143. Diagnostics: Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Killed by external signal
16/03/09 15:26:57 INFO cluster.YarnClusterSchedulerBackend: Asked to remove non-existent executor 2
这个问题和上面是同一问题,Receiver 丢失导致
16/03/09 15:28:57 WARN scheduler.TaskSetManager: Lost task 8.0 in stage 8.0 (TID 234, datanode2): java.lang.OutOfMemoryError: Unable to acquire 262144 bytes of memory, got 139413
at org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:91)
at org.apache.spark.unsafe.map.BytesToBytesMap.allocate(BytesToBytesMap.java:735)
at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:197)
at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:212)
at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.<init>(UnsafeFixedWidthAggregationMap.java:103)
at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:483)
at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95)
at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
16/03/09 15:28:57 INFO storage.BlockManagerInfo: Added input-0-1457508343400 on disk on datanode2:40023 (size: 2.7 MB)
16/03/09 15:28:57 INFO storage.BlockManagerInfo: Added input-0-1457508343600 on disk on datanode2:40023 (size: 3.0 MB)
16/03/09 15:28:57 INFO scheduler.TaskSetManager: Starting task 8.1 in stage 8.0 (TID 236, datanode2, partition 8,NODE_LOCAL, 2006 bytes)
16/03/09 15:28:57 WARN scheduler.TaskSetManager: Lost task 9.0 in stage 8.0 (TID 235, datanode2): java.io.FileNotFoundException: /data/yarn/local/usercache/deployop/appcache/application_1457053973969_0047/blockmgr-e75070dd-691b-43ba-8df6-34a8de5549ce/02/input-0-1457508343800 (No such file or directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
at java.io.FileOutputStream.<init>(FileOutputStream.java:171)
at org.apache.spark.storage.DiskStore.putBytes(DiskStore.scala:47)
at org.apache.spark.storage.BlockManager.dropFromMemory(BlockManager.scala:1050)
at org.apache.spark.storage.BlockManager.dropFromMemory(BlockManager.scala:1009)
at org.apache.spark.storage.MemoryStore$$anonfun$evictBlocksToFreeSpace$2.apply(MemoryStore.scala:456)
at org.apache.spark.storage.MemoryStore$$anonfun$evictBlocksToFreeSpace$2.apply(MemoryStore.scala:445)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.storage.MemoryStore.evictBlocksToFreeSpace(MemoryStore.scala:445)
at org.apache.spark.memory.StorageMemoryPool.shrinkPoolToFreeSpace(StorageMemoryPool.scala:133)
at org.apache.spark.memory.UnifiedMemoryManager.org$apache$spark$memory$UnifiedMemoryManager$$maybeGrowExecutionPool$1(UnifiedMemoryManager.scala:102)
at org.apache.spark.memory.UnifiedMemoryManager$$anonfun$acquireExecutionMemory$1.apply$mcVJ$sp(UnifiedMemoryManager.scala:127)
at org.apache.spark.memory.ExecutionMemoryPool.acquireMemory(ExecutionMemoryPool.scala:114)
at org.apache.spark.memory.UnifiedMemoryManager.acquireExecutionMemory(UnifiedMemoryManager.scala:126)
at org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:140)
at org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:244)
at org.apache.spark.memory.MemoryConsumer.allocatePage(MemoryConsumer.java:112)
at org.apache.spark.unsafe.map.BytesToBytesMap.acquireNewPage(BytesToBytesMap.java:707)
at org.apache.spark.unsafe.map.BytesToBytesMap.access$1800(BytesToBytesMap.java:64)
at org.apache.spark.unsafe.map.BytesToBytesMap$Location.putNewKey(BytesToBytesMap.java:662)
at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.getAggregationBufferFromUnsafeRow(UnsafeFixedWidthAggregationMap.java:132)
at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:517)
at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:686)
at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95)
at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
16/03/09 15:28:57 INFO storage.BlockManagerInfo: Added input-0-1457508537400 in memory on datanode2:40169 (size: 4.0 MB, free: 215.6 MB)
16/03/09 15:28:57 INFO scheduler.TaskSetManager: Starting task 9.1 in stage 8.0 (TID 237, datanode2, partition 9,NODE_LOCAL, 2006 bytes)
16/03/09 15:28:57 WARN scheduler.TaskSetManager: Lost task 8.1 in stage 8.0 (TID 236, datanode2): java.lang.AssertionError: assertion failed
at scala.Predef$.assert(Predef.scala:165)
at org.apache.spark.memory.UnifiedMemoryManager.acquireExecutionMemory(UnifiedMemoryManager.scala:80)
at org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:140)
at org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:244)
at org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:83)
at org.apache.spark.unsafe.map.BytesToBytesMap.allocate(BytesToBytesMap.java:735)
at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:197)
at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:212)
at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.<init>(UnsafeFixedWidthAggregationMap.java:103)
at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:483)
at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95)
at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
16/03/09 15:28:57 INFO scheduler.TaskSetManager: Lost task 9.1 in stage 8.0 (TID 237) on executor datanode2: java.lang.AssertionError (assertion failed) [duplicate 1]
16/03/09 15:28:57 INFO scheduler.TaskSetManager: Starting task 9.2 in stage 8.0 (TID 238, datanode2, partition 9,NODE_LOCAL, 2006 bytes)
yarn再次在其他节点上进行重试,仍然失败:
16/03/09 15:28:58 ERROR executor.Executor: Exception in task 9.3 in stage 8.0 (TID 240)
java.lang.AssertionError: assertion failed
at scala.Predef$.assert(Predef.scala:165)
at org.apache.spark.memory.UnifiedMemoryManager.acquireExecutionMemory(UnifiedMemoryManager.scala:80)
at org.apache.spark.memory.TaskMemoryManager.acquireExecutionMemory(TaskMemoryManager.java:140)
at org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:244)
at org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:83)
at org.apache.spark.unsafe.map.BytesToBytesMap.allocate(BytesToBytesMap.java:735)
at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:197)
at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:212)
at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.<init>(UnsafeFixedWidthAggregationMap.java:103)
at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:483)
at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95)
at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
16/03/09 15:28:58 INFO util.ShutdownHookManager: Shutdown hook called
RDD在computeOrReadCheckpoint方法中计算方法mapPartitions的时候OOM.
16/03/09 15:29:10 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 10.0 (TID 242, datanode4): java.lang.Exception: Could not compute split, block input-0-1457508327800 not found
at org.apache.spark.rdd.BlockRDD.compute(BlockRDD.scala:51)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
yarn不断重试,确认数据丢失后,程序终止
16/03/09 15:29:13 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 24 (MapPartitionsRDD[419] at json at AppBolt.java:77), which has no missing parents
Exception in thread "streaming-job-executor-0" java.lang.Error: java.lang.InterruptedException
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1151)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:503)
at org.apache.spark.scheduler.JobWaiter.awaitResult(JobWaiter.scala:73)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:612)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1952)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:1025)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.RDD.reduce(RDD.scala:1007)
at org.apache.spark.rdd.RDD$$anonfun$treeAggregate$1.apply(RDD.scala:1136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.RDD.treeAggregate(RDD.scala:1113)
at org.apache.spark.sql.execution.datasources.json.InferSchema$.infer(InferSchema.scala:65)
at org.apache.spark.sql.execution.datasources.json.JSONRelation$$anonfun$4.apply(JSONRelation.scala:114)
at org.apache.spark.sql.execution.datasources.json.JSONRelation$$anonfun$4.apply(JSONRelation.scala:109)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.sql.execution.datasources.json.JSONRelation.dataSchema$lzycompute(JSONRelation.scala:109)
at org.apache.spark.sql.execution.datasources.json.JSONRelation.dataSchema(JSONRelation.scala:108)
at org.apache.spark.sql.sources.HadoopFsRelation.schema$lzycompute(interfaces.scala:636)
at org.apache.spark.sql.sources.HadoopFsRelation.schema(interfaces.scala:635)
at org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:37)
at org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:442)
at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:288)
at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:275)
at com.paic.data.cep.app.AppBolt$1.call(AppBolt.java:77)
at com.paic.data.cep.app.AppBolt$1.call(AppBolt.java:54)
at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$4.apply(JavaDStreamLike.scala:343)
at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$4.apply(JavaDStreamLike.scala:343)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:50)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:50)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:50)
at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:426)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:49)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:49)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:49)
at scala.util.Try$.apply(Try.scala:161)
at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39)
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:224)
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:224)
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:224)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:223)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
... 2 more
16/03/09 15:29:13 INFO scheduler.JobScheduler: Stopped JobScheduler
16/03/09 15:34:15 INFO storage.BlockManager: Dropping block input-0-1457508827000 from memory
16/03/09 15:34:15 INFO storage.BlockManager: Writing block input-0-1457508827000 to disk
16/03/09 15:34:15 INFO storage.BlockManager: Dropping block input-0-1457508827200 from memory
16/03/09 15:34:15 INFO storage.BlockManager: Writing block input-0-1457508827200 to disk
16/03/09 15:34:16 INFO storage.MemoryStore: Block input-0-1457508850400 stored as bytes in memory (estimated size 54.6 MB, free 497.9 MB)
16/03/09 15:34:29 INFO zookeeper.ClientCnxn: Client session timed out, have not heard from server in 4140ms for sessionid 0x2519f45091a1a6d, closing socket connection and attempting reconnect
16/03/09 15:34:38 INFO zkclient.ZkClient: zookeeper state changed (Disconnected)
16/03/09 15:34:38 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: SIGTERM
16/03/09 15:34:38 WARN server.TransportChannelHandler: Exception in connection from /10.20.9.17:55968
java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:331)
at org.apache.spark.storage.BlockManager$$anonfun$doGetLocal$9.apply(BlockManager.scala:530)
at org.apache.spark.storage.BlockManager$$anonfun$doGetLocal$9.apply(BlockManager.scala:525)
at org.apache.spark.storage.MemoryStore.bytes$lzycompute$1(MemoryStore.scala:112)
at org.apache.spark.storage.MemoryStore.org$apache$spark$storage$MemoryStore$$bytes$1(MemoryStore.scala:112)
at org.apache.spark.storage.MemoryStore$$anonfun$3.apply(MemoryStore.scala:114)
at org.apache.spark.storage.MemoryStore$$anonfun$3.apply(MemoryStore.scala:114)
at org.apache.spark.storage.MemoryStore.tryToPut(MemoryStore.scala:386)
at org.apache.spark.storage.MemoryStore.putBytes(MemoryStore.scala:114)
at org.apache.spark.storage.BlockManager.doGetLocal(BlockManager.scala:525)
at org.apache.spark.storage.BlockManager.getBlockData(BlockManager.scala:293)
at org.apache.spark.network.netty.NettyBlockRpcServer$$anonfun$2.apply(NettyBlockRpcServer.scala:58)
at org.apache.spark.network.netty.NettyBlockRpcServer$$anonfun$2.apply(NettyBlockRpcServer.scala:58)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:58)
at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:149)
at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:102)
at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:104)
at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
16/03/09 15:34:42 INFO zookeeper.ClientCnxn: Opening socket connection to server cnsz141659.app.paic.com.cn/10.25.161.128:2181. Will not attempt to authenticate using SASL (unknown error)
16/03/09 15:34:43 INFO zookeeper.ClientCnxn: Socket connection established to cnsz141659.app.paic.com.cn/10.25.161.128:2181, initiating session
16/03/09 15:34:50 ERROR util.SparkUncaughtExceptionHandler: [Container in shutdown] Uncaught exception in thread Thread[Thread-5,5,main]
java.lang.OutOfMemoryError: GC overhead limit exceeded
16/03/09 15:34:50 INFO storage.MemoryStore: 1 blocks selected for dropping
16/03/09 15:34:50 INFO storage.BlockManager: Dropping block input-0-1457508827400 from memory
16/03/09 15:34:50 INFO storage.BlockManager: Writing block input-0-1457508827400 to disk
16/03/09 15:34:50 INFO zookeeper.ClientCnxn: Client session timed out, have not heard from server in 7620ms for sessionid 0x2519f45091a1a6d, closing socket connection and attempting reconnect
16/03/09 15:34:50 WARN util.ShutdownHookManager: ShutdownHook '$anon$2' failed, java.lang.OutOfMemoryError: GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
at sun.net.www.protocol.jar.Handler.parseContextSpec(Handler.java:206)
at sun.net.www.protocol.jar.Handler.parseURL(Handler.java:152)
at java.net.URL.<init>(URL.java:614)
at java.net.URL.<init>(URL.java:482)
at sun.misc.URLClassPath$JarLoader.checkResource(URLClassPath.java:757)
at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:842)
at sun.misc.URLClassPath.getResource(URLClassPath.java:199)
at java.net.URLClassLoader$1.run(URLClassLoader.java:358)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:239)
at scala.util.Try$.apply(Try.scala:161)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:218)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
16/03/09 15:34:50 INFO consumer.SimpleConsumer: Reconnect due to socket error: java.lang.OutOfMemoryError: GC overhead limit exceeded
Oralce Explain :
Exception in thread thread_name: java.lang.OutOfMemoryError: GC Overhead limit exceeded
Cause: The detail message “GC overhead limit exceeded” indicates that the garbage collector is running all the time and Java program is making very slow progress. After a garbage collection, if the Java process is spending more than approximately 98% of its time doing garbage collection and if it is recovering less than 2% of the heap and has been doing so far the last 5 (compile time constant) consecutive garbage collections, then a java.lang.OutOfMemoryError is thrown. This exception is typically thrown because the amount of live data barely fits into the Java heap having little free space for new allocations.
Action: Increase the heap size. The java.lang.OutOfMemoryError exception for GC Overhead limit exceeded can be turned off with the command line flag -XX:-UseGCOverheadLimit.
我使用的是JDK1.6.0_37和JDK_1.7.0_60版本,这2个版本中JVM默认启动的时候-XX:+UseGCOverheadLimit,即启用了该特性。这其实是JVM的一种推断,如果垃圾回收耗费了98%的时间,但是回收的内存还不到2%,那么JVM会认为即将发生OOM,让程序提前结束。当然我们可以使用-XX:-UseGCOverheadLimit,关掉这个特性。
不太明白为什么JDK要提供这么个参数。当我们遇到这个错误只能说明:要么是内存空间不足,要么是存在内存泄露。这个时候对我们有意义的,其实就是heap dump,我们可以分析堆内存的情况,从而诊断出是否代码存在问题。
我们知道如果在启动JVM的时候设置了如下的参数,那么JVM崩溃的时候会打印出heap dump。
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=C:/
个人感觉,如果真的发生了”GC overhead limit exceeded”错误,那么其实离实际的OOM已经不远了,所以让JVM做个预测提前结束,感觉意义不大。
这里应该是Receiver在将数据保存到Disk的时候,接收数据的速度过快导致的OOM。
16/03/10 13:28:55 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 416.0 (TID 45932, datanode2, partition 2,NODE_LOCAL, 2017 bytes)
16/03/10 13:28:55 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 416.0 (TID 45931) in 1868 ms on datanode2 (2/4)
Exception in thread "streaming-job-executor-0" java.lang.Error: java.lang.InterruptedException
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1151)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:503)
at org.apache.spark.scheduler.JobWaiter.awaitResult(JobWaiter.scala:73)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:612)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1314)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.RDD.take(RDD.scala:1288)
at org.apache.spark.rdd.RDD$$anonfun$isEmpty$1.apply$mcZ$sp(RDD.scala:1416)
at org.apache.spark.rdd.RDD$$anonfun$isEmpty$1.apply(RDD.scala:1416)
at org.apache.spark.rdd.RDD$$anonfun$isEmpty$1.apply(RDD.scala:1416)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.RDD.isEmpty(RDD.scala:1415)
at org.apache.spark.api.java.JavaRDDLike$class.isEmpty(JavaRDDLike.scala:501)
at org.apache.spark.api.java.AbstractJavaRDDLike.isEmpty(JavaRDDLike.scala:46)
at com.paic.data.cep.app.AppBolt$1.call(AppBolt.java:57)
at com.paic.data.cep.app.AppBolt$1.call(AppBolt.java:54)
at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$4.apply(JavaDStreamLike.scala:343)
at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$4.apply(JavaDStreamLike.scala:343)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:50)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:50)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:50)
at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:426)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:49)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:49)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:49)
at scala.util.Try$.apply(Try.scala:161)
at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39)
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:224)
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:224)
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:224)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:223)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
... 2 more
16/03/10 13:28:55 INFO scheduler.JobScheduler: Stopped JobScheduler
在2016/03/10 13:28:00 集中有4个Job失败,失败日志:
java.io.IOException: java.lang.AssertionError: assertion failed
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1222)
at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:165)
at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:88)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:65)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.AssertionError: assertion failed
at scala.Predef$.assert(Predef.scala:165)
at org.apache.spark.memory.UnifiedMemoryManager.acquireStorageMemory(UnifiedMemoryManager.scala:140)
at org.apache.spark.storage.MemoryStore.tryToPut(MemoryStore.scala:383)
at org.apache.spark.storage.MemoryStore.tryToPut(MemoryStore.scala:342)
at org.apache.spark.storage.MemoryStore.putBytes(MemoryStore.scala:99)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:803)
at org.apache.spark.storage.BlockManager.putBytes(BlockManager.scala:690)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$anonfun$$getRemote$1$1.apply(TorrentBroadcast.scala:130)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$anonfun$$getRemote$1$1.apply(TorrentBroadcast.scala:127)
at scala.Option.map(Option.scala:145)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.org$apache$spark$broadcast$TorrentBroadcast$$anonfun$$getRemote$1(TorrentBroadcast.scala:127)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$1.apply(TorrentBroadcast.scala:137)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$1.apply(TorrentBroadcast.scala:137)
at scala.Option.orElse(Option.scala:257)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:137)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:120)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:120)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$readBlocks(TorrentBroadcast.scala:120)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:175)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1219)
... 12 more
查看Executors,Executor 7 的Address CANNOT FIND ADDRESS,无语了……