在配置为4C16G的虚拟机上安装hadoop生态全家桶,在安装Spark2,使用了社区版2.3的版本。
安装完毕后,使用spark2自带的样例程序 org.apache.spark.examples.SparkPi 测试了下,结果报了如下错误:
Spark context stopped while waiting for backend
完整报错日志如下:
2021-03-12 15:05:32 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-c0b92da6-a56a-47c4-b3a5-bc2e37e6cf84
[root@cdh-001 conf]# spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --driver-memory 600m --executor-memory 600m --executor-cores 2 $SPARK_HOME/examples/jars/spark-examples*.jar 10
2021-03-12 15:05:46 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-03-12 15:05:46 INFO SparkContext:54 - Running Spark version 2.3.0
2021-03-12 15:05:46 INFO SparkContext:54 - Submitted application: Spark Pi
2021-03-12 15:05:46 INFO SecurityManager:54 - Changing view acls to: root
2021-03-12 15:05:46 INFO SecurityManager:54 - Changing modify acls to: root
2021-03-12 15:05:46 INFO SecurityManager:54 - Changing view acls groups to:
2021-03-12 15:05:46 INFO SecurityManager:54 - Changing modify acls groups to:
2021-03-12 15:05:46 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
2021-03-12 15:05:46 INFO Utils:54 - Successfully started service 'sparkDriver' on port 40299.
2021-03-12 15:05:46 INFO SparkEnv:54 - Registering MapOutputTracker
2021-03-12 15:05:46 INFO SparkEnv:54 - Registering BlockManagerMaster
2021-03-12 15:05:46 INFO BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2021-03-12 15:05:46 INFO BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2021-03-12 15:05:46 INFO DiskBlockManager:54 - Created local directory at /tmp/blockmgr-b49e6294-3bf9-42f3-956e-897aa442cf58
2021-03-12 15:05:46 INFO MemoryStore:54 - MemoryStore started with capacity 140.1 MB
2021-03-12 15:05:47 INFO SparkEnv:54 - Registering OutputCommitCoordinator
2021-03-12 15:05:47 INFO log:192 - Logging initialized @1691ms
2021-03-12 15:05:47 INFO Server:346 - jetty-9.3.z-SNAPSHOT
2021-03-12 15:05:47 INFO Server:414 - Started @1762ms
2021-03-12 15:05:47 INFO AbstractConnector:278 - Started ServerConnector@2a551a63{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2021-03-12 15:05:47 INFO Utils:54 - Successfully started service 'SparkUI' on port 4040.
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7cd1ac19{/jobs,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2a2bb0eb{/jobs/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3c291aad{/jobs/job,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@733037{/jobs/job/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7728643a{/stages,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@320e400{/stages/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5167268{/stages/stage,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2c444798{/stages/stage/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1af7f54a{/stages/pool,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6ebd78d1{/stages/pool/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@436390f4{/storage,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4d157787{/storage/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@68ed96ca{/storage/rdd,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6d1310f6{/storage/rdd/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3228d990{/environment,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@54e7391d{/environment/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@50b8ae8d{/executors,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@255990cc{/executors/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@51c929ae{/executors/threadDump,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3c8bdd5b{/executors/threadDump/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@29d2d081{/static,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@60afd40d{/,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@28a2a3e7{/api,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@33a2499c{/jobs/job/kill,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@e72dba7{/stages/stage/kill,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://cdh-001:4040
2021-03-12 15:05:47 INFO SparkContext:54 - Added JAR file:/data/my_bdc_apps/spark/examples/jars/spark-examples_2.11-2.3.0.jar at spark://cdh-001:40299/jars/spark-examples_2.11-2.3.0.jar with timestamp 1615532747291
2021-03-12 15:05:47 INFO RMProxy:98 - Connecting to ResourceManager at cdh-002/10.6.2.245:8032
2021-03-12 15:05:48 INFO Client:54 - Requesting a new application from cluster with 4 NodeManagers
2021-03-12 15:05:48 INFO Client:54 - Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
2021-03-12 15:05:48 INFO Client:54 - Will allocate AM container, with 896 MB memory including 384 MB overhead
2021-03-12 15:05:48 INFO Client:54 - Setting up container launch context for our AM
2021-03-12 15:05:48 INFO Client:54 - Setting up the launch environment for our AM container
2021-03-12 15:05:48 INFO Client:54 - Preparing resources for our AM container
2021-03-12 15:05:49 WARN Client:66 - Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
2021-03-12 15:05:50 INFO Client:54 - Uploading resource file:/tmp/spark-bf6b7bbf-3512-4522-8b31-47a6a65187ac/__spark_libs__6875100117796151482.zip -> hdfs://nameservice1/user/root/.sparkStaging/application_1615455919840_0004/__spark_libs__6875100117796151482.zip
2021-03-12 15:05:51 INFO Client:54 - Uploading resource file:/tmp/spark-bf6b7bbf-3512-4522-8b31-47a6a65187ac/__spark_conf__38977155164719923.zip -> hdfs://nameservice1/user/root/.sparkStaging/application_1615455919840_0004/__spark_conf__.zip
2021-03-12 15:05:51 INFO SecurityManager:54 - Changing view acls to: root
2021-03-12 15:05:51 INFO SecurityManager:54 - Changing modify acls to: root
2021-03-12 15:05:51 INFO SecurityManager:54 - Changing view acls groups to:
2021-03-12 15:05:51 INFO SecurityManager:54 - Changing modify acls groups to:
2021-03-12 15:05:51 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
2021-03-12 15:05:51 INFO Client:54 - Submitting application application_1615455919840_0004 to ResourceManager
2021-03-12 15:05:51 INFO YarnClientImpl:273 - Submitted application application_1615455919840_0004
2021-03-12 15:05:51 INFO SchedulerExtensionServices:54 - Starting Yarn extension services with app application_1615455919840_0004 and attemptId None
2021-03-12 15:05:52 INFO Client:54 - Application report for application_1615455919840_0004 (state: ACCEPTED)
2021-03-12 15:05:52 INFO Client:54 -
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1615532751522
final status: UNDEFINED
tracking URL: http://cdh-002:8088/proxy/application_1615455919840_0004/
user: root
2021-03-12 15:05:53 INFO Client:54 - Application report for application_1615455919840_0004 (state: ACCEPTED)
2021-03-12 15:05:54 INFO Client:54 - Application report for application_1615455919840_0004 (state: ACCEPTED)
2021-03-12 15:05:55 INFO Client:54 - Application report for application_1615455919840_0004 (state: ACCEPTED)
2021-03-12 15:05:55 INFO YarnClientSchedulerBackend:54 - Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> cdh-002, PROXY_URI_BASES -> http://cdh-002:8088/proxy/application_1615455919840_0004), /proxy/application_1615455919840_0004
2021-03-12 15:05:55 INFO JettyUtils:54 - Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
2021-03-12 15:05:56 INFO YarnSchedulerBackend$YarnSchedulerEndpoint:54 - ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM)
2021-03-12 15:05:56 INFO Client:54 - Application report for application_1615455919840_0004 (state: RUNNING)
2021-03-12 15:05:56 INFO Client:54 -
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.6.2.248
ApplicationMaster RPC port: 0
queue: default
start time: 1615532751522
final status: UNDEFINED
tracking URL: http://cdh-002:8088/proxy/application_1615455919840_0004/
user: root
2021-03-12 15:05:56 INFO YarnClientSchedulerBackend:54 - Application application_1615455919840_0004 has started running.
2021-03-12 15:05:56 INFO Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 46294.
2021-03-12 15:05:56 INFO NettyBlockTransferService:54 - Server created on cdh-001:46294
2021-03-12 15:05:56 INFO BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
2021-03-12 15:05:56 INFO BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, cdh-001, 46294, None)
2021-03-12 15:05:56 INFO BlockManagerMasterEndpoint:54 - Registering block manager cdh-001:46294 with 140.1 MB RAM, BlockManagerId(driver, cdh-001, 46294, None)
2021-03-12 15:05:56 INFO BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, cdh-001, 46294, None)
2021-03-12 15:05:56 INFO BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, cdh-001, 46294, None)
2021-03-12 15:05:56 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@626b639e{/metrics/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:59 INFO YarnClientSchedulerBackend:54 - Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> cdh-002, PROXY_URI_BASES -> http://cdh-002:8088/proxy/application_1615455919840_0004), /proxy/application_1615455919840_0004
2021-03-12 15:05:59 INFO JettyUtils:54 - Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
2021-03-12 15:05:59 INFO YarnSchedulerBackend$YarnSchedulerEndpoint:54 - ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM)
2021-03-12 15:06:03 ERROR YarnClientSchedulerBackend:70 - Yarn application has already exited with state FINISHED!
2021-03-12 15:06:03 INFO AbstractConnector:318 - Stopped Spark@2a551a63{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2021-03-12 15:06:03 INFO SparkUI:54 - Stopped Spark web UI at http://cdh-001:4040
2021-03-12 15:06:03 ERROR TransportClient:233 - Failed to send RPC 8719934078843937066 to /10.6.2.248:50270: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)
2021-03-12 15:06:03 ERROR YarnSchedulerBackend$YarnSchedulerEndpoint:91 - Sending RequestExecutors(0,0,Map(),Set()) to AM was unsuccessful
java.io.IOException: Failed to send RPC 8719934078843937066 to /10.6.2.248:50270: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient.lambda$sendRpc$2(TransportClient.java:237)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
at io.netty.util.concurrent.DefaultPromise.access$000(DefaultPromise.java:34)
at io.netty.util.concurrent.DefaultPromise$1.run(DefaultPromise.java:431)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.channels.ClosedChannelException
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)
2021-03-12 15:06:03 INFO SchedulerExtensionServices:54 - Stopping SchedulerExtensionServices
(serviceOption=None,
services=List(),
started=false)
2021-03-12 15:06:03 ERROR Utils:91 - Uncaught exception in thread Yarn application state monitor
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.requestTotalExecutors(CoarseGrainedSchedulerBackend.scala:566)
at org.apache.spark.scheduler.cluster.YarnSchedulerBackend.stop(YarnSchedulerBackend.scala:95)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.stop(YarnClientSchedulerBackend.scala:155)
at org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:508)
at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1752)
at org.apache.spark.SparkContext$$anonfun$stop$8.apply$mcV$sp(SparkContext.scala:1924)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1357)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1923)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:112)
Caused by: java.io.IOException: Failed to send RPC 8719934078843937066 to /10.6.2.248:50270: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient.lambda$sendRpc$2(TransportClient.java:237)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
at io.netty.util.concurrent.DefaultPromise.access$000(DefaultPromise.java:34)
at io.netty.util.concurrent.DefaultPromise$1.run(DefaultPromise.java:431)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.channels.ClosedChannelException
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)
2021-03-12 15:06:03 INFO MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2021-03-12 15:06:03 INFO MemoryStore:54 - MemoryStore cleared
2021-03-12 15:06:03 INFO BlockManager:54 - BlockManager stopped
2021-03-12 15:06:03 INFO BlockManagerMaster:54 - BlockManagerMaster stopped
2021-03-12 15:06:03 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2021-03-12 15:06:03 ERROR SparkContext:91 - Error initializing SparkContext.
java.lang.IllegalStateException: Spark context stopped while waiting for backend
at org.apache.spark.scheduler.TaskSchedulerImpl.waitBackendReady(TaskSchedulerImpl.scala:669)
at org.apache.spark.scheduler.TaskSchedulerImpl.postStartHook(TaskSchedulerImpl.scala:177)
at org.apache.spark.SparkContext.(SparkContext.scala:558)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2486)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:930)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:921)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2021-03-12 15:06:03 INFO SparkContext:54 - SparkContext already stopped.
2021-03-12 15:06:03 INFO SparkContext:54 - Successfully stopped SparkContext
Exception in thread "main" java.lang.IllegalStateException: Spark context stopped while waiting for backend
at org.apache.spark.scheduler.TaskSchedulerImpl.waitBackendReady(TaskSchedulerImpl.scala:669)
at org.apache.spark.scheduler.TaskSchedulerImpl.postStartHook(TaskSchedulerImpl.scala:177)
at org.apache.spark.SparkContext.(SparkContext.scala:558)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2486)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:930)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:921)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2021-03-12 15:06:03 INFO ShutdownHookManager:54 - Shutdown hook called
2021-03-12 15:06:03 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-ad5a6ab4-8aa4-4c06-a783-02195cd8c569
2021-03-12 15:06:03 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-bf6b7bbf-3512-4522-8b31-47a6a65187ac
猜测是在安装了大量hadoop生态组件的服务器之后,服务器上又陆续安装了Spark,在spark-submit任务提交过程中会对系统可用内存进行检测,当发现内存不足时,报出了上述错误。
在安装了 Hadoop 生态全家桶后,机器内存剩余不多,这里配置了如下的spark运行参数:
通过对这2个参数进行不同极端值的设置,可以根据日志推断出当前程序需要的内存与目前服务器的内存限制,测试结果如下:
①、第一次测试
[root@cdh-001 conf]# spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --driver-memory 500m --executor-memory 100m --executor-cores 2 $SPARK_HOME/examples/jars/spark-examples*.jar 10
2021-03-12 15:05:17 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-03-12 15:05:18 INFO SparkContext:54 - Running Spark version 2.3.0
2021-03-12 15:05:18 INFO SparkContext:54 - Submitted application: Spark Pi
2021-03-12 15:05:18 INFO SecurityManager:54 - Changing view acls to: root
2021-03-12 15:05:18 INFO SecurityManager:54 - Changing modify acls to: root
2021-03-12 15:05:18 INFO SecurityManager:54 - Changing view acls groups to:
2021-03-12 15:05:18 INFO SecurityManager:54 - Changing modify acls groups to:
2021-03-12 15:05:18 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
2021-03-12 15:05:18 INFO Utils:54 - Successfully started service 'sparkDriver' on port 43552.
2021-03-12 15:05:18 INFO SparkEnv:54 - Registering MapOutputTracker
2021-03-12 15:05:18 ERROR SparkContext:91 - Error initializing SparkContext.
java.lang.IllegalArgumentException: System memory 466092032 must be at least 471859200. Please increase heap size using the --driver-memory option or spark.driver.memory in Spark configuration.
at org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory(UnifiedMemoryManager.scala:217)
at org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:199)
②、第二次测试
[root@cdh-001 conf]# spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --driver-memory 600m --executor-memory 100m --executor-cores 2 $SPARK_HOME/examples/jars/spark-examples*.jar 10
2021-03-12 15:05:31 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-03-12 15:05:31 INFO SparkContext:54 - Running Spark version 2.3.0
2021-03-12 15:05:31 INFO SparkContext:54 - Submitted application: Spark Pi
2021-03-12 15:05:31 INFO SecurityManager:54 - Changing view acls to: root
2021-03-12 15:05:31 INFO SecurityManager:54 - Changing modify acls to: root
2021-03-12 15:05:31 INFO SecurityManager:54 - Changing view acls groups to:
2021-03-12 15:05:31 INFO SecurityManager:54 - Changing modify acls groups to:
2021-03-12 15:05:31 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
2021-03-12 15:05:31 INFO Utils:54 - Successfully started service 'sparkDriver' on port 39530.
2021-03-12 15:05:31 INFO SparkEnv:54 - Registering MapOutputTracker
2021-03-12 15:05:31 ERROR SparkContext:91 - Error initializing SparkContext.
java.lang.IllegalArgumentException: Executor memory 104857600 must be at least 471859200. Please increase executor memory using the --executor-memory option or spark.executor.memory in Spark configuration.
at org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory(UnifiedMemoryManager.scala:225)
at org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:199)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:330)
看表象是运行spark时,服务器资源不足,无法分配到指定大小的内存。
机器为16G内存,安装了全家桶后,根据free -g命令检查,系统可用内容是高于报错中的内存大小限制的。
但因为运行的Spark on yarn机制,那么会不会是yarn的配置限制或配置错误引起的呢?
经查,Yarn的nodemanager节点会对提交上来的任务(本例为spark on yarn)进行内存可分配性检查,涉及到对物理内存和虚拟内存的检查,当机器内存性能不太高时,可能无法通过内存检查。
当然可用尝试关闭此选项,来通过不预检内存来尝试启动程序的目的(受限于物理内存的制约,可能会失败)。
解决方案:
通过在yarn-site.xml中添加如下配置项,并重启yarn,程序在 “–driver-memory 600m --executor-memory 600m”的参数下已可以成功运行。
yarn.nodemanager.pmem-check-enabled
false
yarn.nodemanager.vmem-check-enabled
false