19/07/09 17:00:48 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
19/07/09 17:00:49 WARN Utils: Service 'Driver' could not bind on a random free port. You may check whether configuring an appropriate binding address.
Exception in thread "main" java.net.BindException: 无法指定被请求的地址: Service 'Driver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'Driver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:128)
at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:558)
at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1283)
at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:501)
at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:486)
at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:989)
at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:254)
at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:364)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
我的spark执行的jar文件代码内容如下
SparkSession.Builder builder = SparkSession.builder()/*.master("local[*]")*/.appName("SparkCalculateRecommend").
config("spark.mongodb.input.uri", "mongodb://xx:[email protected]:27018/sns.igomoMemberInfo_Spark_input")
.config("spark.mongodb.output.uri", "mongodb://xx:[email protected]:27018/sns.igomoMemberInfo_Spark_output")
.config("spark.driver.bindAddress","127.0.0.1")
.config("spark.executor.memory", "1g")
.config("es.nodes", esIpAddr)
.config("es.port", "9200")
.config("es.nodes.wan.only", "true");
我的spark通过java提交到集群的代码如下
SparkLauncher spark = new SparkLauncher()
.setDeployMode("cluster")
.setMainClass("com.fsdn.zaodian.spark.XunMeiSpark")
.setMaster("spark://192.168.31.205:8180")
.setConf(SparkLauncher.EXECUTOR_MEMORY, "512m")
.setConf(SparkLauncher.EXECUTOR_CORES,"2")
.setSparkHome("/data/spark-2.4.3")
// .setSparkHome("D:\\soft\\spark-2.4.3-bin-hadoop2.7")
.setVerbose(true)
.setAppResource("/data/spark-2.4.3/examples/jars/zaodian-0.0.1-SNAPSHOT.jar")
.addAppArgs(memberIds);
经过系列采坑后,原来是 DeployMode 参数制定错了 ,这里将其注释掉 成功解决问题,注释后的代码如下
SparkLauncher spark = new SparkLauncher()
// .setDeployMode("cluster")
.setMainClass("com.fsdn.zaodian.spark.XunMeiSpark")
.setMaster("spark://192.168.31.205:8180")
.setConf(SparkLauncher.EXECUTOR_MEMORY, "512m")
.setConf(SparkLauncher.EXECUTOR_CORES,"2")
.setSparkHome("/data/spark-2.4.3")
// .setSparkHome("D:\\soft\\spark-2.4.3-bin-hadoop2.7")
.setVerbose(true)
.setAppResource("/data/spark-2.4.3/examples/jars/zaodian-0.0.1-SNAPSHOT.jar")
.addAppArgs(memberIds);
详细解决问题的思路如下:
经过大量实验, 使用linux shell方式 提交spark集群 可以正常运行,但是使用java代码提交spark集群就会运行错误。因此猜想是提交的参数很可能有问题
shell方式提交后的日志参数:
Spark Executor Command:
"/usr/java/jdk1.8.0_171/bin/java"
"-cp" "/data/spark-2.4.3/conf/:/data/spark-2.4.3/jars/*"
"-Xmx512M" "-Dspark.ui.port=4349"
"-Dspark.driver.port=35938" "org.apache.spark.executor.CoarseGrainedExecutorBackend"
"--driver-url" "spark://CoarseGrainedScheduler@okdiz:35938"
"--executor-id" "1"
"--hostname" "192.168.31.207"
"--cores" "1"
"--app-id" "app-20190710174108-0011"
"--worker-url" "spark://[email protected]:20157"
java代码提交后的日志参数:
Launch Command: "/usr/java/jdk1.8.0_171/bin/java"
"-cp" "/data/spark-2.4.3/conf/:/data/spark-2.4.3/jars/*"
"-Xmx1024M" "-Dspark.executor.memory=512m"
"-Dspark.driver.supervise=false"
"-Dspark.app.name=com.fsdn.zaodian.spark.XunMeiSpark"
"-Dspark.submit.deployMode=cluster"
"-Dspark.ui.port=4349"
"-Dspark.master=spark://192.168.31.205:8180"
"-Dspark.jars=file:/data/spark-2.4.3/examples/jars/zaodian-0.0.1-SNAPSHOT.jar"
"-Dspark.rpc.askTimeout=10s"
"-Dspark.executor.cores=2" "org.apache.spark.deploy.worker.DriverWrapper"
"spark://[email protected]:10440"
"/data/spark-2.4.3/work/driver-20190710181012-0001/zaodian-0.0.1-SNAPSHOT.jar"
"com.fsdn.zaodian.spark.XunMeiSpark"
"420388870078501"
可以看到两个参数有很多不一样的项目,实验后发现是"-Dspark.submit.deployMode=cluster" 的制定导致无法连接到集群的端口。故作此文