spark on yarn 出现的问题【测试wordcount程序遇到的错误,ipc.client连接到yarn的端口失败】

文章来源:http://www.myexception.cn/open-source/1585678.html


spark on yarn 出现的问题(一)
测试spark on yarn

spark版本:spark-0.9.0-incubating-bin-hadoop2

WordCount.scala 代码:

import org.apache.spark._
import SparkContext._
object WordCount {
  def main(args: Array[String]) {
    if (args.length != 3 ){
      println("usage is org.test.WordCount <master> <input> <output>")
      return
    }
    val sc = new SparkContext(args(0), "WordCount", System.getenv("SPARK_HOME"), Seq(System.getenv("SPARK_TEST_JAR")))
    val textFile = sc.textFile(args(1))
    val result = textFile.flatMap(line => line.split("\\s+")).map(word => (word, 1)).reduceByKey(_ + _)
    result.saveAsTextFile(args(2))
  }
}


使用这个程序在spark on yarn 测试的时候,报了下面的异常:

14/03/05 15:57:48 DEBUG ipc.Client: The ping interval is 60000 ms.
14/03/05 15:57:48 DEBUG ipc.Client: Connecting to 0.0.0.0/0.0.0.0:8030
14/03/05 15:57:49 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:50 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:51 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:52 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:53 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:54 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:55 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:56 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:57 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:58 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:58 DEBUG ipc.Client: closing ipc connection to 0.0.0.0/0.0.0.0:8030: 拒绝连接
java.net.ConnectException: 拒绝连接
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
	at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
	at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:547)
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642)
	at org.apache.hadoop.ipc.Client$Connection.access$2600(Client.java:314)
	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1399)
	at org.apache.hadoop.ipc.Client.call(Client.java:1318)
	at org.apache.hadoop.ipc.Client.call(Client.java:1300)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
	at com.sun.proxy.$Proxy11.registerApplicationMaster(Unknown Source)
	at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:106)
	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
	at com.sun.proxy.$Proxy12.registerApplicationMaster(Unknown Source)
	at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:197)
	at org.apache.spark.deploy.yarn.ApplicationMaster.registerApplicationMaster(ApplicationMaster.scala:138)
	at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:102)
	at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:429)
	at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
14/03/05 15:57:58 DEBUG ipc.Client: IPC Client (1553281132) connection to 0.0.0.0/0.0.0.0:8030 from hadoop: closed


根据异常错误,发现是无法连接上8030这个端口,这个端口是RM的调度端口,在yarn-site.xml中也配置了:
        <property>
                <name>yarn.resourcemanager.scheduler.address.rm2</name>
                <value>master2:8030</value>
        </property>


可是却死活连不上,对应这个问题的解决方案是:

在yarn-site.xml中增加如下配置,明确指定RM的hostname:

        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>master1</value>
        </property>


这样就能个连接上,具体原因得去阅读spark的源码了。

你可能感兴趣的:(spark on yarn 出现的问题【测试wordcount程序遇到的错误,ipc.client连接到yarn的端口失败】)