Spark on Yarn安装过程遇到的错误

错误1:

运行yarn clien模式时候

[main]client.RMProxy (RMProxy.java:createRMProxy(92)) - Connecting to ResourceManagerat /0.0.0.0:8032

2014-11-2615:16:35,416 INFO  [main] ipc.Client(Client.java:handleConnectionFailure(842)) - Retrying connect to server:0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy isRetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2014-11-2615:16:36,418 INFO  [main] ipc.Client(Client.java:handleConnectionFailure(842)) - Retrying connect to server:0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy isRetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

 

解决方法:

在spark-env.sh中添加

exportHADOOP_CONF_DIR=/etc/hadoop/conf

虽然配置文件里已经有

exportHADOOP_CONF_DIR=${HADOOP_CONF_DIR:-etc/hadoop/conf}

但是不起作用。还是需要添加上面的那句。

 

错误2:

ERROR yarn.Client:Required executor memory (3072 MB), is above the max threshold (2048 MB) ofthis cluster.

这个是因为,采用了Yarn模式。Yarn里面配置的yarn.scheduler最大设置是2048M。而spark-defaults.conf里设置了spark.executor.memory为3g。可以把spark-defaults.conf里的这个参数注释掉。

 

错误3:

改成Hadoop集群模式时,出现如下错误。

WARNutil.NativeCodeLoader: Unable to load native-hadooplibrary for your platform... using builtin-java classes where applicable

14/11/28 13:50:13WARN shortcircuit.DomainSocketFactory: The short-circuit local reads featurecannot be used because libhadoop cannot be loaded.

 

14/11/28 09:56:07ERROR lzo.GPLNativeCodeLoader: Could not load native gpl library

java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path

       atjava.lang.ClassLoader.loadLibrary(ClassLoader.java:1886)

       atjava.lang.Runtime.loadLibrary0(Runtime.java:849)

       atjava.lang.System.loadLibrary(System.java:1088)

       atcom.hadoop.compression.lzo.GPLNativeCodeLoader.(GPLNativeCodeLoader.java:32)

       atcom.hadoop.compression.lzo.LzoCodec.(LzoCodec.java:71)

 

解决方法:

#add new by dw fornative-hadoop library

exportHADOOP_HOME=/usr/lib/Hadoop

 

CLASSPATH=$CLASSPATH:$HADOOP_HOME/lib/hadoop-lzo.jar

exportJAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:$HADOOP_HOME/lib/native/

exportLD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native/

 

这样就OK了,结果如下:

14/11/28 14:27:00INFO lzo.GPLNativeCodeLoader: Loaded native gpl library

14/11/28 14:27:00INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library[hadoop-lzo rev c3bcb9c70b90b75fc1ddbb73dfe18dfddd16dc67]

 

补充:

如果Hadoop是HA模式。Spark要是将History Log记录到Hadoop里。则相应的配置文件要修改一下。

spark-defaults.conf

spark.eventLog.dir    hdfs://zxcluster/user/spark/applicationHistory

spark.yarn.jar        hdfs://zxcluster/user/spark/share/lib/spark-assembly.jar

 

/etc/default/spark

exportSPARK_HISTORY_SERVER_LOG_DIR=hdfs://zxcluster/user/spark/applicationHistory


你可能感兴趣的:(Spark on Yarn安装过程遇到的错误)