错误1:
运行yarn clien模式时候
[main]client.RMProxy (RMProxy.java:createRMProxy(92)) - Connecting to ResourceManagerat /0.0.0.0:8032
2014-11-2615:16:35,416 INFO [main] ipc.Client(Client.java:handleConnectionFailure(842)) - Retrying connect to server:0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy isRetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-11-2615:16:36,418 INFO [main] ipc.Client(Client.java:handleConnectionFailure(842)) - Retrying connect to server:0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy isRetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
解决方法:
在spark-env.sh中添加
exportHADOOP_CONF_DIR=/etc/hadoop/conf
虽然配置文件里已经有
exportHADOOP_CONF_DIR=${HADOOP_CONF_DIR:-etc/hadoop/conf}
但是不起作用。还是需要添加上面的那句。
错误2:
ERROR yarn.Client:Required executor memory (3072 MB), is above the max threshold (2048 MB) ofthis cluster.
这个是因为,采用了Yarn模式。Yarn里面配置的yarn.scheduler最大设置是2048M。而spark-defaults.conf里设置了spark.executor.memory为3g。可以把spark-defaults.conf里的这个参数注释掉。
错误3:
改成Hadoop集群模式时,出现如下错误。
WARNutil.NativeCodeLoader: Unable to load native-hadooplibrary for your platform... using builtin-java classes where applicable
14/11/28 13:50:13WARN shortcircuit.DomainSocketFactory: The short-circuit local reads featurecannot be used because libhadoop cannot be loaded.
14/11/28 09:56:07ERROR lzo.GPLNativeCodeLoader: Could not load native gpl library
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
atjava.lang.ClassLoader.loadLibrary(ClassLoader.java:1886)
atjava.lang.Runtime.loadLibrary0(Runtime.java:849)
atjava.lang.System.loadLibrary(System.java:1088)
atcom.hadoop.compression.lzo.GPLNativeCodeLoader.
atcom.hadoop.compression.lzo.LzoCodec.
解决方法:
在
#add new by dw fornative-hadoop library
exportHADOOP_HOME=/usr/lib/Hadoop
CLASSPATH=$CLASSPATH:$HADOOP_HOME/lib/hadoop-lzo.jar
exportJAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:$HADOOP_HOME/lib/native/
exportLD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native/
这样就OK了,结果如下:
14/11/28 14:27:00INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
14/11/28 14:27:00INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library[hadoop-lzo rev c3bcb9c70b90b75fc1ddbb73dfe18dfddd16dc67]
补充:
如果Hadoop是HA模式。Spark要是将History Log记录到Hadoop里。则相应的配置文件要修改一下。
spark-defaults.conf
spark.eventLog.dir hdfs://zxcluster/user/spark/applicationHistory
spark.yarn.jar hdfs://zxcluster/user/spark/share/lib/spark-assembly.jar
/etc/default/spark
exportSPARK_HISTORY_SERVER_LOG_DIR=hdfs://zxcluster/user/spark/applicationHistory