Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream

Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream

CDH5.16.2执行spark-submit或者spark-shell 时报错

[root@hadoop103 ~]# spark-submit
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
	at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$mergeDefaultSparkProperties$1.apply(SparkSubmitArguments.scala:123)
	at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$mergeDefaultSparkProperties$1.apply(SparkSubmitArguments.scala:123)
	at scala.Option.getOrElse(Option.scala:120)
	at org.apache.spark.deploy.SparkSubmitArguments.mergeDefaultSparkProperties(SparkSubmitArguments.scala:123)
	at org.apache.spark.deploy.SparkSubmitArguments.(SparkSubmitArguments.scala:109)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:114)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 7 more

原因是CDH版的Spark从1.4版本以后,Spark编译时都没有将hadoop的classpath编译进去,所以必须在spark-env.sh中指定hadoop中的所有jar包。

解决方案:

在spark-env.sh中添加一条配置信息,将hadoop的classpath引入, ${HADOOP_HOME}根据自己的情况而定,直接写绝对路径也行;注意所有节点都需要修改。

export export SPARK_DIST_CLASSPATH=$(${HADOOP_HOME}/bin/hadoop classpath)

CM:

去CM网页上修改Spark2.2配置,指定SPARK_DIST_CLASSPATH,然后重启过期配置。

export SPARK_DIST_CLASSPATH=$(${HADOOP_HOME}/bin/hadoop classpath)

Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream_第1张图片

 

你可能感兴趣的:(CDH,spark,hadoop)