Spark往hbase里用saveAsHadoopDataset写数据时会出现错误解决

Spark往hbase里用saveAsHadoopDataset写数据时会出现如下错误:


Exception in thread "main" org.apache.spark.SparkException: Job aborted.
	at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:100)
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1096)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 62.0 failed 4 times, most recent failure: Lost task 2.3 in stage 62.0 (TID 92, 10.108.21.148, executor 1): java.lang.ClassNotFoundException: MLModel.FlowDetection$1
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	
Driver stacktrace:
	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1661)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1649)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1648)
Caused by: java.lang.ClassNotFoundException: MLModel.FlowDetection$1
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
解决方法(失败):配置路径无效
export SPARK_CLASSPATH=/home/hadoop/hbase-2.0.5/lib
解决方法:往spark的jar文件夹中添加hbase相关jar包
cp hbase-protocol-2.0.5.jar hbase-common-2.0.5.jar hbase-client-2.0.5.jar metrics-core-3.2.1.jar htrace-core-3.2.0-incubating.jar /home/hadoop/spark-2.3.3-bin-hadoop2.7/jars

错误2:org.apache.spark.SparkException: Job aborted: Task not serializable: java.io.NotSerializableException

public class JavaSparkPi implements Serializable {
  ...

你可能感兴趣的:(大数据)