spark成功之后运行例子报错
问题一:
spark.SparkContext: Added JAR file:/home/hadoop-cdh/app/test/sparktest/EmarOlap-0.0.1-SNAPSHOT.jar at http://192.168.5.143:32252/jars/EmarOlap-0.0.1-SNAPSHOT.jar with timestamp 1428464475056
Exception in thread "main" java.lang.VerifyError: class org.apache.hadoop.yarn.proto.YarnProtos$PriorityProto overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:791)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at java.lang.Class.getDeclaredConstructors0(Native Method)
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2404)
at java.lang.Class.getConstructor0(Class.java:2714)
at java.lang.Class.getConstructor(Class.java:1674)
at org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java:62)
at org.apache.hadoop.yarn.util.Records.newRecord(Records.java:36)
at org.apache.hadoop.yarn.api.records.Priority.newInstance(Priority.java:39)
at org.apache.hadoop.yarn.api.records.Priority.<clinit>(Priority.java:34)
at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$.<init>(YarnSparkHadoopUtil.scala:101)
at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$.<clinit>(YarnSparkHadoopUtil.scala)
at org.apache.spark.deploy.yarn.ClientArguments.<init>(ClientArguments.scala:38)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:55)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:381)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
at com.emar.common.spark.examples.SparkInputFormatExample.main(SparkInputFormatExample.java:31)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
protobuf版本不一致,编译前统一yarn和spark的pb版本,修改spark的pom.xml文件
问题二:
WARN [sparkDriver-akka.actor.default-dispatcher-5] remote.ReliableDeliverySupervisor:
Association with remote system [akka.tcp://sparkExecutor@host127:37972] has failed, address is now gated for [5000] ms. Reason is:
[org.apache.spark.TaskState$; local class incompatible: stream classdesc serialVersionUID = -2913614267616900700, local class serialVersionUID = 746799155515967470].
这是spark的版本不一致,1.3.2不能兼容1.3.0的,spark版本搞得真垃圾
问题三:
spark had a not serializable result
spark1.x的版本,需要将返回的RDD都序列化好,java中就是继承与serializable,否则他默认的序列化程序就不能工作
解决方法,在spark-default中添加
spark.serializer=org.apache.spark.serializer.KryoSerializer