Spark升级到2.0后测试stream-kafka测试报java.lang.NoClassDefFoundError: org/apache/spark/Logging错误

- 最近从Spark 1.5.2升级到2.0之后,运行测试代码spark-stream-kafka报以下错误:

java.lang.NoClassDefFoundError: org/apache/spark/Logging
  at java.lang.ClassLoader.defineClass1(Native Method)
  at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
  at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
  at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
  at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
  at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
  at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
  at java.security.AccessController.doPrivileged(Native Method)
  at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  at org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:
91)
  at org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:
66)
  ... 53 elided
Caused by: java.lang.ClassNotFoundException: org.apache.spark.Logging
  at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  ... 67 more

- 对比了一下spark-core_2.11-2.0.0.jar和spark-core_2.11-1.5.2.jar,发现2.0版本确实少了org.apache.spark.Logging,网上说这个只有1.5.2及以下版本才有,新版本是没有的


- 网上search了一下解决方法,没有找到......,我想到的解决方法就是把缺少的那几个class弄成一个jar包,然后放到Spark的lib下面,重新启动Spark并执行测试代码,没有发现错误


点击此处打开下载页面


- 下面是测试代码:


1. 启动HDFS

2. 启动Zookeeper

$ zookeeper-server-start ./kafka_2.11-0.10.0.1/config/zookeeper.properties


3. 启动Kafka

$ kafka-server-start ./kafka_2.11-0.10.0.1/config/server.properties


4. 创建一个topic

$ kafka-topics --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test


5. 启动Spark

$spark-shell --driver-memory 1G


6. 输入测试命令

scala> :paste
// Entering paste mode (ctrl-D to finish)
import org.apache.spark.streaming.{Seconds, Minutes, StreamingContext}
import org.apache.spark.streaming.kafka.KafkaUtils
val ssc = new StreamingContext(sc, Seconds(2))
val zkQuorum = "localhost:2181"
val group = "test-group"
val topics = "test"
val numThreads = 1
val topicMap = topics.split(",").map((_,numThreads.toInt)).toMap
val lineMap = KafkaUtils.createStream(ssc, zkQuorum, group, topicMap)
val lines = lineMap.map(_._2)
val words = lines.flatMap(_.split(" "))
val pair = words.map(x => (x,1))
val wordCounts = pair.reduceByKeyAndWindow(_ + _, _ - _, Minutes(10), Seconds(2), 2)
wordCounts.print
ssc.checkpoint("hdfs://localhost:9000/checkpoint")
ssc.start
ssc.awaitTermination

输入ctrl-D执行上述命令,会看到下面的信息:


-------------------------------------------
Time: 1474428268000 ms
-------------------------------------------


7. 测试在Kafka发送消息

$ kafka-console-producer --broker-list localhost:9092 --topic test

hello world


8. 在Spark会看见以下信息证明测试成功:

-------------------------------------------
Time: 1474428396000 ms
-------------------------------------------
(hello,1)
(world,1)

你可能感兴趣的:(Spark)