spark streaming 读kafka写入hbase报错

使用spark streaming消费kafka topic系消息, 再写入到hbase中, 使用spark submit 他job时,报了一些错误, 此处归纳一下:

1, io.netty.handler.codec.EncoderException: java.lang.NoSuchMethodError: io.netty.channel.DefaultFileRegion.(Ljava/io/File;JJ)V`

问题: 使用的是 spark-2.1.0-bin-hadoop2.7, scala-2.11.8
解决: 使用更高版本的spark : spark-2.4.0-bin-hadoop2.7

2, 使用spark-2.4.0, 又继续报错

Exception in thread "main" java.lang.AbstractMethodError
        at org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:99)
        at org.apache.spark.streaming.kafka010.KafkaUtils$.initializeLogIfNecessary(KafkaUtils.scala:40)
        at org.apache.spark.internal.Logging$class.log(Logging.scala:46)
        at org.apache.spark.streaming.kafka010.KafkaUtils$.log(KafkaUtils.scala:40)
        at org.apache.spark.internal.Logging$class.logWarning(Logging.scala:66)
        at org.apache.spark.streaming.kafka010.KafkaUtils$.logWarning(KafkaUtils.scala:40)
        at org.apache.spark.streaming.kafka010.KafkaUtils$.fixKafkaParams(KafkaUtils.scala:208)
        at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.(DirectKafkaInputDStream.scala:66)
        at org.apache.spark.streaming.kafka010.KafkaUtils$.createDirectStream(KafkaUtils.scala:150)
        at org.apache.spark.streaming.kafka010.KafkaUtils$.createDirectStream(KafkaUtils.scala:127)
        at a.SparkafkaStreaming$.readKafkaAndWriteHbase(SparkafkaStreaming.scala:55)
        at a.SparkafkaStreaming$.main(SparkafkaStreaming.scala:21)
        at a.SparkafkaStreaming.main(SparkafkaStreaming.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)

问题: spark-sql-kafka的版本是2.2,而spark的版本是2.4,
解决: 修改spark-sql-kafka的版本为2.4

3, spark 写入hbase报错

java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
        at a.SparkafkaStreaming$.sparkWriteHbase(SparkafkaStreaming.scala:93)
        at a.SparkafkaStreaming$$anonfun$readKafkaAndWriteHbase$2$$anonfun$apply$1.apply(SparkafkaStreaming.scala:76)
        at a.SparkafkaStreaming$$anonfun$readKafkaAndWriteHbase$2$$anonfun$apply$1.apply(SparkafkaStreaming.scala:74)
        at scala.collection.Iterator$class.foreach(Iterator.scala:891)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)

问题: spark-env.sh 没有关联hbase配置
解决:
export HIVE_CONF_DIR=/soft/hive/conf
export SPARK_CLASSPATH=$SPARK_CLASSPATH:/soft/hbase/lib/*

4, spark 连接hadoop报错

java.lang.NoSuchMethodError: org.apache.hadoop.conf.Configuration.getPassword(Ljava/lang/String;)[C
        at org.apache.spark.SSLOptions$$anonfun$8.apply(SSLOptions.scala:188)
        at org.apache.spark.SSLOptions$$anonfun$8.apply(SSLOptions.scala:188)
        at scala.Option.orElse(Option.scala:289)
        at org.apache.spark.SSLOptions$.parse(SSLOptions.scala:188)
        at org.apache.spark.SecurityManager.(SecurityManager.scala:117)
        at org.apache.spark.deploy.SparkSubmit.secMgr$lzycompute$1(SparkSubmit.scala:359)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$secMgr$1(SparkSubmit.scala:359)
        at org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$7.apply(SparkSubmit.scala:367)
        at org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$7.apply(SparkSubmit.scala:367)
        at scala.Option.map(Option.scala:146)
        at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:366)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:143)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

解决: 把spark版本改为 spark2.2.1

你可能感兴趣的:(大数据hadoop-spark)