spark序列化溢出

序列化缓存溢出

Causedby:org.apache.spark.SparkException:Kryo序列化失败:缓冲区溢出。可用:0,必需:21.要避免此情况,请增加spark.kryoserializer.buffer.max

Caused by:org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow.Available: 0, required: 21. To avoid this, increasespark.kryoserializer.buffer.max value.

atorg.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:299)

atorg.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:240)

atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

atjava.lang.Thread.run(Thread.java:745)

 

 val sparkConf = newSparkConf().setAppName(Constants.SPARK_NAME_APP)

     .set("spark.kryoserializer.buffer.max","128");

 

原因分析: RDD extends scala.AnyRef withscala.Serializable  ,所以在使用textFile ,读取表的数据,等大量创建新的rdd,df,ds等 数据集的时候,注意把 这个值调大

你可能感兴趣的:(spark)