org.apache.spark.SparkException:job aborted due to stage failure spark driver maxResultSize (1024)

org.apache.spark.SparkException:job aborted due to stage failure spark driver maxResultSize (1024)

本地local模式运行报spark.driver.maxResultSize超出1024M,

接下来分解决方法、参数含义及默认值等维度说明。

一、解决方法:

增大spark.driver.maxResultSize,设置方式是
sparkConf.set("spark.driver.maxResultSize", "4g")
二、参数含义及默认值:

Limit of total size of serialized results of all partitions for each Spark action (e.g. collect). Should be at least 1M, or 0 for unlimited. Jobs will be aborted if the total size is above this limit. Having a high limit may cause out-of-memory errors in driver (depends on spark.driver.memory and memory overhead of objects in JVM). Setting a proper limit can protect the driver from out-of-memory errors.

每个Spark action的所有分区的序列化结果的总大小限制(例如,collect行动算子)。 应该至少为1M,或者为无限制。 如果超过1g,job将被中止。 如果driver.maxResultSize设置过大可能会超出内存(取决于spark.driver.memory和JVM中对象的内存开销)。 设置适当的参数限制可以防止内存不足。

Ref、

1、https://www.fashici.com/tech/215.html

2、http://spark.apache.org/docs/1.6.1/configuration.html

3、http://bourneli.github.io/scala/spark/2016/09/21/spark-driver-maxResultSize-puzzle.html
 

你可能感兴趣的:(spark)