spark 参数

spark.default.parallelism Default number of partitions in RDDs returned by transformations like join, reduceByKey, and parallelize when not set by user.
reducer的个数
spark.sql.shuffle.partitions denote the number of reducers in a join/group-by operation (dataframe)

你可能感兴趣的:(spark 参数)