TimeoutException: Futures timed out after [300 seconds]异常问题

caused by:
java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]
scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227)
org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:220)
org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:146)
org.apache.spark.sql.execution.InputAdapter.doExecuteBroadcast(WholeStageCodegenExec.scala:387)
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:144)
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:140)
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
org.apache.spark.sql.execution.SparkPlan.executeBroadcast(SparkPlan.scala:140)

TimeoutException: Futures timed out after [300 seconds]
scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)

这个错误的原因解释:

在spark 配置 spark.sql.autoBroadcastJoinThreshold=10485760000(1G) 使用broadcast join模式,会将小于spark.sql.autoBroadcastJoinThreshold值(默认为10M)的表广播到其他计算节点,不会走shuffle过程,会更加高效。但是如果数据过大可能导致广播超时,所以一方面不建议autoBroadcastJoinThreshold设置过大,一方面估算下数据量不大的话即使shuffle也费不了多少时间~autoBroadcastJoinThreshold超时时间当然也是可以设置的。

关于BroadcastJoin可以参考这个https://blog.csdn.net/dabokele/article/details/65963401

你可能感兴趣的:(Spark)