Spark On Yarn VCore Userd 值不正常(DefaultResourceCalculator / DominantResourceCalculator )

Spark On Yarn  VCore Userd 值不正常,目前集群有两个任务再跑,每个任务使用1cores。

在执行下面的脚本的时候。资源使用如下图:

Spark On Yarn VCore Userd 值不正常(DefaultResourceCalculator / DominantResourceCalculator )_第1张图片

 

执行脚本:

 

spark-submit \

--master yarn \

--deploy-mode cluster \

--class com.yss.aml.core.analysis.Analysis1201 \

--driver-memory 5g \

--num-executors 9 \

--executor-memory 5g \

--executor-cores 10 \

/data/temp/core.jar \

"20190128"

 

 

 

 

 

 

 

 

从脚本来看,申请executors数量9,每个executor-cores数量为10,那么申请的VCore Userd 至少为90+1,即91个。

加上原来集群VCore Userd 为2, 那么正常来说,集群在跑的时候数值理论至少为93个。

从截图上看,VCore Userd的使用量只有12个。 为什么?

 

 

Property Description
yarn.scheduler.capacity.resource-calculator The ResourceCalculator implementation to be used to compare Resources in the scheduler. The default i.e. org.apache.hadoop.yarn.util.resource.DefaultResourseCalculator only uses Memory while DominantResourceCalculator uses Dominant-resource to compare multi-dimensional resources such as Memory, CPU etc. A Java ResourceCalculator class name is expected.

 

 

 

    yarn默认情况下,只根据内存调度资源,所以spark on yarn运行的时候,即使通过--executor-cores指定vcore个数为N,但是在yarn的资源管理页面上看到使用的vcore个数还是1. 相关配置在capacity-scheduler.xml 文件:

 


    yarn.scheduler.capacity.resource-calculator
    org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator
   
      The ResourceCalculator implementation to be used to compare
      Resources in the scheduler.
      The default i.e. DefaultResourceCalculator only uses Memory while
      DominantResourceCalculator uses dominant-resource to compare
      multi-dimensional resources such as Memory, CPU etc.
   

 

 

我们应该考虑使用DominantResourceCalculator,该资源计算器在计算资源的时候会综合考虑cpu和内存的情况,来决定yarn最后的调度。

 


  yarn.scheduler.capacity.resource-calculator
 
  org.apache.hadoop.yarn.util.resource.DominantResourceCalculator

 


ambari调整页面:

yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
 

Spark On Yarn VCore Userd 值不正常(DefaultResourceCalculator / DominantResourceCalculator )_第2张图片

 

好了,添加完,记得验证一下。

 

清理到yarn所有的任务。

提交参数:

spark-submit

--master yarn

--deploy-mode cluster

--class com.yss.aml.core.analysis.Analysis1201

--driver-memory 5g

--num-executors 9

--executor-memory 3g

--executor-cores 10

/data/temp/core.jar "20190128"

 

Cpu:  executor数量* executor上使用的cpu数量 + am Core数量 = 9*10 +1 = 91 

Memory: executor数量* executor上使用的cpu数量+ driver内存 = 9*3+5 = 42

Spark On Yarn VCore Userd 值不正常(DefaultResourceCalculator / DominantResourceCalculator )_第3张图片

 

 

 

官方文档:http://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

你可能感兴趣的:(Hadoop)