Yarn Scheduler

Yarn Scheduler

https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html

How to find the scheduler in used on the existing cluster?

There are 2 ways:

  1. Through Resource Manager Web UI ->Scheduler, e.g. http://vm-slaver1:8088/cluster/scheduler
  2. Check the configuration yarn.resourcemanager.scheduler.class in yarn-site.xml. If not found, go to check the yarn-default.xml corresponding to the version.
    Example for Capacity Scheduler
    capacity-scheduler.xml put the same directory to yarn-site.xml
 



  
    yarn.scheduler.capacity.maximum-applications
    10000
     Maximum number of applications that can be pending and running. 
  

  
    yarn.scheduler.capacity.maximum-am-resource-percent
    0.1
     Maximum percent of resources in the cluster which can be used to run application masters i.e. controls number of concurrent running applications. 
  

  
    yarn.scheduler.capacity.resource-calculator
    org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator
     The ResourceCalculator implementation to be used to compare Resources in the scheduler. The default i.e. DefaultResourceCalculator only uses Memory while DominantResourceCalculator uses dominant-resource to compare multi-dimensional resources such as Memory, CPU etc. 
  

  
    yarn.scheduler.capacity.root.queues
    q0
     The queues at the this level (root is the root queue). 
  

   
     yarn.scheduler.capacity.root.q0.queues
     q01,q02
         

  
    yarn.scheduler.capacity.root.q0.capacity
    100
    Default queue target capacity.
  

 
    yarn.scheduler.capacity.root.q0.q01.capacity
    60
    Default queue target capacity.
  

 
    yarn.scheduler.capacity.root.q0.q02.capacity
    40
    Default queue target capacity.
  

  
    yarn.scheduler.capacity.root.q0.user-limit-factor
    1
     Default queue user limit a percentage from 0.0 to 1.0. 
  

  
    yarn.scheduler.capacity.root.q0.maximum-capacity
    100
     The maximum capacity of the default queue. 
  

 
    yarn.scheduler.capacity.root.q0.q01.maximum-capacity
    80
     The maximum capacity of the default queue. 
  


    yarn.scheduler.capacity.root.q0.q02.maximum-capacity
    80
     The maximum capacity of the default queue. 
  

  
    yarn.scheduler.capacity.root.q0.state
    RUNNING
     The state of the default queue. State can be one of RUNNING or STOPPED. 
  

  
    yarn.scheduler.capacity.root.q0.q01.state
    RUNNING
     The state of the default queue. State can be one of RUNNING or STOPPED. 
  

 
    yarn.scheduler.capacity.root.q0.q02.state
    STOPPED
     The state of the default queue. State can be one of RUNNING or STOPPED. 
  

  
    yarn.scheduler.capacity.root.q0.acl_submit_applications
    *
     The ACL of who can submit jobs to the default queue. 
  

  
    yarn.scheduler.capacity.root.q0.acl_administer_queue
    *
     The ACL of who can administer jobs on the default queue. 
  

  
    yarn.scheduler.capacity.node-locality-delay
    40
     Number of missed scheduling opportunities after which the CapacityScheduler attempts to schedule rack-local containers. Typically this should be set to number of nodes in the cluster, By default is setting approximately number of nodes in one rack which is 40. 
  

  
    yarn.scheduler.capacity.queue-mappings
    
     A list of mappings that will be used to assign jobs to queues The syntax for this list is [u|g]:[name]:[queue_name][,next mapping]* Typically this list will be used to map users to queues, for example, u:%user:%user maps all users to queues with the same name as the user. 
  

  
    yarn.scheduler.capacity.queue-mappings-override.enable
    false
     If a queue mapping is present, will it override the value specified by the user? This can be used by administrators to place jobs in queues that are different than the one specified by the user. The default is false. 
  


Tips:
The sum of capacities for all queues, at each level, must be equal to 100. e.g. q0.q01+q0.q02=100

Example for Fair Scheduler

The default configuration is under the same directory to yarn-site.xml with name fair-scheduler.xml. You can specify it by yarn.scheduler.fair.allocation.file in yarn-site.xml.

examples: fair-scheduler-allocations.xml



  
    10000 mb,0vcores
    90000 mb,1vcores
    50
    0.1
    2.0
    fair
    
      hadoop
      5000 mb,0vcores
    
  
 
  0.5
 
  
  3.0
  
 
  
    30
  
  5
 
  
    
    
    
        
    
    
  

Tips:
The name of queue you set in your applciation should be a leaf queue with path name ( with or without root is ok) e.g. q0.q01 or root.q0.q1
![yarn-img-1](/Users/canhuamei/Desktop/screamshot/yarn-1.png

)

(注:但是后来在spark research的时候,发现spark里面只要直接设置叶子节点名即可,sparkConf.set("spark.yarn.queue","q01");

HelloWorld for Yarn Application

你可能感兴趣的:(Yarn Scheduler)