计算能力调度器算法
总结
设置调度算法
设置capacity的细节参数
mapred.capacity-scheduler.queue.
详见capacity-scheduler.xml
以capacity-scheduler.xml当前配置来说明算法的运行情况
这里设置了整个集群的承载量,及2队列的占用比率queue 占用85% secondqueue占用15%
用于一个队列里的每个用户的平均量
以上几个参数实现运行的情况如下
2011-09-11 15:07:00,705 INFO org.apache.hadoop.mapred.CapacityTaskScheduler: Initializing 'default' queue with cap=85.0, maxCap=-1.0, ulMin=100, ulMinFactor=1.0, supportsPriorities=false, maxJobsToInit=9, maxJobsToAccept=90, maxActiveTasks=200000, maxJobsPerUserToInit=9, maxJobsPerUserToAccept=90, maxActiveTasksPerUser=100000
2011-09-11 15:07:00,706 INFO org.apache.hadoop.mapred.CapacityTaskScheduler: Initializing 'secondqueue' queue with cap=15.0, maxCap=-1.0, ulMin=25, ulMinFactor=1.0, supportsPriorities=true, maxJobsToInit=2, maxJobsToAccept=6, maxActiveTasks=200000, maxJobsPerUserToInit=1, maxJobsPerUserToAccept=3, maxActiveTasksPerUser=100000
当前default的各项值是如何计算得出的
maxJobsToInit 当前queue的最大并发运行任务数
int maxJobsToInit = (int)Math.ceil(maxSystemJobs * capacityPercent/100.0);
9= 10*85%
maxJobsPerUserToInit 当前queue里的用户最大并发运行任务数
int maxJobsPerUserToInit =
(int)Math.ceil(maxSystemJobs * capacityPercent/100.0 * ulMin/100.0);
以secondqueue来说 他的最大并发运行任务数是2 设了minimum-user-limit-percent =25 得出的
maxJobsPerUserToInit = 1
那么按照配置 secondqueue的情况就是 queue支持任务优先级排序,队伍最大支持并发运行2个任务,最大能容量6个任务,相当于会有4个任务在waiting 状态 ,maxActiveTasks,maxActiveTasksPerUser指定任务运行过程中的最大task 参数直接指定
secondqueue 队伍中的用户单个用户最多能执行1个任务,单用户最多容量3个任务
当任务数量提交超过maxJobsToAccept或用户的maxJobsPerUserToAccept 将直接提示队伍满,不能提交任务
以上总结以dfs 及mine帐户 在 static-1.space|app-25.space作测试得出的结论
相应的代码见 CapacitySchedulerQueue checkJobSubmissionLimits ()方法int queueWaitingJobs = getNumWaitingJobs();
int queueInitializingJobs = getNumInitializingJobs();
int queueRunningJobs = getNumRunningJobs();
if ((queueWaitingJobs + queueInitializingJobs + queueRunningJobs) >= maxJobsToAccept)
{ throw new IOException( "Job '" + job.getJobID() + "' from user '" + user + "' rejected since queue '" + queueName + "' already has " + queueWaitingJobs + " waiting jobs, " + queueInitializingJobs + " initializing jobs and " + queueRunningJobs + " running jobs - Exceeds limit of " + maxJobsToAccept + " jobs to accept"); } // Across all jobs of the user
int userWaitingJobs = getNumWaitingJobsByUser(user);
int userInitializingJobs = getNumInitializingJobsByUser(user);
int userRunningJobs = getNumRunningJobsByUser(user);
if ((userWaitingJobs + userInitializingJobs + userRunningJobs) >= maxJobsPerUserToAccept)
{ throw new IOException( "Job '" + job.getJobID() + "' rejected since user '" + user + "' already has " + userWaitingJobs + " waiting jobs, " + userInitializingJobs + " initializing jobs and " + userRunningJobs + " running jobs - " + " Exceeds limit of " + maxJobsPerUserToAccept + " jobs to accept" + " in queue '" + queueName + "' per user"); }
最终的总结
Queue Name State Scheduling Information
default running Queue configuration
Capacity Percentage: 85.0%
User Limit: 100%
Priority Supported: NO
-------------
Map tasks
Capacity: 17 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
-------------
Reduce tasks
Capacity: 17 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
-------------
Job info
Number of Waiting Jobs: 0
Number of Initializing Jobs: 0
Number of users who have submitted jobs: 0
secondqueue running Queue configuration
Capacity Percentage: 15.0%
User Limit: 25%
Priority Supported: YES
-------------
Map tasks
Capacity: 3 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
-------------
Reduce tasks
Capacity: 3 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
-------------
Job info
Number of Waiting Jobs: 0
Number of Initializing Jobs: 0
Number of users who have submitted jobs: 0
capacity-scheduler在多用户任务并行运行环境下,通过对现有集群的slots进行按queue 比率来进行划分
进而达到队列内不同用户的比率控制,
相比hadoop 默认的JobQueueTaskScheduler 单一队列fifo,传统无特点的调度算法 是一个极大的改进,推荐使用
Hadoop调度算法CapacityScheduler源码分析:http://blog.csdn.net/zhoujq/article/details/6737441