This topic applies to YARN clusters only, and describes how to tune and optimize YARN for your cluster. It introduces the following terms:
Worker nodes also run system support services and possibly third-party monitoring or asset management services. This includes the Linux operating system.
In addition, you must allow resources for task buffers, such as the HDFS Sort I/O buffer. For vcore demand, consider the number of concurrent processes or tasks each service runs as an initial guide. For the operating system, start with a count of two.
Service | vcores | Memory (MB) |
---|---|---|
Operating system | 2 | |
YARN NodeManager | 1 | |
HDFS DataNode | 1 | 1,024 |
Impala Daemon | 1 | 16,348 |
HBase RegionServer | 0 | 0 |
Solr Server | 0 | 0 |
Cloudera Manager agent | 1 | 1,024 |
Task overhead | 0 | 52,429 |
YARN containers | 18 | 137,830 |
Total | 24 | 262,144 |
You can now configure YARN to use the remaining resources for its supervisory processes and task containers. Start with the NodeManager, which has the following settings:
Property | Description | Default |
---|---|---|
yarn.nodemanager.resource.cpu-vcores | Number of virtual CPU cores that can be allocated for containers. | 8 |
yarn.nodemanager.resource.memory-mb | Amount of physical memory, in MB, that can be allocated for containers. | 8 GB |
Property | Value |
---|---|
yarn.nodemanager.resource.cpu-vcores | min(24 – 6, 2 x 10) = 18 |
yarn.nodemanager.resource.memory-mb | 137,830 MB |
Property | Description | Default |
---|---|---|
yarn.scheduler.minimum-allocation-vcores | The smallest number of virtual CPU cores that can be requested for a container. | 1 |
yarn.scheduler.maximum-allocation-vcores | The largest number of virtual CPU cores that can be requested for a container. | 32 |
yarn.scheduler.increment-allocation-vcores | If using the Fair Scheduler, virtual core requests are rounded up to the nearest multiple of this number. | 1 |
yarn.scheduler.minimum-allocation-mb | The smallest amount of physical memory, in MB, that can be requested for a container. | 1 GB |
yarn.scheduler.maximum-allocation-mb | The largest amount of physical memory, in MB, that can be requested for a container. | 64 GB |
yarn.scheduler.increment-allocation-mb | If you are using the Fair Scheduler, memory requests are rounded up to the nearest multiple of this number. | 512 MB |
If a NodeManager has 50 GB or more RAM available for containers, consider increasing the minimum allocation to 2 GB. The default memory increment is 512 MB. For minimum memory of 1 GB, a container that requires 1.2 GB receives 1.5 GB. You can set maximum memory allocation equal to yarn.nodemanager.resource.memory-mb.
The default minimum and increment value for vcores is 1. Because application tasks are not commonly multithreaded, you generally do not need to change this value. The maximum value is usually equal to yarn.nodemanager.resource.cpu-vcores. Reduce this value to limit the number of containers running concurrently on one node.
The example leaves more than 50 GB RAM available for containers, which accommodates the following settings:
Property | Value |
---|---|
yarn.scheduler.minimum-allocation-mb | 2,048 MB |
yarn.scheduler.maximum-allocation-mb | 137,830 MB |
yarn.scheduler.maximum-allocation-vcores | 18 |
Property | Description | Default |
---|---|---|
mapreduce.map.memory.mb | The amount of physical memory, in MB, allocated for each map task of a job. | 1 GB |
mapreduce.map.java.opts.max.heap | The maximum Java heap size, in bytes, of the map processes. | 800 MB |
mapreduce.map.cpu.vcores | The number of virtual CPU cores allocated for each map task of a job. | 1 |
mapreduce.reduce.memory.mb | The amount of physical memory, in MB, allocated for each reduce task of a job. | 1 GB |
mapreduce.reduce.java.opts.max.heap | The maximum Java heap size, in bytes, of the reduce processes. | 800 MB |
mapreduce.reduce.cpu.vcores | The number of virtual CPU cores for each reduce task of a job. | 1 |
yarn.app.mapreduce.am.resource.mb | The physical memory requirement, in MB, for the ApplicationMaster. | 1 GB |
ApplicationMaster Java maximum heap size | The maximum heap size, in bytes, of the Java MapReduce ApplicationMaster. Exposed in Cloudera Manager as part of the YARN service configuration. This value is folded into the propertyyarn.app.mapreduce.am.command-opts. | 800 MB |
yarn.app.mapreduce.am.resource.cpu-vcores | The virtual CPU cores requirement for the ApplicationMaster. | 1 |
The settings for mapreduce.[map | reduce].java.opts.max.heap specify the default memory allotted for mapper and reducer heap size, respectively. The mapreduce.[map| reduce].memory.mb settings specify memory allotted their containers, and the value assigned should allow overhead beyond the task heap size. Cloudera recommends applying a factor of 1.2 to the mapreduce.[map | reduce].java.opts.max.heap setting. The optimal value depends on the actual tasks. Cloudera also recommends settingmapreduce.map.memory.mb to 1–2 GB and setting mapreduce.reduce.memory.mb to twice the mapper value. The ApplicationMaster heap size is 1 GB by default, and can be increased if your jobs contain many concurrent tasks. Using these guides, size the example worker node as follows:
Property | Value |
---|---|
mapreduce.map.memory.mb | 2048 MB |
mapreduce.reduce.memory.mb | 4096 MB |
mapreduce.map.java.opts.max.heap | 0.8 x 2,048 = 1,638 MB |
mapreduce.reduce.java.opts.max.heap | 0.8 x 4,096 = 3,277 MB |
With YARN worker resources configured, you can determine how many containers best support a MapReduce application, based on job type and system resources. For example, a CPU-bound workload such as a Monte Carlo simulation requires very little data but complex, iterative processing. The ratio of concurrent containers to spindle is likely greater than for an ETL workload, which tends to be I/O-bound. For applications that use a lot of memory in the map or reduce phase, the number of containers that can be scheduled is limited by RAM available to the container and the RAM required by the task. Other applications may be limited based on vcores not in use by other YARN applications or the rules employed by dynamic resource pools (if used).
To calculate the number of containers for mappers and reducers based on actual system constraints, start with the following formulas:
Property | Value |
---|---|
mapreduce.job.maps | MIN(yarn.nodemanager.resource.memory-mb / mapreduce.map.memory.mb, yarn.nodemanager.resource.cpu-vcores / mapreduce.map.cpu.vcores, number of physical drives x workload factor) x number of worker nodes |
mapreduce.job.reduces | MIN(yarn.nodemanager.resource.memory-mb / mapreduce.reduce.memory.mb, yarn.nodemanager.resource.cpu-vcores / mapreduce.reduce.cpu.vcores, # of physical drives x workload factor) x # of worker nodes |
The workload factor can be set to 2.0 for most workloads. Consider a higher setting for CPU-bound workloads.
You may also have to maximize or minimize cluster utilization for your workload or to meet Service Level Agreements (SLAs). To find the best resource configuration for an application, try various container and gateway/client settings and record the results.
For example, the following TeraGen/TeraSort script supports throughput testing with a 10-GB data load and a loop of varying YARN container and gateway/client settings. You can observe which configuration yields the best results.
#!/bin/sh HADOOP_PATH=/opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce for i in 2 4 8 16 32 64 # Number of mapper containers to test do for j in 2 4 8 16 32 64 # Number of reducer containers to test do for k in 1024 2048 # Container memory for mappers/reducers to test do MAP_MB=`echo "($k*0.8)/1" | bc` # JVM heap size for mappers RED_MB=`echo "($k*0.8)/1" | bc` # JVM heap size for reducers hadoop jar $HADOOP_PATH/hadoop-examples.jar teragen -Dmapreduce.job.maps=$i -Dmapreduce.map.memory.mb=$k -Dmapreduce.map.java.opts.max.heap=$MAP_MB 100000000 /results/tg-10GB-${i}-${j}-${k} 1>tera_${i}_${j}_${k}.out 2>tera_${i}_${j}_${k}.err hadoop jar $HADOOP_PATH/hadoop-examples.jar terasort -Dmapreduce.job.maps=$i -Dmapreduce.job.reduces=$j -Dmapreduce.map.memory.mb=$k -Dmapreduce.map.java.opts.max.heap=$MAP_MB -Dmapreduce.reduce.memory.mb=$k -Dmapreduce.reduce.java.opts.max.heap=$RED_MB /results/ts-10GB-${i}-${j}-${k} 1>>tera_${i}_${j}_${k}.out 2>>tera_${i}_${j}_${k}.err hadoop fs -rmr -skipTrash /results/tg-10GB-${i}-${j}-${k} hadoop fs -rmr -skipTrash /results/ts-10GB-${i}-${j}-${k} done done done
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_yarn_tuning.html