ResourceManager(RM):RM是一个全局的资源管理器,负责整个系统的资源管理和分配。它主要由两个组件构成:调度器(Scheduler)和应用程序管理器(Applications Manager,ASM)。
- 多队列:容量调度器底层多个调度队列,每个队列采用FIFO调度策略。
- 容量保证:每个队列可以设定资源下限和资源上限,以保证任务的执行。
- 灵活性:队列之间的资源可以共享。如果一个队列的资源有多余,可以暂时借给其他队列使用,但是一旦该队列需要时,其他队列必须归还资源,即该队列对这些资源具有绝对拥有权。
- 多用户:一个队列中可 以存在多个用户提交的任务,并且可以为每个用户提交的任务设置资源上限,防止某个用户提交的任务将整个队列中的资源全消耗完。
- 任务和数据在同一节点上。
- 任务和数据在同一个机架上。
第一次分配:100 / 3 = 33.33,queueA=33.33%(多13.33%),queueA=33.33%(少16.66%),queueA=33.33%(多3.33%)。
第二次分配:(13.33 + 3.33)/ 1 = 16.66,queueA=20%(33.33 - 13.33),queueA=50%(33.33 + 16.66),queueA=30%(33.33 - 3.33)。
a) 加权:假设总资源总共12个,有4个job,对资源的需求分别为:job1=1,job2=2,job3=6,job4=5。
第一次分配:12 / 4 = 3,job1=3(多2),job2=3(多1),job3=3(少3),job4=3(少2)
第二次分配:3 / 2 = 1.5,job1=1(3 - 2),job2=2(3 - 1),job3=4.5(3 + 1.5),job4=4.5(3 + 1.5)
b) 加权:假设总资源总共16个,有4个job,每个job是有权重的,括号内为job的权重比,对资源的需求分别为:job1(5)=4,job2(8)=2,job3(1)=10,job4(2)=4。
第一次分配:16 / (5 + 8 + 1 + 2) = 1,job1=5(5 * 1,多1),job2=8(8 * 1,多6),job3=1(1 * 1,少9),job4=2(2 * 1,少2)
第二次分配:(1 + 6) / (1 + 2) = 2.33,job1=4(5 - 1),job2=2(8 - 6),job3=3.33(1 + 1 * 2.33 ,少6.67),job4=6.66(2 + 2 * 2.33,多2.66)
第三次分配:(2.66) / (1 ) = 2.66,job1=4(5 - 1),job2=2(8 - 6),job3=5.99(3.33 + 1 * 2.66 ,少4.01),job4=4(6.66 - 2.66,多2.66)
杀死对应的某个任务:yarn application -kill application_1645869756054_0001
查看尝试运行的任务(包含containerId):yarn applicationattempt -list application_1645869756054_0001
查看某个任务的日志:yarn logs -applicationId application_1645869756054_0001
查看某个任务中的某一个container运行的日志:yarn logs -applicationId application_1645869756054_0001 -containerId container_1645869756054_0001_01_000001
查看某一个尝试运行的任务的状态:yarn applicationattempt -status appattempt_1645869756054_0001_000001
查看正在运行的容器(必须在任务运行的时候,因为任务运行完成后,容器就被释放了): yarn container -list appattempt_1645869756054_0001_000001
查看当前该容器的状态(必须在任务运行的时候,因为任务运行完成后,容器就被释放了):yarn container -status container_1645869756054_0001_01_000001
重新加载队列相关配置(如果在运行过程中,更改了队列的相关配置,可以使用该命令):yarn rmadmin -refreshQueues
<description>Number of threads to handle scheduler interface.</description>
<description>The class to use as the resource scheduler.</description>
<description>Enable auto-detection of node capabilities such as
memory and CPU.
<description>Flag to determine if logical processors(such as
hyperthreads) should be counted as cores. Only applicable on Linux
when yarn.nodemanager.resource.cpu-vcores is set to -1 and
yarn.nodemanager.resource.detect-hardware-capabilities is true.
<description>Multiplier to determine how to convert phyiscal cores to
vcores. This value is used if yarn.nodemanager.resource.cpu-vcores
is set to -1(which implies auto-calculate vcores) and
yarn.nodemanager.resource.detect-hardware-capabilities is set to true. The
number of vcores will be calculated as
number of CPUs * multiplier.
<description>Amount of physical memory, in MB, that can be allocated
for containers. If set to -1 and
yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
automatically calculated(in case of Windows and Linux).
In other cases, the default is 8192MB.
<description>Number of vcores that can be allocated
for containers. This is used by the RM scheduler when allocating
resources for containers. This is not used to limit the number of
CPUs used by YARN containers. If it is set to -1 and
yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
automatically determined from the hardware in case of Windows and Linux.
In other cases, number of vcores is 8 by default.</description>
<description>The minimum allocation for every container request at the RM
in MBs. Memory requests lower than this will be set to the value of this
property. Additionally, a node manager that is configured to have less memory
than this value will be shut down by the resource manager.</description>
<description>The maximum allocation for every container request at the RM
in MBs. Memory requests higher than this will throw an
<description>The minimum allocation for every container request at the RM
in terms of virtual CPU cores. Requests lower than this will be set to the
value of this property. Additionally, a node manager that is configured to
have fewer virtual cores than this value will be shut down by the resource
<description>The maximum allocation for every container request at the RM
in terms of virtual CPU cores. Requests higher than this will throw an
<description>Whether virtual memory limits will be enforced for
<description>Ratio between virtual memory to physical memory when
setting memory limits for containers. Container allocations are
expressed in terms of physical memory, and virtual memory usage
is allowed to exceed this allocation by this ratio.
修改/hadoop-3.2.2/etc/hadoop/capacity-scheduler.xml文件配置,修改后分发到所有节点,如果不想重新启动集群,执行yarn rmadmin -refreshQueues即可生效。
The queues at the this level (root is the root queue).
<description>Default queue target capacity.</description>
<description>Default queue target capacity.</description>
Default queue user limit a percentage from 0.0 to 1.0.
Default queue user limit a percentage from 0.0 to 1.0.
The maximum capacity of the default queue.
The maximum capacity of the default queue.
The state of the default queue. State can be one of RUNNING or STOPPED.
The state of the default queue. State can be one of RUNNING or STOPPED.
The ACL of who can submit jobs to the default queue.
The ACL of who can submit jobs to the default queue.
The ACL of who can submit jobs to the default queue.
The ACL of who can submit jobs to the default queue.
The ACL of who can submit applications with configured priority.
For e.g, [user={name} group={name} max_priority={priority} default_priority={priority}]
The ACL of who can submit applications with configured priority.
For e.g, [user={name} group={name} max_priority={priority} default_priority={priority}]
Maximum lifetime of an application which is submitted to a queue
in seconds. Any value less than or equal to zero will be considered as
This will be a hard time limit for all applications in this
queue. If positive value is configured then any application submitted
to this queue will be killed after exceeds the configured lifetime.
User can also specify lifetime per application basis in
application submission context. But user lifetime will be
overridden if it exceeds queue maximum lifetime. It is point-in-time
Note : Configuring too low value will result in killing application
sooner. This feature is applicable only for leaf queue.
Maximum lifetime of an application which is submitted to a queue
in seconds. Any value less than or equal to zero will be considered as
This will be a hard time limit for all applications in this
queue. If positive value is configured then any application submitted
to this queue will be killed after exceeds the configured lifetime.
User can also specify lifetime per application basis in
application submission context. But user lifetime will be
overridden if it exceeds queue maximum lifetime. It is point-in-time
Note : Configuring too low value will result in killing application
sooner. This feature is applicable only for leaf queue.
Default lifetime of an application which is submitted to a queue
in seconds. Any value less than or equal to zero will be considered as
If the user has not submitted application with lifetime value then this
value will be taken. It is point-in-time configuration.
Note : Default lifetime can't exceed maximum lifetime. This feature is
applicable only for leaf queue.
Default lifetime of an application which is submitted to a queue
in seconds. Any value less than or equal to zero will be considered as
If the user has not submitted application with lifetime value then this
value will be taken. It is point-in-time configuration.
Note : Default lifetime can't exceed maximum lifetime. This feature is
applicable only for leaf queue.
(1)、执行jar的时候,直接指定队列:hadoop jar /orkasgb/software/hadoop-3.2.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.2.jar wordcount -D mapreduce.job.queuename=hive /wordcount/input /wordcount/output5
configuration.set("mapreduce.job.queuename", "hive");
Defines maximum application priority in a cluster.
If an application is submitted with a priority higher than this value, it will be
reset to this maximum value.
(1)、设置任务优先级为5,并提交:hadoop jar /orkasgb/software/hadoop-3.2.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.2.jar wordcount -D mapreduce.job.priority=5 /wordcount/input /wordcount/output8
(2)、设置已经提交之后的任务的优先级:yarn application -appId application_1646114756125_0004 -updatePriority 3
export HDFS_NAMENODE_OPTS="-Dhadoop.security.logger=INFO,RFAS -Xmx1024m"
export HDFS_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS -Xmx1024m"
<!-- 设置namende最大能接受的datanode并发访问量,默认为10 -->
<!-- 开启hadoop回收站功能,0表示不开启,其他数字表示多少分钟清空回收站 -->
<!-- 多久检测一下回收站,默认为0,表示和fs.trash.interval的时间相同,每次检查点运行时,它都会从当前创建一个新的检查点,并删除在 fs.trash.interval 分钟前的文件 -->
// 代码删除
Trash trash = new Trash(conf);
// 命令行删除
hadoop fs -rm /test.txt
<description>Determines where on the local filesystem an DFS data node
should store its blocks. If this is a comma-delimited
list of directories, then data will be stored in all named
directories, typically on different devices. The directories should be tagged
with corresponding storage types ([SSD]/[DISK]/[ARCHIVE]/[RAM_DISK]) for HDFS
storage policies. The default storage type will be DISK if the directory does
not have a storage type tagged explicitly. Directories that do not exist will
be created if local filesystem permission allows.
<description>Determines where on the local filesystem the DFS name node
should store the name table(fsimage). If this is a comma-delimited list
of directories then the name table is replicated in all of the
directories, for redundancy. </description>
// 生成磁盘数据均衡计划
hdfs diskbalancer -plan node-02
// 执行磁盘数据均衡计划
hdfs diskbalancer -execute node-02.plan.json
// 查看磁盘数据均衡执行情况
hdfs diskbalancer -query node-02
// 取消磁盘数据均衡计划
hdfs diskbalancer -cancel node-02.plan.json
<description>Names a file that contains a list of hosts that are
permitted to connect to the namenode. The full pathname of the file
must be specified. If the value is empty, all hosts are
<description>Names a file that contains a list of hosts that are
not permitted to connect to the namenode. The full pathname of the
file must be specified. If the value is empty, no hosts are
3.7、在新机器节点上使用hdfs --daemon start datanode、yarn --daemon start nodemanager这两个命令单独启动datanode和nodemanager,如果想启动zookeeper和hbase都可以。
3.8、在NameNode节点上执行hdfs dfsadmin -refreshNodes命令刷新所有node。
<!-- 开启uber模式,默认关闭 -->
<description>Whether to enable the small-jobs "ubertask" optimization,
which runs "sufficiently small" jobs sequentially within a single JVM.
"Small" is defined by the following maxmaps, maxreduces, and maxbytes
settings. Note that configurations for application masters also affect
the "Small" definition - yarn.app.mapreduce.am.resource.mb must be
larger than both mapreduce.map.memory.mb and mapreduce.reduce.memory.mb,
and yarn.app.mapreduce.am.resource.cpu-vcores must be larger than
both mapreduce.map.cpu.vcores and mapreduce.reduce.cpu.vcores to enable
ubertask. Users may override this value.
<!-- uber模式中最大开启的maptask的数量,默认为9,只能大于等于9。 -->
<description>Threshold for number of maps, beyond which job is considered
too big for the ubertasking optimization. Users may override this value,
but only downward.
<!-- uber模式中最大开启的maptask的数量,默认为1,只能小于等于1。 -->
<description>Threshold for number of reduces, beyond which job is considered
too big for the ubertasking optimization. CURRENTLY THE CODE CANNOT SUPPORT
MORE THAN ONE REDUCE and will ignore larger values. (Zero is a valid max,
however.) Users may override this value, but only downward.
<!-- uber模式中最大能处理的数据量,默认是块大小,只能小于等于块大小 -->
<description>Threshold for number of input bytes, beyond which job is
considered too big for the ubertasking optimization. If no value is
specified, dfs.block.size is used as a default. Be sure to specify a
default value in mapred-site.xml if the underlying filesystem is not HDFS.
Users may override this value, but only downward.
将namenode地址1所在集群上的数据拷贝到namenode地址2所在集群上:hadoop distcp hdfs://namenode地址1:8020/hbase hdfs://namenode地址2:8020/hbase