hadoop yarn 报错 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeManager: RECEIVED SIGNAL

运行简单的样例程序报错

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input output 'dfs[a-z.]+'

现象是job卡住, 进度一直停在20% 

2023-08-24 15:02:43,732 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeManager: RECEIVED SIGNAL 15: SIGTERM

023-08-24 15:02:43,749 INFO org.mortbay.log: Stopped [email protected]:8042
2023-08-24 15:02:43,755 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Exception when trying to cleanup container container_1692860510744_0002_01_000003: ExitCodeException exitCode=143: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
    at org.apache.hadoop.util.Shell.run(Shell.java:456)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.killContainer(DefaultContainerExecutor.java:450)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.signalContainer(DefaultContainerExecutor.java:406)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:419)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55)
    at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
    at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
    at java.lang.Thread.run(Thread.java:750)

采用如下解决办法 25后面3个0 可以跑到73%才挂, 然后25后面4个0 跑到100%才挂

#RM(yarn-site.xml) 内存资源配置——两个参数:它们表示单个容器可以申请的最小与最大内存。

	yarn.scheduler.minimum-allocation-mb
	1024


	yarn.scheduler.maximum-allocation-mb
	250000

 
#NM(yarn-site.xml)前者表示单个节点可用的最大内存,RM中的两个值都不应该超过该值。
后者表示虚拟内存率,即占task所用内存的百分比,默认为2.1.

	yarn.nodemanager.resource.memory-mb
	250000


	yarn.nodemanager.vmem-pmem-ratio
	2.1

 

找到问题描述: [YARN-4459] container-executor should only kill process groups - ASF JIRA

由于我用的是hadoop 2.7.2版本,  升级到 2.7.3版本尝试一下 

hadoop yarn 报错 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeManager: RECEIVED SIGNAL_第1张图片 

换成2.7.3重新部署,果然没有任何问题了.

你可能感兴趣的:(hadoop,各种问题,hadoop,apache,大数据)