Hadoop执行Job时报DiskChecker$DiskErrorException异常原因

版本:

 

$ hadoop version 
Hadoop 0.20.2-cdh3u4
Subversion git://ubuntu-slave01/var/lib/jenkins/workspace/CDH3u4-Full-RC/build/cdh3/hadoop20/0.20.2-cdh3u4/source -r 214dd731e3bdb687cb55988d3f47dd9e248c5690
Compiled by jenkins on Mon May  7 13:01:39 PDT 2012
From source with checksum a60c9795e41a3248b212344fb131c12c

 

问题描述:

Hadoop执行MR的时候抛org.apache.hadoop.util.DiskChecker$DiskErrorException异常,详情如下:

 

org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for output/map_4.out
        at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:376)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
        at org.apache.hadoop.mapred.MapOutputFile.getInputFileForWrite(MapOutputFile.java:176)
        at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2374)
        at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:582)

问题原因:

 

执行org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(...)磁盘空间不足抛异常了

解决方式:

1.查找用户log看看那个job占的空间大

du -m --max-depth=1 /yourpath/mapred/local/userlogs | sort -n

 2.此job如果已经执行完成了那么直接rm -rf即可

最终解决方案:

修改hadoop保留用户日志的时间,修改mapred-site.xml 的mapred.userlog.retain.hours

  <property>
    <name>mapred.userlog.retain.hours</name>
    <value>10</value>
  </property>

 

你可能感兴趣的:(exception)