【HDFS】文件入Trash-rename操作

接http://blog.csdn.net/tracymkgld/article/details/17552189

上篇没提到Trash具体怎么工作,接着看一下:

    if(!skipTrash) {
      try {
	      Trash trashTmp = new Trash(srcFs, getConf());
        if (trashTmp.moveToTrash(src)) {//new 一个trash,然后把要删的文件名字符串扔给它
          System.out.println("Moved to trash: " + src);
          return;
        }
      } catch (IOException e) {
        Exception cause = (Exception) e.getCause();
        String msg = "";
        if(cause != null) {
          msg = cause.getLocalizedMessage();
        }
        System.err.println("Problem with Trash." + msg +". Consider using -skipTrash option");        
        throw e;
      }
    }
删除文件一般都会经过Trash,从FsShell的代码看就是new一个Trash对象,然后把要删的文件路径传给它就这么简单。

进入看看什么是Trash呢?

  private final FileSystem fs;
  private final Path trash;//private static final Path TRASH = new Path(".Trash/");
  private final Path current;
  private final long interval;
public Trash(FileSystem fs, Configuration conf) throws IOException {
    super(conf);
    this.fs = fs;
    this.trash = new Path(fs.getHomeDirectory(), TRASH);
    this.current = new Path(trash, CURRENT);//private static final Path CURRENT = new Path("Current");
    this.interval = conf.getLong("fs.trash.interval", 60) * MSECS_PER_MINUTE;//集群默认配置清理trash的时间是1小时,实际这个时间可以灵活调整,也可以手工清理Trash,目前线上集群是2天。
  }
可以看到Trash对象初始化的时候,要传递hdfs的文件系统句柄,它里边有个Path对象叫trash,这个Path指向用户家目录的.Trash目录

什么是家目录,看一眼你就知道了:

public Path getHomeDirectory() {
    return new Path("/user/"+System.getProperty("user.name"))
      .makeQualified(this);
  }

家目录是指hdfs上/user/用户名这个目录,用户名是你客户端使用的用户名,关于 Kerbose 统一认证这里就不讲了。总之知道哪里是家目录就行了,类似linux的/home/username/目录

再看moveToTrash方法的片段:

    Path trashPath = makeTrashRelativePath(current, path);
    Path baseTrashPath = makeTrashRelativePath(current, path.getParent());
啥意思?

private Path makeTrashRelativePath(Path basePath, Path rmFilePath) {
    return new Path(basePath + rmFilePath.toUri().getPath());
  }
啥意思?是这样的,比如我创建一个文件:/hadoop dfs -touchz /hello/world/wo/ca/123
原来叫 /hello/world/wo/ca/123嘛,现在把它rm掉,你可以发现这个文件/user/username/.Trash/Current/hello/world/wo/ca/123

也就是说/user/username/.Trash/Current/是新的垃圾根

之后执行rename操作,就是把原来的文件rename成前边加上/user/username/.Trash/Current的前缀。所以删到Trash中就是给文件rename。

关于rename就不一步一步的追踪了,直接进namenode看吧:

  private synchronized boolean renameToInternal(String src, String dst
      ) throws IOException {
    NameNode.stateChangeLog.debug("DIR* NameSystem.renameTo: " + src + " to " + dst);
    if (isInSafeMode())
      throw new SafeModeException("Cannot rename " + src, safeMode);
    if (!DFSUtil.isValidName(dst)) {
      throw new IOException("Invalid name: " + dst);
    }//安全模式的时候也不能动哦!

    if (isPermissionEnabled) {
      //We should not be doing this.  This is move() not renameTo().
      //but for now,
      String actualdst = dir.isDir(dst)?//检查你希望改成的名字是不是目录,如果是目录的话,我擦,这样:/a/b/c 改成/1/2/,那么改成/1/2/a/b/c
          dst + Path.SEPARATOR + new Path(src).getName(): dst;
      checkParentAccess(src, FsAction.WRITE);
      checkAncestorAccess(actualdst, FsAction.WRITE);//源文件父目录和目标文件上溯inode写权限检查,回头renameTo需要权限
    }

    HdfsFileStatus dinfo = dir.getFileInfo(dst);
    if (dir.renameTo(src, dst)) {
      changeLease(src, dst, dinfo);     // update lease with new filename
      return true;
    }
    return false;
  }
还记得dir吗FSDirectory对象,它执行renameTo操作,审计日志log之类的就不说了,看关键的代码
  boolean unprotectedRenameTo(String src, String dst, long timestamp) 
  throws QuotaExceededException {
    synchronized (rootDir) {
      INode[] srcInodes = rootDir.getExistingPathINodes(src);//拿到要改名的那个文件的各级上层inode数组
//http://blog.csdn.net/tracymkgld/article/details/17553173
      // check the validation of the source
      if (srcInodes[srcInodes.length-1] == null) {//显然这是目标inode,就是你想改名的那个文件inode
        NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: "
            + "failed to rename " + src + " to " + dst
            + " because source does not exist");
        return false;
      } 
      if (srcInodes.length == 1) {
        NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: "
            +"failed to rename "+src+" to "+dst+ " because source is the root");
        return false;//root不允许你改名
      }
      if (isDir(dst)) {
        dst += Path.SEPARATOR + new Path(src).getName();
      }
      //有检查一遍你是不是想非法的将文件改为目录名
      // check the validity of the destination
      if (dst.equals(src)) {
        return true;
      }//是不是同一个文件名,操蛋呢?
      // dst cannot be directory or a file under src
      if (dst.startsWith(src) && //不能将其改到它的孩子身上去
          dst.charAt(src.length()) == Path.SEPARATOR_CHAR) {
        NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: "
            + "failed to rename " + src + " to " + dst
            + " because destination starts with src");
        return false;
      }
      
      byte[][] dstComponents = INode.getPathComponents(dst);
      INode[] dstInodes = new INode[dstComponents.length];
      rootDir.getExistingPathINodes(dstComponents, dstInodes);
      if (dstInodes[dstInodes.length-1] != null) {//不说了,很简单,搞完src的inode开始搞dst的inode了
        NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: "
                                     +"failed to rename "+src+" to "+dst+ 
                                     " because destination exists");
        return false;
      }
      if (dstInodes[dstInodes.length-2] == null) {
        NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: "
            +"failed to rename "+src+" to "+dst+ 
            " because destination's parent does not exist");
        return false;
      }
      
      // Ensure dst has quota to accommodate rename
      verifyQuotaForRename(srcInodes,dstInodes);//quota管理,这里先不讲。
      
      INode dstChild = null;
      INode srcChild = null;
      String srcChildName = null;
      try {
        // remove src
        srcChild = removeChild(srcInodes, srcInodes.length-1);//先把想要被改名的那个文件从他爹手里清理掉,他爹不管他了
        if (srcChild == null) {
          NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: "
              + "failed to rename " + src + " to " + dst
              + " because the source can not be removed");
          return false;
        }
        srcChildName = srcChild.getLocalName();
        srcChild.setLocalName(dstComponents[dstInodes.length-1]);
        
        // add src to the destination
        dstChild = addChildNoQuotaCheck(dstInodes, dstInodes.length - 1,
            srcChild, -1, false);//挂到dst inode上,add 到dst的 Inode的children (List)中,当然还有quota之类的换算,mtime的更新等。
        if (dstChild != null) {
          srcChild = null;
          if (NameNode.stateChangeLog.isDebugEnabled()) {
            NameNode.stateChangeLog.debug("DIR* FSDirectory.unprotectedRenameTo: " + src
                    + " is renamed to " + dst);
          }
          // update modification time of dst and the parent of src
          srcInodes[srcInodes.length-2].setModificationTime(timestamp);//源和端的父目录都更新mtime
          dstInodes[dstInodes.length-2].setModificationTime(timestamp);
          return true;
        }
      } finally {
        if (dstChild == null && srcChild != null) {
          // put it back
          srcChild.setLocalName(srcChildName);
          addChildNoQuotaCheck(srcInodes, srcInodes.length - 1, srcChild, -1,
              false);
        }
      }
      NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: "
          +"failed to rename "+src+" to "+dst);
      return false;
    }
小结:

删除文件的时候,如果文件被放到trash里,就是对文件执行rename操作,并且是放到/user/username/.Trash/Current/目录下,然后namenode对源文件进行namespace操作

1、使其脱离它的父亲,从父inodeDirectory的children列表中清除。

2、add到目的inodeDirectory的children列表中

3、注意修改源目的目录的mtime,同时减少源目录的quota占用,增加dst的quota占用。

至于trash啥时候执行http://blog.csdn.net/tracymkgld/article/details/17552189这里这样的6步删除操作,或者如何自动清理trash,见http://blog.csdn.net/tracymkgld/article/details/17557655





你可能感兴趣的:(【HDFS】文件入Trash-rename操作)