Hadoop NameNode启动之载入FSImage(一)

  Namenode在启动时,有个重要步骤就是载入fsimage文件,下面分析下这个流程

NameNode.main-> NameNode(conf) -> NameNode.initialize(conf)-> FSNamesystem(this,conf) ->FSNamesystem.initialize(nn, conf)->FSNamesystem.dir.loadFSImage(getNamespaceDirs(conf),getNamespaceEditsDirs(conf),startOpt)

主要看最后一个函数loadFSImage,该函数通过一系列的校验后载入FSImage,在这过程中会合并edits和FSImage,该函数有三个参数,前两个为函数返回值,startOpt为枚举类型,在正常启动时值为REGULAR。下面看getNameSpaceDirs函数。

public static Collection getNamespaceDirs(Configurationconf)
{
//在配置文件中获得FSImage目录,有可能是多个目录,所以放入集合对象中,该属性在
//hdfs-site.xml中配置
Collection dirNames =conf.getStringCollection("dfs.name.dir");
//判断目录名数量,如果为0,则设置缺省的目录为/tmp/hadoop/dfs/name
    if (dirNames.isEmpty())
      dirNames.add("/tmp/hadoop/dfs/name");
Collection dirs =new ArrayList(dirNames.size());
//将目录名放入数组链表中,最后返回
    for(String name : dirNames) {
      dirs.add(new File(name));
    }
    return dirs;
  }

//获得edits目录,一般情况下和FSImage相同,但如果系统更新频繁,NAMENODE节点IO太多可以考

//虑将FSImage和edits分开存放,来达到IO负载均衡的效果

public static Collection getNamespaceEditsDirs(Configurationconf) {
    Collection editsDirNames =
            conf.getStringCollection("dfs.name.edits.dir");
    if (editsDirNames.isEmpty())
      editsDirNames.add("/tmp/hadoop/dfs/name");
    Collection dirs =new ArrayList(editsDirNames.size());
    for(String name : editsDirNames) {
      dirs.add(new File(name));
    }
    return dirs;
  }

//两个函数都执行完后则进入载入环节,我们看其中几个重要的函数

// 1fsImage.recoverTransitionRead

// 2、fsImage.saveNamespace

// 3、fsImage.setCheckpointDirectories

 
void loadFSImage(Collection dataDirs,
                   CollectioneditsDirs,
                   StartupOption startOpt)throws IOException {
    // 如果需要格式化,则先格式化
    if (startOpt == StartupOption.FORMAT) {
      fsImage.setStorageDirectories(dataDirs, editsDirs);
      fsImage.format();
      startOpt = StartupOption.REGULAR;
    }
    try {
      if (fsImage.recoverTransitionRead(dataDirs, editsDirs, startOpt)) {
        fsImage.saveNamespace(true);
      }
      FSEditLog editLog =fsImage.getEditLog();
      assert editLog != null :"editLog must be initialized";
      if (!editLog.isOpen())
        editLog.open();
      fsImage.setCheckpointDirectories(null,null);
    } catch(IOException e) {
      fsImage.close();
      throw e;
    }
    synchronized (this) {
      this.ready =true;
      this.nameCache.initialized();
      this.notifyAll();
    }
  }

fsImage.recoverTransitionRead会判断dfs.name.dir下的目录是否正常,通过analyzeStorage函数来判断,正常启动的情况下代码如下:

 

1、 首先判断文件的存在性,比如文件是否在,目录中是否包含临时目录等,主要代码如下:

analyzeStorage函数:

     // 判断版本文件有效性
     File versionFile = getVersionFile();
      boolean hasCurrent = versionFile.exists();
 
      // 一系列的临时目录检测
      boolean hasPrevious = getPreviousDir().exists();
      boolean hasPreviousTmp = getPreviousTmp().exists();
      boolean hasRemovedTmp = getRemovedTmp().exists();
      boolean hasFinalizedTmp = getFinalizedTmp().exists();
      boolean hasCheckpointTmp =getLastCheckpointTmp().exists();
 
      //正常情况下会返回NORMAL,判断条件就是这些临时目录没有
      if (!(hasPreviousTmp || hasRemovedTmp
          || hasFinalizedTmp ||hasCheckpointTmp)) {
        // no temp dirs - no recovery
        if (hasCurrent)
          return StorageState.NORMAL;
        if (hasPrevious)
          throw new InconsistentFSStateException(root,
                             "version file in current directory is
                              missing.");
        return StorageState.NOT_FORMATTED;
      }

2、 版本文件校验sd.read() -> read(getVersionFile() ->getFields()

//读入VERSIONS文件,转化为Properties类型,进行一致性校验,校验通过则把用这些

//属性信息初始化父类StorageInfo成员变量

  protected void getFields(Propertiesprops,
                           StorageDirectory sd
                           )throws IOException {
    String sv, st, sid, sct;
    sv = props.getProperty("layoutVersion");
    st = props.getProperty("storageType");
    sid = props.getProperty("namespaceID");
    sct = props.getProperty("cTime");
    //属性校验开始
    if (sv == null || st ==null || sid == null || sct ==null)
      throw new InconsistentFSStateException(sd.root,
                                            "file " + STORAGE_FILE_VERSION +
                                         " is invalid.");
    int rv = Integer.parseInt(sv);
    NodeType rt = NodeType.valueOf(st);
    int rid = Integer.parseInt(sid);
    long rct = Long.parseLong(sct);
    if (!storageType.equals(rt) ||
        !((namespaceID == 0) || (rid == 0) ||namespaceID == rid))
      throw new InconsistentFSStateException(sd.root,
                                            "is incompatible with others.");
    if (rv < FSConstants.LAYOUT_VERSION)// future version
      throw new IncorrectVersionException(rv,"storage directory "
                                          + sd.root.getCanonicalPath());
   //StorageInfo成员变量初始化
    layoutVersion = rv;
    storageType = rt;
    namespaceID = rid;
    cTime = rct;
}

3、 再次循环dfs.name.dir目录,判断是否有需要格式化的

   for (Iterator it =
                     dirIterator();it.hasNext();) {
      StorageDirectory sd = it.next();
      StorageState curState =dataDirStates.get(sd);
      switch(curState) {
      caseNON_EXISTENT:
        assertfalse : StorageState.NON_EXISTENT +" state cannot be here";
      case NOT_FORMATTED:
        LOG.info("Storage directory " + sd.getRoot() +
                 " is not formatted.");
        LOG.info("Formatting ...");
        sd.clearDirectory();// create empty currrent dir
        break;
      default:
        break;
      }
 }

4、 判断启动参数 升级?引入?回滚?常规?在这里,因为是正常启动所以执行的是载入FSImage

loadFSImage()

  boolean loadFSImage()throws IOException{
    // Nowcheck all curFiles and see which is the newest
    longlatestNameCheckpointTime = Long.MIN_VALUE;
    long latestEditsCheckpointTime= Long.MIN_VALUE;
    StorageDirectory latestNameSD =null;
    StorageDirectory latestEditsSD =null;
    boolean needToSave= false;
    isUpgradeFinalized = true;
    Collection imageDirs =new ArrayList();
    Collection editsDirs =new ArrayList();
 
     //循环dfs.name.dir所指定的目录,并把有效目录加入集合变量,并读取fstime来确定检查点时间,
     //如果有多个目录则以最新的检查点时间为准,因为在这个循环中latestNameCheckpointTime会
     //保留最新的时间戳
    for(Iterator it = dirIterator(); it.hasNext();) {
      StorageDirectory sd = it.next();
      if(!sd.getVersionFile().exists()) {
        needToSave |=true;
        continue; // some of them might have just beenformatted
      }
      boolean imageExists= false, editsExists =false;
      if(sd.getStorageDirType().isOfType(NameNodeDirType.IMAGE)) {
        imageExists =getImageFile(sd,NameNodeFile.IMAGE).exists();
       imageDirs.add(sd.getRoot().getCanonicalPath());
      }
      if(sd.getStorageDirType().isOfType(NameNodeDirType.EDITS)) {
        editsExists =getImageFile(sd,NameNodeFile.EDITS).exists();
       editsDirs.add(sd.getRoot().getCanonicalPath());
      }
     
      checkpointTime = readCheckpointTime(sd);
      if ((checkpointTime != Long.MIN_VALUE) &&
          ((checkpointTime !=latestNameCheckpointTime) ||
           (checkpointTime !=latestEditsCheckpointTime))){
        // Force saving of new image if checkpoint time
        // is not same in all of the storage directories.
        needToSave |=true;
      }
      //确定有效的检查点时间
      if(sd.getStorageDirType().isOfType(NameNodeDirType.IMAGE) &&
         (latestNameCheckpointTime  latestEditsCheckpointTime
        && latestNameSD !=latestEditsSD
        && latestNameSD.getStorageDirType()== NameNodeDirType.IMAGE
        &&latestEditsSD.getStorageDirType() == NameNodeDirType.EDITS) {
      // This isa rare failure when NN has image-only and edits-only
      // storagedirectories, and fails right after saving images,
      // in someof the storage directories, but before purging edits.
      // See-NOTE- in saveNamespace().
      LOG.error("This is a rare failurescenario!!!");
      LOG.error("Image checkpoint time " + latestNameCheckpointTime +
               " > edits checkpoint time " + latestEditsCheckpointTime);
      LOG.error("Name-node will treat the image as thelatest state of " +
               "the namespace. Old edits will be discarded.");
    } else if (latestNameCheckpointTime !=latestEditsCheckpointTime)
      throw new IOException("Inconsistentstorage detected, " +
                     "image and edits checkpoint times do not match." +
                     "image checkpoint time = " + latestNameCheckpointTime +
                     "edits checkpoint time = " + latestEditsCheckpointTime);
   
    // Recoverfrom previous interrrupted checkpoint if any
    needToSave |=recoverInterruptedCheckpoint(latestNameSD, latestEditsSD);
   
    long startTime =FSNamesystem.now();
    long imageSize =getImageFile(latestNameSD, NameNodeFile.IMAGE).length();
    
    //
    // Load inbits
    //
    latestNameSD.read();//这里还要载入一次VERIONS文件,真TM麻烦
    needToSave |= loadFSImage(getImageFile(latestNameSD,NameNodeFile.IMAGE));//注意:这里才真正开始载入fsimage文件
    LOG.info("Image file of size " + imageSize +" loaded in "
        + (FSNamesystem.now() -startTime)/1000 +"seconds.");
   
    // Loadlatest edits
    if(latestNameCheckpointTime > latestEditsCheckpointTime)
      // theimage is already current, discard edits
      needToSave |=true;
    else // latestNameCheckpointTime ==latestEditsCheckpointTime
      needToSave |= (loadFSEdits(latestEditsSD)> 0);
   
    return needToSave;
  } 

经过漫长的校验之后在

needToSave|= loadFSImage(getImageFile(latestNameSD, NameNodeFile.IMAGE));才开始真正载入fsimage文件,

需要注意的是在开篇调用流程中有个FSNamesystem.dir.loadFSImage,这个函数的调用代码在FSDirectory.java中,真正载入fsimage的代码在FSImage.java中,不要混淆,具体流程下回分解。


你可能感兴趣的:(hadoop)