Namenode在启动时,有个重要步骤就是载入fsimage文件,下面分析下这个流程
NameNode.main-> NameNode(conf) -> NameNode.initialize(conf)-> FSNamesystem(this,conf) ->FSNamesystem.initialize(nn, conf)->FSNamesystem.dir.loadFSImage(getNamespaceDirs(conf),getNamespaceEditsDirs(conf),startOpt)
主要看最后一个函数loadFSImage,该函数通过一系列的校验后载入FSImage,在这过程中会合并edits和FSImage,该函数有三个参数,前两个为函数返回值,startOpt为枚举类型,在正常启动时值为REGULAR。下面看getNameSpaceDirs函数。
public static Collection getNamespaceDirs(Configurationconf)
{
//在配置文件中获得FSImage目录,有可能是多个目录,所以放入集合对象中,该属性在
//hdfs-site.xml中配置
Collection dirNames =conf.getStringCollection("dfs.name.dir");
//判断目录名数量,如果为0,则设置缺省的目录为/tmp/hadoop/dfs/name
if (dirNames.isEmpty())
dirNames.add("/tmp/hadoop/dfs/name");
Collection dirs =new ArrayList(dirNames.size());
//将目录名放入数组链表中,最后返回
for(String name : dirNames) {
dirs.add(new File(name));
}
return dirs;
}
//获得edits目录,一般情况下和FSImage相同,但如果系统更新频繁,NAMENODE节点IO太多可以考
//虑将FSImage和edits分开存放,来达到IO负载均衡的效果
public static Collection getNamespaceEditsDirs(Configurationconf) {
Collection editsDirNames =
conf.getStringCollection("dfs.name.edits.dir");
if (editsDirNames.isEmpty())
editsDirNames.add("/tmp/hadoop/dfs/name");
Collection dirs =new ArrayList(editsDirNames.size());
for(String name : editsDirNames) {
dirs.add(new File(name));
}
return dirs;
}
//两个函数都执行完后则进入载入环节,我们看其中几个重要的函数
// 1、fsImage.recoverTransitionRead
// 2、fsImage.saveNamespace
// 3、fsImage.setCheckpointDirectories
void loadFSImage(Collection dataDirs,
CollectioneditsDirs,
StartupOption startOpt)throws IOException {
// 如果需要格式化,则先格式化
if (startOpt == StartupOption.FORMAT) {
fsImage.setStorageDirectories(dataDirs, editsDirs);
fsImage.format();
startOpt = StartupOption.REGULAR;
}
try {
if (fsImage.recoverTransitionRead(dataDirs, editsDirs, startOpt)) {
fsImage.saveNamespace(true);
}
FSEditLog editLog =fsImage.getEditLog();
assert editLog != null :"editLog must be initialized";
if (!editLog.isOpen())
editLog.open();
fsImage.setCheckpointDirectories(null,null);
} catch(IOException e) {
fsImage.close();
throw e;
}
synchronized (this) {
this.ready =true;
this.nameCache.initialized();
this.notifyAll();
}
}
fsImage.recoverTransitionRead会判断dfs.name.dir下的目录是否正常,通过analyzeStorage函数来判断,正常启动的情况下代码如下:
1、 首先判断文件的存在性,比如文件是否在,目录中是否包含临时目录等,主要代码如下:
analyzeStorage函数:
// 判断版本文件有效性
File versionFile = getVersionFile();
boolean hasCurrent = versionFile.exists();
// 一系列的临时目录检测
boolean hasPrevious = getPreviousDir().exists();
boolean hasPreviousTmp = getPreviousTmp().exists();
boolean hasRemovedTmp = getRemovedTmp().exists();
boolean hasFinalizedTmp = getFinalizedTmp().exists();
boolean hasCheckpointTmp =getLastCheckpointTmp().exists();
//正常情况下会返回NORMAL,判断条件就是这些临时目录没有
if (!(hasPreviousTmp || hasRemovedTmp
|| hasFinalizedTmp ||hasCheckpointTmp)) {
// no temp dirs - no recovery
if (hasCurrent)
return StorageState.NORMAL;
if (hasPrevious)
throw new InconsistentFSStateException(root,
"version file in current directory is
missing.");
return StorageState.NOT_FORMATTED;
}
2、 版本文件校验sd.read() -> read(getVersionFile() ->getFields()
//读入VERSIONS文件,转化为Properties类型,进行一致性校验,校验通过则把用这些
//属性信息初始化父类StorageInfo成员变量
protected void getFields(Propertiesprops,
StorageDirectory sd
)throws IOException {
String sv, st, sid, sct;
sv = props.getProperty("layoutVersion");
st = props.getProperty("storageType");
sid = props.getProperty("namespaceID");
sct = props.getProperty("cTime");
//属性校验开始
if (sv == null || st ==null || sid == null || sct ==null)
throw new InconsistentFSStateException(sd.root,
"file " + STORAGE_FILE_VERSION +
" is invalid.");
int rv = Integer.parseInt(sv);
NodeType rt = NodeType.valueOf(st);
int rid = Integer.parseInt(sid);
long rct = Long.parseLong(sct);
if (!storageType.equals(rt) ||
!((namespaceID == 0) || (rid == 0) ||namespaceID == rid))
throw new InconsistentFSStateException(sd.root,
"is incompatible with others.");
if (rv < FSConstants.LAYOUT_VERSION)// future version
throw new IncorrectVersionException(rv,"storage directory "
+ sd.root.getCanonicalPath());
//StorageInfo成员变量初始化
layoutVersion = rv;
storageType = rt;
namespaceID = rid;
cTime = rct;
}
3、 再次循环dfs.name.dir目录,判断是否有需要格式化的
for (Iterator it =
dirIterator();it.hasNext();) {
StorageDirectory sd = it.next();
StorageState curState =dataDirStates.get(sd);
switch(curState) {
caseNON_EXISTENT:
assertfalse : StorageState.NON_EXISTENT +" state cannot be here";
case NOT_FORMATTED:
LOG.info("Storage directory " + sd.getRoot() +
" is not formatted.");
LOG.info("Formatting ...");
sd.clearDirectory();// create empty currrent dir
break;
default:
break;
}
}
4、 判断启动参数 升级?引入?回滚?常规?在这里,因为是正常启动所以执行的是载入FSImage
loadFSImage()
boolean loadFSImage()throws IOException{
// Nowcheck all curFiles and see which is the newest
longlatestNameCheckpointTime = Long.MIN_VALUE;
long latestEditsCheckpointTime= Long.MIN_VALUE;
StorageDirectory latestNameSD =null;
StorageDirectory latestEditsSD =null;
boolean needToSave= false;
isUpgradeFinalized = true;
Collection imageDirs =new ArrayList();
Collection editsDirs =new ArrayList();
//循环dfs.name.dir所指定的目录,并把有效目录加入集合变量,并读取fstime来确定检查点时间,
//如果有多个目录则以最新的检查点时间为准,因为在这个循环中latestNameCheckpointTime会
//保留最新的时间戳
for(Iterator it = dirIterator(); it.hasNext();) {
StorageDirectory sd = it.next();
if(!sd.getVersionFile().exists()) {
needToSave |=true;
continue; // some of them might have just beenformatted
}
boolean imageExists= false, editsExists =false;
if(sd.getStorageDirType().isOfType(NameNodeDirType.IMAGE)) {
imageExists =getImageFile(sd,NameNodeFile.IMAGE).exists();
imageDirs.add(sd.getRoot().getCanonicalPath());
}
if(sd.getStorageDirType().isOfType(NameNodeDirType.EDITS)) {
editsExists =getImageFile(sd,NameNodeFile.EDITS).exists();
editsDirs.add(sd.getRoot().getCanonicalPath());
}
checkpointTime = readCheckpointTime(sd);
if ((checkpointTime != Long.MIN_VALUE) &&
((checkpointTime !=latestNameCheckpointTime) ||
(checkpointTime !=latestEditsCheckpointTime))){
// Force saving of new image if checkpoint time
// is not same in all of the storage directories.
needToSave |=true;
}
//确定有效的检查点时间
if(sd.getStorageDirType().isOfType(NameNodeDirType.IMAGE) &&
(latestNameCheckpointTime latestEditsCheckpointTime
&& latestNameSD !=latestEditsSD
&& latestNameSD.getStorageDirType()== NameNodeDirType.IMAGE
&&latestEditsSD.getStorageDirType() == NameNodeDirType.EDITS) {
// This isa rare failure when NN has image-only and edits-only
// storagedirectories, and fails right after saving images,
// in someof the storage directories, but before purging edits.
// See-NOTE- in saveNamespace().
LOG.error("This is a rare failurescenario!!!");
LOG.error("Image checkpoint time " + latestNameCheckpointTime +
" > edits checkpoint time " + latestEditsCheckpointTime);
LOG.error("Name-node will treat the image as thelatest state of " +
"the namespace. Old edits will be discarded.");
} else if (latestNameCheckpointTime !=latestEditsCheckpointTime)
throw new IOException("Inconsistentstorage detected, " +
"image and edits checkpoint times do not match." +
"image checkpoint time = " + latestNameCheckpointTime +
"edits checkpoint time = " + latestEditsCheckpointTime);
// Recoverfrom previous interrrupted checkpoint if any
needToSave |=recoverInterruptedCheckpoint(latestNameSD, latestEditsSD);
long startTime =FSNamesystem.now();
long imageSize =getImageFile(latestNameSD, NameNodeFile.IMAGE).length();
//
// Load inbits
//
latestNameSD.read();//这里还要载入一次VERIONS文件,真TM麻烦
needToSave |= loadFSImage(getImageFile(latestNameSD,NameNodeFile.IMAGE));//注意:这里才真正开始载入fsimage文件
LOG.info("Image file of size " + imageSize +" loaded in "
+ (FSNamesystem.now() -startTime)/1000 +"seconds.");
// Loadlatest edits
if(latestNameCheckpointTime > latestEditsCheckpointTime)
// theimage is already current, discard edits
needToSave |=true;
else // latestNameCheckpointTime ==latestEditsCheckpointTime
needToSave |= (loadFSEdits(latestEditsSD)> 0);
return needToSave;
}
经过漫长的校验之后在
needToSave|= loadFSImage(getImageFile(latestNameSD, NameNodeFile.IMAGE));才开始真正载入fsimage文件,
需要注意的是在开篇调用流程中有个FSNamesystem.dir.loadFSImage,这个函数的调用代码在FSDirectory.java中,真正载入fsimage的代码在FSImage.java中,不要混淆,具体流程下回分解。