Datanode周期性的向Namenode发送心跳信息,心跳信息的发送通过sendHeartbeat方法来完成。sendHeartbeat向Namenode表明Datanode依然存活并且完好。sendHeartbeat还向Namenode传输一些Datanode的状态信息,包括Datanode的容量、Datanode DFS消耗的容量、Datanode还可以使用的容量等。最后Namenode向Datanode返回一系列的DatanodeCommand以控制Datanode执行一些操作。DatanodeCommand主要有以下几种类型:
DNA_UNKNOWN = 0; // unknown action(未知的操作)
DNA_TRANSFER = 1; // transfer blocks to another datanode(传输blocks到别的Datanode)
DNA_INVALIDATE = 2; // invalidate blocks(使blocks失效)
DNA_SHUTDOWN = 3; // shutdown node(关闭Datanode)
DNA_REGISTER = 4; // re-register(重新注册)
DNA_FINALIZE = 5; // finalize previous upgrade(commit上一次升级)
DNA_RECOVERBLOCK = 6; // request a block recovery(请求block恢复)
DNA_ACCESSKEYUPDATE = 7; // update access key(升级访问key)
DNA_BALANCERBANDWIDTHUPDATE = 8; // update balancer// bandwidth
在Namenode中保存的DatanodeDescriptor中包含了几个队列:
BlockQueue replicateBlocks;/*保存该Datanode需要复制的block的集合*/
BlockQueue recoverBlocks;/*保存该Datanode需要恢复的block的集合*/
Set<Block> invalidateBlocks ;/*保存该Datanode需要使之失效的的block的集合*/
下面我们说一下sendHeartbeat的具体实现:
1. 首先从datanodeMap中获取发送“心跳”的datanodeDescriptor。在获取的过程中,如果datanodeMap获取的Datanode的hostname与正在发送“心跳”的Datanode的hostname不一致,则抛出异常,并向该Datanode返回DNA_REGISTER 命令。
2. 如果从datanodeMap中获取的DatanodeDescriptor不为空,但是该Datanode的是需要shutdown的节点,则将该节点置为dead。
3. 如果获取的DatanodeDescriptor为空,或者DataNodeDescriptor的状态为not live,则向并向该Datanode返回DNA_REGISTER 命令。
4. 如果DatanodeDescriptor正常,则更新FSNamespace以及DatanodeDescriptor的状态,主要包括capacity、dfsUsed、remaining和xceiverCount。
5. 然后,依次获取DNA_RECOVERBLOCK 、DNA_TRANSFER、DNA_INVALIDATE、KeyUpdateCommand、BalancerBandwidthCommand执行命令返回给Datanode。
6. 最后,判断是不是有UpgradeCommand命令,如果有返回给Datanode进行执行。
DatanodeCommand[] handleHeartbeat(DatanodeRegistration nodeReg,
long capacity, long dfsUsed, long remaining, int xceiverCount,
int xmitsInProgress) throws IOException {
DatanodeCommand cmd = null;
synchronized (heartbeats) {
synchronized (datanodeMap) {
DatanodeDescriptor nodeinfo = null;
try {
nodeinfo = getDatanode(nodeReg);
} catch (UnregisteredDatanodeException e) {
return new DatanodeCommand[] { DatanodeCommand.REGISTER };
}
// Check if this datanode should actually be shutdown instead.
if (nodeinfo != null && shouldNodeShutdown(nodeinfo)) {
setDatanodeDead(nodeinfo);
throw new DisallowedDatanodeException(nodeinfo);
}
if (nodeinfo == null || !nodeinfo.isAlive) {
return new DatanodeCommand[] { DatanodeCommand.REGISTER };
}
updateStats(nodeinfo, false);
nodeinfo.updateHeartbeat(capacity, dfsUsed, remaining,
xceiverCount);
updateStats(nodeinfo, true);
// check lease recovery
cmd = nodeinfo.getLeaseRecoveryCommand(Integer.MAX_VALUE);
if (cmd != null) {
return new DatanodeCommand[] { cmd };
}
ArrayList<DatanodeCommand> cmds = new ArrayList<DatanodeCommand>();
// check pending replication
cmd = nodeinfo.getReplicationCommand(maxReplicationStreams
- xmitsInProgress);
if (cmd != null) {
cmds.add(cmd);
}
// check block invalidation
cmd = nodeinfo.getInvalidateBlocks(blockInvalidateLimit);
if (cmd != null) {
cmds.add(cmd);
}
// check access key update
if (isAccessTokenEnabled && nodeinfo.needKeyUpdate) {
cmds.add(new KeyUpdateCommand(accessTokenHandler
.exportKeys()));
nodeinfo.needKeyUpdate = false;
}
// check for balancer bandwidth update
if (nodeinfo.getBalancerBandwidth() > 0) {
cmds.add(new BalancerBandwidthCommand(nodeinfo
.getBalancerBandwidth()));
// set back to 0 to indicate that datanode has been sent the
// new value
nodeinfo.setBalancerBandwidth(0);
}
if (!cmds.isEmpty()) {
return cmds.toArray(new DatanodeCommand[cmds.size()]);
}
}
}
// check distributed upgrade
cmd = getDistributedUpgradeCommand();
if (cmd != null) {
return new DatanodeCommand[] { cmd };
}
return null;
}