hadoop-0.20.2-cdh3u5版本从集群中移除一个节点


1.首先检查整个集群的Average block replication,如果大于2,那即使直接拔出一个节点也不会丢失数据

hadoop fsck / 检查集群的文件系统状况

可以手动设置文件的冗余倍数,为了安全备份,可对关键数据hadoop fs -setrep -w 3 -R <path>


2.给下面两个文件增加配置,在excludes中填写要解任的节点名

mapred-site.xml

<property>

  <name>mapred.hosts</name>

  <value></value>

  <description>Names a file that contains the list of nodes that may

  connect to the jobtracker.  If the value is empty, all hosts are

  permitted.</description>

</property>


<property>

  <name>mapred.hosts.exclude</name>

  <value>HADOOP_HOME/conf/excludes</value>

  <description>Names a file that contains the list of hosts that

  should be excluded by the jobtracker.  If the value is empty, no

  hosts are excluded.</description>

</property>


hdfs-site.xml

<property>

  <name>dfs.hosts</name>

  <value></value>

  <description>Names a file that contains a list of hosts that are

  permitted to connect to the namenode. The full pathname of the file

  must be specified.  If the value is empty, all hosts are

  permitted.</description>

</property>


<property>

  <name>dfs.hosts.exclude</name>

  <value>HADOOP_HOME/conf/excludes</value>

  <description>Names a file that contains a list of hosts that are

  not permitted to connect to the namenode.  The full pathname of the

  file must be specified.  If the value is empty, no hosts are

  excluded.</description>

</property> 


excludes 文件里面配置机器的hostname即可。


run on namenode:   hadoop dfsadmin -refreshNodes  

run on jobtracker:   hadoop mradmin -refreshNodes 

“hadoop dfsadmin -refreshNodes”会触发Decommission过程,在Decommission过程,集群会将Decommission节点上的数据冗余到其他几点上,


Decommission is not instant since it requires replication of potentially a large number of blocks and we do not want the cluster to be overwhelmed with just this one job. 

The decommission progress can be monitored on the name-node Web UI. Until all blocks are replicated the node will be in "Decommission In Progress" state. 

When decommission is done the state will change to "Decommissioned". The nodes can be removed whenever decommission is finished.

The decommission process can be terminated at any time by editing the configuration or the exclude files and repeating the -refreshNodes command.


Decommission refer:

http://wiki.apache.org/hadoop/FAQ


cloudera cdh3u5 version refer:

http://blog.csdn.net/rzhzhz/article/details/7577352


你可能感兴趣的:(hadoop-0.20.2-cdh3u5版本从集群中移除一个节点)