1.前提条件
(1).搭建普通hdfs集群,高可用集群在此集群上进行修改,扩展:参考:搭建hadoop分布式文件管理系统(HDFS)
(2).搭建zookeeper集群(用来切换namenode主备节点):参考:搭建zookeeper集群
2.介绍高可用hdfs集群
(1).action namenode节点为活动节点,主要作用为:1接受客户端的读写操作;2存储元数据(fsimage和edit文件)执行节点格式化时候加载
(2).standby namenode为备节点,在高可用集群中将取消SecondaryNameNode辅助节点,将元数据放到内部集群当中JournalNode中
(3).zookeeper选举机制实现action和standby切换,每个namenode有一个影节点,failovercontroller active 和failovercontroller standby
(4).影节点作用1.进行远程免密码登录切换standby namenode为action namenode,2监管所有namenode的健康状态
3.机器分配情况
有5台机器:node1,node2,node3,node4,node5
node1,node2,node3上装载zookeeper
node3,node4,node5装载内部journalnode
node1,node2为主节点和备节点(standby namenode)
4.改造原有集群
(1).删除集群中所有机器原先配置的masters
rm /usr/hadoop-2.5.1/etc/hadoop/masters
(2).删除集群中所有机器上格式化和使用生成的文件
rm -rf /opt/hadoop-2.5
5.在node1上修改hadoop配置文件hdfs-site.xml
vi /usr/hadoop-2.5.1/etc/hadoop/hdfs-site.xml ,删除原configuration配置中文件,写入下面配置
#自定义dfs.nameservices
#.后写入dfs.nameservices的value下面相同
#active namenode节点,文件传输端口
# standby namenode节点
#外端监视端口
#journal node节点
#客户端连接使用
#本地生成私钥,可参考原先免密码登录的博文
#共享元文件写入地址
#启动zookeeper自动切换
6.在node1上修改hadoop配置文件core-site.xml
vi /usr/hadoop-2.5.1/etc/hadoop/core-site.xml ,删除原configuration配置中的配置,写入下面配置
7.将node1的配置文件,拷贝到其余其他机器上
scp /usr/hadoop-2.5.1/etc/hadoop/* root@node2:/usr/hadoop-2.5.1/etc/hadoop/
8.启动node3,node4,node5上的JournalNode ,在每台机器上分别执行
hadoop-daemon.sh start journalnode
9.检验journalnode是否启动成功,查看日志,没有报错就视为成功
tail -300 /usr/hadoop-2.5.1/logs/hadoop-root-journalnode-node5.log
10.同步node1上namenode格式化的元数据到namenode2上
(1).在node1上执行
hdfs namenode -format 如果抛出异常检查其余所有节点防火墙是否关闭 service iptables stop
(2).拷贝格式化后的元数据到另一个node2的namenode中
scp -r /opt/hadoop-2.5/ root@node2:/opt
11.在hdfs集群中初始化zookeeper集群,在node1上执行
hdfs zkfc -formatZK
12.启动集群
start-dfs.sh
停止命令stop-dfs.sh
[root@node1 hadoop]# start-dfs.sh
Starting namenodes on [node1 node2] 启动主备
node2: starting namenode, logging to /usr/hadoop-2.5.1
/logs/hadoop-root-namenode-node2.out
node1: starting namenode, logging to /usr/hadoop-2.5.1
/logs/hadoop-root-namenode-node1.out
192.168.108.15: starting datanode, logging to /usr/hadoop-2.5.1
/logs/hadoop-root-datanode-node5.out 启动datanode
192.168.108.14: starting datanode, logging to /usr/hadoop-2.
5.1/logs/hadoop-root-datanode-node4.out
192.168.108.13: starting datanode, logging to /usr/hadoop-2.
5.1/logs/hadoop-root-datanode-node3.out
Starting journal nodes [192.168.108.13 192.168.108.14 192.168.108.15]
192.168.108.15: starting journalnode, logging to /usr/hadoop
-2.5.1/logs/hadoop-root-journalnode-node5.out 启动 journal nodes
192.168.108.13: starting journalnode, logging to /usr/hadoop-2
.5.1/logs/hadoop-root-journalnode-node3.out
192.168.108.14: starting journalnode, logging to /usr/hado
op-2.5.1/logs/hadoop-root-journalnode-node4.out
Starting ZK Failover Controllers on NN hosts [node1 node2] 启动影节点连接到zookeeper集群中
node2: starting zkfc, logging to /usr/hadoop-2.5.1/l
ogs/hadoop-root-zkfc-node2.out
node1: starting zkfc, logging to /usr/hadoop-2.
5.1/logs/hadoop-root-zkfc-node1.out
13.启动单个namenode节点的命令
hadoop-daemon.sh start namenode
14.验证
打开外部端口,如果kill掉node1中的namenode将切换到node2中的namenode