一、zookeeper集群的安装。
(我亲自成功搭建过的一套集群,涉及到很多的细节,但是有的细节地方我不一定能完全写出,如果有遇到问题的可以留言)
前提准备3台centos7.0虚拟机
(1)首先设置每台虚拟机的网络连接方式为net方式,然后修改每台虚拟机的IP地址为静态IP(虚拟机设置里面),然后修改每台主机的主机名(/etc/hostname),最后把下面的IP和主机名的映射写到每台虚拟机的/etc/hosts文件中,注意是在每台机器上都执行。具体如下(本篇文章重点在搭建环境,上述准备工作不会的请自行百度。一定更要确保这一步做完,否则后边会出现很多问题。)
192.168.70.103:node1
192.168.70.104:node2
192.168.70.105:node3
(2)下载zookeeper并解压
登录到node1并进入/opt目录下执行如下命令
[root@node1 opt]$ wget http://apache.fayea.com/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz
[root@node1 opt]$ tar -zxvf zookeeper-3.4.10.tar.gz
[root@node1 opt]$ chmod +wxr zookeeper-3.4.10
(3)修改zookeeper的配置文件,并建立数据目录和日志目录
[root@node1 opt]$ cd zookeeper-3.4.10
[root@node1 zookeeper-3.4.10]$ mkdir data
[root@node1 zookeeper-3.4.10]$ mkdir logs
[root@node1 zookeeper-3.4.10]$ vi conf/zoo.cfg (打开后在文件中找到下面的字段然后照着写)
# example sakes.
dataDir=/opt/zookeeper-3.4.10/data
dataLogDir=/opt/zookeeper-3.4.10/logs
server.1=node1:2888:3888
server.2=node2:2888:3888
server.3=node3:2888:3888
[root@node1 zookeeper-3.4.10]$ cd data
[root@node1 data]$ vi myid
1
(4)复制node3的zookeeper-3.4.10到node2和node3上
[root@node1 opt]$ scp zookeeper-3.4.10 root@node2:/opt/zookeeper-3.4.10
[root@cnode1 opt]$ scp zookeeper-3.4.10 root@node3:/opt/zookeeper-3.4.10
(5)分别修改node2和node3上myid的值为2和3
[root@node2 zookeeper-3.4.10]$ vi data/myid
2
[root@node3 zookeeper-3.4.10]$ vi data/myid
3
(6)分别启动node1、node2、node3上的zookeeper
[root@node1 zookeeper-3.4.10]$ bin/zkServer.sh start
[root@node2 zookeeper-3.4.10]$ bin/zkServer.sh start
[root@node3 zookeeper-3.4.10]$ bin/zkServer.sh start
(7)查看zookeeper的状态(一定要在三台虚拟机上全部启动zookeeper后再去查看zookeeper的状态,相信学过zookeeper的知道这里说的"状态"是什么意思)
[root@node1 zookeeper-3.4.10]$ bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: follower
[root@node2 zookeeper-3.4.10]$ bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: leader
[root@node3 zookeeper-3.4.10]$ bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: follower
(8)验证zookeeper集群
[root@node1 zookeeper-3.4.10]$ bin/zkCli.sh
Connecting to c7003:2181
2017-04-02 03:06:12,251 [myid:] - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
2017-04-02 03:06:12,257 [myid:] - INFO [main:Environment@100] - Client environment:host.name=c7003.ambari.apache.org
2017-04-02 03:06:12,257 [myid:] - INFO [main:Environment@100] - Client environment:java.version=1.8.0_121
2017-04-02 03:06:12,260 [myid:] - INFO [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2017-04-02 03:06:12,260 [myid:] - INFO [main:Environment@100] - Client environment:java.home=/opt/jdk1.8.0_121/jre
2017-04-02 03:06:12,260 [myid:] - INFO [main:Environment@100] - Client environment:java.class.path=/opt/zookeeper-3.4.10/bin/../build/classes:/opt/zookeeper-3.4.10/bin/../build/lib/*.jar:/opt/zookeeper-3.4.10/bin/../lib/slf4j-log4j12-1.6.1.jar:/opt/zookeeper-3.4.10/bin/../lib/slf4j-api-1.6.1.jar:/opt/zookeeper-3.4.10/bin/../lib/netty-3.10.5.Final.jar:/opt/zookeeper-3.4.10/bin/../lib/log4j-1.2.16.jar:/opt/zookeeper-3.4.10/bin/../lib/jline-0.9.94.jar:/opt/zookeeper-3.4.10/bin/../zookeeper-3.4.10.jar:/opt/zookeeper-3.4.10/bin/../src/java/lib/*.jar:/opt/zookeeper-3.4.10/bin/../conf:
2017-04-02 03:06:12,260 [myid:] - INFO [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2017-04-02 03:06:12,260 [myid:] - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2017-04-02 03:06:12,260 [myid:] - INFO [main:Environment@100] - Client environment:java.compiler=
2017-04-02 03:06:12,260 [myid:] - INFO [main:Environment@100] - Client environment:os.name=Linux
2017-04-02 03:06:12,260 [myid:] - INFO [main:Environment@100] - Client environment:os.arch=amd64
2017-04-02 03:06:12,261 [myid:] - INFO [main:Environment@100] - Client environment:os.version=4.1.12-32.el7uek.x86_64
2017-04-02 03:06:12,261 [myid:] - INFO [main:Environment@100] - Client environment:user.name=vagrant
2017-04-02 03:06:12,261 [myid:] - INFO [main:Environment@100] - Client environment:user.home=/home/vagrant
2017-04-02 03:06:12,261 [myid:] - INFO [main:Environment@100] - Client environment:user.dir=/opt/zookeeper-3.4.10
2017-04-02 03:06:12,262 [myid:] - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=c7003:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@506c589e
Welcome to ZooKeeper!
ls /
[zookeeper, zk_test]
至此:zookeeper集群安装完毕!
二、hadoop HA分布式集群搭建
概述
hadoop2中NameNode可以有多个(目前只支持2个)。每一个都有相同的职能。一个是active状态的,一个是standby状态的。当集群运行时,只有active状态的NameNode是正常工作的,standby状态的NameNode是处于待命状态的,时刻同步active状态NameNode的数据。一旦active状态的NameNode不能工作,standby状态的NameNode就可以转变为active状态的,就可以继续工作了。
2个NameNode的数据其实是实时共享的。新HDFS采用了一种共享机制,Quorum Journal Node(JournalNode)集群或者Nnetwork File System(NFS)进行共享。NFS是操作系统层面的,JournalNode是hadoop层面的,我们这里使用JournalNode集群进行数据共享(这也是主流的做法)。
两个NameNode为了数据同步,会通过一组称作JournalNodes的独立进程进行相互通信。当active状态的NameNode的命名空间有任何修改时,会告知大部分的JournalNodes进程。standby状态的NameNode有能力读取JNs中的变更信息,并且一直监控edit log的变化,把变化应用于自己的命名空间。standby可以确保在集群出错时,命名空间状态已经完全同步了。
对于HA集群而言,确保同一时刻只有一个NameNode处于active状态是至关重要的。否则,两个NameNode的数据状态就会产生分歧,可能丢失数据,或者产生错误的结果。为了保证这点,这就需要利用使用ZooKeeper了。首先HDFS集群中的两个NameNode都在ZooKeeper中注册,当active状态的NameNode出故障时,ZooKeeper能检测到这种情况,它就会自动把standby状态的NameNode切换为active状态。
hadoop-ha包含HDFS的HA和YARN的HA,下面就2个部件的HA进行搭建。
环境介绍:
os:centos7.0
hadoop:2.8.0
zookeeper:3.4.10
5台虚拟机,各服务部署情况如下:
master1
ip:192.168.70.101
安装软件:Hadoop(HA)
运行进程:NameNode、ResourceManager、DFSZKFailoverController
master2
ip:192.168.70.102
安装软件:Hadoop(HA)
运行进程:NameNode、ResourceManager、DFSZKFailoverController
node1
ip:192.168.70.103
安装软件:Hadoop,Zookeeper
运行进程:DataNode、NodeManager、QuorumPeerMain、JournalNode
node2
ip:192.168.70.104
安装软件:Hadoop,Zookeeper
运行进程:DataNode、NodeManager、QuorumPeerMain、JournalNode
node3
ip:192.168.70.105
安装软件:Hadoop,Zookeeper
运行进程:DataNode、NodeManager、QuorumPeerMain、JournalNode
(1)准备工作<***这是出问题最多的,如果遇到问题请百度或者留言***>
新建两台虚拟机,分别修改主机名和配置静态IP如下,和上文的zookeeper搭建时用的主机差不多
192.168.70.101:master1
192.168.70.101:master2
在master1上修改/etc/hosts文件如下
192.168.70.101:master1
192.168.70.102:master2
192.168.70.103:node1
192.168.70.104:node2
192.168.70.105:node3
免密登录配置
在master1的根目录下敲入:ssh-keygen -t rsa
然后一路回车
接着敲入:
ssh-copy-id -i ~/.ssh/id_rsa.pub root@master1
ssh-copy-id -i ~/.ssh/id_rsa.pub root@master2
ssh-copy-id -i ~/.ssh/id_rsa.pub root@node1
ssh-copy-id -i ~/.ssh/id_rsa.pub root@node2
ssh-copy-id -i ~/.ssh/id_rsa.pub root@node3
现在实现的是master1登录到其他虚拟机时不需要密码。
如果要实现master1、master2、node1、node2、node3任意之间的无密码登录,分别再每台机器上重复上述步骤即可。
(2)在master1、master2、node1、node2、node3的/opt目录下安装jdk,并设置环境变量
cd /etc/
wget http://download.oracle.com/otn-pub/java/jdk/8u121-b13/e9e7ea248e2c4826b92b3f075a80e441/jdk-8u121-linux-x64.tar.gz?AuthParam=1491205869_4d911aca9d38a4b869d2a6ecaa9bbf47
tar zxvf jdk-8u121-linux-x64.tar.gz
vi ~/.bash_profile
export JAVA_HOME=/opt/jdk1.8.0_121
export PATH=$PATH:$JAVA_HOME/bin
(3)安装Hadoop集群
下载并解压hadoop
在master1、master2、node1、node2、node3的终端目录/opt下执行如下命令:
wget http://219.238.4.196/files/705200000559DFDC/apache.communilink.net/hadoop/common/hadoop-2.8.0/hadoop-2.8.0.tar.gz
然后再把各个机器上的hadoop解压
tar zxvf hadoop-2.8.0.tar.gz
(4)在master1终端修改hadoop配置文件,这里需要修改的有core-site.xml、hdfs-site.xml、mapreduce-site.xml、yarn-site.xml、hadoop-env.sh、mapred-env.sh、yarn-env.sh这7个文件
core-site.xml
<configuration>
<property>
<name>fs.defaultFSname>
<value>hdfs://bdclustervalue>
property>
<property>
<name>hadoop.tmp.dirname>
<value>/opt/hadoop-2.8.0/tmpvalue>
property>
<property>
<name>ha.zookeeper.quorumname>
<value>node1:2181,node2:2181,node3:2181value>
property>
<property>
<name>ha.zookeeper.session-timeout.msname>
<value>3000value>
property>
configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservicesname>
<value>bdclustervalue>
property>
<property>
<name>dfs.ha.namenodes.bdclustername>
<value>nn1,nn2value>
property>
<property>
<name>dfs.namenode.rpc-address.bdcluster.nn1name>
<value>master1:9000value>
property>
<property>
<name>dfs.namenode.rpc-address.bdcluster.nn2name>
<value>master2:9000value>
property>
<property>
<name>dfs.namenode.http-address.bdcluster.nn1name>
<value>master1:50070value>
property>
<property>
<name>dfs.namenode.http-address.bdcluster.nn2name>
<value>master2:50070value>
property>
<property>
<name>dfs.namenode.shared.edits.dirname>
<value>qjournal://node1:8485;node2:8485;node3:8485/bdclustervalue>
property>
<property>
<name>dfs.journalnode.edits.dirname>
<value>/opt/hadoop-2.8.0/tmp/journalvalue>
property>
<property>
<name>dfs.ha.automatic-failover.enabledname>
<value>truevalue>
property>
<property>
<name>dfs.client.failover.proxy.provider.bdclustername>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
value>
property>
<property>
<name>dfs.ha.fencing.methodsname>
<value>
sshfence
shell(/bin/true)
value>
property>
<property>
<name>dfs.ha.fencing.ssh.private-key-filesname>
<value>/root/.ssh/id_rsavalue>
property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeoutname>
<value>30000value>
property>
<property>
<name>dfs.namenode.name.dirname>
<value>file:///opt/hadoop-2.8.0/hdfs/namevalue>
property>
<property>
<name>dfs.datanode.data.dirname>
<value>file:///opt/hadoop-2.8.0/hdfs/datavalue>
property>
<property>
<name>dfs.replicationname>
<value>3value>
property>
configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.namename>
<value>yarnvalue>
property>
<property>
<name>mapreduce.jobhistory.addressname>
<value>0.0.0.0:10020value>
property>
<property>
<name>mapreduce.jobhistory.webapp.addressname>
<value>0.0.0.0:19888value>
property>
configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.ha.enabledname>
<value>truevalue>
property>
<property>
<name>yarn.resourcemanager.recovery.enabledname>
<value>truevalue>
property>
<property>
<name>yarn.resourcemanager.cluster-idname>
<value>yrcvalue>
property>
<property>
<name>yarn.resourcemanager.ha.rm-idsname>
<value>rm1,rm2value>
property>
<property>
<name>yarn.resourcemanager.hostname.rm1name>
<value>master1value>
property>
<property>
<name>yarn.resourcemanager.hostname.rm2name>
<value>master2value>
property>
<property>
<name>ha.zookeeper.quorumname>
<value>node1:2181,node2:2181,node3:2181value>
property>
!--配置与zookeeper的连接地址-->
<property>
<name>yarn.resourcemanager.zk-state-store.addressname>
<value>node1:2181,node2:2181,node3:2181value>
property>
<property>
<name>yarn.resourcemanager.store.classname>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
value>
property>
<property>
<name>yarn.resourcemanager.zk-addressname>
<value>node1:2181,node2:2181,node3:2181value>
property>
<property>
<name>yarn.resourcemanager.ha.automatic-failover.zk-base-pathname>
<value>/yarn-leader-electionvalue>
<description>Optionalsetting.Thedefaultvalueis/yarn-leader-election
description>
property>
<property>
<name>yarn.nodemanager.aux-servicesname>
<value>mapreduce_shufflevalue>
property>
configuration>
hadoop-env.sh 和 mapred-env.sh 和 yarn-env.sh
export JAVA_HOME=/opt/jdk1.8.0_121
export CLASS_PATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export HADOOP_HOME=/opt/hadoop-2.8.0
export HADOOP_PID_DIR=/opt/hadoop-2.8.0/pids
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="$HADOOP_OPTS-Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_PREFIX=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
(5)通过执行如下命令将master1修改好的配置文件同步到master2、node1、node2、node3:
scp -r /opt/hadoop-2.8.0/etc/hadoop root@master2:/opt/hadoop-2.8.0/etc/
scp -r /opt/hadoop-2.8.0/etc/hadoop root@node1:/opt/hadoop-2.8.0/etc/
scp -r /opt/hadoop-2.8.0/etc/hadoop root@node2:/opt/hadoop-2.8.0/etc/
scp -r /opt/hadoop-2.8.0/etc/hadoop root@node3:/opt/hadoop-2.8.0/etc/
至此,hadoop的配置文件已经全部配置完毕
(6)启动zookeeper集群
分别在node1、node2、node3上执行如下命令启动zookeeper集群;
[vagrant@node1 bin]$ sh zkServer.sh start
验证集群zookeeper集群是否启动,分别在node1、node2、node3上执行如下命令验证zookeeper集群是否启动,集群启动成功,有两个follower节点跟一个leader节点;
[root@node1 bin]$ sh zkServer.sh status
JMX enabled by default
Using config: /opt/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: follower
(7) 启动journalnode集群
在master1上执行如下命令完成JournalNode集群的启动
[root@master1 hadoop-2.8.0]$ sbin/hadoop-daemons.sh start journalnode
执行jps命令,可以查看到JournalNode的java进程pid
(8)格式化zkfc,让在zookeeper中生成ha节点
在master1上执行如下命令,完成格式化
hdfs zkfc -formatZK
格式成功后,查看zookeeper中可以看到
[zk: localhost:2181(CONNECTED) 1] ls /hadoop-ha
[bdcluster]
(9)格式化hdfs
在master1上执行如下命令
hadoop namenode -format
(10)启动NameNode
首先在master1上启动active节点,在master1上执行如下命令
[root@master1 hadoop-2.8.0]$ sbin/hadoop-daemon.sh start namenode
在master2上同步namenode的数据,同时启动standby的namenod,命令如下
#把NameNode的数据同步到c7002上
[root@master2 hadoop-2.8.0]$ bin/hdfs namenode -bootstrapStandby
#启动master2上的namenode作为standby
[root@master2 hadoop-2.8.0]$ sbin/hadoop-daemon.sh start namenode
(11)启动启动datanode
在master1上执行如下命令
[root@master1 hadoop-2.8.0]$ sbin/hadoop-daemons.sh start datanode
(12) 启动yarn
在作为资源管理器上的机器上启动,我这里是master1,执行如下命令完成year的启动
[root@master1 hadoop-2.8.0]$ sbin/start-yarn.sh
(12) 启动ZKFC
在master1上执行如下命令,完成ZKFC的启动
[root@master1 hadoop-2.8.0]$ sbin/hadoop-daemons.sh start zkfc
(13)全部启动完后分别在master1、master2、node1、node2、node3上执行jps是可以看到下面这些进程的
master1
ip:192.168.70.101
安装软件:Hadoop(HA)
运行进程:NameNode、ResourceManager、DFSZKFailoverController
master2
ip:192.168.70.102
安装软件:Hadoop(HA)
运行进程:NameNode、ResourceManager、DFSZKFailoverController
node1
ip:192.168.70.103
安装软件:Hadoop,Zookeeper
运行进程:DataNode、NodeManager、QuorumPeerMain、JournalNode
node2
ip:192.168.70.104
安装软件:Hadoop,Zookeeper
运行进程:DataNode、NodeManager、QuorumPeerMain、JournalNode
node3
ip:192.168.70.105
安装软件:Hadoop,Zookeeper
运行进程:DataNode、NodeManager、QuorumPeerMain、JournalNode
(14)ResourceManager HA
NameNode HA操作完之后我们可以发现只有一个节点(这里是master1)启动,需要手动启动另外一个节点(master2)的resourcemanager。
sbin/yarn-daemon.sh start resourcemanager
然后用以下指令查看resourcemanager状态
yarn rmadmin -getServiceState rm1
结果显示Active
yarn rmadmin -getServiceState rm2
而rm2是standby。
验证HA和NameNode HA同理,kill掉Active resourcemanager,则standby的resourcemanager则会转换为Active。
还有一条指令可以强制转换
yarn rmadmin –transitionToStandby rm1
注意:一定要在master2上修改yarn-site.xml中的mr1为mr2
<property>
<name>yarn.resourcemanager.ha.idname>
<value>rm2value>
<description>If we want to launch more than one RM in single node,we need this configurationdescription>
property>
在master1上配置是rm1,而在master2上一定要修改为rm2,如果不修改,c7002的resourcemanager启动不了。