转载请注明出处:http://blog.csdn.net/l1028386804/article/details/51353063
集群规划:
主机名 IP 安装的软件 运行的进程
liuyazhuang01 192.168.1.201 jdk、hadoop NameNode、DFSZKFailoverController
liuyazhuang02 192.168.1.202 jdk、hadoop NameNode、DFSZKFailoverController
liuyazhuang03 192.168.1.203 jdk、hadoop ResourceManager
liuyazhuang04 192.168.1.204 jdk、hadoop、zookeeper DataNode、NodeManager、JournalNode、QuorumPeerMain
liuyazhuang05 192.168.1.205 jdk、hadoop、zookeeper DataNode、NodeManager、JournalNode、QuorumPeerMain
liuyazhuang06 192.168.1.206 jdk、hadoop、zookeeper DataNode、NodeManager、JournalNode、QuorumPeerMain
在hadoop2.0中通常由两个NameNode组成,一个处于active状态,另一个处于standby状态。Active NameNode对外提供服务,而Standby NameNode则不对外提供服务,仅同步active namenode的状态,以便能够在它失败时快速进行切换。
hadoop2.0官方提供了两种HDFS HA的解决方案,一种是NFS,另一种是QJM。这里我们使用简单的QJM。在该方案中,主备NameNode之间通过一组JournalNode同步元数据信息,一条数据只要成功写入多数JournalNode即认为写入成功。通常配置奇数个JournalNode
这里还配置了一个zookeeper集群,用于ZKFC(DFSZKFailoverController)故障转移,当Active NameNode挂掉了,会自动切换Standby NameNode为standby状态
tar -zxvf zookeeper-3.4.5.tar.gz -C /liuyazhuang/
cd /liuyazhuang/zookeeper-3.4.5/conf/ cp zoo_sample.cfg zoo.cfg vim zoo.cfg 修改:dataDir=/liuyazhuang/zookeeper-3.4.5/tmp 在最后添加: server.1=liuyazhuang04:2888:3888 server.2=liuyazhuang05:2888:3888 server.3=liuyazhuang06:2888:3888 保存退出 然后创建一个tmp文件夹 mkdir /liuyazhuang/zookeeper-3.4.5/tmp 再创建一个空文件 touch /liuyazhuang/zookeeper-3.4.5/tmp/myid 最后向该文件写入ID echo 1 > /liuyazhuang/zookeeper-3.4.5/tmp/myid
将配置好的zookeeper拷贝到其他节点(首先分别在liuyazhuang05、liuyazhuang06根目录下创建一个liuyazhuang目录:mkdir /liuyazhuang)
scp -r /liuyazhuang/zookeeper-3.4.5/ liuyazhuang05:/liuyazhuang/ scp -r /liuyazhuang/zookeeper-3.4.5/ liuyazhuang06:/liuyazhuang/ 注意:修改liuyazhuang05、liuyazhuang06对应/liuyazhuang/zookeeper-3.4.5/tmp/myid内容 liuyazhuang05: echo 2 > /liuyazhuang/zookeeper-3.4.5/tmp/myid liuyazhuang06: echo 3 > /liuyazhuang/zookeeper-3.4.5/tmp/myid
tar -zxvf hadoop-2.2.0.tar.gz -C /liuyazhuang/
#将hadoop添加到环境变量中 vim /etc/profile export JAVA_HOME=/usr/java/jdk1.7.0_55 export HADOOP_HOME=/liuyazhuang/hadoop-2.2.0 export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin #hadoop2.0的配置文件全部在$HADOOP_HOME/etc/hadoop下 cd /liuyazhuang/hadoop-2.2.0/etc/hadoop
export JAVA_HOME=/usr/java/jdk1.7.0_55
<configuration> <!-- 指定hdfs的nameservice为ns1 --> <property> <name>fs.defaultFS</name> <value>hdfs://ns1</value> </property> <!-- 指定hadoop临时目录 --> <property> <name>hadoop.tmp.dir</name> <value>/liuyazhuang/hadoop-2.2.0/tmp</value> </property> <!-- 指定zookeeper地址 --> <property> <name>ha.zookeeper.quorum</name> <value>liuyazhuang04:2181,liuyazhuang05:2181,liuyazhuang06:2181</value> </property> </configuration>
<configuration> <!--指定hdfs的nameservice为ns1,需要和core-site.xml中的保持一致 --> <property> <name>dfs.nameservices</name> <value>ns1</value> </property> <!-- ns1下面有两个NameNode,分别是nn1,nn2 --> <property> <name>dfs.ha.namenodes.ns1</name> <value>nn1,nn2</value> </property> <!-- nn1的RPC通信地址 --> <property> <name>dfs.namenode.rpc-address.ns1.nn1</name> <value>liuyazhuang01:9000</value> </property> <!-- nn1的http通信地址 --> <property> <name>dfs.namenode.http-address.ns1.nn1</name> <value>liuyazhuang01:50070</value> </property> <!-- nn2的RPC通信地址 --> <property> <name>dfs.namenode.rpc-address.ns1.nn2</name> <value>liuyazhuang02:9000</value> </property> <!-- nn2的http通信地址 --> <property> <name>dfs.namenode.http-address.ns1.nn2</name> <value>liuyazhuang02:50070</value> </property> <!-- 指定NameNode的元数据在JournalNode上的存放位置 --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://liuyazhuang04:8485;liuyazhuang05:8485;liuyazhuang06:8485/ns1</value> </property> <!-- 指定JournalNode在本地磁盘存放数据的位置 --> <property> <name>dfs.journalnode.edits.dir</name> <value>/liuyazhuang/hadoop-2.2.0/journal</value> </property> <!-- 开启NameNode失败自动切换 --> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <!-- 配置失败自动切换实现方式 --> <property> <name>dfs.client.failover.proxy.provider.ns1</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行--> <property> <name>dfs.ha.fencing.methods</name> <value> sshfence shell(/bin/true) </value> </property> <!-- 使用sshfence隔离机制时需要ssh免登陆 --> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/root/.ssh/id_rsa</value> </property> <!-- 配置sshfence隔离机制超时时间 --> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> </property> </configuration>
<configuration> <!-- 指定mr框架为yarn方式 --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
<configuration> <!-- 指定resourcemanager地址 --> <property> <name>yarn.resourcemanager.hostname</name> <value>liuyazhuang03</value> </property> <!-- 指定nodemanager启动时加载server的方式为shuffle server --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>
修改slaves(slaves是指定子节点的位置,因为要在liuyazhuang01上启动HDFS、在liuyazhuang03启动yarn,所以liuyazhuang01上的slaves文件指定的是datanode的位置,liuyazhuang03上的slaves文件指定的是nodemanager的位置)
liuyazhuang04 liuyazhuang05 liuyazhuang06
#首先要配置liuyazhuang01到liuyazhuang02、liuyazhuang03、liuyazhuang04、liuyazhuang05、liuyazhuang06的免密码登陆 #在liuyazhuang01上生产一对钥匙 ssh-keygen -t rsa #将公钥拷贝到其他节点,包括自己 ssh-coyp-id liuyazhuang01 ssh-coyp-id liuyazhuang02 ssh-coyp-id liuyazhuang03 ssh-coyp-id liuyazhuang04 ssh-coyp-id liuyazhuang05 ssh-coyp-id liuyazhuang06 #配置liuyazhuang03到liuyazhuang04、liuyazhuang05、liuyazhuang06的免密码登陆 #在liuyazhuang03上生产一对钥匙 ssh-keygen -t rsa #将公钥拷贝到其他节点 ssh-coyp-id liuyazhuang04 ssh-coyp-id liuyazhuang05 ssh-coyp-id liuyazhuang06 #注意:两个namenode之间要配置ssh免密码登陆,别忘了配置liuyazhuang02到liuyazhuang01的免登陆 在liuyazhuang02上生产一对钥匙 ssh-keygen -t rsa ssh-coyp-id -i liuyazhuang01
scp -r /liuyazhuang/ liuyazhuang02:/ scp -r /liuyazhuang/ liuyazhuang03:/ scp -r /liuyazhuang/hadoop-2.2.0/ root@liuyazhuang04:/liuyazhuang/ scp -r /liuyazhuang/hadoop-2.2.0/ root@liuyazhuang05:/liuyazhuang/ scp -r /liuyazhuang/hadoop-2.2.0/ root@liuyazhuang06:/liuyazhuang/###注意:严格按照下面的步骤
(分别在liuyazhuang04、liuyazhuang05、liuyazhuang06上启动zk)
cd /liuyazhuang/zookeeper-3.4.5/bin/ ./zkServer.sh start #查看状态:一个leader,两个follower ./zkServer.sh status
(在liuyazhuang01上启动所有journalnode,注意:是调用的hadoop-daemons.sh这个脚本,注意是复数s的那个脚本)
cd /liuyazhuang/hadoop-2.2.0 sbin/hadoop-daemons.sh start journalnode#运行jps命令检验,liuyazhuang04、liuyazhuang05、liuyazhuang06上多了JournalNode进程
#在liuyazhuang01上执行命令: hdfs namenode -format #格式化后会在根据core-site.xml中的hadoop.tmp.dir配置生成个文件,这里我配置的是/liuyazhuang/hadoop-2.2.0/tmp,然后将/liuyazhuang/hadoop-2.2.0/tmp拷贝到liuyazhuang02的/liuyazhuang/hadoop-2.2.0/下。 scp -r tmp/ liuyazhuang02:/liuyazhuang/hadoop-2.2.0/
hdfs zkfc -formatZK
sbin/start-dfs.sh
(#####注意#####:是在liuyazhuang03上执行start-yarn.sh,把namenode和resourcemanager分开是因为性能问题,因为他们都要占用大量资源,所以把他们分开了,他们分开了就要分别在不同的机器上启动)
sbin/start-yarn.sh
http://192.168.1.201:50070 NameNode 'liuyazhuang01:9000' (active) http://192.168.1.202:50070 NameNode 'liuyazhuang02:9000' (standby)
首先向hdfs上传一个文件 hadoop fs -put /etc/profile /profile hadoop fs -ls / 然后再kill掉active的NameNode kill -9 <pid of NN> 通过浏览器访问:http://192.168.1.202:50070 NameNode 'liuyazhuang02:9000' (active) 这个时候liuyazhuang02上的NameNode变成了active 在执行命令: hadoop fs -ls / -rw-r--r-- 3 root supergroup 1926 2014-02-06 15:36 /profile 刚才上传的文件依然存在!!! 手动启动那个挂掉的NameNode sbin/hadoop-daemon.sh start namenode 通过浏览器访问:http://192.168.1.201:50070 NameNode 'liuyazhuang01:9000' (standby)
运行一下hadoop提供的demo中的WordCount程序:
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /profile /outOK,大功告成!!!