hadoop笔记之部署hadoop集群(HA)+yarn

部署hadoop集群(HA)+yarn

前提条件

部署虚拟机:有以下条件

  • 配置网络
  • 设置USEDNS为no
  • 关闭防火墙
  • 关闭SELinux
  • 配置免秘钥登录
  • 安装jdk
服务器 nn dn jnn rm nn zk zkfc
node1
node1
node3
node4

zookeeper集群搭建

安装到node2,node3,node4

a) 上传安装包

将zookeeper.tar.gz上传到node2、node3、node4

b) 解压

tar -zxf zookeeper-3.4.6.tar.gz -C /opt

c) 配置环境变量

/etc/profile最后添加:

vim /etc/profile

export ZOOKEEPER_PREFIX=/opt/zookeeper-3.4.6
export PATH=$PATH:$ZOOKEEPER_PREFIX/bin

复制到其他机器上,然后让配置生效

for node in node2 node3 node4;do scp /etc/profile $node:/etc/;done
#所有机器上执行
. /etc/profile

d) zoo.cfg

到$ZOOKEEPER_PREFIX/conf下

复制zoo_sample.cfg为zoo.cfg

#cd /opt/zookeeper-3.4.6/conf/
cp /opt/zookeeper-3.4.6/conf/zoo_sample.cfg  /opt/zookeeper-3.4.6/conf/zoo.cfg
vim /opt/zookeeper-3.4.6/conf/zoo.cfg

编辑zoo.cfg
添加如下行:

server.1=node2:2881:3881
server.2=node3:2881:3881
server.3=node4:2881:3881

修改

dataDir=/var/zookeeper/data

hadoop笔记之部署hadoop集群(HA)+yarn_第1张图片
将/opt/zookeeper-3.4.6通过网络拷贝到node2、node3上

for node in node2 node3 node4;do scp -r /opt/zookeeper-3.4.6/conf/zoo.cfg $node:/opt/zookeeper-3.4.6/conf;done

e) 创建myid

创建/var/zookeeper/data目录,并在该目录下放一个文件:myid
在myid中写下当前zookeeper的编号

#node2
mkdir -p /var/zookeeper/data
echo 1 > /var/zookeeper/data/myid
#node3
mkdir -p /var/zookeeper/data
echo 2 > /var/zookeeper/data/myid
#node4
mkdir -p /var/zookeeper/data
echo 3 > /var/zookeeper/data/myid

f) 测试

启动命令:

zkServer.sh start|stop|status

查看日志

less zookeeper.out

安装hadoop

1.上传hadoop安装包

2.在所有机器上解压

tar -zxf hadoop-2.6.5.tar.gz -C /opt

3.配置全局环境变量

vi /etc/profile

​ 添加两行记录:

export HADOOP_PREFIX=/opt/hadoop-2.6.5
export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin

​ 复制到其他机器上

for node in node2 node3 node4;do scp /etc/profile $node:/etc/;done

在所有机器上执行执行. /etc/profile让配置生效

source /etc/profile

修改配置文件

1.配置hadoop的java_home

修改/opt/hadoop-2.6.5/etc/hadoop/hadoop-env.sh

vi /opt/hadoop-2.6.5/etc/hadoop/hadoop-env.sh

添加

export JAVA_HOME=/usr/java/jdk1.8.0_172-amd64

2.NameNode的slaves

vi /opt/hadoop-2.6.5/etc/hadoop/slaves

添加

node1
node2
node3
node4

3.core-site.xml

vim cd /opt/hadoop-2.6.5/etc/hadoop/core-site.xml

##########添加##############
<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://mycluster</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/var/bjsxt/hadoop/ha</value>
  </property>
  <!-- 指定每个zookeeper服务器的位置和客户端端口号 -->
  <property>
     <name>ha.zookeeper.quorum</name>
     <value>node2:2181,node3:2181,node4:2181</value>
   </property>

</configuration>

4.mapred-site.xml

<configuration>
  <property>
    <name>mapreduce.framework.namename>
    <value>yarnvalue>
  property>
configuration>

5.yarn-site.xml

<configuration>


  <property>
    <name>yarn.nodemanager.aux-servicesname>
    <value>mapreduce_shufflevalue>
  property>

<property>
  <name>yarn.resourcemanager.ha.enabledname>
  <value>truevalue>
property>

<property>
  <name>yarn.resourcemanager.cluster-idname>
  <value>rmhacluster1value>
property>

<property>
  <name>yarn.resourcemanager.ha.rm-idsname>
  <value>rm1,rm2value>
property>

<property>
  <name>yarn.resourcemanager.hostname.rm1name>
  <value>node3value>
property>
<property>
  <name>yarn.resourcemanager.hostname.rm2name>
  <value>node4value>
property>

<property>
  <name>yarn.resourcemanager.webapp.address.rm1name>
  <value>node3:8088value>
property>
<property>
  <name>yarn.resourcemanager.webapp.address.rm2name>
  <value>node4:8088value>
property>

<property>
  <name>yarn.resourcemanager.zk-addressname>
  <value>node2:2181,node3:2181,node4:2181value>
property>

configuration>

6.hdfs-site.xml

<configuration>
  
  <property>
	<name>dfs.replicationname>
	<value>2value>
  property>
  
  <property>
	<name>dfs.nameservicesname>
	<value>myclustervalue>
  property>
  
  <property>
	<name>dfs.ha.namenodes.myclustername>
	<value>nn1,nn2value>
  property>
  
  <property>
	<name>dfs.namenode.rpc-address.mycluster.nn1name>
	<value>node1:8020value>
  property>
  <property>
	<name>dfs.namenode.rpc-address.mycluster.nn2name>
	<value>node2:8020value>
  property>
  
  <property>
	<name>dfs.namenode.shared.edits.dirname>
	<value>qjournal://node1:8485;node2:8485;node3:8485/myclustervalue>
  property>
  
  <property>
	<name>dfs.client.failover.proxy.provider.myclustername>
	<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvidervalue>
  property>
  
	<property>
	  <name>dfs.ha.fencing.methodsname>
	  <value>sshfencevalue>
	property>

	<property>
	  <name>dfs.ha.fencing.ssh.private-key-filesname>
	  <value>/root/.ssh/id_dsavalue>
	property>
  
  <property>
	<name>dfs.journalnode.edits.dirname>
	<value>/var/hadoop/ha/jnnvalue>
  property>

<property>
   <name>dfs.ha.automatic-failover.enabledname>
   <value>truevalue>
 property>

configuration>

启动集群(初始化)

a) 启动journalnode

在node1\node2\node3上启动三台journalnode

hadoop-daemon.sh start journalnode

b) 格式化HDFS

任意选择node1或者node2,格式化HDFS

hdfs namenode -format

c) 启动namenode进程

hadoop-daemon.sh start namenode

d) 同步元数据

在另一台node2或者node1上同步元数据

hdfs namenode -bootstrapStandby

e) 初始化zookeeper

启动zookeeper

初始化zookeeper上的内容上的内容

hdfs zkfc -formatZK

f) 启动hadoop集群

可在node1到node4这四台服务器上任意位置执行

start-dfs.sh

g)启动yarn

ssh node3 "source /etc/profile; start-yarn.sh"
ssh node4 "source /etc/profile; yarn-daemon.sh start resourcemanager"

启动集群

上传脚本并执行

. start-ha.sh

###############启动脚本#################
#!/bin/bash
for node in node2 node3 node4
do
  ssh $node "source /etc/profile; zkServer.sh start"
done

#sleep 3

start-dfs.sh

sleep 3

ssh node3 "source /etc/profile; start-yarn.sh"
ssh node4 "source /etc/profile; yarn-daemon.sh start resourcemanager"


echo "---------------node1-jps---------------"
jps

for node in node2 node3 node4
do 
  echo "----------------$node-jps-----------------------"
  ssh $node "source /etc/profile; jps"
done
###############启动脚本#################

停止集群

上传脚本并执行

. stop-ha.sh

###############停止脚本#################

#!/bin/bash

ssh node3 "source /etc/profile; stop-yarn.sh"
ssh node4 "source /etc/profile; yarn-daemon.sh stop resourcemanager"

stop-dfs.sh

for node in node2 node3 node4
do
  ssh $node "source /etc/profile; zkServer.sh stop"
done

echo "---------------node1-jps---------------"
jps

for node in node2 node3 node4
do 
echo "----------------$node-jps-----------------------"
  ssh $node "source /etc/profile; jps"
done
###############停止脚本#################

你可能感兴趣的:(hadoop,环境部署)