Hadoop2.6.5/Spark1.6.3 HA集群构建

在基于Spark的大数据开发中,我们确定使用Centos6.5,在intellij IDEA中使用Scala语言进行开发。Spark集群包含9台机器,其中,六台作为Worker执行任务,2台作为Master管理集群资源,一台作为Client提交任务,Worker集群的配置和Master、Worker的配置可以不同,具体内容如下:
 Worker硬件环境:
 内存:1G
 硬盘:1T
 网络:1Gbit
 CPU:1核
 Master硬件环境:
 内存:2G
 硬盘:1T
 网络:1Gbit
 CPU:1核
 软件环境:
 操作系统:CentOS 6.5
root用户密码:124077
 sparker用户密码:sparker

1 规划部署

一共9台机器,HA架构:2台Master,6台Worker,1台Client

1.1 网络规划:


主机名: sparker001 IP: 192.168.1.101 网关 192.168.1.254
主机名: sparker002 IP: 192.168.1.102 网关 192.168.1.254
主机名: sparker003 IP: 192.168.1.103 网关 192.168.1.254
主机名: sparker004 IP: 192.168.1.104 网关 192.168.1.254
主机名: sparker005 IP: 192.168.1.105 网关 192.168.1.254
主机名: sparker006 IP: 192.168.1.106 网关 192.168.1.254
主机名: sparker007 IP: 192.168.1.107 网关 192.168.1.254
主机名: sparker008 IP: 192.168.1.108 网关 192.168.1.254
主机名: sparker009 IP: 192.168.1.109 网关 192.168.1.254


1.2 修改主机名,配置hosts

(所有节点)

编辑 vi /etc/hosts文件(每台机器均需要编辑)


Hadoop2.6.5/Spark1.6.3 HA集群构建_第1张图片


检查一下(通过拼主机名看可否通信):
Hadoop2.6.5/Spark1.6.3 HA集群构建_第2张图片
Hadoop2.6.5/Spark1.6.3 HA集群构建_第3张图片
Hadoop2.6.5/Spark1.6.3 HA集群构建_第4张图片

2 关闭防火墙 (所有节点)

检查防火墙运行状态:
    service iptables status

永远关闭防火墙-重启后生效
    chkconfig iptables off

Hadoop2.6.5/Spark1.6.3 HA集群构建_第5张图片

重启系统后,再次检查防火墙运行状态:

service iptables status

这里写图片描述

3 关闭SELinux(所有节点)

检查SELinux状态
    getenforce
将 SELINUX=enforcing 修改为 disabled

    vi /etc/selinux/config

Hadoop2.6.5/Spark1.6.3 HA集群构建_第6张图片

重启系统之后,再次检查SELINUX状态,显示为Disabled

这里写图片描述

4 SSH 免密码登陆 配置

4.1 开启RSA密码验证(所有节点)

vim /etc/ssh/sshd_config

去掉注释,放开这几个配置项

Hadoop2.6.5/Spark1.6.3 HA集群构建_第7张图片

这里写图片描述

4.2 生成公钥和私钥(所有节点)

说明一下 : 这个公钥私钥的生成需要和 对应的 集群使用用户 目录对应
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
参数说明:
-t 
-P
-f

Hadoop2.6.5/Spark1.6.3 HA集群构建_第8张图片

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys

Hadoop2.6.5/Spark1.6.3 HA集群构建_第9张图片

进入到master的.ssh目录下面:
cd .ssh
scp authorized_keys
192.168.1.102:~/.ssh/authorized_keys_from_master

Hadoop2.6.5/Spark1.6.3 HA集群构建_第10张图片


Hadoop2.6.5/Spark1.6.3 HA集群构建_第11张图片
Hadoop2.6.5/Spark1.6.3 HA集群构建_第12张图片


分别进入到worker机器的.ssh目录

cat authorized_keys_from_master >> authorized_keys

Hadoop2.6.5/Spark1.6.3 HA集群构建_第13张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第14张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第15张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第16张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第17张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第18张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第19张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第20张图片

修改authorized_keys权限
Hadoop2.6.5/Spark1.6.3 HA集群构建_第21张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第22张图片
。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。

Hadoop2.6.5/Spark1.6.3 HA集群构建_第23张图片
Hadoop2.6.5/Spark1.6.3 HA集群构建_第24张图片

5 安装JDK(所有节点)

5.1 卸载Centos6.5自带的OpenJDK

查看包:

rpm -qa | grep jdk
卸载:

rpm -e --nodeps java-1.7.0-openjdk-1.7.0.45-2.4.3.3.el6.x86_64
rpm -e --nodeps java-1.6.0-openjdk-1.6.0.0-1.66.1.13.0.el6.x86_64
rpm -e --nodeps tzdata-java-2013g-1.el6.noarch

Hadoop2.6.5/Spark1.6.3 HA集群构建_第25张图片

5.2 安装Oracle JDK

tar包的下载与解压:略
修改 /etc/profile文件:
在尾部加上如下几句:
# houjingru 2017 07.09

export JAVA_HOME=/home/sparker/cluster/java/jdk1.7.0_80
export PATH=$PATH:/home/sparker/cluster/java/jdk1.7.0_80/bin
export CLASSPATH=.:/home/sparker/cluster/java/jdk1.7.0_80/jre/lib

Hadoop2.6.5/Spark1.6.3 HA集群构建_第26张图片
立即生效

source /etc/profile

Hadoop2.6.5/Spark1.6.3 HA集群构建_第27张图片

6 安装Zookeeper

6.1 下载解压(略)

6.2 新建zoo.cfg

cd zookeeper-3.4.6
cp zoo_sample.cfg zoo.cfg

6.3修改配置文件

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/home/sparker/cluster/zookeeper-3.4.10/data
server.1=sparker001:2888:3888
server.2=sparker002:2888:3888
server.3=sparker009:2888:3888

# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
~                                

Hadoop2.6.5/Spark1.6.3 HA集群构建_第28张图片

6.4 新建data文件夹

/home/sparker/zookeeper/zookeeper3.4.10/下新建data文件夹,然后在该文件夹下新建文件,文件名为myid,向其中加入server.1/2/9中的值,此处填1(只填数字).

6.5 将安装完成的Zookeeper分发到其他机器

[sparker@sparker001 zookeeper]$ scp -r zookeeper-3.4.10 192.168.1.102/home/sparker/cluster/zookeeper/

[sparker@sparker001 zookeeper]$ scp -r zookeeper-3.4.10 192.168.1.109:/home/sparker/cluster/zookeeper/

修改102/109上中/data/myid的值.与server后面的值对应。在102中myid修改为2,在109中myid修改为9.

6.6 修改/etc/profile文件

export ZOOKEEPER_HOME=/home/sparker/cluster/zookeeper/zookeeper-3.4.10
export PATH=$PATH:$ZOOKEEPER_HOME/bin

这里写图片描述

立即生效:

source /etc/profile

6.7 软件启动

在每一台机器上(sparker001,sparker002,sparker009)上使用命令:zkServer.sh start 启动Zookeeper。

Hadoop2.6.5/Spark1.6.3 HA集群构建_第29张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第30张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第31张图片

7 Hadoop安装

7.1 安装包下载解压

7.2 修改配置文件

配置IO常用的core-site.xml文件:

由于配置的为HDFS的HA,有两个namenode,所以需要给整个集群起一个逻辑名称,此处为hdfs://hjrCluster。使用hadoop.tmp.dir配置了Hadoop的默认临时存储目录。使用ha.zookeeper.quorum配置Zookeeper的三个节点。
vi /home/sparker/cluster/hadoop/hadoop-2.6.5/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFSname>
<value>hdfs://hjrClustervalue>
property>
<property>
<name>hadoop.tmp.dirname>
<value>/home/sparker/tmpvalue>
property>
<property>
<name>ha.zookeeper.quorumname>
<value>sparker001:2181,sparker002:2181,sparker009:2181value>
property>
<property>
<name>ha.zookeeper.session-timeout.msname>
<value>1000value>
<description>msdescription>
property>
configuration>

Hadoop2.6.5/Spark1.6.3 HA集群构建_第32张图片

vi /home/sparker/cluster/hadoop/hadoop-2.6.5/etc/hadoop/hdfs-site.xml






<configuration>
<property>
<name>dfs.namenode.name.dirname>
<value>/home/sparker/dfs/namevalue>
property>
<property>
<name>dfs.datanode.data.dirname>
<value>/home/sparker/cluster/data0/dfs/data,/home/sparker/cluster/data1/dfs/data,/home/sparker/cluster/data2/dfs/datavalue>
property>
<property>
<name>dfs.replicationname>
<value>3value>
property>
<property>
<name>dfs.webhdfs.enabledname>
<value>truevalue>
property>
<property>
<name>dfs.nameservicesname>
<value>hjrClustervalue>
<description>Logical name for this newnameservicedescription>
property>


<property>
<name>dfs.ha.namenodes.hjrClustername>
<value>nn1,nn2value>
<description>Unique identifiers for each NameNode in thenameservicedescription>

<property>
<name>dfs.namenode.rpc-address.hjrCluster.nn1name>
<value>sparker001:8020value>
property>
<property>
<name>dfs.namenode.rpc-address.hjrCluster.nn2name>
<value>sparker002:8020value>
property>
<property>
<name>dfs.namenode.servicerpc-address.hjrCluster.nn1name>
<value>sparker001:53310value>
property>
<property>
<name>dfs.namenode.servicerpc-address.hjrCluster.nn2name>
<value>sparker002:53310value>
property>

<property>
<name>dfs.namenode.http-address.hjrCluster.nn1name>
<value>sparker001:50070value>
property>
<property>
<name>dfs.namenode.http-address.hjrCluster.nn2name>
<value>sparker002:50070value>
property>

<property>
<name>dfs.namenode.shared.edits.dirname>
<value>qjournal://sparker003:8485;sparker004:8485;sparker005:8485;sparker006:8485;sparker007:8485/hjrClustervalue>
property>
<property>
<name>dfs.client.failover.proxy.provider.hjrClustername>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvidervalue>
property>

<property>
<name>dfs.ha.fencing.methodsname>
<value>sshfencevalue>
property>
<property>
<name>dfs.ha.fencing.ssh.private-key-filesname>
<value>/home/sparker/.ssh/id_rsavalue>
property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeoutname>
<value>30000value>
property>

<property>
<name>dfs.journalnode.edits.dirname>
<value>/home/sparker/data/hadoop/journaldatavalue>
property>
<property>
<name>dfs.ha.automatic-failover.enabledname>
<value>truevalue>
property>
<property>
<name>ha.failover-controller.cli-check.rpc-timeout.msname>
<value>60000value>
property>
configuration>

Hadoop2.6.5/Spark1.6.3 HA集群构建_第33张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第34张图片

vi /usr/hadoop/hadoop-2.6.5/etc/hadoop/mapper-site.xml
<configuration>
#管理框架
<property>
<name>mapreduce.framework.namename>
<value>yarnvalue>
property>
#历史任务地址
<property>
<name>mapreduce.jobhistory.addressname>
<value>sparker001:10020value>
property>
<property>
<name>mapreduce.map.memory.mbname>
<value>1536value>
property>
#map、reduce任务内存
<property>
<name>mapreduce.map.java.optsname>
<value>-Xmx1024Mvalue>
property>
<property>
<name>mapreduce.reduce.memory.mbname>
<value>3072value>
property>
<property>
<name>mapreduce.reduce.java.optsname>
<value>-Xmx2560Mvalue>
property>
#网页监控历史作业地址
<property>
<name>mapreduce.jobhistory.webapp.addressname>
<value>sparker001:19888value>
property>
configuration>

Hadoop2.6.5/Spark1.6.3 HA集群构建_第35张图片

yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-servicesname>
<value>mapreduce_shufflevalue>
property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.classname>
<value>org.apache.hadoop.mapred.ShuffleHandlervalue>
property>
<property>
<name>yarn.resourcemanager.addressname>
<value>sparker001:8032value>
property>
<property>
<name>yarn.resourcemanager.scheduler.addressname>
<value>sparker001:8030value>
property>
<property>
<name>yarn.resourcemanager.resource-tracker.addressname>
<value>sparker001:8031value>
property>
<property>
<name>yarn.resourcemanager.admin.addressname>
<value>sparker001:8033value>
property>
<property>
<name>yarn.resourcemanager.webapp.addressname>
<value>sparker001:8088value>
property>
configuration>

Hadoop2.6.5/Spark1.6.3 HA集群构建_第36张图片

vi slaves

Hadoop2.6.5/Spark1.6.3 HA集群构建_第37张图片

vi hadoop-env.sh

Hadoop2.6.5/Spark1.6.3 HA集群构建_第38张图片

vi /etc/profile

Hadoop2.6.5/Spark1.6.3 HA集群构建_第39张图片

source /etc/profile

7.3拷贝到其他节点

scp -r /home/sparker/cluster/hadoop/hadoop-2.6.5 192.168.1.10:2//home/sparker/cluster/hadoop
scp -r /home/sparker/cluster/hadoop/hadoop-2.6.5 192.168.1.10:3//home/sparker/cluster/hadoop
scp -r /home/sparker/cluster/hadoop/hadoop-2.6.5 192.168.1.10:4//home/sparker/cluster/hadoop
scp -r /home/sparker/cluster/hadoop/hadoop-2.6.5 192.168.1.10:5//home/sparker/cluster/hadoop
scp -r /home/sparker/cluster/hadoop/hadoop-2.6.5 192.168.1.10:6//home/sparker/cluster/hadoop
scp -r /home/sparker/cluster/hadoop/hadoop-2.6.5 192.168.1.10:7//home/sparker/cluster/hadoop
scp -r /home/sparker/cluster/hadoop/hadoop-2.6.5 192.168.1.108://home/sparker/cluster/hadoop

7.4 按照配置文件新建文件夹

再/home/sparker/下建立tmp文件夹和/dfs/data,/dfs/name文件夹。

mkdir /home/sparker/tmp
mkdir /home/sparker/dfs/data
mkdir /home/sparker/dfs/name

7.5 软件启动

(1)zookeeper格式化(仅仅第一步需要做)

hdfs zkfc -formactZK

Hadoop2.6.5/Spark1.6.3 HA集群构建_第40张图片
Hadoop2.6.5/Spark1.6.3 HA集群构建_第41张图片

(2)启动ZKFC
master节点以及standbymaster节点

hadoop-daemon.sh start zkfc
hadoop-daemon.sh stop zkfc

这里写图片描述

这里写图片描述

(3)JournalNode节点启动 《所有节点》

用于主备NN之间同步元数据信息的共享存储系统

所有节点均需要启动

hadoop-daemon.sh start journalnode
hadoop-daemon.sh stop journalnode

Hadoop2.6.5/Spark1.6.3 HA集群构建_第42张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第43张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第44张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第45张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第46张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第47张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第48张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第49张图片

(4)格式化namenode 【仅仅第一步需要做】

记录问题:如果需要重新格式化namenode,需要删除的目录有:

hdfs-site.xml:
core-site.xml:

这里写图片描述

走起》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》

hadoop namenode -format

Hadoop2.6.5/Spark1.6.3 HA集群构建_第50张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第51张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第52张图片

(5)启动NN<此条命令在sparker001上执行>

hadoop-daemon.sh start namenode
hadoop-daemon.sh stop namenode

这里写图片描述
(6)在备NN上同步主NN的元数据信息<此条命令在sparker002上执行>

hdfs namenode -bootstrapStandby

Hadoop2.6.5/Spark1.6.3 HA集群构建_第53张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第54张图片
(7)启动备NN<此条命令在sparker002上执行>

在备NN上执行命令:
hadoop-daemon.sh start namenode

这里写图片描述

在主NN节点上执行命令激活NN:
hdfs haadmin -transitionToActive nn1

Hadoop2.6.5/Spark1.6.3 HA集群构建_第55张图片

(8) 在主NN上启动Datanod<此条命令在sparker001上执行>

hadoop-daemon.sh start datanodee

Hadoop2.6.5/Spark1.6.3 HA集群构建_第56张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第57张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第58张图片

这里写图片描述

这里写图片描述

7.6 验证查看

Hadoop2.6.5/Spark1.6.3 HA集群构建_第59张图片
Hadoop2.6.5/Spark1.6.3 HA集群构建_第60张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第61张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第62张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第63张图片

8 Spark安装

8.1 Scala安装

(1)下载解压:略
Hadoop2.6.5/Spark1.6.3 HA集群构建_第64张图片

(2)修改/etc/profile文件

export SCALA_HOME=/home/sparker/cluster/scala/scala-2.10.5

export PATH=$PATH:$SCALA_HOME/bin

这里写图片描述

(3)将Scala分发到集群的其他机机器

 scp -r /home/sparker/cluster/scala/scala-2.10.5 192.168.1.102:/home/sparker/cluster/scala/
 scp -r /home/sparker/cluster/scala/scala-2.10.5 192.168.1.103:/home/sparker/cluster/scala/
 scp -r /home/sparker/cluster/scala/scala-2.10.5 192.168.1.104:/home/sparker/cluster/scala/
 scp -r /home/sparker/cluster/scala/scala-2.10.5 192.168.1.105:/home/sparker/cluster/scala/
 scp -r /home/sparker/cluster/scala/scala-2.10.5 192.168.1.106:/home/sparker/cluster/scala/
 scp -r /home/sparker/cluster/scala/scala-2.10.5 192.868.1.107:/home/sparker/cluster/scala/
  scp -r /home/sparker/cluster/scala/scala-2.10.5 192.168.1.108:/home/sparker/cluster/scala/

8.2 Spark 安装


Master:sparker001
StandbyMaster:sparker002
Worker:sparker003,sparker004,sparker005,sparker006,sparker007,sparker008


(0)下载解压:略
(1)配置Spark的环境变量

(2)修改spark-1.6.3-bin-hadoop2.6/conf下的slaves文件

cp slaves.template slaves
vi slaves
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# A Spark Worker will be started on each of the machines listed below.

sparker003
sparker004
sparer005
sparker006
sparker007
sparker008

(3)修改spark-1.6.3-bin-hadoop2.6/conf下的spark-env.sh
加入Java、Hadoop、Scala等的环境变量,配置worker的内存、CPU核数等,以及加入jdbc,odbc等的jar包路径。具体配置如下:

(4)将Spark分发到其他机器

  scp -r /home/sparker/cluster/spark/spark-1.6.3-bin-hadoop2.6 192.168.1.102:/home/sparker/cluster/spark
  scp -r /home/sparker/cluster/spark/spark-1.6.3-bin-hadoop2.6 192.168.1.103:/home/sparker/cluster/spark
  scp -r /home/sparker/cluster/spark/spark-1.6.3-bin-hadoop2.6 192.168.1.104:/home/sparker/cluster/spark
  scp -r /home/sparker/cluster/spark/spark-1.6.3-bin-hadoop2.6 192.168.1.105:/home/sparker/cluster/spark
  scp -r /home/sparker/cluster/spark/spark-1.6.3-bin-hadoop2.6 192.168.1.106:/home/sparker/cluster/spark
  scp -r /home/sparker/cluster/spark/spark-1.6.3-bin-hadoop2.6 192.168.1.107:/home/sparker/cluster/spark
  scp -r /home/sparker/cluster/spark/spark-1.6.3-bin-hadoop2.6 192.168.1.108:/home/sparker/cluster/spark

8.3 Spark 启动

master:
cd /home/sparker/cluster/spark/spark-1.6.3/sbin
./start-all.sh

Hadoop2.6.5/Spark1.6.3 HA集群构建_第65张图片

StandbyMaster:
cd /home/sparker/cluster/spark/spark-1.6.3/sbin
./start-master.sh

这里写图片描述

sparker001:8080

Hadoop2.6.5/Spark1.6.3 HA集群构建_第66张图片

Hadoop2.6.5/Spark1.6.3 HA集群构建_第67张图片

./spark-shell --master spark://sparker001:7077

Hadoop2.6.5/Spark1.6.3 HA集群构建_第68张图片
Hadoop2.6.5/Spark1.6.3 HA集群构建_第69张图片
Hadoop2.6.5/Spark1.6.3 HA集群构建_第70张图片
Hadoop2.6.5/Spark1.6.3 HA集群构建_第71张图片

你可能感兴趣的:(大数据,Spark,Hadoop,HA)