目录
一、HA介绍
二、环境准备
2.1 机器准备及节点规划
2.2 /etc/hosts配置
2.3 配置ssh免密登录
三、安装](#3)
3.1 安装zookeeper
3.2 安装包准备
3.3 新建/opt/hadoop目录
3.4 上传安装包
3.5 解压并创建目录
3.6 配置环境变量
3.7 集群配置
3.7.1 配置hadoop-env.sh
3.7.2 配置core-site.xml文件
3.7.3 配置hdfs-site.xml文件
3.7.4 创建并修改mapred-site.xml文件
3.7.5 修改yarn-site.xml文件
3.7.6 修改slaves文件
3.8 集群初始化和启动
3.9 集群日常启动与关闭
3.9.1 正常的启动顺序
3.9.2 正常的关闭顺序
一、HA介绍
Hadoop的HA包含了HDFS的HA、YARN的HA,HA架构和方案详见: https://www.jianshu.com/p/7c697f146674
二、环境准备
2.1 机器准备及节点规划
host | ip | os | 节点规划-hdfs | 节点规划-yarn | 节点规划-zookeeper |
---|---|---|---|---|---|
hadoop-1 | 192.168.90.131 | Ubuntu 18.04.2 LTS | NameNode(active) 、DFSZKFailoverController | ResourceManager(standby) | zookeeper |
hadoop-2 | 192.168.90.132 | Ubuntu 18.04.2 LTS | NameNode(standby)、DFSZKFailoverController | ResourceManager(active) | zookeeper |
hadoop-3 | 192.168.90.133 | Ubuntu 18.04.2 LTS | DateNode 、JournalNode | NodeManager | zookeeper |
hadoop-4 | 192.168.90.134 | Ubuntu 18.04.2 LTS | DateNode 、JournalNode | NodeManager | zookeeper(observer) |
hadoop-5 | 192.168.90.135 | Ubuntu 18.04.2 LTS | DateNode 、JournalNode | NodeManager |
2.2 /etc/hosts配置
在每台机器上编辑/etc/hosts文件, 加入如下内容:
192.168.90.131 hadoop-1
192.168.90.132 hadoop-2
192.168.90.133 hadoop-3
192.168.90.134 hadoop-4
192.168.90.135 hadoop-5
2.3 配置ssh免密登录
-
对每台机器生成秘钥文件,以root用户登录,生成空字符串秘钥,执行:
ssh-keygen -t rsa -P ''
提示
Enter file in which to save the key (/root/.ssh/id_rsa):
,直接回车(用默认文件/root/.ssh/id_rsa)。执行完成后在/root/.ssh目录下有3个文件:authorized_keys、id_rsa、id_rsa.pub。(如果没有authorized_keys可以通过touch authorized_keys
手动生成)。 将所有机器上的id_rsa.pub的内容合并到authorized_keys文件中(这里“合并”的意思是每台机器上authorized_keys文件包含所有机器上id_rsa.pub文件的内容)。
在每台机器上用ssh命令测试登录其它机器
如果提示类如如下认证信息:
The authenticity of host 'hadoop-2 (192.168.90.132)' can't be established.
ECDSA key fingerprint is SHA256:XEhSC0caRxdbv0eHNBo8c7VULr7vhj5pM2bt3frOEAA.
Are you sure you want to continue connecting (yes/no)?
输入yes回车即可,后面可以直接ssh登录。
如果是非root用户(例如本文后面建立的hadoop用户)ssh免密登录,需要在.ssh目录下执行
chmod 600 authorized_keys
, 将authorized_keys权限改为600 。
三、安装
3.1 安装zookeeper
本文用hadoop-1 ~ hadoop-4安装zookeeper集群,参考: https://www.jianshu.com/p/e9becafcbaa7
3.2 安装包准备
下载hadoop安装包,地址:https://archive.apache.org/dist/hadoop/common/hadoop-2.8.3/
3.3 新建/opt/hadoop目录
mkdir /opt/hadoop
3.4 上传安装包
上传hadoop-2.8.3.tar.gz到/opt/hadoop目录
3.5 解压并创建目录:
cd /opt/hadoop
tar -zxvf hadoop-2.8.3.tar.gz
mkdir hdfs
cd hdfs
mkdir data name tmp pid journalnode logs
cd ../
mkdir yarn
cd yarn
mkdir logs local staging
创建的hdfs下的date、name、tmp、pid在后续配置中会用到,分别用来配置hdfs的data目录、hdfs的namenode目录、hadoop的tmp目录、pid文件存放目录、journalnode存储目录、日志存储目录。yarn下目录用于配置yarn运行的相关目录。
3.6 配置环境变量
export HADOOP_HOME=/opt/hadoop/hadoop-2.8.3
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
3.7 集群配置
涉及文件如下表所示:
文件 | 说明 |
---|---|
hadoop-env.sh | hadoop运行环境配置 |
core-site.xml | Common组件,定义系统级别参数,如hdfs url等 |
hdfs-site.xml | HDFS组件 |
mapred-site.xml | MapReduce组件 |
yarn-site.xml | YARN组件 |
slaves | slaves节点 |
3.7.1 配置hadoop-env.sh
编辑该文件,在文件开始处设置JAVA_HOME环境变量(和/etc/profile中的一致),如:
JAVA_HOME=/opt/jdk/jdk1.8.0_231
完整配置可参考:
# The java implementation to use.
JAVA_HOME=/opt/jdk/jdk1.8.0_231
#export JSVC_HOME=${JSVC_HOME}
export HADOOP_LOGFILE=${USER}-hadoop.log
export HADOOP_ROOT_LOGGER=INFO,DRFA,console
export HADOOP_MAPRED_ROOT_LOGGER=INFO,DRFA,console
export HDFS_AUDIT_LOGGER=WARN,DRFA,console
export HADOOP_SECURITY_LOGGER=INFO,DRFA,console
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
export HADOOP_HOME=/opt/hadoop/hadoop-2.8.3
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
# Extra Java CLASSPATH elements. Automatically insert capacity-scheduler.
for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
if [ "$HADOOP_CLASSPATH" ]; then
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
else
export HADOOP_CLASSPATH=$f
fi
done
# The maximum amount of heap to use, in MB. Default is 1000.
export HADOOP_HEAPSIZE=3072
#export HADOOP_NAMENODE_INIT_HEAPSIZE=""
# Enable extra debugging of Hadoop's JAAS binding, used to set up
# Kerberos security.
# export HADOOP_JAAS_DEBUG=true
# Extra Java runtime options. Empty by default.
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"
# Command specific options appended to HADOOP_OPTS when specified
export HADOOP_NAMENODE_OPTS="-XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:GCPauseIntervalMillis=100 -Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-WARN,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-WARN,NullAppender} $HADOOP_NAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:GCPauseIntervalMillis=100 -Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"
export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-WARN,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-WARN,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"
export HADOOP_NFS3_OPTS="$HADOOP_NFS3_OPTS"
export HADOOP_PORTMAP_OPTS="-Xmx3072m $HADOOP_PORTMAP_OPTS"
# The following applies to multiple commands (fs, dfs, fsck, distcp etc)
export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS"
# set heap args when HADOOP_HEAPSIZE is empty
if [ "$HADOOP_HEAPSIZE" = "" ]; then
export HADOOP_CLIENT_OPTS="-Xmx3072m $HADOOP_CLIENT_OPTS"
fi
#HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData $HADOOP_JAVA_PLATFORM_OPTS"
# On secure datanodes, user to run the datanode as after dropping privileges.
# This **MUST** be uncommented to enable secure HDFS if using privileged ports
# to provide authentication of data transfer protocol. This **MUST NOT** be
# defined if SASL is configured for authentication of data transfer protocol
# using non-privileged ports.
export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}
# Where log files are stored. $HADOOP_HOME/logs by default.
export HADOOP_LOG_DIR=/opt/hadoop/hdfs/logs
# Where log files are stored in the secure data environment.
export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}
###
# HDFS Mover specific parameters
###
# Specify the JVM options to be used when starting the HDFS Mover.
# These options will be appended to the options specified as HADOOP_OPTS
# and therefore may override any similar flags set in HADOOP_OPTS
#
# export HADOOP_MOVER_OPTS=""
###
# Advanced Users Only!clusterManager
###
# The directory where pid files are stored. /tmp by default.
# NOTE: this should be set to a directory that can only be written to by
# the user that will run the hadoop daemons. Otherwise there is the
# potential for a symlink attack.
export HADOOP_PID_DIR=/opt/hadoop/hdfs/pid
export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}
# A string representing this instance of hadoop. $USER by default.
export HADOOP_IDENT_STRING=$USER
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HADOOP_HOME/share/hadoop/tools/lib/*
#for f in $HADOOP_HOME/share/hadoop/tools/lib/*.jar; do
# if [ "$HADOOP_CLASSPATH" ]; then
# export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
# else
# export HADOOP_CLASSPATH=$f
# fi
#done
3.7.2 配置core-site.xml文件
配置可参考:
fs.defaultFS
hdfs://ns
hadoop.tmp.dir
/opt/hadoop/hdfs/tmp
io.file.buffer.size
4096
fs.trash.checkpoint.interval
0
fs.trash.interval
1440
ha.zookeeper.quorum
hadoop-1:2181,hadoop-2:2181,hadoop-3:2181,hadoop-4:2181
ha.zookeeper.session-timeout.ms
2000
hadoop.proxyuser.hadoop.hosts
*
hadoop.proxyuser.hadoop.groups
*
io.compression.codecs
org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.DefaultCodec, org.apache.hadoop.io.compress.BZip2Codec, org.apache.hadoop.io.compress.SnappyCodec
3.7.3 配置hdfs-site.xml文件
配置可参考:
dfs.nameservices
ns
dfs.ha.namenodes.ns
nn1,nn2
dfs.namenode.rpc-address.ns.nn1
hadoop-1:9000
dfs.namenode.http-address.ns.nn1
hadoop-1:50070
dfs.namenode.rpc-address.ns.nn2
hadoop-2:9000
dfs.namenode.http-address.ns.nn2
hadoop-2:50070
dfs.namenode.shared.edits.dir
qjournal://hadoop-3:8485;hadoop-4:8485;hadoop-5:8485/ns
dfs.journalnode.edits.dir
/opt/hadoop/hdfs/journalnode
dfs.ha.automatic-failover.enabled
true
dfs.client.failover.proxy.provider.ns
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
dfs.ha.fencing.methods
sshfence
shell(/bin/true)
dfs.ha.fencing.ssh.private-key-files
/root/.ssh/id_rsa
dfs.namenode.name.dir
/opt/hadoop/hdfs/name
dfs.datanode.data.dir
/opt/hadoop/hdfs/data
dfs.replication
2
dfs.webhdfs.enabled
true
dfs.client.slow.io.warning.threshold.ms
90000
dfs.heartbeat.interval
8
dfs.namenode.heartbeat.recheck-interval
90000
dfs.namenode.checkpoint.preiod
3600
dfs.namenode.checkpoint.txns
1000000
dfs.blockreport.intervalMsec
1800000
dfs.datanode.directoryscan.interval
1800
dfs.datanode.max.xcievers
8000
dfs.hosts
/opt/hadoop/hadoop-2.8.3/etc/hadoop/slaves
dfs.balance.bandwidthPerSec
10485760
dfs.blocksize
67108864
dfs.namenode.handler.count
64
dfs.datanode.max.transfer.threads
36867
dfs.datanode.directoryscan.threads
18
dfs.datanode.handler.count
128
dfs.datanode.slow.io.warning.threshold.ms
1000
3.7.4 创建并修改mapred-site.xml文件
先执行下面命令创建mapred-site.xml文件。
cp mapred-site.xml.template mapred-site.xml
编辑该文件,配置可参考:
mapreduce.framework.name
yarn
mapred.local.dir
/opt/hadoop/yarn/local
mapreduce.map.java.opts
-Xmx4096m
mapreduce.map.memory.mb
4096
mapreduce.reduce.java.opts
-Xmx4096m
mapreduce.reduce.memory.mb
4096
mapreduce.jobhistory.cleaner.interval-ms
604800000
mapreduce.jobhistory.joblist.cache.size
20000
mapreduce.jobhistory.datestring.cache.size
200000
mapreduce.jobhistory.cleaner.enable
true
mapreduce.jobhistory.max-age-ms
604800000
3.7.5 修改yarn-site.xml文件
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.resourcemanager.ha.enabled
true
yarn.resourcemanager.cluster-id
hdcluster
yarn.resourcemanager.ha.rm-ids
rm1,rm2
yarn.resourcemanager.hostname.rm1
hadoop-1
yarn.resourcemanager.hostname.rm2
hadoop-2
yarn.resourcemanager.webapp.address.rm1
hadoop-1:8088
yarn.resourcemanager.webapp.address.rm2
hadoop-2:8088
yarn.resourcemanager.webapp.https.address.rm1
hadoop-1:5005
yarn.resourcemanager.webapp.https.address.rm2
hadoop-2:5005
yarn.resourcemanager.scheduler.address.rm1
hadoop-1:5001
yarn.resourcemanager.scheduler.address.rm2
hadoop-2:5001
yarn.resourcemanager.admin.address.rm1
hadoop-1:5003
yarn.resourcemanager.admin.address.rm2
hadoop-2:5003
yarn.resourcemanager.zk-address
hadoop-1:2181,hadoop-2:2181,hadoop-3:2181,hadoop-4:2181
yarn.nm.liveness-monitor.expiry-interval-ms
100000
yarn.scheduler.fair.user-as-default-queue
false
yarn.nodemanager.container-executor.class
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor
yarn.nodemanager.pmem-check-enabled
true
yarn.resourcemanager.ha.automatic-failover.enabled
true
。
yarn.nodemanager.resource.memory-mb
4096
。
yarn.scheduler.maximum-allocation-mb
12288
yarn.scheduler.maximum-allocation-vcores
32
yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms
5000
yarn.resourcemanager.recovery.enabled
true
yarn.resourcemanager.store.class
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
yarn.resourcemanager.connect.retry-interval.ms
2000
yarn.resourcemanager.resource-tracker.address.rm1
hadoop-1:5002
yarn.resourcemanager.resource-tracker.address.rm2
hadoop-2:5002
yarn.resourcemanager.zk.state-store.address
hadoop-1:2181,hadoop-2:2181,hadoop-3:2181,hadoop-4:2181
yarn.nodemanager.vmem-pmem-ratio
8
yarn.log-aggregation-enable
true
yarn.log.server.url
http://hadoop-1:19888/jobhistory/logs
yarn.log-aggregation.retain-seconds
604800
yarn.nodemanager.log-dirs
/opt/hadoop/yarn/logs
yarn.nodemanager.delete.debug-delay-sec
600
yarn.nodemanager.remote-app-log-dir
/yarn-logs
yarn.nodemanager.remote-app-log-dir-suffix
logs
yarn.nodemanager.local-dirs
/opt/hadoop/yarn/local
yarn.app.mapreduce.am.staging-dir
/opt/hadoop/yarn/staging
yarn.nodemanager.aux-services.mapreduce.shuffle.class
org.apache.hadoop.mapred.ShuffleHandler
yarn.resourcemanager.scheduler.class
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
yarn.nodemanager.resource.cpu-vcores
8
yarn.resourcemanager.am.max-attempts
5
yarn.scheduler.minimum-allocation-mb
128
yarn.resourcemanager.ha.automatic-failover.embedded
true
yarn.resourcemanager.nodemanagers.heartbeat-interval-ms
1000
yarn.nodemanager.linux-container-executor.group
hadoop
yarn.nodemanager.resource.percentage-physical-cpu-limit
100
yarn.scheduler.minimum-allocation-vcores
1
yarn.nodemanager.log.retain-seconds
604800
yarn.nodemanager.vmem-check-enabled
false
yarn.resourcemanager.max-completed-applications
150
yarn.log-aggregation.retain-check-interval-seconds
604800
yarn.nodemanager.linux-container-executor.resources-handler.class
org.apache.hadoop.yarn.server.nodemanager.util.DefaultLCEResourcesHandler
3.7.6 修改slaves文件
该文件配置hdfs的数据存储节点(datanodes):
hadoop-3
hadoop-4
hadoop-5
3.8 集群初始化和启动
# 1. 启动zookeeper集群(hadoop-1、hadoop-2、hadoop-3、hadoop-4)
zkServer.sh start
#运行jps命令,对应机器多了QuorumPeerMain的进程
# 2. 启动journalnode(hadoop-3、hadoop-4、hadoop-5)
hadoop-daemon.sh start journalnode
#运行jps命令可以看到多了JournalNode进程
#ps: 在任意一台机器启动会远程拉起所有机器的进程
# 3. 格式化namenode(hadoop-1)
hdfs namenode -format
# 4. 格式化ZKFC(初始化 HA 状态到 zk)(hadoop-1)
hdfs zkfc -formatZK
# 5. 启动 namenode1(hadoop-1)
hadoop-daemon.sh start namenode
#运行jps命令可以看到多了NameNode进程
# 6. 同步 namenode(hadoop-2)
hdfs namenode -bootstrapStandby
# 7. 启动 namenode2(hadoop-2)
hadoop-daemon.sh start namenode
#运行jps命令可以看到多了NameNode进程
# 8. 启动ZookeeperFailoverController(hadoop-1,hadoop-2)
hadoop-daemon.sh start zkfc
#运行jps命令可以看到多了DFSZKFailoverController进程.
#哪台机器先启动zkfc,哪台就是active
# 9. 启动 datanode(hadoop-3、hadoop-4、hadoop-5)
hadoop-daemon.sh start datanode
#运行jps命令,多了DataNode进程
# 10. 启动 resourcemanager(hadoop-2,hadoop-1)
yarn-daemon.sh start resourcemanager
#启动时先启动hadoop-2的rm,这样将hadoop-2的rm置为active(也可以通过命令手动切换)
#运行jps,多了ResourceManager进程
# 11. 启动 nodemanager(hadoop-3、hadoop-4、hadoop-5)
yarn-daemon.sh start nodemanager
#运行jps,多了NodeManager进程
# 12. 启动 historyserver(hadoop-1,hadoop-2)
mr-jobhistory-daemon.sh start historyserver
#运行jps,多了JobHistoryServer进程
3.9 集群日常启动与关闭
上述流程为刚初始化并启动集群的流程,日常启动无需格式化NameNode,否则数据会丢失。
3.9.1 正常的启动顺序
# 1. 启动 zookeeper(hadoop-1~hadoop-4)
zkServer.sh start
# 2. 启动 journalnode(hadoop-3~hadoop-5)
hadoop-daemons.sh start journalnode
# 3. 启动 namenode(hadoop-1, hadoop-2)
hadoop-daemon.sh start namenode
# 4. 启动ZookeeperFailoverController(hadoop-1,hadoop-2)
hadoop-daemon.sh start zkfc
#ps: 哪台机器先启动zkfc,哪台就是active
# 5. 启动 datanode(hadoop-3~hadoop-5)
hadoop-daemons.sh start datanode
# 6. 启动 resourcemanager(hadoop-2,hadoop-1)
yarn-daemon.sh start resourcemanager
#ps: 先启动的为active
# 7. 启动 nodemanager(hadoop-3~hadoop-5)
yarn-daemon.sh start nodemanager
# 8. 启动 historyserver(hadoop-1,hadoop-2)
mr-jobhistory-daemon.sh start historyserver
上述流程的6和7可以改为:先在某个RM节点上运行start-yarn.sh
(启动active RM和所有NM),再在另一个RM节点上运行yarn-daemon.sh start resourcemanager
(启动standby RM)
3.9.2 正常的关闭顺序
在开启相关服务的机器上执行:
# 1. 关闭 historyserver(hadoop-1,hadoop-2)
mr-jobhistory-daemon.sh stop historyserver
# 2. 关闭 nodemanager(hadoop-3~hadoop-5)
yarn-daemon.sh stop nodemanager
# 3. 关闭 resourcemanager(hadoop-2,hadoop-1)
yarn-daemon.sh stop resourcemanager
# 4. 关闭 datanode (hadoop-3~hadoop-5)
hadoop-daemons.sh stop datanode
# 5. 关闭 ZookeeperFailoverController (hadoop-1,hadoop-2)
hadoop-daemon.sh stop zkfc
# 6. 关闭 namenode(hadoop-1, hadoop-2
hadoop-daemon.sh stop namenode
# 7. 关闭 journalnode(hadoop-3~hadoop-5)
hadoop-daemons.sh stop journalnode
# 8. 关闭 zookeeper(hadoop-1~hadoop-4)
zkServer.sh stop
上述流程的2和3可以改为:现在standby RM上执行yarn-daemon.sh stop resourcemanager
(关闭standby RM),再在active RM上执行stop-yarn.sh
(关闭active RM和所有NM).
四、验证高可用
相关命令:
# hdfs查看nn命令
hdfs haadmin -getServiceState nn1
# hdfs 切换为active命令
hdfs haadmin -transitionToActive --forcemanual nn1
# hdfs 切换为standby命令
hdfs haadmin -transitionToStandby --forcemanual nn2
# yarn查看rm命令
yarn rmadmin -getServiceState rm1
# yarn切换为 standby 状态
yarn rmadmin -transitionToStandby --forcemanual rm2
# yarn切换为 active 状态
yarn rmadmin -transitionToActive --forcemanual rm1
验证方法:在active NameNode节点将NameNode kill掉,看standby NameNode是否变成active状态;active Resoucemanager 节点将Resoucemanager进程kill掉,看standby Resoucemanager是否变成active状态。