Centos7.5安装分布式Hadoop2.6.0+Hbase+Hive(CDH5.14.2离线安装tar包)

Tags: Hadoop

Centos7.5安装分布式Hadoop2.6.0+Hbase+Hive(CDH5.14.2离线安装tar包)

 

  • Centos7.5安装分布式Hadoop2.6.0+Hbase+Hive(CDH5.14.2离线安装tar包)
      • 主机环境
      • 软件环境
      • 主机规划
      • 主机安装前准备
      • 安装jdk1.8
      • 安装zookeeper
      • 安装hadoop
        • 配置HDFS
        • 配置YARN
        • 集群初始化
        • 启动HDFS
        • 启动YARN
        • 整个集群启动顺序
          • 启动
          • 停止
      • Hbase安装
      • Hive安装

 

主机环境

基本配置:

节点数 5
操作系统 CentOS Linux release 7.5.1804 (Core)
内存 8GB

流程配置:

节点数 5
操作系统 CentOS Linux release 7.5.1804 (Core)
内存 16GB

注: 实际生产中按照需求分配内存,如果只是在vmvare中搭建虚拟机,内存可以调整为每台主机1-2GB即可

软件环境

软件 版本 下载地址
jdk jdk-8u172-linux-x64 点击下载
hadoop hadoop-2.6.0-cdh5.14.2 点击下载
zookeeeper zookeeper-3.4.5-cdh5.14.2 点击下载
hbase hbase-1.2.0-cdh5.14.2 点击下载
hive hive-1.1.0-cdh5.14.2 点击下载

注: CDH5的所有软件可以在此下载:http://archive.cloudera.com/cdh5/cdh/5/

主机规划

5个节点角色规划如下:

主机名 CDHNode1 CDHNode2 CDHNode3 CDHNode4 CDHNode5
IP 192.168.223.201 192.168.223.202 192.168.223.203 192.168.223.204 192.168.223.205
namenode yes yes no no no
dataNode no no yes yes yes
resourcemanager yes yes no no no
journalnode yes yes yes yes yes
zookeeper yes yes yes no no
hmaster(hbase) yes yes no no no
regionserver(hbase) no no yes yes yes
hive(hiveserver2) no no yes yes yes

注: Journalnode和ZooKeeper保持奇数个,如果需要高可用则不少于 3 个节点。具体原因,以后详叙。

主机安装前准备

  1. 关闭所有节点的 SELinux
sed -i 's/^SELINUX=.*$/SELINUX=disabled/g' /etc/selinux/config 
setenforce 0
  1. 关闭所有节点防火墙 firewalld or iptables
systemctl disable firewalld;  
systemctl stop firewalld;
systemctl disable iptables; systemctl stop iptables; 
  1. 开启所有节点时间同步 ntpdate
echo "*/5 * * * * /usr/sbin/ntpdate asia.pool.ntp.org | logger -t NTP" >> /var/spool/cron/root
  1. 设置所有节点语言编码以及时区
echo 'export TZ=Asia/Shanghai' >> /etc/profile
echo 'export LANG=en_US.UTF-8' >> /etc/profile . /etc/profile 
  1. 所有节点添加hadoop用户
useradd -m hadoop
echo '123456' | passwd --stdin hadoop
# 设置PS1
su - hadoop
echo 'export PS1="\u@\h:\$PWD>"' >> ~/.bash_profile echo "alias mv='mv -i' alias rm='rm -i'" >> ~/.bash_profile . ~/.bash_profile 
  1. 设置hadoop用户之间免密登录 首先在CDHNode1主机生成秘钥
su - hadoop
ssh-keygen -t rsa	# 一直回车即可生成hadoop用户的公钥和私钥
cd .ssh vi id_rsa.pub # 去掉私钥末尾的主机名 hadoop@CDHNode1 cat id_rsa.pub > authorized_keys chmod 600 authorized_keys 

压缩.ssh文件夹

su - hadoop
zip -r ssh.zip .ssh

随后分发ssh.zip到CDHNode2-5主机hadoop用户家目录解压即完成免密登录

  1. 主机内核参数优化以及最大文件打开数、最大进程数等参数优化 不同主机优化参数有可能不一样,故这里不作出具体优化方法,但如果Hadoop环境用于正式生产,必须优化,linux默认参数可能会导致hadoop集群性能低下。 
  2. datanode节点(CDHNode3-5)挂载数据盘/chunk1,大小15G,请挂载后目录需要授权给hadoop用户

注: 以上操作需要使用 root 用户,到目前为止操作系统环境已经准备完成,以下开始正式安装,后面的操作如果不做特殊说明均使用 hadoop 用户

安装jdk1.8

所有节点都需要安装,安装方式都一样 解压 jdk-8u172-linux-x64.tar.gz

tar zxvf jdk-8u172-linux-x64.tar.gz
mkdir -p /home/hadoop/app
mv jdk-8u172-linux-x64 /home/hadoop/app/jdk rm -f jdk-8u172-linux-x64.tar.gz 

配置环境变量 vi ~/.bash_profile 添加以下内容:

#java
export JAVA_HOME=/home/hadoop/app/jdk
export CLASSPATH=.:$JAVA_HOME/lib:$CLASSPATH export PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/jre/bin 

加载环境变量

. ~/.bash_profile

查看是否安装成功 java -version

java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)

如果出现以上结果证明安装成功。

安装zookeeper

首先在CDHNode1上安装

解压 zookeeper-3.4.5-cdh5.14.2.tar.gz

tar zxvf zookeeper-3.4.5-cdh5.14.2.tar.gz
mv zookeeper-3.4.5-cdh5.14.2 /home/hadoop/app/zookeeper rm -f zookeeper-3.4.5-cdh5.14.2.tar.gz 

设置环境变量 vi ~/.bash_profile 添加以下内容:

#zk
export ZOOKEEPER_HOME=/home/hadoop/app/zookeeper
export PATH=$PATH:$ZOOKEEPER_HOME/bin 

加载环境变量

. ~/.bash_profile

添加配置文件 vi /home/hadoop/app/zookeeper/conf/zoo.cfg 添加以下内容:

# The number of milliseconds of each tick  
tickTime=2000
# The number of ticks that the initial  
# synchronization phase can take  
initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. #数据文件目录与日志目录 dataDir=/home/hadoop/data/zookeeper/zkdata dataLogDir=/home/hadoop/data/zookeeper/zkdatalog # the port at which the clients will connect clientPort=2181 #server.服务编号=主机名称:Zookeeper不同节点之间同步和通信的端口:选举端口(选举leader) server.1=CDHNode1:2888:3888 server.2=CDHNode2:2888:3888 server.3=CDHNode3:2888:3888 # 节点变更时只需在此添加或者删除相应的节点(所有节点配置都需要修改),然后在启动新增或者停止删除的节点即可 # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 

创建所需目录

mkdir -p /home/hadoop/data/zookeeper/zkdata
mkdir -p /home/hadoop/data/zookeeper/zkdatalog
mkdir -p /home/hadoop/app/zookeeper/logs 

添加myid vim /home/hadoop/data/zookeeper/zkdata/myid,添加:

1

注: 此数字来源于zoo.cfg中配置 server.1=CDHNode1:2888:3888行server后面的1,故CDHNode2填写2,CDHNode3填写3

配置日志目录 vim /home/hadoop/app/zookeeper/libexec/zkEnv.sh ,修改以下参数为:

ZOO_LOG_DIR="$ZOOKEEPER_HOME/logs"
ZOO_LOG4J_PROP="INFO,ROLLINGFILE"

注: /home/hadoop/app/zookeeper/libexec/zkEnv.sh 与 /home/hadoop/app/zookeeper/bin/zkEnv.sh 文件内容相同。启动脚本 /home/hadoop/app/zookeeper/bin/zkServer.sh 会优先读取/home/hadoop/app/zookeeper/libexec/zkEnv.sh,当其不存在时才会读取 /home/hadoop/app/zookeeper/bin/zkEnv.sh

vim /home/hadoop/app/zookeeper/conf/log4j.properties ,修改以下参数为:

zookeeper.root.logger=INFO, ROLLINGFILE
zookeeper.log.dir=/home/hadoop/app/zookeeper/logs
log4j.appender.ROLLINGFILE=org.apache.log4j.RollingFileAppender 

复制zookeeper到CDHNode2-3

scp ~/.bash_profile CDHNode2:/home/hadoop
scp ~/.bash_profile CDHNode3:/home/hadoop
scp -pr /home/hadoop/app/zookeeper CDHNode2:/home/hadoop/app scp -pr /home/hadoop/app/zookeeper CDHNode3:/home/hadoop/app ssh CDHNode2 "mkdir -p /home/hadoop/data/zookeeper/zkdata;mkdir -p /home/hadoop/data/zookeeper/zkdatalog;mkdir -p /home/hadoop/app/zookeeper/logs" ssh CDHNode2 "echo 2 > /home/hadoop/data/zookeeper/zkdata/myid" ssh CDHNode3 "mkdir -p /home/hadoop/data/zookeeper/zkdata;mkdir -p /home/hadoop/data/zookeeper/zkdatalog;mkdir -p /home/hadoop/app/zookeeper/logs" ssh CDHNode3 "echo 3 > /home/hadoop/data/zookeeper/zkdata/myid" 

启动zookeeper 3个节点均启动

/home/hadoop/app/zookeeper/bin/zkServer.sh start

查看节点状态

/home/hadoop/app/zookeeper/bin/zkServer.sh status

如果一个节点为leader,另2个节点为follower,则说明Zookeeper安装成功

查看进程

jps

其中 QuorumPeerMain 进程为zookeeper

停止zookeeper

/home/hadoop/app/zookeeper/bin/zkServer.sh stop

安装hadoop

首先在CDHNode1节点安装,然后复制到其他节点 解压 hadoop-2.6.0-cdh5.14.2.tar.gz

tar zxvf hadoop-2.6.0-cdh5.14.2.tar.gz
mv hadoop-2.6.0-cdh5.14.2 /home/hadoop/app/hadoop rm -f hadoop-2.6.0-cdh5.14.2.tar.gz 

设置环境变量 vi ~/.bash_profile 添加以下内容:

#hadoop
HADOOP_HOME=/home/hadoop/app/hadoop
PATH=$HADOOP_HOME/bin:$PATH
export HADOOP_HOME PATH 

加载环境变量

. ~/.bash_profile

配置HDFS

配置 /home/hadoop/app/hadoop/etc/hadoop/hadoop-env.sh, 修改以下内容

export JAVA_HOME=/home/hadoop/app/jdk

配置 /home/hadoop/app/hadoop/etc/hadoop/core-site.xml

"1.0" encoding="UTF-8"?> "text/xsl" href="configuration.xsl"?>   <configuration> <property> <name>fs.defaultFSname> <value>hdfs://cluster1value> property>  <property> <name>hadoop.tmp.dirname> <value>/home/hadoop/data/tmpvalue> property>  <property> <name>ha.zookeeper.quorumname> <value>CDHNode1:2181,CDHNode2:2181,CDHNode3:2181value> property>  configuration> 

配置 /home/hadoop/app/hadoop/etc/hadoop/hdfs-site.xml

"1.0" encoding="UTF-8"?> "text/xsl" href="configuration.xsl"?>   <configuration> <property> <name>dfs.replicationname> <value>3value> property>  <property> <name>dfs.name.dirname> <value>/home/hadoop/data/hdfs/namevalue> property>  <property> <name>dfs.data.dirname> <value>/chunk1value> property>  <property> <name>dfs.permissionsname> <value>falsevalue> property> <property> <name>dfs.permissions.enabledname> <value>falsevalue> property>  <property> <name>dfs.nameservicesname> <value>cluster1value> property>  <property> <name>dfs.ha.namenodes.cluster1name> <value>CDHNode1,CDHNode2value> property>  <property> <name>dfs.namenode.rpc-address.cluster1.CDHNode1name> <value>CDHNode1:9000value> property>  <property> <name>dfs.namenode.http-address.cluster1.CDHNode1name> <value>CDHNode1:50070value> property>  <property> <name>dfs.namenode.rpc-address.cluster1.CDHNode2name> <value>CDHNode2:9000value> property>  <property> <name>dfs.namenode.http-address.cluster1.CDHNode2name> <value>CDHNode2:50070value> property>  <property> <name>dfs.ha.automatic-failover.enabledname> <value>truevalue> property>  <property> <name>dfs.namenode.shared.edits.dirname> <value>qjournal://CDHNode1:8485;CDHNode2:8485;CDHNode3:8485;CDHNode4:8485;CDHNode5:8485/cluster1value> property>  <property> <name>dfs.client.failover.proxy.provider.cluster1name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvidervalue> property>  <property> <name>dfs.journalnode.edits.dirname> <value>/home/hadoop/data/journaldata/jnvalue> property>  <property> <name>dfs.ha.fencing.methodsname> <value>shell(/bin/true)value> 

转载于:https://www.cnblogs.com/leffss/p/9184171.html

你可能感兴趣的:(Centos7.5安装分布式Hadoop2.6.0+Hbase+Hive(CDH5.14.2离线安装tar包))