Centos7 安装Hadoop3.x 完全分布式部署

  • 1. 最小化安装CentOS 7 系统
    • 1.1 安装net-tools启用 ifconfig
    • 1.2 更新系统
    • 1.3 配置系统IP为固定IP
      • 1.3.1 查看网卡(文件 ifcfg-enp* 为网卡文件)
      • 1.3.2 配置网卡(virtualBox 分配 host-only网卡,并使用固定IP)
      • 1.3.3 重启网卡
      • 1.3.4 修改主机名称(可以在安装时候指定)
    • 1.4 配置Host,可以使用名称直接访问
    • 1.5 配置免密码登录,生成各种密码文件
  • 2. 安装JDK
    • 2.1 下载JDK 下载
    • 2.2 将下载的JDK放到 opt目录下解压
    • 2.2 将JDK添加到环境变量中
    • 2.3 验证JDK是否安装成功
  • 3. 安装Hadoop
    • 3.1 下载Hadoop,下载地址
    • 3.2 将下载的Hadoop放入/opt目录
    • 3.3 安装Zookeeper
      • 3.3.1 下载Zookeeper 下载地址
      • 3.3.2 拷贝zookeeper到需要的机器上
      • 3.3.3 解压zookeeper
      • 3.3.4 创建连接文件
      • 3.3.5 配置环境变量
      • 3.3.6 配置zookeeper集群,修改配置文件
      • 3.3.7 将配置文件复制到其他节点
      • 3.3.8 创建节点ID,在配置的 dataDir 路径中添加myid文件
      • 3.3.9 启动 zookeeper(已经添加到环境变量了)
      • 3.3.10 检验是否启动成功
      • 3.3.11 (可选) zookeeper Centos7 配置开机自启动
    • 3.4 修改Hadoop配置(完全分布式)
      • 3.4.1 配置Hadoop 环境变量
      • 3.4.2 HADOOP 节点分布如下:
      • 3.4.3 修改Hadoop环境配置文件 hadoop-env.sh
      • 3.4.4 参考 官方文档 配置高可用HDFS
    • 3.5 启动HDFS
      • 3.5.1 先启动zookeeper
      • 3.5.2 在其中一个namenode上格式化zookeeper
      • 3.5.3 启动journalnode,需要启动所有节点的journalnode
      • 3.5.4 格式化namenode
      • 3.5.5 启动namenode,以便同步其他namenode
      • 3.5.6 其他namenode同步
      • 3.5.7 配置datanode
      • 3.5.7 启动hdfs
  • 4. Hadoop 配置日志聚合和jobhistoryserver
    • 4.1 yarn-site.xml 配置resourcemanager web监听
    • 4.2 mapred-site.xml配置jobhistoryserver
    • 4.3 yarn-site.xml配置日志聚合
    • 错误处理
      • 1. zkfc 格式化错误
      • 2. 格式化namenode 报错,一直在尝试连接
      • 3. hdfs 启动报错
      • 4. yarn 启用报错
      • 5. NodeManager 启动报错
      • 6. NodeManager启动之后又结束
      • 7. hdfs 安全模式开(safe mode is on)

1. 最小化安装CentOS 7 系统

1.1 安装net-tools启用 ifconfig

  yum install net-tools vim

1.2 更新系统

    yum update

1.3 配置系统IP为固定IP

1.3.1 查看网卡(文件 ifcfg-enp* 为网卡文件)

ls /etc/sysconfig/network-scripts/

1.3.2 配置网卡(virtualBox 分配 host-only网卡,并使用固定IP)

vi /etc/sysconfig/network-scripts/ifcfg-enp*
# 启用host-only网卡
cd /etc/sysconfig/network-scripts/
cp ifcfg-enp0s3  ifcfg-enp0s8

修改网卡为静态IP
1. 修改BOOTPROTO为static
2. 修改NAME为enp0s8
3. 修改UUID(可以随意改动一个值,只要不和原先的一样)
4. 添加IPADDR,可以自己制定,用于主机连接虚拟机使用。
5. 添加NETMASK=255.255.255.0 (网管 也可以和网段一样 x.x.x.255)

1.3.3 重启网卡

service network restart

1.3.4 修改主机名称(可以在安装时候指定)

vim /etc/hostname

1.4 配置Host,可以使用名称直接访问

vim /etc/hosts
# 复制到其他机器
scp /etc/hosts  root@192.168.56.12:/etc/hosts

增加内容

1.5 配置免密码登录,生成各种密码文件

    ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    chmod 0600 ~/.ssh/authorized_keys
    # 拷贝公钥到远程服务器
    cat ~/.ssh/id_rsa.pub | ssh root@192.168.56.101  "cat - >> ~/.ssh/authorized_keys"

    # 如果需要互相免密码登录,则执行下面命令
    scp .ssh/authorized_keys  root@192.168.56.14:~/.ssh/authorized_keys

2. 安装JDK

2.1 下载JDK 下载

2.2 将下载的JDK放到 opt目录下解压

    cd /opt/
    tar -xzvf server-jre-8u161-linux-x64.tar.gz
    # 创建快捷方式
    ln -sf jdk1.8.0_161/ jdk

2.2 将JDK添加到环境变量中

    vim /etc/profile
    # 添加如下内容
    export JAVA_HOME=/opt/jdk
    export PATH=.:$PATH:$JAVA_HOME/bin
    # 使修改生效
    source /etc/profile

2.3 验证JDK是否安装成功

    java -version

3. 安装Hadoop

3.1 下载Hadoop,下载地址

3.2 将下载的Hadoop放入/opt目录

    # 1. 解压Hadoop
    tar -xzvf hadoop-3.0.0.tar.gz 
    # 2. 创建超连接
    ln -sf hadoop-3.0.0 hadoop

3.3 安装Zookeeper

3.3.1 下载Zookeeper 下载地址

3.3.2 拷贝zookeeper到需要的机器上

    scp /opt/zookeeper-3.4.11.tar.gz node2:/opt/

3.3.3 解压zookeeper

    tar -xzvf zookeeper-3.4.11.tar.gz

3.3.4 创建连接文件

    ln -sf zookeeper-3.4.11 zookeeper

3.3.5 配置环境变量

    vim /etc/profilve
        # 添加如下内容
        export ZOOKEEPER_HOME = /opt/zookeeper
        export PATH = $PATH:$ZOOKEEPER_HOME/bin

3.3.6 配置zookeeper集群,修改配置文件

    cp /opt/zookeeper/conf/zoo_sample.cfg /opt/zookeeper/conf/zoo.cfg
        # 5.1 在zoo.cfg 文件末尾追加(zoo1 为 服务器名称)
        # 具体配置见:http://zookeeper.apache.org/doc/r3.4.11/zookeeperStarted.html#sc_RunningReplicatedZooKeeper
        tickTime=2000
        dataDir=/opt/data/zookeeper # 数据存放路径
        clientPort=2181
        initLimit=5
        syncLimit=2
        server.1=node2:2888:3888
        server.2=node3:2888:3888
        server.3=node4:2888:3888

3.3.7 将配置文件复制到其他节点

    scp /opt/zookeeper/conf/zoo.cfg node2:/opt/zookeeper/conf/

3.3.8 创建节点ID,在配置的 dataDir 路径中添加myid文件

    echo "1" > myid

3.3.9 启动 zookeeper(已经添加到环境变量了)

    zkServer.sh start

3.3.10 检验是否启动成功

    jps

如果看到 如下图进程,表示启动成功

3.3.11 (可选) zookeeper Centos7 配置开机自启动

  1. 在/etc/systemd/system/文件夹下创建一个启动脚本zookeeper.service
    内容如下:
[Unit]
Description=zookeeper
After=syslog.target network.target

[Service]
Type=forking
# 指定zookeeper 日志文件路径,也可以在zkServer.sh 中定义
Environment=ZOO_LOG_DIR=/opt/data/zookeeper/logs
# 指定JDK路径,也可以在zkServer.sh 中定义
Environment=JAVA_HOME=/opt/jdk
ExecStart=/opt/zookeeper/bin/zkServer.sh start
ExecStop=/opt/zookeeper/bin/zkServer.sh stop
Restart=always
User=root
Group=root

[Install]
WantedBy=multi-user.target
  1. 重新加载服务
systemctl daemon-reload
  1. 启动zookeeper
systemctl start zookeeper
  1. 开机自启动
systemctl enable zookeeper
  1. 查看zookeeper状态
systemctl status zookeeper

问题:

nohup: 无法运行命令”java”: 没有那个文件或目录 \
nohup: failed to run command `java’: No such file or directory

解决方法: \
主要是找不到Java造成的,配置下环境变量即可,可以在zkServer.sh 中添加如下:

    JAVA_HOME=/opt/jdk

或者在zookeeper.service中指定:

    Environment=JAVA_HOME=/opt/jdk

3.4 修改Hadoop配置(完全分布式)

参考文档:
1. (Hadoop HDFS分布式配置)http://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
2. (Hadoop Yarn 分布式配置) http://hadoop.apache.org/docs/r3.0.0/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html

3.4.1 配置Hadoop 环境变量

    # 添加hadoop环境变量
    export HADOOP_HOME = /opt/hadoop
    export PATH = $PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:
    # 启用环境变量
    source /etc/profile

3.4.2 HADOOP 节点分布如下:

节点 NN DN ZK ZKFC JN RM NM
Node1 1 1 1
Node2 1 1 1 1 1 1 1
Node3 1 1 1 1
Node4 1 1 1 1

上面已经配置好了zookeeper,这里就不需要在配置了

3.4.3 修改Hadoop环境配置文件 hadoop-env.sh

    # 设置Java环境变量
    exprot JAVA_HOME = /opt/jdk
    export HADOOP_HOME = /opt/hadoop

3.4.4 参考 官方文档 配置高可用HDFS

  1. 配置 hdfs-site.xml 文件如下:
<configuration>
    <property>
        
        <name>dfs.nameservicesname>
        <value>hbzxvalue>
    property>
    <property>
        
        <name>dfs.permissions.enabledname>
        <value>falsevalue>
    property>
    <property>
        
        <name>dfs.ha.namenodes.hbzxname>
        <value>nn1,nn2value>
    property>
    <property>
        
        <name>dfs.namenode.rpc-address.hbzx.nn1name>
        <value>node1:9820value>
    property>
    <property>
        
        <name>dfs.namenode.rpc-address.hbzx.nn2name>
        <value>node2:9820value>
    property>
    <property>
        
        <name>dfs.namenode.http-address.hbzx.nn1name>
        <value>node1:9870value>
    property>
    <property>
        
        <name>dfs.namenode.http-address.hbzx.nn2name>
        <value>node2:9870value>
    property>

    <property>
        
        <name>dfs.namenode.shared.edits.dirname>
        <value>qjournal://node2:8485;node3:8485;node4:8485/hbzxvalue>
    property>

    <property>
        
        <name>dfs.client.failover.proxy.provider.hbzxname>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvidervalue>
    property>

    <property>
        
        <name>dfs.ha.fencing.methodsname>
        <value>sshfencevalue>
    property>

    <property>
        <name>dfs.ha.fencing.ssh.private-key-filesname>
        <value>/root/.ssh/id_rsavalue>
    property>

    <property>
        
        <name>dfs.journalnode.edits.dirname>
        <value>/opt/data/journal/node/local/datavalue>
    property>

    <property>
        
        <name>dfs.ha.automatic-failover.enabledname>
        <value>truevalue>
    property>

configuration>
  1. 配置 core-site.xml
<configuration>
    <property>
        
        <name>fs.defaultFSname>
        <value>hdfs://hbzxvalue>
    property>
    <property>
        
        <name>hadoop.tmp.dirname>
        <value>/opt/data/hadoop/value>
    property>

    <property>
        
        <name>ha.zookeeper.quorumname>
        <value>node2:2181,node3:2181,node4:2181value>
    property>

configuration>
  1. 配置yarn-site.xml 为单节点默认,多节点参考:官方文档
<configuration>
    <property>
        <name>yarn.nodemanager.aux-servicesname>
        <value>mapreduce_shufflevalue>
    property>
    <property>
        <name>yarn.nodemanager.env-whitelistname>
        <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOMEvalue>
    property>

    <property>
        
        <name>yarn.resourcemanager.ha.enabledname>
        <value>truevalue>
    property>
    <property>
        
        <name>yarn.resourcemanager.cluster-idname>
        <value>hbzxvalue>
    property>
    <property>
        
        <name>yarn.resourcemanager.ha.rm-idsname>
        <value>rm1,rm2value>
    property>
    <property>
        
        <name>yarn.resourcemanager.hostname.rm1name>
        <value>node1value>
    property>
    <property>
        
        <name>yarn.resourcemanager.hostname.rm2name>
        <value>node2value>
    property>
    <property>
        
        <name>yarn.resourcemanager.webapp.address.rm1name>
        <value>node1:8088value>
    property>
    <property>
        
        <name>yarn.resourcemanager.webapp.address.rm2name>
        <value>node2:8088value>
    property>
    <property>
        
        <name>yarn.resourcemanager.zk-addressname>
        <value>node2:2181,node3:2181,node4:2181value>
    property>

    <property>
        
        <name>yarn.nodemanager.resource.detect-hardware-capabilitiesname>
        <value>truevalue>
    property>
configuration>
  1. 配置mapred-site.xml
<configuration>
    <property>
        <name>mapreduce.framework.namename>
        <value>yarnvalue>
    property>
configuration>
  1. 将配置文件复制到其他机器
scp ./* node4:/opt/hadoop/etc/hadoop/

3.5 启动HDFS

3.5.1 先启动zookeeper

    zkServer.sh start

3.5.2 在其中一个namenode上格式化zookeeper

hdfs zkfc -formatZK

如下图表示格式化成功

3.5.3 启动journalnode,需要启动所有节点的journalnode

hdfs --daemon start journalnode

使用JPS命令查看journalnode是否启动成功,成功之后能看到JournalNode如下图:

3.5.4 格式化namenode

hdfs namenode -format 
# 如果有多个namenode名称,可以使用  hdfs namenode -format xxx 指定

如果没有Error日志输出表示格式化成功

3.5.5 启动namenode,以便同步其他namenode

hdfs --daemon start namenode

启动之后使用jps命令查询是否启动成功

3.5.6 其他namenode同步

  1. 如果是使用高可用方式配置的namenode,使用下面命令同步(需要同步的namenode执行).
hdfs namenode -bootstrapStandby


2. 如果不是使用高可用方式配置的namenode,使用下面命令同步:

hdfs namenode -initializeSharedEdits

3.5.7 配置datanode

修改workers 文件,添加datanode节点

node2
node3
node4

3.5.7 启动hdfs

start-dfs.sh

jps 查看结果:



通过浏览器访问hdfs
http://192.168.56.11:9870

4. Hadoop 配置日志聚合和jobhistoryserver

4.1 yarn-site.xml 配置resourcemanager web监听

<property>
         <name>yarn.resourcemanager.webapp.addressname>
         <value>rmhost:8088value>
 property>

4.2 mapred-site.xml配置jobhistoryserver

<property>
    <name>mapreduce.jobhistory.addressname>
    <value>rmhost:10020value>
property>
<property>
    <name>mapreduce.jobhistory.webapp.addressname>
    <value>rmhost:19888value>
property>
<property>
    <name>mapreduce.jobhistory.intermediate-done-dirname>
    <value>/mr-history/tmpvalue>
property>
<property>
    <name>mapreduce.jobhistory.done-dirname>
    <value>/mr-history/donevalue>
property>

注意:jobhistoryserver需单独启动

mapred --daemon start historyserver

4.3 yarn-site.xml配置日志聚合


<property>
    <name>yarn.log-aggregation-enablename>
    <value>truevalue>
property>

<property>
    <name>yarn.nodemanager.remote-app-log-dirname>
    <value>/user/container/logsvalue>
property> 

错误处理

1. zkfc 格式化错误

java.net.NoRouteToHostException: 没有到主机的路由
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
2018-02-06 11:34:01,218 ERROR ha.ActiveStandbyElector: Connection timed out: couldn't connect to ZooKeeper in 5000 milliseconds
2018-02-06 11:34:01,461 INFO zookeeper.ClientCnxn: Opening socket connection to server node2/192.168.56.12:2181. Will not attempt to authenticate using SASL (unknown error)

解决方法:

关闭防火墙,并禁止防火墙启动

systemctl stop firewalld.service #停止firewall
systemctl disable firewalld.service #禁止firewall开机启动

2. 格式化namenode 报错,一直在尝试连接

如图:

2018-02-06 11:43:58,061 INFO ipc.Client: Retrying connect to server: node2/192.168.56.12:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-02-06 11:43:58,062 INFO ipc.Client: Retrying connect to server: node4/192.168.56.14:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-02-06 11:43:58,062 INFO ipc.Client: Retrying connect to server: node3/192.168.56.13:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

解决办法:

启用 journalnode,需要分别启动所有节点

hdfs --daemon start journalnode

使用JPS命令查看journalnode是否启动成功,成功之后能看到JournalNode如下图:

3. hdfs 启动报错

Starting namenodes on [node1 node2]
ERROR: Attempting to operate on hdfs namenode as root
ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
Starting datanodes
ERROR: Attempting to operate on hdfs datanode as root
ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.
Starting journal nodes [node2 node3 node4]
ERROR: Attempting to operate on hdfs journalnode as root
ERROR: but there is no HDFS_JOURNALNODE_USER defined. Aborting operation.
Starting ZK Failover Controllers on NN hosts [node1 node2]
ERROR: Attempting to operate on hdfs zkfc as root
ERROR: but there is no HDFS_ZKFC_USER defined. Aborting operation.

解决方法:
在start-dfs.sh,stop-dfs.sh 开始位置增加如下配置:

# 注意等号前后不要有空格
HDFS_NAMENODE_USER=root
HDFS_DATANODE_USER=root
HDFS_JOURNALNODE_USER=root
HDFS_ZKFC_USER=root

4. yarn 启用报错

Starting resourcemanager
ERROR: Attempting to operate on yarn resourcemanager as root
ERROR: but there is no YARN_RESOURCEMANAGER_USER defined. Aborting operation.
Starting nodemanagers
ERROR: Attempting to operate on yarn nodemanager as root
ERROR: but there is no YARN_NODEMANAGER_USER defined. Aborting operation.

解决办法:

在start-yarn.sh 文件开始处添加:

# 注意等号前后不要有空格
YARN_RESOURCEMANAGER_USER=root
YARN_NODEMANAGER_USER=root

5. NodeManager 启动报错

2018-02-06 15:22:36,169 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Your endpoint configuration is wrong; For more details see:  http://wiki.apache.org/hadoop/UnsetHostnameOrPort
    at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:259)
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
    at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:451)
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:834)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:894)

解决办法:
让 NodeManager自动检测内容和CPU,在yarn-size.xml 添加如下配置:

    <property>
        
        <name>yarn.nodemanager.resource.detect-hardware-capabilitiesname>
        <value>truevalue>
    property>

6. NodeManager启动之后又结束

2018-02-06 16:50:31,210 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Received SHUTDOWN signal from Resourcemanager, Registration of NodeManager failed, Message from ResourceManager: NodeManager from  node4 doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the NodeManager. Node capabilities are 256, vCores:1>; minimums are 1024mb and 1 vcores
    at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:259)
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
    at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:451)
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:834)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:894)
Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Received SHUTDOWN signal from Resourcemanager, Registration of NodeManager failed, Message from ResourceManager: NodeManager from  node4 doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the NodeManager. Node capabilities are 256, vCores:1>; minimums are 1024mb and 1 vcores
    at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:375)
    at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:253)
    ... 6 more

解决办法:升级内存,NodeManager内存最小要求为1024M 和 1核CPU

7. hdfs 安全模式开(safe mode is on)

解决办法:

hadoop dfsadmin -safemode leave

你可能感兴趣的:(hadoop)