--笔记心得8-HA环境搭建试验

1。刚刚开始搭建看官网,不使用zookeeper,不搭建自动故障转移,以为可以直接配置使用。但是总是会出现这样的情况,使用hdfs dfs -mkdir /lcc 总是把文件建立在本地。最后才知道只要是HA,必须要有zookeeper.
2。建立好zookeeper后,搭建好HA发现还是那个问题。但是使用命令hdfs dfs -mkdir /lcc建立的是本地文件夹,使用hdfs dfs -mkdir hdfs://mycluster/lcc才能建立远程文件夹,因此猜测默认文件访问可能没有修改。
最后查看文档发现

<property>
        <name>fs.defaultFSname>
        <value>hdfs://myclustervalue>
property>

让我写成了、

<property>
        <name>dfs.defaultFSname>
        <value>hdfs://myclustervalue>
property>

所以错了。

3.下面开始从新搭建一个完整的HA
首先我们要保证有三台机器,三台机器分别设置如下
--笔记心得8-HA环境搭建试验_第1张图片
4.首先我们要设置好IP与域名的对应关系,保证能联网,IP自己设定

修改主机名
[root@biluos2 ~]# vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=biluos2.com

修改对应关系
[root@biluos2 ~]# vim /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.10.173         biluos.com      biluos
192.168.10.174         biluos1.com     biluos1
192.168.10.175         biluos2.com     biluos2

5。配置ssh无秘钥登录

biluos.com执行
ssh-keygen -t rsa
ssh-copy-id -i .ssh/id_rsa.pub root@biluos.com 
ssh-copy-id -i .ssh/id_rsa.pub root@biluos1.com 
ssh-copy-id -i .ssh/id_rsa.pub root@biluos2.com 

biluos1.com执行
ssh-keygen -t rsa
ssh-copy-id -i .ssh/id_rsa.pub root@biluos.com 
ssh-copy-id -i .ssh/id_rsa.pub root@biluos1.com 
ssh-copy-id -i .ssh/id_rsa.pub root@biluos2.com 

biluos2.com执行
ssh-keygen -t rsa
ssh-copy-id -i .ssh/id_rsa.pub root@biluos.com 
ssh-copy-id -i .ssh/id_rsa.pub root@biluos1.com 
ssh-copy-id -i .ssh/id_rsa.pub root@biluos2.com 

这里一定要使用域名,不要使用ip,否则会出现莫名其妙的问题。

6。下载zookeeper和hadoop解压到/opt/moudles/目录下,注意下载hadoop是2.7.3建议下载一个高版本的,因为高版本的没有本地库问题。这里没说安装jdk,这个必须安装。

[root@biluos2 ~]# ll /opt/moudles/ 
total 28
drwxr-xr-x. 10 root root 4096 Jul 31 03:09 hadoop-2.7.3
drwxr-xr-x.  3 root root 4096 Jul 30 03:24 hadoop-2.7.3.data
drwxr-xr-x.  8 root root 4096 Jul 26 08:37 jdk1.8.0_121
drwxr-xr-x.  6 root root 4096 Jul 26 08:46 myhadoopdata
drwxr-xr-x.  4 root root 4096 Jul 30 05:04 myzookeeperdata
drwxr-xr-x. 10 root root 4096 Jul 30 05:27 zookeeper-3.4.6

7。配置环境变量

export JAVA_HOME=/opt/moudles/jdk1.8.0_121
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar

export HADOOP_HOME=/opt/moudles/hadoop-2.7.3
export PATH=$PATH:$HADOOP_HOME/bin

export ZOOKEEPER_HOME=/opt/moudles/zookeeper-3.4.6
export PATH=$PATH:$ZOOKEEPER_HOME/bin:$ZOOKEEPER_HOME/conf

8。配置zookeeper

[root@biluos zookeeper-3.4.6]# vim conf/zoo.cfg 
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/opt/moudles/myzookeeperdata/data
dataLogDir=/opt/moudles/myzookeeperdata/logs
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
autopurge.snapRetainCount=30
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
autopurge.purgeInterval=24
server.1=biluos.com:2888:3888
server.2=biluos1.com:2888:3888
server.3=biluos2.com:2888:3888

9.上面配置的两个路径dataDir=/opt/moudles/myzookeeperdata/datadataLogDir=/opt/moudles/myzookeeperdata/logs需要自己手动建立,并且在dataDir=/opt/moudles/myzookeeperdata/data这个路径下,建立文件myid,第一个在主机biluos.com里为1,第二个在biluos1.com为2,第三个biluos2.com为3,记住了,一定要正确,不要多字符少,字符。否则会启动不起来。
10。启动,并且验证是否zookeeper是否正常启动,在三台机器每个都执行bin/zkServer.sh start 这个去启动zookeeper,然后验证bin/zkServer.sh status

[root@biluos zookeeper-3.4.6]# bin/zkServer.sh status
JMX enabled by default
Using config: /opt/moudles/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: follower

[root@biluos1 zookeeper-3.4.6]# bin/zkServer.sh status
JMX enabled by default
Using config: /opt/moudles/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: leader

[root@biluos2 zookeeper-3.4.6]# bin/zkServer.sh status
JMX enabled by default
Using config: /opt/moudles/zookeeper-3.4.6/bin/../conf/zoo.cfg
Mode: follower

jps命令下出现

[root@biluos ~]# jps
3113 QuorumPeerMain
4862 Jps

[root@biluos1 ~]# jps
3113 QuorumPeerMain
4862 Jps

[root@biluos2 ~]# jps
3113 QuorumPeerMain
4862 Jps

出现了一个领导者,两个跟随者就是成功了。然后不用zookeeper了。

10。配置hadoop环境


hadoop-env.sh 
export JAVA_HOME=/opt/moudles/jdk1.8.0_121

配置hdfs-site.xml



<configuration>

    <property>
        <name>dfs.nameservicesname>
        <value>myclustervalue>
    property>

    <property>
        <name>dfs.ha.namenodes.myclustername>
        <value>myNameNode1,myNameNode2value>
    property>

    <property>
        <name>dfs.namenode.rpc-address.mycluster.myNameNode1name>
        <value>biluos.com:8020value>
    property>
    <property>
        <name>dfs.namenode.servicerpc-address.mycluster.myNameNode1name>
        <value>biluos.com:8022value>
    property>
    <property>
        <name>dfs.namenode.http-address.mycluster.myNameNode1name>
        <value>biluos.com:50070value>
    property>
    <property>
        <name>dfs.namenode.https-address.mycluster.myNameNode1name>
        <value>biluos.com:50470value>
    property>
    <property>
        <name>dfs.namenode.secondary.http-address.mycluster.myNameNode1name>
        <value>biluos.com:50090value>
    property>



    <property>
        <name>dfs.namenode.rpc-address.mycluster.myNameNode2name>
        <value>biluos1.com:8020value>
    property>
    <property>
        <name>dfs.namenode.servicerpc-address.mycluster.myNameNode2name>
        <value>biluos1.com:8022value>
    property>
    <property>
        <name>dfs.namenode.http-address.mycluster.myNameNode2name>
        <value>biluos1.com:50070value>
    property>
    <property>
        <name>dfs.namenode.https-address.mycluster.myNameNode2name>
        <value>biluos1.com:50470value>
    property>
    <property>
        <name>dfs.namenode.secondary.http-address.mycluster.myNameNode2name>
        <value>biluos1.com:50090value>
    property>


    <property>
        <name>dfs.namenode.name.dirname>
        <value>/opt/moudles/hadoop-2.7.3.data/ha/data/dfs/namenode/namevalue>
    property>
    <property>
        <name>dfs.namenode.edits.dirname>
        <value>/opt/moudles/hadoop-2.7.3.data/ha/data/dfs/namenode/editsvalue>
    property>
    <property>
        <name>dfs.datanode.data.dirname>
        <value>/opt/moudles/hadoop-2.7.3.data/ha/data/dfs/dnvalue>
    property>
    <property>
        <name>dfs.datanode.checkpoint.dirname>
        <value>/opt/moudles/hadoop-2.7.3.data/ha/data/dfs/secondarynamenode/namevalue>
    property>
    <property>
        <name>dfs.datanode.checkpoint.edits.dirname>
        <value>/opt/moudles/hadoop-2.7.3.data/ha/data/dfs/secondarynamenode/editsvalue>
    property>


    <property>
        <name>dfs.namenode.shared.edits.dirname>
        <value>qjournal://biluos.com:8485;biluos1.com:8485;biluos2.com:8485/myclustervalue>
    property>

    <property>
        <name>dfs.journalnode.edits.dirname>
        <value>/opt/moudles/hadoop-2.7.3.data/ha/data/dfs/jnvalue>
    property>

    <property>
        <name>dfs.client.failover.proxy.provider.myclustername>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvidervalue>
    property>

    <property>
        <name>dfs.ha.fencing.methodsname>
        <value>sshfencevalue>
    property>
    <property>
        <name>dfs.ha.fencing.ssh.private-key-filesname>
        <value>/root/.ssh/id_rsavalue>
    property>


    <property>
        <name>dfs.replicationname>
        <value>1value>
    property>
    <property>
        <name>dfs.permissions.enabledname>
        <value>falsevalue>
    property>

    <property>
        <name>dfs.ha.automatic-failover.enabledname>
        <value>truevalue>
    property>



configuration>
注意:上面一些文件夹要在本地自己建立。
mkdir -p /opt/moudles/hadoop-2.7.3.data/ha/data/dfs/namenode/name
mkdir -p /opt/moudles/hadoop-2.7.3.data/ha/data/dfs/namenode/edits
mkdir -p /opt/moudles/hadoop-2.7.3.data/ha/data/dfs/dn
mkdir -p /opt/moudles/hadoop-2.7.3.data/ha/data/dfs/secondarynamenode/name
mkdir -p /opt/moudles/hadoop-2.7.3.data/ha/data/dfs/secondarynamenode/edits
mkdir -p /opt/moudles/hadoop-2.7.3.data/ha/data/dfs/jn

dfs.ha.fencing.methods这个不要写成dfs.ha.fencing.method这个了,我就是写错了,导致集群看起来很正常,但是就是不会自动切换active状态,两个都是standby状态。

配置core-site.xml


<configuration>

    <property>
        <name>fs.defaultFSname>
        <value>hdfs://myclustervalue>
    property>

    <property>
        <name>hadoop.tmp.dirname>
        <value>/opt/moudles/hadoop-2.7.3.data/ha/data/tmpvalue>
    property>

    <property>
        <name>fs.trash.intervalname>
        <value>10080value>
    property>

    <property>
        <name>ha.zookeeper.quorumname>
        <value>biluos.com:2181,biluos1.com:2181,biluos2.com:2181value>
    property>


configuration>

注意:fs.defaultFS这个不要写成dfs.defaultFS了,我就是写错了,导致一直使用hdfs dfs -mkdir /lcc  命令结果把文件夹建立本地了。因为配错了。

配置slaves文件
biluos.com
biluos1.com
biluos2.com

11。按照步骤启动

1)先在每个zookeeper节点启动(我这里是三台都启动)
        /opt/moudles/zookeeper-3.4.6/bin/zkServer.sh start
        jps如下:
        [root@biluos1 ~]# jps
        1509 QuorumPeerMain
    (2)启动每个zookeeper节点journalnode(我这里是三台都启动)
        /opt/moudles/hadoop-2.7.3/sbin/hadoop-daemon.sh start journalnode
        jps如下:
        [root@biluos1 ~]# jps
        1509 QuorumPeerMain
        5851 JournalNode
    (3)格式化第一条安装namenode的机器,我的是biluos.com
        /opt/moudles/hadoop-2.7.3/bin/hadoop namenode -format
        不要出错。如果报错
        java.io.IOException: Incompatible clusterIDs in /opt/moudles/hadoop-
            c31a265c5e6e; datanode clusterID = CID-a2b73025-f5cc-4bf2-8793-
                    c1ff1a3628bd at org.apache.hadoop.hdfs.server.datanode.DataStorage.
                    doTransition(DataStorage.java:775)
        java.io.IOException: All specified directories are failed to load.
                at org.apache.hadoop.hdfs.server.datanode.DataStorage.
                recoverTransitionRead(DataStorage.java:574)
        一定是datanode Id不对了,删除掉重新格式化,或者修改配置文件
        rm -rf /opt/moudles/hadoop-2.7.3.data

        mkdir -p /opt/moudles/hadoop-2.7.3.data/ha/data/dfs/namenode/name
        mkdir -p /opt/moudles/hadoop-2.7.3.data/ha/data/dfs/namenode/edits
        mkdir -p /opt/moudles/hadoop-2.7.3.data/ha/data/dfs/dn
        mkdir -p /opt/moudles/hadoop-2.7.3.data/ha/data/dfs/secondarynamenode/name
        mkdir -p /opt/moudles/hadoop-2.7.3.data/ha/data/dfs/secondarynamenode/edits
        mkdir -p /opt/moudles/hadoop-2.7.3.data/ha/data/dfs/jn

     (4)在第1台机器上启动nameNode,我的是biluos.com 
          /opt/moudles/hadoop-2.7.3/sbin/hadoop-daemon.sh start namenode
     (5)在第二台机器上进行元数据同步,我的是biluos1.com 
         /opt/moudles/hadoop-2.7.3/bin/hdfs namenode -bootstrapStandby
         启动nameNode
         /opt/moudles/hadoop-2.7.3/sbin/hadoop-daemon.sh start namenode
     (6)所有节点启动dataNode
        /opt/moudles/hadoop-2.7.3/sbin/hadoop-daemon.sh start datanode
     (7)此时两个nameNode都启动了,但是都是standby状态
        /opt/moudles/hadoop-2.7.3/sbin/hadoop-daemon.sh start datanode
    (8)格式化zk,在一台机器上执行就可以了
        hdfs zkfc -formatZK 
    (9)在有nameNode的节点上启动DFSZKFailoverController
        /opt/moudles/hadoop-2.7.3/sbin/hadoop-daemon.sh start zkfc
        启动完成后,会发现,nameNode其中一个自动转换为active状态了。
    (10)最后jps结果如下
        [root@biluos moudles]# jps
        7536 NameNode
        8018 Jps
        2024 QuorumPeerMain
        7849 DFSZKFailoverController
        7673 DataNode
        7451 JournalNode

        [root@biluos1 ~]# jps
        6240 DFSZKFailoverController
        1509 QuorumPeerMain
        5975 NameNode
        6535 Jps
        5851 JournalNode
        6109 DataNode
        [root@biluos1 ~]# 

        [root@biluos2 ~]# jps
        3684 JournalNode
        4006 Jps
        3770 DataNode
        1501 QuorumPeerMain
        [root@biluos2 ~]# 

这里可能会出现很多问题,但是问题主要都是配置错误,记得关闭防火墙,否则会失败的。

你可能感兴趣的:(大数据-hadoop)