hadoop namenode格式化问题汇总

hadoop namenode格式化问题汇总

(持续更新)


0 Hadoop集群环境

3台rhel6.4,2个namenode+2个zkfc, 3个journalnode+zookeeper-server 组成一个最简单的HA集群方案。

1) hdfs-site.xml配置如下:





    
    
        dfs.nameservices
        hacl
        unique identifiers for each NameNode in the nameservice.
    

    
        dfs.ha.namenodes.hacl
        hn1,hn2
        Configure with a list of comma-separated NameNode IDs.
    

    
        dfs.namenode.rpc-address.hacl.hn1
        hacl-node1.pepstack.com:8020
        the fully-qualified RPC address for each NameNode to listen on.
    

    
        dfs.namenode.rpc-address.hacl.hn2
        hacl-node2.pepstack.com:8020
        the fully-qualified RPC address for each NameNode to listen on.
    

    
        dfs.namenode.http-address.hacl.hn1
        hacl-node1.pepstack.com:50070
        the fully-qualified HTTP address for each NameNode to listen on.
    

    
        dfs.namenode.http-address.hacl.hn2
        hacl-node2.pepstack.com:50070
        the fully-qualified HTTP address for each NameNode to listen on.
    

    
        dfs.namenode.shared.edits.dir
        qjournal://hacl-node1.pepstack.com:8485;hacl-node2.pepstack.com:8485;hacl-node3.pepstack.com:8485/hacl
        the URI which identifies the group of JNs where the NameNodes will write or read edits.
    

    
        dfs.journalnode.edits.dir
        /hacl/data/dfs/jn
        the path where the JournalNode daemon will store its local state.
    

    
        dfs.client.failover.proxy.provider.hacl
        org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
        the Java class that HDFS clients use to contact the Active NameNode.
    

    
    
        dfs.ha.fencing.methods
        sshfence
        a list of scripts or Java classes which will be used to fence the Active NameNode during a failover.
    

    
        dfs.ha.fencing.ssh.private-key-files
        /var/lib/hadoop-hdfs/.ssh/id_dsa
        The sshfence option SSHes to the target node and uses fuser to kill the process
          listening on the service's TCP port. In order for this fencing option to work, it must be
          able to SSH to the target node without providing a passphrase. Thus, one must also configure the
          dfs.ha.fencing.ssh.private-key-files option, which is a comma-separated list of SSH private key files.
             logon namenode machine:
             cd /var/lib/hadoop-hdfs
             su hdfs
             ssh-keygen -t dsa
        
    
    

    
        dfs.ha.automatic-failover.enabled
        true
    

    
        dfs.ha.automatic-failover.enabled.hacl
        true
    

    
    
        dfs.namenode.name.dir
        /hacl/data/dfs/nn
        Path on the local filesystem where the NameNode stores the namespace and transactions logs persistently.
    

    
        dfs.blocksize
        268435456
        HDFS blocksize of 256MB for large file-systems.
    

    
        dfs.replication
        3
        
    

    
        dfs.namenode.handler.count
        100
        More NameNode server threads to handle RPCs from large number of DataNodes.
    

    
    
        dfs.datanode.data.dir
        /hacl/data/dfs/dn
        Comma separated list of paths on the local filesystem of a DataNode where it should store its blocks.
    

2) core-site.xml配置如下:




    
        fs.defaultFS
        hdfs://hacl
    

    
        io.file.buffer.size
        131072
    

    
        hadoop.tmp.dir
        /hdfs/data/tmp
        chown -R hdfs:hdfs hadoop_tmp_dir
    

    
    
        ha.zookeeper.quorum
        hacl-node1.pepstack.com:2181,hacl-node2.pepstack.com:2181,hacl-node3.pepstack.com:2181
        This lists the host-port pairs running the ZooKeeper service.
    

    


1. namenode格式化过程如下:

1) 启动所有journalnode,必须3个节点的JN都正确启动。关闭所有的namenode:

# service hadoop-hdfs-journalnode start
# service hadoop-hdfs-namenode stop

2) namenode格式化。hacl-pepstack-com是我给集群起的名字,可以忽略。su - hdfs -c "..." 表示以hdfs用户格式化。

hdfs-site.xml和core-site.xml上指定的所有目录都必须赋予正确的权限:

# chown -R hdfs:hdfs /hacl/data/dfs
然后 在任何一个namenode上格式化,比如在hn1上执行

########## hn1
# su - hdfs -c "hdfs namenode -format -clusterid hacl-pepstack-com -force"
# service hadoop-hdfs-namenode start   ##### hn1

首先必须把刚格式化好的hn1启动,然后在另一个namenode上(hn2)执行:

########## hn2
# su - hdfs -c "hdfs namenode -bootstrapStandby -force"
# service hadoop-hdfs-namenode start   ##### hn2

至此,2个namenode都格式化并且启动好了。








你可能感兴趣的:(linux,hadoop,cloud)