(持续更新)
3台rhel6.4,2个namenode+2个zkfc, 3个journalnode+zookeeper-server 组成一个最简单的HA集群方案。
<?xml version="1.0" ?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Quorum Journal Manager HA: http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html --> <configuration> <!-- Quorum Journal Manager HA --> <property> <name>dfs.nameservices</name> <value>hacl</value> <description>unique identifiers for each NameNode in the nameservice.</description> </property> <property> <name>dfs.ha.namenodes.hacl</name> <value>hn1,hn2</value> <description>Configure with a list of comma-separated NameNode IDs.</description> </property> <property> <name>dfs.namenode.rpc-address.hacl.hn1</name> <value>hacl-node1.pepstack.com:8020</value> <description>the fully-qualified RPC address for each NameNode to listen on.</description> </property> <property> <name>dfs.namenode.rpc-address.hacl.hn2</name> <value>hacl-node2.pepstack.com:8020</value> <description>the fully-qualified RPC address for each NameNode to listen on.</description> </property> <property> <name>dfs.namenode.http-address.hacl.hn1</name> <value>hacl-node1.pepstack.com:50070</value> <description>the fully-qualified HTTP address for each NameNode to listen on.</description> </property> <property> <name>dfs.namenode.http-address.hacl.hn2</name> <value>hacl-node2.pepstack.com:50070</value> <description>the fully-qualified HTTP address for each NameNode to listen on.</description> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hacl-node1.pepstack.com:8485;hacl-node2.pepstack.com:8485;hacl-node3.pepstack.com:8485/hacl</value> <description>the URI which identifies the group of JNs where the NameNodes will write or read edits.</description> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/hacl/data/dfs/jn</value> <description>the path where the JournalNode daemon will store its local state.</description> </property> <property> <name>dfs.client.failover.proxy.provider.hacl</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> <description>the Java class that HDFS clients use to contact the Active NameNode.</description> </property> <!-- Automatic failover adds two new components to an HDFS deployment: - a ZooKeeper quorum; - the ZKFailoverController process (abbreviated as ZKFC). Configuring automatic failover: --> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> <description>a list of scripts or Java classes which will be used to fence the Active NameNode during a failover.</description> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/var/lib/hadoop-hdfs/.ssh/id_dsa</value> <description>The sshfence option SSHes to the target node and uses fuser to kill the process listening on the service's TCP port. In order for this fencing option to work, it must be able to SSH to the target node without providing a passphrase. Thus, one must also configure the dfs.ha.fencing.ssh.private-key-files option, which is a comma-separated list of SSH private key files. logon namenode machine: cd /var/lib/hadoop-hdfs su hdfs ssh-keygen -t dsa </description> </property> <!-- Optionally, one may configure a non-standard username or port to perform the SSH. One may also configure a timeout, in milliseconds, for the SSH, after which this fencing method will be considered to have failed. It may be configured like so: <property> <name>dfs.ha.fencing.methods</name> <value>sshfence([[username][:port]])</value> </property> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> </property> //--> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <property> <name>dfs.ha.automatic-failover.enabled.hacl</name> <value>true</value> </property> <!-- Configurations for NameNode: --> <property> <name>dfs.namenode.name.dir</name> <value>/hacl/data/dfs/nn</value> <description>Path on the local filesystem where the NameNode stores the namespace and transactions logs persistently.</description> </property> <property> <name>dfs.blocksize</name> <value>268435456</value> <description>HDFS blocksize of 256MB for large file-systems.</description> </property> <property> <name>dfs.replication</name> <value>3</value> <description></description> </property> <property> <name>dfs.namenode.handler.count</name> <value>100</value> <description>More NameNode server threads to handle RPCs from large number of DataNodes.</description> </property> <!-- Configurations for DataNode: --> <property> <name>dfs.datanode.data.dir</name> <value>/hacl/data/dfs/dn</value> <description>Comma separated list of paths on the local filesystem of a DataNode where it should store its blocks.</description> </property></configuration>
<?xml version="1.0" ?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://hacl</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/hdfs/data/tmp</value> <description>chown -R hdfs:hdfs hadoop_tmp_dir</description> </property> <!-- Configuring automatic failover --> <property> <name>ha.zookeeper.quorum</name> <value>hacl-node1.pepstack.com:2181,hacl-node2.pepstack.com:2181,hacl-node3.pepstack.com:2181</value> <description>This lists the host-port pairs running the ZooKeeper service.</description> </property> <!-- Securing access to ZooKeeper --> </configuration>
1) 启动所有journalnode,必须3个节点的JN都正确启动。关闭所有的namenode:
# service hadoop-hdfs-journalnode start # service hadoop-hdfs-namenode stop
2) namenode格式化。hacl-pepstack-com是我给集群起的名字,可以忽略。su - hdfs -c "..." 表示以hdfs用户格式化。
hdfs-site.xml和core-site.xml上指定的所有目录都必须赋予正确的权限:
# chown -R hdfs:hdfs /hacl/data/dfs然后 在任何一个namenode上格式化,比如在hn1上执行
########## hn1 # su - hdfs -c "hdfs namenode -format -clusterid hacl-pepstack-com -force" # service hadoop-hdfs-namenode start ##### hn1
首先必须把刚格式化好的hn1启动,然后在另一个namenode上(hn2)执行:
########## hn2 # su - hdfs -c "hdfs namenode -bootstrapStandby -force"# service hadoop-hdfs-namenode start ##### hn2
至此,2个namenode都格式化并且启动好了。