Hadoop 2.2.0 cluster install guid

Installing hadoop 2.2.0 clusters with 3 nodes(one for namenode/resourcemanager and secondary namenode while the other tow nodes for datanode/nodemanager)

1. ip assignments

    192.168.122.1        namenode

    192.168.122.2        datanode

    192.168.122.3        datanode

2. download the latest stable hadoop tarball (2.2.0)  and untar it to  /home/xxx/hadoop/hadoop-2.2.0

 

3. prepare the runtime enviroments

   a. java

      install oracle java 1.7.0  and set JAVA_HOME

   b. ssh without passphase

       b1. make sure the namenode has ssh client and server  using the following commands

             #which ssh   /   which sshd   / which ssh-keygen

       b2. generate ssh key pair

             #ssh-keygen -t rsa

            the above commond will produce three files in ~/.ssh dir

       b3.dirstribute public key and validate logins

            #scp ~/.ssh/id_rsa.pub   [email protected]:~/authorized_keys

            #scp ~/.ssh/id_rsa.pub   [email protected]:~/authorized_keys

            ---

            login 192.168.122.2 and 192.168.122.3 and run the following commands

            #mkdir ~/.shh

            #chmod 700 ~/.ssh

            #mv ~/authorized_keys  ~/.ssh/

            #chmod 600 ~/.ssh/authorized_keys

 IF the ssh still prompts your enter password to login, the execute the following commnads

$ chmod go-w $HOME $HOME/.ssh
$ chmod 600 $HOME/.ssh/authorized_keys
$ chown `whoami` $HOME/.ssh/authorized_keys

 

4. edit the core config files for hadoop clusters (nonsecurity mode)

    core-site.xml

    hdfs-site.xml      ( dfs.namenode.hosts is important)

    yarn-site.xml

    mapred-site.xml

   -----

   dfs.namenode.hosts  -> hosts.txt

   the content for hosts.txt like following(the ips for every datanode in the cluster):

   192.168.122.2

   192.168.122.3

 

5.edit  /etc/hosts  in 192.168.122.1 (without DNS)

        192.168.122.1   host.dataminer

        192.168.122.2   f1.zhj

         192.168.122.3  f2.zhj

    meanwhile edit the /etc/hosts in 192.168.122.2/3

         127.0.0.1  f1.zhj  

 

6.edit ~/.bashrc  and HADOOP_HOME \ HADOOP_CONF_DIR while append the bin and sbin dir  to PATH

  run the command to make it effective. #source ~/.bashrc

 

NOTE: the sample hadoop cluster is based on my notebook with Ubuntu13.10 with KVM which hosts the other two datanode with fedora20.

 

 

References:

http://allthingshadoop.com/2010/04/20/hadoop-cluster-setup-ssh-key-authentication/

 

 

你可能感兴趣的:(hadoop)