转自:http://blog.csdn.net/jiushuai/article/details/16817311

Setup newest Hadoop 2.x (2.2.0) on Ubuntu

In this tutorial I am going to guide you through setting up hadoop 2.2.0 environment on Ubuntu.

Prerequistive

[java] view plain copy
  1. $ sudo apt-get install openjdk-7-jdk  

  2. $ java -version  

  3. java version "1.7.0_25"

  4. OpenJDK Runtime Environment (IcedTea 2.3.12) (7u25-2.3.12-4ubuntu3)  

  5. OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)  

  6. $ cd /usr/lib/jvm  

  7. $ ln -s java-7-openjdk-amd64 jdk  

  8. $ sudo apt-get install openssh-server  

Add Hadoop Group and User

[java] view plain copy
  1. $ sudo addgroup hadoop  

  2. $ sudo adduser --ingroup hadoop hduser  

  3. $ sudo adduser hduser sudo  

After user is created, re-login into ubuntu using hduser

Setup SSH Certificate

[java] view plain copy
  1. $ ssh-keygen -t rsa -P ''

  2. ...  

  3. Your identification has been saved in /home/hduser/.ssh/id_rsa.  

  4. Your public key has been saved in /home/hduser/.ssh/id_rsa.pub.  

  5. ...  

  6. $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys  

  7. $ ssh localhost  

Download Hadoop 2.2.0

[java] view plain copy
  1. $ cd ~  

  2. $ wget http://www.trieuvan.com/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz

  3. $ sudo tar vxzf hadoop-2.2.0.tar.gz -C /usr/local  

  4. $ cd /usr/local  

  5. $ sudo mv hadoop-2.2.0 hadoop  

  6. $ sudo chown -R hduser:hadoop hadoop  

Setup Hadoop Environment Variables

[java] view plain copy
  1. $cd ~  

  2. $vi .bashrc  

  3. paste following to the end of the file  

  4. #Hadoop variables  

  5. export JAVA_HOME=/usr/lib/jvm/jdk/  

  6. export HADOOP_INSTALL=/usr/local/hadoop  

  7. export PATH=$PATH:$HADOOP_INSTALL/bin  

  8. export PATH=$PATH:$HADOOP_INSTALL/sbin  

  9. export HADOOP_MAPRED_HOME=$HADOOP_INSTALL  

  10. export HADOOP_COMMON_HOME=$HADOOP_INSTALL  

  11. export HADOOP_HDFS_HOME=$HADOOP_INSTALL  

  12. export YARN_HOME=$HADOOP_INSTALL  

  13. ###end of paste  

  14. $ cd /usr/local/hadoop/etc/hadoop  

  15. $ vi hadoop-env.sh  

  16. #modify JAVA_HOME  

  17. export JAVA_HOME=/usr/lib/jvm/jdk/  

Re-login into Ubuntu using hdser and check hadoop version
[java] view plain copy
  1. $ hadoop version  

  2. Hadoop 2.2.0

  3. Subversion https://svn.apache.org/repos/asf/hadoop/common -r 1529768

  4. Compiled by hortonmu on 2013-10-07T06:28Z  

  5. Compiled with protoc 2.5.0

  6. From source with checksum 79e53ce7994d1628b240f09af91e1af4  

  7. This command was run using /usr/local/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar  

At this point, hadoop is installed.

Configure Hadoop

[java] view plain copy
  1. $ cd /usr/local/hadoop/etc/hadoop  

  2. $ vi core-site.xml  

  3. #Paste following between  

  4.   fs.default.name  

  5.   hdfs://localhost:9000

  6. $ vi yarn-site.xml  

  7. #Paste following between  

  8.   yarn.nodemanager.aux-services  

  9.   mapreduce_shuffle  

  10.   yarn.nodemanager.aux-services.mapreduce.shuffle.class

  11.   org.apache.hadoop.mapred.ShuffleHandler  

  12. $ mv mapred-site.xml.template mapred-site.xml  

  13. $ vi mapred-site.xml  

  14. #Paste following between  

  15.   mapreduce.framework.name  

  16.   yarn  

  17. $ cd ~  

  18. $ mkdir -p mydata/hdfs/namenode  

  19. $ mkdir -p mydata/hdfs/datanode  

  20. $ cd /usr/local/hadoop/etc/hadoop  

  21. $ vi hdfs-site.xml  

  22. Paste following between tag  

  23.   dfs.replication  

  24. 1

  25.   dfs.namenode.name.dir  

  26.   file:/home/hduser/mydata/hdfs/namenode  

  27.   dfs.datanode.data.dir  

  28.   file:/home/hduser/mydata/hdfs/datanode  

Format Namenode

[java] view plain copy
  1. hduser@ubuntu40:~$ hdfs namenode -format  

Start Hadoop Service

[java] view plain copy
  1. $ start-dfs.sh  

  2. ....  

  3. $ start-yarn.sh  

  4. ....  

  5. hduser@ubuntu40:~$ jps  

  6. If everything is sucessful, you should see following services running  

  7. 2583 DataNode  

  8. 2970 ResourceManager  

  9. 3461 Jps  

  10. 3177 NodeManager  

  11. 2361 NameNode  

  12. 2840 SecondaryNameNode  

Run Hadoop Example

[java] view plain copy
  1. hduser@ubuntu: cd /usr/local/hadoop  

  2. hduser@ubuntu:/usr/local/hadoop$ hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 25

  3. Number of Maps  = 2

  4. Samples per Map = 5

  5. 13/10/2118:41:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable  

  6. Wrote input for Map #0

  7. Wrote input for Map #1

  8. Starting Job  

  9. 13/10/2118:41:04 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032

  10. 13/10/2118:41:04 INFO input.FileInputFormat: Total input paths to process : 2

  11. 13/10/2118:41:04 INFO mapreduce.JobSubmitter: number of splits:2

  12. 13/10/2118:41:04 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name  

  13. ...  

[java] view plain copy
[java] view plain copy