如何搭建Hadoop集群

准备工作

1.判断机器上是否有ssh服务,

[linuxidc @ www.codesky.net Desktop]$ ssh -verison
OpenSSH_5.3p1, OpenSSL 1.0.0-fips 29 Mar 2010
Bad escape character 'rison'.

我的系统自带的,所以不用装了。

2.判断机器上是否有JDK

[linuxidc @ www.codesky.net Desktop]$ java -version
java version "1.6.0_24"
OpenJDK Runtime Environment (IcedTea6 1.11.1) (rhel-1.45.1.11.1.el6-i386)
OpenJDK Server VM (build 20.0-b12, mixed mode)
[linuxidc @ www.codesky.net Desktop]$ javac -version
javac 1.6.0_24

如果是系统自带的JDK,最好重装一下。 http://www.codesky.net/Linux/2012-08/67185.htm

进入主题

1.下载和安装Hadoop,我下载的是hadoop-0.20.2.tar.gz

解压文件:[root@ www.codesky.net Downloads]# tar -zxvf Hadoop-0.20.2.tar.gz

移动文件:[root@ www.codesky.net Downloads]# mv Hadoop-0.20.2  /usr/local/

安装文件:[root@ www.codesky.net Downloads]# ln -s Hadoop-0.20.2   hadoop

2.修改环境变量

[root@ www.codesky.net local]#vi /etc/profile

在文件的下面添加,不能直接在文件的上面添加

export Hadoop_HOME=/usr/local/hadoop
export PATH=$PATH:$Hadoop_HOME/bin

[root@ www.codesky.net local]#.  /etc/profile

[root@ www.codesky.net local]# vi /usr/local/Hadoop/conf/hadoop.env.sh(配置JAVA_HOME)

[root@ www.codesky.net Desktop]# Hadoop version
Hadoop 0.20.2
Subversion https://svn.apache.org/repos/asf/Hadoop/common/branches/branch-0.20 -r 911707
Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010

压轴大戏

1.NameNode配置

[Hadoop@hadoop1 ~]# vi /etc/hosts
192.168.127.145  Hadoop1
192.168.127.146  Hadoop2
192.168.127.147  Hadoop3
192.168.127.148  Hadoop4

[root@ www.codesky.net conf]# vi core-site.xml

  1. <configuration>  
  2.    <property>  
  3.      <name>fs.default.name</name>  
  4.      <value>hdfs://Hadoop1:9000</value>  
  5.    </property>  
  6. </configuration>  

[root@ www.codesky.net conf]# vi hdfs-site.xml 

  1. <configuration>  
  2.     <property>  
  3.        <name>dfs.replication</name>  
  4.        <value>3</value>  
  5.     </property>  
  6.   
  7.     <property>  
  8.        <name>dfs.name.dir</name>  
  9.        <value>/usr/local/Hadoop/namenode/</value>  
  10.     </property>  
  11.   
  12.     <property>  
  13.        <name>Hadoop.tmp.dir</name>  
  14.        <value>/usr/local/Hadoop/tmp/</value>  
  15.     </property>  
  16. </configuration>  

[root@ www.codesky.net conf]# vi mapred-site.xml

  1. <configuration>  
  2.     <property>  
  3.         <name>mapred.job.tracker</name>  
  4.         <value>Hadoop1:9001</value>  
  5.     </property>  
  6.   
  7.     <property>  
  8.        <name>mapred.tasktracker.map.tasks.maximum</name>  
  9.        <value>4</value>  
  10.     </property>  
  11.   
  12.     <property>  
  13.        <name>mapred.tasktracker.reduce.tasks.maximum</name>  
  14.        <value>4</value>  
  15.     </property>  
  16. </configuration>

datanode配置 (只需修改hdfs-site.xml,mapred-site.xml 和core-site.xml跟NameNode一样 )
[Hadoop@hadoop2 ~]$ vi hdfs-site.xml

  1.  <configuration>  
  2. <property>  
  3.        <name>dfs.replication</name>  
  4.        <value>3</value>  
  5.     </property>  
  6.   
  7.     <property>  
  8.        <name>dfs.data.dir</name>  
  9.        <value>/home/Hadoop/data</value>  
  10.     </property>  
  11.   
  12.     <property>  
  13.        <name>Hadoop.tmp.dir</name>  
  14.        <value>/usr/local/Hadoop/tmp/</value>  
  15.     </property>  
  16. </configuration>  

[Hadoop@hadoop1 conf]$ vi masters
Hadoop1
[Hadoop@hadoop1 conf]$ vi slaves
Hadoop2
Hadoop3
Hadoop4
[Hadoop@hadoop1 ~]$ start-all.sh
[Hadoop@hadoop1 ~]$ stop-all.sh

你可能感兴趣的:(java,hadoop,ssh)