hadoop集群安装配置

系统环境
redhat 5.5-x64
所需软件
hadoop-0.21.0.tar.gz
下载地址 :http://www.apache.org/dyn/closer.cgi/hadoop/common/
jdk-6u21-linux-x64.bin

部署环境:
namenode:192.168.10.20(hadoop1)
datanode:192.168.10.21(hadoop2)
         192.168.10.22(hadoop3)
         192.168.10.23(hadoop4)
一、安装
1、在namenode:(在datanode上操作和以下相同)
[hadoop@hadoop1 ~]# vi /etc/hosts
192.168.10.20  hadoop1
192.168.10.21  hadoop2
192.168.10.22  hadoop3
192.168.10.23  hadoop4

2、用root登陆,建立hadoop用户
[hadoop@hadoop1 ~]# useradd hadoop
[hadoop@hadoop1 ~]# passwd hadoop
输入******做为密码

3、su - hadoop       #进入hadoop用户目录
[hadoop@hadoop1 ~]# su - hadoop
[hadoop@hadoop1 ~]$ ssh-keygen -t rsa #建立ssh目录,敲回车到底
[hadoop@hadoop1 ~]$ cd .ssh/
[hadoop@hadoop1 .ssh]$ ll
total 20
-rw------- 1 hadoop hadoop 1675 Sep 23 16:19 id_rsa
-rw-r--r-- 1 hadoop hadoop  403 Sep 23 16:19 id_rsa.pub
-rw-r--r-- 1 hadoop hadoop 3136 Sep 24 15:23 known_hosts
[hadoop@hadoop1 .ssh]$ scp -r id_rsa.pub hadoop@hadoop2:/home/hadoop/.ssh/
[hadoop@hadoop1 .ssh]$ ssh hadoop2
[hadoop@hadoop2 .ssh]$ cat id_rsa.pub >>authorized_keys
[hadoop@hadoop2 .ssh]$ chmod 644 authorized_keys
其它datanode机器操作一样

4、jdk-6u21-linux-x64.bin安装
[root@master src]# ./jdk-6u21-linux-x64.bin
[root@master src]# mv jdk1.6.0_21 /usr/local/
[root@master local]# ln -s jdk1.6.0_21 java
5、hadoop-0.21.0.tar.gz安装
[root@master src]# tar -zxvf hadoop-0.21.0.tar.gz
[root@master src]# mv hadoop-0.21.0 /usr/local/
[root@master local]# ln -s hadoop-0.21.0 hadoop
6、修改环境变量
[root@master src]# vi /etc/profile
export JAVA_HOME=/usr/local/java
export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export PATH=$JAVA_HOME/lib:$JAVA_HOME/jre/bin:$PATH:$HOME/bin 
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin 
[root@master src]# java -version
java version "1.6.0_21"
Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)
[root@master src]# hadoop version
Hadoop 0.21.0
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.21 -r 985326
Compiled by tomwhite on Tue Aug 17 01:02:28 EDT 2010
From source with checksum a1aeb15b4854808d152989ba76f90fac

二、配置
配置文件
1、hadoop-env.sh,core-site.xml,hdfs-site.xml,mapred-site.xml,hadoop-env.sh,masters,slaves
namenode配置
[hadoop@hadoop1 ~]$ cd /usr/local/hadoop/conf/
[hadoop@hadoop1 ~]$ vi hadoop-env.sh (修改java环境变量)
export JAVA_HOME=/usr/local/java
[hadoop@hadoop1 ~]$ vi core-site.xml(hdfs和mapreduce中很普通的I/O设置)
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
   <property>
     <name>fs.default.name</name>
     <value>hdfs://hadoop1:9000</value>
   </property>
</configuration>
[hadoop@hadoop1 ~]$ vi hdfs-site.xml(HDFS后台程序设置的配置:名称节点,第二名称节点和数据节点)
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
    <property>
       <name>dfs.replication</name>
       <value>3</value>
    </property>

    <property>
       <name>dfs.name.dir</name>
       <value>/usr/local/hadoop/namenode/</value>
    </property>

    <property>
       <name>hadoop.tmp.dir</name>
       <value>/usr/local/hadoop/tmp/</value>
    </property>

</configuration>
[hadoop@hadoop1 ~]$ vi mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
    <property>
        <name>mapred.job.tracker</name>
        <value>hadoop1:9001</value>
    </property>

    <property>
       <name>mapred.tasktracker.map.tasks.maximum</name>
       <value>4</value>
    </property>

    <property>
       <name>mapred.tasktracker.reduce.tasks.maximum</name>
       <value>4</value>
    </property>

</configuration>
datanode配置 (只需修改hdfs-site.xml)
[hadoop@hadoop2 ~]$ vi hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
    <property>
       <name>dfs.replication</name>
       <value>3</value>
    </property>

    <property>
       <name>dfs.data.dir</name>
       <value>/home/hadoop/data</value>
    </property>

    <property>
       <name>hadoop.tmp.dir</name>
       <value>/usr/local/hadoop/tmp/</value>
    </property>

[hadoop@hadoop1 conf]$ vi masters
hadoop1
[hadoop@hadoop1 conf]$ vi slaves
hadoop2
hadoop3
hadoop4
[hadoop@hadoop1 ~]$ start-all.sh
[hadoop@hadoop1 ~]$ stop-all.sh

你可能感兴趣的:(mapreduce,jdk,hadoop,redhat,ssh)