hadoop 2.04测试环境搭建

1:规划
oracle linux6.4上搭建hadoop2.0环境
192.168.100.171 linux1 (namenode)
192.168.100.172 linux2 (预留当namenode)
192.168.100.173 linux3 (datanode)
192.168.100.174 linux4 (datanode)
192.168.100.175 linux5 (datanode)

2:创建VMware Workstation样板机
a:安装oracle linux 6.4虚拟机linux1,开通ssh服务,屏蔽iptables服务
[root@linux1 ~]# chkconfig sshd on
[root@linux1 ~]# chkconfig iptables off
[root@linux1 ~]# chkconfig ip6tables off
[root@linux1 ~]# chkconfig postfix off

b:关闭虚拟机linux1,增加一个新的硬盘到共享目录作为共享硬盘用(使用SCSI1:0接口),
修改linux1.vmx,添加和修改参数:
disk.locking="FALSE"    
diskLib.dataCacheMaxSize = "0"
disk.EnableUUID = "TRUE"
scsi1.present = "TRUE"
scsi1.sharedBus = "Virtual"
scsi1.virtualDev = "lsilogic"

c:重启虚拟机linux1,下载JAVA到共享硬盘,安装JAVA,在环境变量配置文件/etc/profile末尾增加:
JAVA_HOME=/usr/java/jdk1.7.0_21; export JAVA_HOME
JRE_HOME=/usr/java/jdk1.7.0_21/jre; export JRE_HOME
CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar; export CLASSPATH
PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH; export PATH
************************************************************************
为了方便,配置hadoop环境变量到/etc/profile或hadoop用户~/.bashrc
export HADOOP_PREFIX=/app/hadoop204
export PATH=$PATH:$HADOOP_PREFIX/bin
export PATH=$PATH:$HADOOP_PREFIX/sbin
export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
export YARN_HOME=${HADOOP_PREFIX}
************************************************************************

d:修改/etc/hosts,增加:
192.168.100.171 linux1
192.168.100.172 linux2
192.168.100.173 linux3
192.168.100.174 linux4
192.168.100.175 linux5

e:修改/etc/sysconfig/selinux
SELINUX=disabled

f:增加hadoop用户及安装hadoop文件:
[root@linux1 ~]# useradd hadoop -g root
[root@linux1 ~]# passwd hadoop
[root@linux1 ~]# cd /
[root@linux1 /]# mkdir /app
[root@linux1 /]# cd /app
[root@linux1 app]# tar -zxf /mnt/mysoft/LinuxSoft/hadoop-2.0.4-alpha.tar.gz
[root@linux1 app]# mv hadoop-2.0.4-alpha hadoop204
[root@linux1 app]# chown hadoop:root -R /app/hadoop204
[root@linux1 hadoop204]# su - hadoop
[hadoop@linux1 ~]$ cd /app/hadoop204
[hadoop@linux1 hadoop204]$ mkdir tmp

g:修改hadoop相关配置文件:

[hadoop@linux1 hadoop204]$ cd etc/hadoop
[hadoop@linux1 hadoop]$ vi core-site.xml
******************************************************************************
<configuration>
<property>
<name>io.native.lib.available</name>
<value>true</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://linux1:9000</value>
<description>The name of the default file system. Either the literal string "local" or a host:port for NDFS. </description>
<final>true</final>
</property>
</configuration>
******************************************************************************

[hadoop@linux1 hadoop]$ vi hdfs-site.xml
******************************************************************************
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/app/hadoop204/dfs/name</value>
<description>Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. </description>
<final>true</final>
</property>

<property>
<name>dfs.datanode.data.dir</name>
<value>file:/app/hadoop204/dfs/data</value>
<description>Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.</description>
<final>true</final>
</property>

<property>
<name>dfs.replication</name>
<value>1</value>
</property>

<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
******************************************************************************

[hadoop@linux1 hadoop]$ vi mapred-site.xml
******************************************************************************
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

<property>
<name>mapreduce.job.tracker</name>
<value>hdfs://linux1:9001</value>
<final>true</final>
</property>

<property>
<name>mapred.system.dir</name>
<value>file:/app/hadoop204/mapred/system</value>
<final>true</final>
</property>

<property>
<name>mapred.local.dir</name>
<value>file:/app/hadoop204/mapred/local</value>
<final>true</final>
</property>
</configuration>
******************************************************************************

[hadoop@linux1 hadoop]$ vi yarn-site.xml
******************************************************************************
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>linux1:8080</value>
</property>

<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>linux1:8081</value>
</property>

<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>linux1:8082</value>
</property>

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
</property>

<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
******************************************************************************

[hadoop@linux1 hadoop]$ vi hadoop-env.sh
******************************************************************************
export JAVA_HOME=/usr/java/jdk1.7.0_21
export HADOOP_FREFIX=/app/hadoop204
export HADOOP_COMMON_HOME=${HADOOP_FREFIX}
export HADOOP_HDFS_HOME=${HADOOP_FREFIX}
export PATH=$PATH:$HADOOP_FREFIX/bin
export PATH=$PATH:$HADOOP_FREFIX/sbin
export HADOOP_MAPRED_HOME=${HADOOP_FREFIX}
export YARN_HOME=${HADOOP_FREFIX}
export HADOOP_CONF_HOME=${HADOOP_FREFIX}/etc/hadoop
export YARN_CONF_DIR=${HADOOP_FREFIX}/etc/hadoop
******************************************************************************

[hadoop@linux1 hadoop]$ vi yarn-env.sh
******************************************************************************
export JAVA_HOME=/usr/java/jdk1.7.0_21
export HADOOP_FREFIX=/app/hadoop204
export HADOOP_COMMON_HOME=${HADOOP_FREFIX}
export HADOOP_HDFS_HOME=${HADOOP_FREFIX}
export PATH=$PATH:$HADOOP_FREFIX/bin
export PATH=$PATH:$HADOOP_FREFIX/sbin
export HADOOP_MAPRED_HOME=${HADOOP_FREFIX}
export YARN_HOME=${HADOOP_FREFIX}
export HADOOP_CONF_HOME=${HADOOP_FREFIX}/etc/hadoop
export YARN_CONF_DIR=${HADOOP_FREFIX}/etc/hadoop
******************************************************************************

h:配置ssh使用证书验证/etc/ssh/sshd_config,打开注释:
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys


3:配置ssh
a:关闭样板机,分别复制成linux2、linux3、linux4、linux5:
修改vmware workstation配置文件的displayname;
修改虚拟机的下列文件中相关的信息
/etc/udev/rules.d/70-persistent-net.rules
/etc/sysconfig/network
/etc/sysconfig/network-scripts/ifcfg-eth0

b:启动linux1、linux2、linux3、linux4、linux5,确保相互之间能ping通。

c:配置ssh,确保linux1能无验证访问其他节点
[root@linux1 tmp]# su - hadoop
[hadoop@linux1 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): 
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
17:37:98:fa:7e:5c:e4:8b:b4:7e:bb:59:28:8f:45:bd hadoop@linux1
The key's randomart image is:
+--[ RSA 2048]----+
|                 |
|           o     |
|          + o    |
|         . o ... |
|        S .  o. .|
|         o  ..o..|
|          .o.+oE.|
|         .  ==oo |
|          .oo.=o |
+-----------------+
[hadoop@linux1 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@linux1
[hadoop@linux1 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@linux2
[hadoop@linux1 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@linux3
[hadoop@linux1 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@linux4
[hadoop@linux1 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@linux5

验证可否无密码访问:
[hadoop@linux1 ~]$ ssh linux1 date
[hadoop@linux1 ~]$ ssh linux2 date
[hadoop@linux1 ~]$ ssh linux3 date
[hadoop@linux1 ~]$ ssh linux4 date
[hadoop@linux1 ~]$ ssh linux5 date


4:初始化hadoop
[hadoop@linux1 hadoop204]$ /app/hadoop204/bin/hdfs namenode -format

5:配置linux1的slaves
[hadoop@linux1 hadoop204]$ vi etc/hadoop/slaves
192.168.100.173
192.168.100.174
192.168.100.175

6:启动hadoop
[hadoop@linux1 hadoop204]$ /app/hadoop204/sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
13/06/11 10:08:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [linux1]
linux1: starting namenode, logging to /app/hadoop204/logs/hadoop-hadoop-namenode-linux1.out
192.168.100.174: starting datanode, logging to /app/hadoop204/logs/hadoop-hadoop-datanode-linux4.out
192.168.100.175: starting datanode, logging to /app/hadoop204/logs/hadoop-hadoop-datanode-linux5.out
192.168.100.173: starting datanode, logging to /app/hadoop204/logs/hadoop-hadoop-datanode-linux3.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /app/hadoop204/logs/hadoop-hadoop-secondarynamenode-linux1.out
13/06/11 10:08:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting yarn daemons
starting resourcemanager, logging to /app/hadoop204/logs/yarn-hadoop-resourcemanager-linux1.out
192.168.100.174: starting nodemanager, logging to /app/hadoop204/logs/yarn-hadoop-nodemanager-linux4.out
192.168.100.175: starting nodemanager, logging to /app/hadoop204/logs/yarn-hadoop-nodemanager-linux5.out
192.168.100.173: starting nodemanager, logging to /app/hadoop204/logs/yarn-hadoop-nodemanager-linux3.out

7:修改linux启动到console状态。以上只是初步测试,虽然http://192.168.100.171:8088可以访问,但还有许多问题:
启动中的 Unable to load native-hadoop library for your platform错误,可以参见:
Hadoop本地库与系统版本不一致引起的错误解决方法
HADOOP的本地库(NATIVE LIBRARIES)介绍
192.168.100.172 namenode配置,可以参见:
Hadoop 0.23.0初探
另经测试/app/hadoop204/etc/hadoop/hadoop-env.sh和/app/hadoop204/ etc/hadoop/yarn-env.sh中的JAVA_HOME配置需要绝对路径,不然即使/etc/profile中已经配置了JAVA_HOME还是会报错。

路还很长。。。。

你可能感兴趣的:(hadoop,安装)