环境: 三台CentOS6.6虚拟机,机器名分别为hadoop4、hadoop5、hadoop6,通过桥接连网,互相可以ping通,均安装好JDK1.7,关闭iptables防火墙,创建好新用户并配置好ssh免密码
修改配置文件 hadoop-env.sh ,指定JAVA路径
[grid@hadoop4 hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_72
修改配置文件 yarn-env.sh ,指定JAVA路径
[grid@hadoop4 hadoop]$ vi yarn-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_72
修改配置文件 slaves
,指定slaves节点
[grid@hadoop4 hadoop]$ vi slaves
hadoop5
hadoop6
修改配置文件 core-site.xml ,指定文件系统的访问入口
[grid@hadoop4 hadoop]$ vi core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop4:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/grid/hadoop-2.5.2/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>hadoop.proxyuser.hduser.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hduser.groups</name>
<value>*</value>
</property>
</configuration>
修改配置文件 hdfs-site.xml,
[grid@hadoop4 hadoop]$ vi hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop4:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/grid/hadoop-2.5.2/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/grid/hadoop-2.5.2/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
修改配置文件 mapred-site.xml ,
[grid@hadoop4 hadoop]$ cp mapred-site.xml.template mapred-site.xml
[grid@hadoop4 hadoop]$ vi mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop4:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop4:19888</value>
</property>
</configuration>
修改配置文件 yarn-site.xml ,
[grid@hadoop4 hadoop]$ vi yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop4:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop4:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop4:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoop4:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop4:8088</value>
</property>
</configuration>
分发Hadoop
[grid@hadoop4 ~]$ scp -r hadoop-2.5.2 hadoop5:~
[grid@hadoop4 ~]$ scp -r hadoop-2.5.2 hadoop6:~
配置Hadoop环境变量
[grid@hadoop6 ~]$ vi .bash_profile
## Hadoop
export HADOOP_PREFIX=/home/grid/hadoop-2.5.2
export HADOOP_COMMON_HOME=$HADOOP_PREFIX
export HADOOP_HDFS_HOME=$HADOOP_PREFIX
export HADOOP_MAPRED_HOME=$HADOOP_PREFIX
export HADOOP_YARN_HOME=$HADOOP_PREFIX
export HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop
export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin
[grid@hadoop6 ~]$ source .bash_profile
格式化namenode
[grid@hadoop4 hadoop-2.5.2]$ ./bin/hdfs namenode -format
当看到如下消息时表示格式化成功:
启动hdfs
[grid@hadoop4 hadoop-2.5.2]$ ./sbin/start-dfs.sh
此时在hadoop4上面运行的迕程有: namenode, secondarynamenode ;hadoop5和hadoop6上面运行的迕程有: datanode
启动yarn
[grid@hadoop4 hadoop-2.5.2]$ ./sbin/start-yarn.sh
此时在hadoop4上运行的迕程有: namenode, secondarynamenode, resourcemanager ;hadoop5和hadoop6上面运行的迕程有: datanode, nodemanager