一、准备软件环境:
hadoop-2.6.0.tar.gz
CentOS-5.11-i386
jdk-6u24-linux-i586
Master:hadoop02 192.168.20.129
Slave01:hadoop03 192.168.20.130
Slave02:hadoop04 192.168.20.131
二、安装JDK、SSH环境和hadoop【先在hadoop02下】
对于JDK
chmod u+x jdk-6u24-linux-i586.bin ./jdk-6u24-linux-i586.bin mv jdk-1.6.0_24 /home/jdk
注:证明JDK安装成功命令:
#java -version
对于SSH
ssh-keygen -t rsa cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys
注:证明SSH无密码登录成功命令:
#ssh localhost
对于Hadoop
tar -zxvf hadoop-2.6.0.tar.gz mv hadoop-2.6.0 /home/hadoop
#vim /etc/profile
export JAVA_HOME=/home/jdk export HADOOP_HOME=/home/hadoop export PATH=.:$JAVA_HOME/bin:$HADOOP_HOME/bin:$PATH
#source /etc/profile
#vim /etc/hosts
192.168.20.129 hadoop02 192.168.20.130 hadoop03 192.168.20.131 hadoop04
三、配置Hadoop环境【先在hadoop02下】
1)配置文件1:hadoop-env.sh
export JAVA_HOME=/home/jdk
2)配置文件2:yarn-env.sh
export JAVA_HOME=/home/jdk
3)配置文件3:slaves
hadoop03 hadoop04
4)配置文件4:core-site.xml
<configuration> <property> <name>hadoop.tmp.dir</name> <value>/data/hadoop-${user.name}</value> </property> <property> <name>fs.default.name</name> <value>hdfs://hadoop02:9000</value> </property> </configuration>
5)配置文件5:hdfs-site.xml
<configuration> <property> <name>dfs.http.address</name> <value>hadoop02:50070</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>hadoop02:50090</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
6)配置文件6:mapred-site.xml
<configuration> <property> <name>mapred.job.tracker</name> <value>hadoop02:9001</value> </property> <property> <name>mapred.map.tasks</name> <value>20</value> </property> <property> <name>mapred.reduce.tasks</name> <value>4</value> </property> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>hadoop02:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hadoop02:19888</value> </property> </configuration>
7)配置文件7:yarn-site.xml
<configuration> <property> <name>yarn.resourcemanager.address</name> <value>hadoop02:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>hadoop02:8030</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>hadoop02:8088</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>hadoop02:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>hadoop02:8033</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>
四、配置hadoop03和hadoop04
他们的配置同hadoop02一样,同理:
scp -r /root/.ssh/ root@hadoop03:/root/.ssh/ scp -r /root/.ssh/ root@hadoop04:/root/.ssh/ scp /etc/profile root@hadoop03:/etc/ scp /etc/profile root@hadoop04:/etc/ scp /etc/hosts root@hadoop03:/etc/ scp /etc/hosts root@hadoop04:/etc/ scp -r /home/ root@hadoop03:/home/ scp -r /home/ root@hadoop04:/home/
五、启动hadoop集群
1)格式化namenode:
/home/hadoop/bin/hdfs namenode -format
2)启动hdfs:
/home/hadoop/sbin/start-dfs.sh
此时在Master上面运行的进程有:namenode secondarynamenode
Slave1和Slave2上面运行的进程有:datanode
3)启动yarn:
/home/hadoop/sbin/start-yarn.sh
此时在Master上面运行的进程有:namenode secondarynamenode resourcemanager
Slave1和Slave2上面运行的进程有:datanode nodemanaget
4)检查启动结果
查看集群状态:
hdfs dfsadmin �Creport
查看HDFS:
http://192.168.20.129:50070
六、总结实验--错误:
15/05/11 13:41:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
这个错误是/home/hadoop/lib/native/libhadoop.so.1.0.0是64位系统,而我所用是32位系统,
但不影响系统
#file libhadoop.so.1.0.0