[root@master ~]# uname -a
Linux master.hadoop 2.6.32-431.el6.x86_64 #1 SMP Fri Nov22 03:15:09 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
[root@master ~]# cat /etc/issue
CentOS release 6.5 (Final)
三台主机:
主机名 |
主机短名 |
IP |
master.hadoop |
master |
192.168.56.102 |
slave1.hadoop |
slave1 |
192.168.56.103 |
slave2.hadoop |
slave2 |
192.168.56.104 |
安装JDK:
下载JDK地址
http://download.oracle.com/otn-pub/java/jdk/8u45-b14/jdk-8u45-linux-x64.tar.gz
解压到指定目录
tar -zxvf jdk-8u45-linux-x64.tar.gz -C /opt
用Hadoop源码编译64位版本,如何编译见上一篇文章,编译好的版本放在了百度云上:
http://pan.baidu.com/s/1qWFSigk
解压到指定目录:
tar -zxvf hadoop-2.6.0-x64.tar.gz -C /opt
2.主机之间互信
用ssh-keygen -t rsa生成公钥密钥对,生成的密钥在~/.ssh目录下:
私钥文件:id_rsa
公钥文件:id_rsa.pub
将三台主机的公钥文件id_rsa.pub内容放到~/.ssh/authorized_keys文件,在一台机器上做就可以了:
ssh-keygen -t rsa
cat .ssh/id_rsa.pub > .ssh/authorized_keys
ssh 192.168.56.103 ssh-keygen -t rsa
ssh 192.168.56.103 cat ~/.ssh/id_rsa.pub >>.ssh/authorized_keys
ssh 192.168.56.104 ssh-keygen -t rsa
ssh 192.168.56.104 cat ~/.ssh/id_rsa.pub >>.ssh/authorized_keys
scp .ssh/authorized_keys 192.168.56.103:~/.ssh
scp .ssh/authorized_keys 192.168.56.104:~/.ssh
在每台机器都执行下面命令:
ssh 192.168.56.102 date
ssh 192.168.56.103 date
ssh 192.168.56.104 date
如直接返回日期,互信建立成功
3.Hadoop配置
配置环境变量:
vi /etc/profile加入如下配置:
export JAVA_HOME= /opt/jdk1.8.0_45
export HADOOP_HOME=/opt/hadoop-2.6.0
exportCLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:
exportPATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:
使之生效:
source /etc/profile
vi /etc/hosts文件 在最后面添加如下内容:
192.168.56.102 master master.hadoop
192.168.56.103 slave1 slave1.hadoop
192.168.56.104 slave2 slave2.hadoop
进入hadoop配置文件目录
cd /opt/hadoop-2.6.0/etc/hadoop/
在hadoop-env.sh和 yarn-env.sh的开头添加如下环境变量(一定要添加)
export JAVA_HOME=/opt/jdk1.8.0_45
vi core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/apt/hadoop/tmp</value>
<description>Abase for other temporarydirectories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master.hadoop:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>4096</value>
</property>
</configuration>
vi hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///opt/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///opt/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>hadoop-cluster1</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master.hadoop:50090</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
cp mapred-site.xml.template mapred-site.xml
vi mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<final>true</final>
</property>
<property>
<name>mapreduce.jobtracker.http.address</name>
<value>master.hadoop:50030</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master.hadoop:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master.hadoop:19888</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>http://master.hadoop:9001</value>
</property>
</configuration>
vi yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties-->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master.hadoop</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master.hadoop:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master.hadoop:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master.hadoop:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master.hadoop:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master.hadoop:8088</value>
</property>
</configuration>
vi slaves
slave1.hadoop
slave2.hadoop
在其他主机上进行同样的设置
4.将三台机器的防火墙关闭掉:
service iptables stop
chkconfig iptables off
到此:整个Hadoop的集群算是配置完成了,在master.hadoop的机器上面执行命令启动集群
hadoop namenode -format
/opt/hadoop-2.6.0/sbin/start-all.sh
5.验证
#Master上面的如下:
[root@master ~]# jps
slave1和slave2上面的如下:
[root@slave1 ~]# jps
http://192.168.56.102:8088/cluster/nodes