Environment:
OS: CentOS 6.5 x64
Hadoop Version: 2.6.0
Host:
IP | HostName |
---|---|
172.22.35.167 | Master |
172.22.35.159 | Slave1 |
172.22.35.158 | Slave2 |
172.22.35.147 | Slave3 |
172.22.35.160 | Second |
搭建过程如下:
Step 1:在所有主机上新建hadoop用户
#useradd hadoop
#passwd hadoop
所有主机的用户名密码均一致
Step 2:
(1)关闭以上所有主机的selinux
vi /etc/sysconfig/selinux
修改
#SELINUX=enforcing
如下
SELINUX=disabled
(2)关闭所有主机的防火墙
service iptables stop
取消所有主机的开机自动启动防火墙
chkconfig -off iptables
(3)在所有主机上执行以上所有操作,然后重启所有主机
Step 3:切换到hadoop用户
su hadoop
然后设置所有主机之间的SSH无密码登录
参见:
CentOS 6.5下设置不同主机之间的SSH免秘钥登录
Step 4:
下载hadoop 2.6.0 :http://hadoop.apache.org/releases.html
然后解压
sudo tar -xvzf hadoop-2.6.0.tar.gz
Step 5:
cd hadoop/etc/hadoop/
配置如下:
core-site.xml
<configuration>
<property>
<name>fs.defaultFSname>
<value>hdfs://Master:9000value>
property>
<property>
<name>hadoop.tmp.dirname>
<value>/home/hadoop/hadoop/var/tmp_hadoopvalue>
property>
configuration>
hdfs-site.xml:
<configuration>
<property>
<name>dfs.namenode.name.dirname>
<value>file:///home/hadoop/hadoop/data/dfs/namenodevalue>
property>
<property>
<name>dfs.datanode.data.dirname>
<value>file:///home/hadoop/hadoop/data/dfs/datanodevalue>
property>
<property>
<name>dfs.namenode.checkpoint.dirname>
<value>file:///home/hadoop/hadoop/data/dfs/namesecondaryvalue>
property>
<property>
<name>dfs.namenode.secondary.http-address
name>
<value>Second:50090value>
property>
<property>
<name>dfs.replicationname>
<value>3value>
property>
configuration>
mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.namename>
<value>yarnvalue>
<description>Execution framework.description>
property>
<property>
<name>mapreduce.jobhistory.addressname>
<value>Master:10020value>
property>
<property>
<name>mapreduce.jobhistory.webapp.addressname>
<value>Master:19888value>
property>
configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-servicesname>
<value>mapreduce_shufflevalue>
property>
<property>
<name>yarn.resourcemanager.hostnamename>
<value>Mastervalue>
property>
<property>
<name>yarn.resourcemanager.addressname>
<value>Master:8032value>
property>
<property>
<name>yarn.resourcemanager.resource-tracker.addressname>
<value>Master:8031value>
property>
configuration>
slaves:记录所有的datanode主机名
Slave1
Slave2
Slave3
masters: 记录SecondaryNameNode主机名
Second
Step6: 配置JDK
下载jdk1.7.0_60.tar.gz,然后解压
sudo tar -xvzf jdk1.7.0_60.tar.gz
然后配置环境变量
[hadoop@Slave2 ~]$ cd
vi .bashrc,在最后追加
# add by zt
export JAVA_HOME=/home/hadoop/jdk1.7.0_60
export PATH=$JAVA_HOME/bin:$PATH
# configure hadoop environment
export HADOOP_COMMON_HOME=$HOME/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_COMMON_HOME
export HADOOP_CONF_DIR=$HADOOP_COMMON_HOME/etc/hadoop
export HADOOP_HDFS_HOME=$HADOOP_COMMON_HOME
export YARN_HOME=$HADOOP_COMMON_HOME
export HIVE_HOME=/home/hadoop/hive
export HBASE_HOME=/home/hadoop/hbase
export ZOOKEEPER_HOME=/home/hadoop/zookeeper
export PATH=$PATH:$HADOOP_COMMON_HOME/bin
export PATH=$PATH:$HADOOP_COMMON_HOME/sbin
export PATH=$PATH:$HIVE_HOME/bin:$HBASE_HOME/bin:$ZOOKEEPER_HOME/bin
然后保存退出
:wq!
然后执行source使文件立刻生效
source .bashrc
Step 7:将hadoop 依次copy到其他主机上
scp .bashrc hadoop@Slave1:/home/hadoop/
scp -r hadoop hadoop@Slave1:/home/hadoop/
scp .bashrc hadoop@Slave2:/home/hadoop/
scp -r hadoop hadoop@Slave2:/home/hadoop/
scp .bashrc hadoop@Slave3:/home/hadoop/
scp -r hadoop hadoop@Slave3:/home/hadoop/
scp .bashrc hadoop@Second:/home/hadoop/
scp -r hadoop hadoop@Second:/home/hadoop/
然后分别执行
source .bashrc
Step 8:格式化namenode
[hadoop@Master hadoop]$ hdfs namenode -format
Step 9: 开启dfs
[hadoop@Master logs]$ start-dfs.sh
开启yarn
[hadoop@Master logs]$ start-yarn.sh
此时,hadoop启动完成。
Step 10:然后在Master上执行jps会看到启动的进程
[hadoop@Master ~]$ jps
2700 NameNode
2975 ResourceManager
15850 Jps
在三台datanode上执行jps会发现:
[hadoop@Slave2 ~]$ jps
2065 NodeManager
11653 Jps
1916 DataNode
Done!前后折腾这么久,也算是一个小的总结吧!
MarkDown编辑器用着不错!
welcome 2016!