1. Hadoop2.2.0 下载
下载地址:
http://archive.apache.org/dist/hadoop/core/hadoop-2.2.0/
2. 集群环境搭建
设置CentOS静态IP地址的步骤:
$ sudo vim /etc/sysconfig/network-scripts/ifcfg-eth0
在里面添加如下语句:
IPADDR=192.168.61.0
NETMASK=255.255.255.0
NETWORK=192.168.0.0
设置好之后,需要让IP地址生效,运行下面命令:
$ sudo service network restart
$ ifconfig (检验设置是否生效)
设置hostname步骤如下:
$ sudo vim /etc/sysconfig/network (将里面的HOSTNAME修改为你的hostname)
$ hostname (查看主机名)
1. 修改etc/hosts文件
添加集群中所有节点的IP和hostname的映射关系。
compute-0-0 192.168.61.1
compute-0-1 192.168.61.2
compute-0-2 192.168.61.3
2. 在所有的机器上建立相同的用户grid
$ useradd grid
$ passwd grid
为每个账户分配sudo的权限。修改/etc/sudoers文件,增加:
grid ALL=(ALL)ALL
3. 安装SSH
该配置主要是为了实现在机器之间执行指令时不需要输入密码。在所有机器上建立.ssh目录,执行一下操作:
$ mkdir .ssh
$ ssh-keygen –t rsa 生成密钥对
然后一直按Enter键,就会按照默认的选项将生成的密钥对保存在.ssh/id_rsa文件中。接着执行如下命令:
$ cd .ssh
$ cp id_rsa authorized_keys(把id_rsa.pub追加到授权的key里面去,cat id_rsa.pub >> authorized_keys)
$ scp authorized_keys compute-0-1:/home/grid/.ssh
最后进入所有机器的.ssh目录下,改变authorized_keys文件的许可权限:
$ chmod 644 authorized_keys(该文件的权限不能高于644,600也可)
重启 SSH服务,使配置生效:
$ service sshd restart
4. 安装JDK(尽量集群中的java保持一致)
下载地址:
http://www.oracle.com/technetwork/java/javase/archive-139210.html
选择java的安装目录/usr/,新建一个java文件夹,并将jdk-7u40-linux-i586.tar.gz移动到/usr/java下,命令如下:
$ cd /usr
$ mkdir java
$ tar –zxvf jdk-7u40-linux-i586.tar.gz
$ rm –rf jdk-7u40-linux-i586.tar.gz(为了节省空间)
至此,JDK安装完毕,下面配置环境变量,打开/etc/profile,具体方法如下:
$ vim /etc/profile
JAVA_HOME=/usr/java/jdk1.7.0_40
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
PATH=$JAVA_HOME/bin
export JAVA_HOME,PATH,CLASSPATH
然后使用source /etc/profile
验证是否安装成功:java -version
5. 关闭每台机器的防火墙
/etc/init.d/iptables stop //关闭防火墙
chkconfig iptables off //关闭开机启动
3. Hadoop2.2.0安装
1. 解压hadoop-2.2.0.tar.gz文件。
将第一步分下载的hadoop-2.2.0.tar.gz解压到/home/grid路径下。
注意:每台机器的安装路径要相同!
2. hadoop配置过程
配置之前,需要在集群上每个节点的文件系统创建以下文件夹:
~/dfs/name
~/dfs/data
~/tmp
这里要涉及到的配置文件有7个:
~/hadoop-2.2.0/etc/hadoop/hadoop-env.sh
~/hadoop-2.2.0/etc/hadoop/yarn-env.sh
~/hadoop-2.2.0/etc/hadoop/slaves
~/hadoop-2.2.0/etc/hadoop/core-site.xml
~/hadoop-2.2.0/etc/hadoop/hdfs-site.xml
~/hadoop-2.2.0/etc/hadoop/mapred-site.xml
~/hadoop-2.2.0/etc/hadoop/yarn-site.xml
以上个别文件默认不存在,可以复制相应的template文件获得。
配置文件1:hadoop-env.sh
修改JAVA_HOME值(export JAVA_HOME=/usr/java/jdk1.7.0_40)
配置文件2:yarn-env.sh
修改JAVA_HOME值(export JAVA_HOME=/usr/java/jdk1.7.0_40)
配置文件3:slaves(这个文件里面保存所有slave节点)
写入所有的从节点。
compute-0-1
compute-0-2
配置文件4:core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs:// compute-0-0:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/grid/tmp</value>
</property>
<property>
</configuration>
配置文件5:hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>compute-0-0:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/grid/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/grid/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
配置文件6:mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>compute-0-0:19888</value>
</property>
</configuration>
配置文件7:yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle .class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>compute-0-0:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>compute-0-0:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value> cloud001:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value> cloud001:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value> cloud001:8088</value>
</property>
</configuration>
3. 复制到其他节点
这里可以写一个shell脚本进行操作(有大量节点时比较方便)
#!/bin/bash
scp –r /home/grid/hadoop-2.2.0 grid@compute-0-2:~/
scp –r /home/grid/hadoop-2.2.0 grid@compute-0-3:~/
注:由于我们集群里面compute-0-0是64bit,而comput-0-1和compute-0-2是32bit的,所以不能直接复制,而采用单独安装hadoop,复制替换相关配置文件:
#!/bin/bash
scp /home/grid/hadoop-2.2.0/etc/hadoop/hadoop-env.sh grid@compute-0-1: ~/hadoop-2.2.0/etc/hadoop/hadoop-env.sh
scp /home/grid/hadoop-2.2.0/etc/hadoop/hadoop-env.sh grid@compute-0-2: ~/hadoop-2.2.0/etc/hadoop/hadoop-env.sh
scp /home/grid/hadoop-2.2.0/etc/hadoop/yarn-env.sh grid@compute-0-1: ~/hadoop-2.2.0/etc/hadoop/yarn-env.sh
scp /home/grid/hadoop-2.2.0/etc/hadoop/yarn-env.sh grid@compute-0-2: ~/hadoop-2.2.0/etc/hadoop/yarn-env.sh
scp /home/grid/hadoop-2.2.0/etc/hadoop/slaves grid@compute-0-1: ~/hadoop-2.2.0/etc/hadoop/slaves
scp /home/grid/hadoop-2.2.0/etc/hadoop/slaves grid@compute-0-2: ~/hadoop-2.2.0/etc/hadoop/slaves
scp /home/grid/hadoop-2.2.0/etc/hadoop/core-site.xml grid@compute-0-1: ~/hadoop-2.2.0/etc/hadoop/core-site.xml
scp /home/grid/hadoop-2.2.0/etc/hadoop/core-site.xml grid@compute-0-2: ~/hadoop-2.2.0/etc/hadoop/core-site.xml
scp /home/grid/hadoop-2.2.0/etc/hadoop/hdfs-site.xml grid@compute-0-1: ~/hadoop-2.2.0/etc/hadoop/hdfs-site.xml
scp /home/grid/hadoop-2.2.0/etc/hadoop/hdfs-site.xml grid@compute-0-2: ~/hadoop-2.2.0/etc/hadoop/hdfs-site.xml
scp /home/grid/hadoop-2.2.0/etc/hadoop/mapred-site.xml grid@compute-0-1: ~/hadoop-2.2.0/etc/hadoop/mapred-site.xml
scp /home/grid/hadoop-2.2.0/etc/hadoop/mapred-site.xml grid@compute-0-2: ~/hadoop-2.2.0/etc/hadoop/mapred-site.xml
scp /home/grid/hadoop-2.2.0/etc/hadoop/yarn-site.xml grid@compute-0-1: ~/hadoop-2.2.0/etc/hadoop/yarn-site.xml
scp /home/grid/hadoop-2.2.0/etc/hadoop/yarn-site.xml grid@compute-0-2: ~/hadoop-2.2.0/etc/hadoop/yarn-site.xml
4. 启动hadoop
进入安装目录:
cd ~/hadoop-2.2.0/
格式化namenode:
./bin/hdfs namenode –format
启动hdfs:
./sbin/start-dfs.sh
此时在compute-0-0上面运行的进程有:
NameNode SecondaryNameNode
在compute-0-1和compute-0-2上面运行的进程有:
DataNode
启动yarn:
./sbin/start-yarn.sh
此时在compute-0-0上面运行的进程有:
NameNode SecondaryNameNode ResourceManager
在compute-0-1和compute-0-2上面运行的进程有:
DataNode NodeManaget
查看集群状态:
./bin/hdfs dfsadmin –report
查看文件块组成:
./bin/hdfsfsck / -files –blocks
查看HDFS:
http://localhost:50070
查看RM
http://localhost:8088
5. 验证安装是否成功
运行示例程序,现在hdfs上创建一个文件夹
./bin/hdfs dfs –mkdir /input
./bin/hadoop jar ./share/hadoop-mapreduce-example.jar writer input
DataNode无法启动的解决方法:
你可以查看一下datanode上面的log信息,datanode无法启动的原因八成是你多次format namenode造成的。log中有详细原因,解放方法也很简单,主要是两个问题:
1. clusterID不一致,namenode的cid和datanode的cid不一致,导致的原因是对namenode进行format的之后,datanode不会进行format,所以datanode里面的cid还是和format之前namenode的cid一样,解决办法是删除datanode里面的dfs.datanode.data.dir目录和tmp目录,然后再启动start-dfs.sh
2.即使删除iptables之后,仍然报Datanode denied communication with namenode: DatanodeRegistration错误,参考文章http://stackoverflow.com/questions/17082789/cdh4-3exception-from-the-logs-after-start-dfs-sh-datanode-and-namenode-star,可以知道需要把集群里面每个houst对应的ip写入/etc/hosts文件就能解决问题。