Hadoop-2.2.0 x64编译
1. hadoop源码hadoop-2.2.0-src.tar.gz下载并解压
2. 安装必要工具:yum install maven ncurses-devel openssl-devel cmake
3. 编译:
3.1 cd hadoop-2.2.0-src
3.2 mvn package -Pdist,native -DskipTests -Dtar
生成的发布包为hadoop-dist/target/hadoop-2.2.0.tar.gz
hadoop-2.2.0 集群配置
1. 解压前面编译的发布包hadoop-2.2.0.tar.gz,以下操作均在master机器Fedora01上操作
tar -zxvf hadoop-2.2.0.tar.gz -C /usr/local
ln -snf hadoop-2.2.0 hadoop2
cd hadoop2/etc/hadoop
2. 配置以下几个文件,参考http://blog.csdn.net/licongcong_0224/article/details/12972889
2.1 slavesFedora01 Fedora02 Fedora03 Fedora042.2 core-site.xml
<!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://Fedora01:9000</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/data/hadoop2/temp</value> <description>A base for other temporary directories.</description> </property> <property> <name>hadoop.proxyuser.hduser.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hduser.groups</name> <value>*</value> </property> </configuration>2.3 hdfs-site.xml
<!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>Fedora01:9001</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/data/hadoop2/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/data/hadoop2/dfs/data</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> </configuration>2.4 mapred-site.xml
cp mapred-site.xml.template mapred-site.xml
<!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>Fedora01:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>Fedora01:19888</value> </property> </configuration>2.5 yarn-site.xml
<configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>Fedora01:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>Fedora01:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>Fedora01:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>Fedora01:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>Fedora01:8080</value> </property> </configuration>3. 环境变量配置,同步到其他机器
3.1 增加下面两行到~/.bashrc,并source ~/.bashrc
export HADOOP_HOME="/usr/local/hadoop2"
export PATH="$HADOOP_HOME/bin:$PATH"
3.2 同步
cd /usr/local
~/sync-cluster.sh hadoop2 hadoop-2.2.0 ~/.bashrc
4. 启动集群
cd /usr/local/hadoop2
./bin/hdfs namenode -format
./sbin/start-dfs.sh
jps #Fedora01上应该看到4个进程DataNode NameNode Jps SecondaryNameNode 其他机器应该看到2个进程DataNode Jps
./sbin/start-yarn.sh
jps #Fedora01上应该看到6个进程DataNode NameNode Jps SecondaryNameNode ResourceManager NodeManager 其他机器应该看到3个进程DataNode Jps NodeManager
5. 配置selinux和firewalld
5.1 setenfore 0
5.2 编辑/etc/selinux/conf文件,设置SELINUX=disabled
5.3 systemctl stop firewalld.service; systemctl disable firewalld.service 这里很坑人的是iptables在Fedora20里已经不用了,默认就是禁止了的,而改用了firewalld。
5. 测试,无错误
[root@Fedora01 ~]# cd
-rw-r--r-- 3 root supergroup 763 2014-02-07 21:42 /data/test/sync-cluster.sh
监控页面,这里为了方便在主机Fedora上的/etc/hosts文件配置了Fedora01与IP的对应关系,如果没配置,把Fedora01改为对应的IP即可