spark-0.8.1安装——hadoop-2.2.0 x64

Hadoop-2.2.0 x64编译

1. hadoop源码hadoop-2.2.0-src.tar.gz下载并解压

spark-0.8.1安装——hadoop-2.2.0 x64_第1张图片

2. 安装必要工具:yum install maven ncurses-devel openssl-devel cmake

3. 编译:

  3.1 cd hadoop-2.2.0-src

  3.2 mvn package -Pdist,native -DskipTests -Dtar

生成的发布包为hadoop-dist/target/hadoop-2.2.0.tar.gz

hadoop-2.2.0 集群配置

1. 解压前面编译的发布包hadoop-2.2.0.tar.gz,以下操作均在master机器Fedora01上操作

  tar -zxvf hadoop-2.2.0.tar.gz -C /usr/local

  ln -snf hadoop-2.2.0 hadoop2

  cd hadoop2/etc/hadoop

2. 配置以下几个文件,参考http://blog.csdn.net/licongcong_0224/article/details/12972889

  2.1 slaves
Fedora01
Fedora02
Fedora03
Fedora04
  2.2 core-site.xml
<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://Fedora01:9000</value>
    </property>
    <property>
        <name>io.file.buffer.size</name>
        <value>131072</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/data/hadoop2/temp</value>
        <description>A base for other temporary directories.</description>
    </property>
    <property>
        <name>hadoop.proxyuser.hduser.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.hduser.groups</name>
        <value>*</value>
    </property>
</configuration>
  2.3 hdfs-site.xml
<!-- Put site-specific property overrides in this file. -->

<configuration>
	<property>
		<name>dfs.namenode.secondary.http-address</name>
		<value>Fedora01:9001</value>
	</property>
	<property>
		<name>dfs.namenode.name.dir</name>
		<value>/data/hadoop2/dfs/name</value>
	</property>
	<property>
		<name>dfs.datanode.data.dir</name>
		<value>/data/hadoop2/dfs/data</value>
	</property>
	<property>
		<name>dfs.replication</name>
		<value>3</value>
	</property>
	<property>
		<name>dfs.webhdfs.enabled</name>
		<value>true</value>
	</property>
</configuration>
  2.4 mapred-site.xml

 cp mapred-site.xml.template mapred-site.xml

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<property>
		<name>mapreduce.framework.name</name>
		<value>yarn</value>
	</property>
	<property>
		<name>mapreduce.jobhistory.address</name>
		<value>Fedora01:10020</value>
	</property>
	<property>
		<name>mapreduce.jobhistory.webapp.address</name>
		<value>Fedora01:19888</value>
	</property>
</configuration>
  2.5 yarn-site.xml
<configuration>
	<!-- Site specific YARN configuration properties -->
	<property>
		<name>yarn.nodemanager.aux-services</name>
		<value>mapreduce_shuffle</value>
	</property>
	<property>
		<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
		<value>org.apache.hadoop.mapred.ShuffleHandler</value>
	</property>
	<property>
		<name>yarn.resourcemanager.address</name>
		<value>Fedora01:8032</value>
	</property>
	<property>
		<name>yarn.resourcemanager.scheduler.address</name>
		<value>Fedora01:8030</value>
	</property>
	<property>
		<name>yarn.resourcemanager.resource-tracker.address</name>
		<value>Fedora01:8031</value>
	</property>
	<property>
		<name>yarn.resourcemanager.admin.address</name>
		<value>Fedora01:8033</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.address</name>
		<value>Fedora01:8080</value>
	</property> 
</configuration>
3. 环境变量配置,同步到其他机器

  3.1 增加下面两行到~/.bashrc,并source ~/.bashrc

export HADOOP_HOME="/usr/local/hadoop2"
export PATH="$HADOOP_HOME/bin:$PATH"
  3.2 同步

cd /usr/local

~/sync-cluster.sh hadoop2 hadoop-2.2.0 ~/.bashrc

4. 启动集群

cd /usr/local/hadoop2

./bin/hdfs namenode -format

./sbin/start-dfs.sh

jps #Fedora01上应该看到4个进程DataNode  NameNode  Jps  SecondaryNameNode  其他机器应该看到2个进程DataNode  Jps

./sbin/start-yarn.sh

jps #Fedora01上应该看到6个进程DataNode  NameNode  Jps  SecondaryNameNode  ResourceManager  NodeManager 其他机器应该看到3个进程DataNode  Jps NodeManager

5. 配置selinux和firewalld

  5.1 setenfore 0

  5.2 编辑/etc/selinux/conf文件,设置SELINUX=disabled

  5.3 systemctl stop firewalld.service; systemctl disable firewalld.service 这里很坑人的是iptables在Fedora20里已经不用了,默认就是禁止了的,而改用了firewalld。

5. 测试,无错误

[root@Fedora01 ~]# cd

[root@Fedora01 ~]# hadoop fs -mkdir -p /data/test
[root@Fedora01 ~]# hadoop fs -put sync-cluster.sh /data/test
[root@Fedora01 ~]# hadoop fs -ls /data/test
Found 1 items

-rw-r--r--   3 root supergroup        763 2014-02-07 21:42 /data/test/sync-cluster.sh

监控页面,这里为了方便在主机Fedora上的/etc/hosts文件配置了Fedora01与IP的对应关系,如果没配置,把Fedora01改为对应的IP即可

spark-0.8.1安装——hadoop-2.2.0 x64_第2张图片

你可能感兴趣的:(spark-0.8.1安装——hadoop-2.2.0 x64)