Spark on Yarn安装

1、安装hadoop-2.2.0

下载hadoop2.2版本。地址:http://apache.dataguru.cn/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz

 执行tar zxf hadoop-2.2.0.tar.gz解压至当前目录/home/hduser目录下。

mv hadoop-2.2.0 ~/app/


2、配置hadoop

(1)java home

echo $JAVA_HOME

/usr/lib/jvm/java-7-oracle

编辑/home/hduser/hadoop/etc/hadoop/hadoop-env.sh

替换exportJAVA_HOME=${JAVA_HOME}为如下:

exportJAVA_HOME=/usr/lib/jvm/java-7-oracle


(2)配置core-site

编辑/home/hduser/hadoop/etc/hadoop/core-site.xml

在configuration节点下添加:

<property>
  <name>hadoop.tmp.dir</name>
 <value>~/app/hadoop-2.2.0/tmp</value>
 <description>A base for other temporarydirectories.</description>
</property>
<property>
 <name>fs.default.name</name>
 <value>hdfs://localhost:8010</value>
 <description>The name of the default file system.  A URI whose
 scheme and authority determine the FileSystem implementation.  The
 uri's scheme determines the config property (fs.SCHEME.impl) naming
 the FileSystem implementation class. The uri's authority is used to
 determine the host, port, etc. for a filesystem.</description>
</property>


(3)配置mapred-site

编辑/home/hduser/hadoop/etc/hadoop/mapred-site.xml:

~/app/hadoop-2.2.0/etc/hadoop# cp mapred-site.xml.template mapred-site.xml

在configuration节点下添加

<property>
 <name>mapred.job.tracker</name>
 <value>localhost:54311</value>
 <description>The host and port that the MapReduce job tracker runs
 at.  If "local", thenjobs are run in-process as a single map
 and reduce task.
  </description>
</property>
<property>
 <name>mapred.map.tasks</name>
 <value>10</value>
 <description>As a rule of thumb, use 10x the number of slaves(i.e., number of tasktrackers).
  </description>
</property>
<property>
 <name>mapred.reduce.tasks</name>
 <value>2</value>
 <description>As a rule of thumb, use 2x the number of slaveprocessors (i.e., number of tasktrackers).
  </description>
</property>


(4)配置hdfs-site

编辑/home/hduser/hadoop/etc/hadoop/hdfs-site.xml

在configuration节点下添加

<property>
 <name>dfs.replication</name>
 <value>1</value>
 <description>Default block replication.
 The actual number of replications can be specified when the file iscreated.
 The default is used if replication is not specified in create time.
  </description>
</property>


3、运行hadoop

(1)初始化

~/app/hadoop-2.2.0/bin# ./hdfs namenode -format

如果执行成功,你会在日志中(倒数几行)找到如下成功的提示信息:

common.Storage: Storage directory/home/hduser/hadoop/tmp/hadoop-hduser/dfs/name has been successfully formatted.


(2)配置ssh

ssh-keygen -t rsa

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

ssh localhost


(3)配置/etc/profile
export HADOOP_HOME=/root/app/hadoop-2.2.0
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOOME/sbin:$HADOOP_HOME/lib
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"


(4)启动dfs

~/app/hadoop-2.2.0/sbin# ./start-dfs.sh

jps看启动成功与否

15021 NameNode

15767 SecondaryNameNode

15123 DataNode


(5)启动yarn

~/app/hadoop-2.2.0/sbin# ./start-yarn.sh

jps

15021 NameNode

16052 NodeManager

15767 SecondaryNameNode

15123 DataNode

15952 ResourceManager


(6)查看资源管理器

http://192.168.2.215:8088/cluster

http://192.168.2.215:50070


(7)测试

wget http://www.gutenberg.org/cache/epub/20417/pg20417.txt

bin/hdfs dfs -mkdir /test

~/app/hadoop-2.2.0/bin# hdfs dfs -copyFromLocal ~/app/hadoop-2.2.0/pg20417.txt /test

到http://192.168.2.215:50075 去查看


4、停止hadoop

若停止hadoop,依次运行如下命令:

$./stop-yarn.sh

$./stop-dfs.sh


5、运行spark-on-yarn

(1)在yarn cluster模式下启动spark

~/app/spark-1.0.0-bin-hadoop2/bin# ./spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 3 --dirver-memory 1g --executor-memory 1g -executor-cores 1 ~/app/spark-1.0.0-bin-hadoop2/lib/spark-examples*.jar 10

(2)client模式

 ./bin/spark-shell --master yarn-client
(3)查看结果

http://192.168.2.215:8042/node/application/application_1405736830832_0001



参考

http://blog.csdn.net/gobitan/article/details/13020211

http://spark.apache.org/docs/1.0.0/running-on-yarn.html

你可能感兴趣的:(Spark on Yarn安装)