spark学习-运行spark on yarn 例子和查看日志.

要通过web页面查看运行日志,需要启动两个东西
hadoop启动jobhistoryserver和spark的history-server.

相关配置文件:

etc/hadoop/mapred-site.xml

   
   <property>
         <name>mapreduce.jobhistory.addressname>
         <value>spark-master:10020value>
    property>
    <property>
        <name>mapreduce.jobhistory.webapp.addressname>
        <value>spark-master:19888value>
    property>

yarn-site.xml


     <property>
        <name>yarn.log-aggregation-enablename>
        <value>truevalue>
     property>
       
     <property>
         <name>yarn.log.server.urlname>
         <value>http://spark-master:19888/jobhistory/logs/value>
      property>
      
      <property>
          <name>yarn.log-aggregation.retain-secondsname>
          <value>86400value>
      property>

spark-defaults.conf

spark.eventLog.enabled=true
spark.eventLog.compress=true
#保存在本地
#spark.eventLog.dir=file://usr/local/hadoop-2.7.6/logs/userlogs
#spark.history.fs.logDirectory=file://usr/local/hadoop-2.7.6/logs/userlogs

#保存在hdfs上
spark.eventLog.dir=hdfs://spark-master:9000/tmp/logs/root/logs
spark.history.fs.logDirectory=hdfs://spark-master:9000/tmp/logs/root/logs
spark.yarn.historyServer.address=spark-master:18080

启动

1.首先启动 hadoop的jobhistory

[root@spark-master hadoop-2.7.6]# sbin/mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /usr/local/hadoop-2.7.6/logs/mapred-root-historyserver-spark-master.out

2.启动spark的history-server

[root@spark-master spark-2.3.0]# sbin/start-history-server.sh 
starting org.apache.spark.deploy.history.HistoryServer, logging to /usr/local/spark-2.3.0/logs/spark-root-org.apache.spark.deploy.history.HistoryServer-1-spark-master.out

如果配置正确,启动完成之后,就可以访问18080 和19888

效果图:
spark学习-运行spark on yarn 例子和查看日志._第1张图片

spark学习-运行spark on yarn 例子和查看日志._第2张图片

运行测试例子

spark运行机制有机制,基于local模式,standalone,和yarn模式.

三种模式的命令有一些不一样.

local

./bin/spark-submit --class org.apache.spark.examples.SparkPi --master local[4] --driver-memory 4g --executor-memory 2g --executor-cores 1 examples/jars/spark-examples_2.11-2.3.0.jar 1

standalone

./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://spark-master:6066 --deploy-mode cluster --driver-memory 4g --executor-memory 2g --executor-cores 1 examples/jars/spark-examples_2.11-2.3.0.jar 1

yarn模式

bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --driver-memory 1g --executor-memory 1g examples/jars/spark-examples_2.11-2.3.0.jar 1

我们这里面探讨的是 spark on yarn模式下,查看日志的流程.

参考资料https://www.cnblogs.com/sorco/p/7070922.html

你可能感兴趣的:(大数据)