配置 spark 历史服务器

Spark-shell 没有退出之前, 我们是可以看到正在执行的任务的日志情况,一旦任务执行结束,web-ui 就不能看到历史任务,需要额外配置历史服务器

1. 配置 spark-default.conf 文件, 开启 Log

// 创建 Log
hadoop fs -mkdir hdfs://hadoop1:9000/spark-job-log
[hadoop@hadoop1 apps]$ cd spark-2.2.0/conf/
[hadoop@hadoop1 conf]$ cp spark-defaults.conf.template spark-defaults.conf
[hadoop@hadoop1 conf]$ vim spark-defaults.conf

spark.eventLog.enabled           true
spark.eventLog.dir               hdfs://hadoop1:9000/spark-job-log

注意: hdfs://hadoop1:9000/spark-job-log 需要提前创建

2. 修改 spark-env.sh文件

[hadoop@hadoop1 conf]$ vim spark-env.sh 

# history server
export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=18080 -Dspark.history.retainedApplications=30 -Dspark.history.fs.logDirectory=hdfs://hadoop1:9000/spark-job-log"

3. 分发到其他节点

[hadoop@hadoop1 apps]$ sh xscp.sh /home/hadoop/apps/spark-2.2.0/conf/spark-env.sh 
[hadoop@hadoop1 apps]$ sh xscp.sh /home/hadoop/apps/spark-2.2.0/conf/spark-defaults.conf

4. 开启历史服务器

[hadoop@hadoop1 apps]$ ./spark-2.2.0/sbin/start-history-server.sh 
starting org.apache.spark.deploy.history.HistoryServer, logging to /home/hadoop/apps/spark-2.2.0/logs/spark-hadoop-org.apache.spark.deploy.history.HistoryServer-1-hadoop1.out

[hadoop@hadoop1 apps]$ sh xcall.sh jps
============= hadoop@192.168.131.137 jps =============
2017 NodeManager
6546 HRegionServer
1685 DataNode
7431 HistoryServer      // 历史服务器进程
1912 ResourceManager
7483 Jps
4907 SparkSubmit
6414 HMaster
2399 QuorumPeerMain
1583 NameNode

Web-Ui 地址:http://192.168.131.137:18080/

注意:需要先启动 HDFS

你可能感兴趣的:(大数据,spark,服务器,hadoop)