本内容整理来源于DT大数据梦工厂。
在bin 目录下执行:
./spark-submit --class cn.tan.spark.dt.WordCount --master spark://node11:7077 /home/word.jar
spark jobHistory 配置:
Caused by: java.lang.IllegalArgumentException: Log directory specified does not exist: file:/tmp/spark-events. Did you configure the correct one through spark.history.fs.logDirectory?
at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$startPolling(FsHistoryProvider.scala:168)
at org.apache.spark.deploy.history.FsHistoryProvider.initialize(FsHistoryProvider.scala:120)
at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:116)
at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:49)
解决办法:
spark historyServer 配置:
默认配置在spark-defaults.conf
spark.eventLog.enabled true
spark.eventLog.dir hdfs://node11:9000/historyserverforSpark
spark.history.ui.port 18080
spark.history.fs.logDirectory hdfs://node11:9000/historyserverforSpark
jps
看到进程:6269 HistoryServer
在eclispe 写好广告点击排名的程序并测试。
SparkContext 源码延伸:
SparkEvn, SparkShceduler, SchedulerBackend
DT大数据梦工厂联系方式:
新浪微博:www.weibo.com/ilovepains/
微信公众号:DT_Spark
博客:http://.blog.sina.com.cn/ilovepains
TEL:18610086859
Email:[email protected]