尝试在oozie上运行pyspark程序:
先配置yarn-env.sh以解决找不到pyspark库等的问题
export SPARK_HOME=/usr/share/spark
$ hdfs dfs -copyFromLocal py4j.zip /user/oozie/share/lib/spark
$ hdfs dfs -copyFromLocal pyspark.zip /user/oozie/share/lib/spark
【问题没有解决】
现在先解决单独用spark-submit运行的问题,再解决通过oozie运行的问题。
单独用spark-submit运行,不带参数,可以成功
带 --master yarn-cluster 会失败,在8088里面提示这样的错误
Application application_1486993422162_0016 failed 2 times due to AM Container for appattempt_1486993422162_0016_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://bigdata-master:8088/cluster/app/application_1486993422162_0016Then, click on links to logs of each attempt.
Diagnostics: File does not exist: hdfs://bigdata/user/hadoop/.sparkStaging/application_1486993422162_0016/spark1.py
java.io.FileNotFoundException: File does not exist: hdfs://bigdata/user/hadoop/.sparkStaging/application_1486993422162_0016/spark1.py
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.
【尝试一】把py里面的
# conf = conf.setMaster("local[*]") 注释掉,让spark自动选取运行的master
再次运行这样的命令:
spark-submit --master yarn-cluster pythonApp/lib/spark1.py
【成功,8088那儿不报错了!】
【失败,去掉local[*]后,单独spark-submit会造成17/02/15 16:18:11 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.】
【尝试二(未尝试)】在尝试一的基础上:
SparkConf sc中添加路径
sc.addFile("hdfs:/optimize-spark.py")
【放到oozie那儿还是报错找不到py4j.zip和pyspark.zip】
【尝试一】
更改job里面的properties
把master从local[*]改成yarn-cluster
【均不能解决找不到pyspark的问题】
【尝试二】添加标签看看能不能自动包含文件
py4j.zip pyspark.zip spark1.py
${nameNode}/user/oozie/${examplesRoot}/apps/pythonApp/lib/pyspark.zip
${nameNode}/user/oozie/${examplesRoot}/apps/pythonApp/lib/py4j.zip
【尝试二添加后出错】
Error: E0701 : E0701: XML schema error, cvc-complex-type.2.4.a: Invalid content was found starting with element 'file'. One of '{"uri:oozie:spark-action:0.1":spark-opts, "uri:oozie:spark-action:0.1":arg}' is expected.
【尝试三,添加到py里面】
conf.addFile("hdfs://bigdata/user/oozie/examples/apps/pythonApp/lib/pyspark.zip")
conf.addFile("hdfs://bigdata/user/oozie/examples/apps/pythonApp/lib/py4j.zip")
【失败】
无论在本地lib添加py4j.zip和pyspark.zip
还是在/user/oozie/share/lib/spark里面添加py4j.zip和pyspark.zip
还是在<本地>$OOZIE_HOME/share/lib/里面添加py4j.zip和pyspark.zip
还是在lib里面解压出文件再添加,都不行,都失败
【尝试四】
添加syspath /usr/share/spark/python/lib
import sys
import random
del sys.path[9]
sys.path.append("/usr/share/spark/python/lib")
【问题】这样直接提交spark-submit就会失败,提示:
$ spark-submit spark1.py
Traceback (most recent call last):
File "/home/hadoop/oozie/oozie-4.3.0/examples/apps/pythonApp/lib/spark1.py", line 8, in
from pyspark import SparkConf, SparkContext
File "/home/hadoop/oozie/oozie-4.3.0/examples/apps/pythonApp/lib/pyspark/__init__.py", line 41, in
from pyspark.context import SparkContext
File "/home/hadoop/oozie/oozie-4.3.0/examples/apps/pythonApp/lib/pyspark/context.py", line 21, in
import shutil
File "/usr/share/anaconda2/lib/python2.7/shutil.py", line 12, in
import collections
File "/usr/share/anaconda2/lib/python2.7/collections.py", line 8, in
from _collections import deque, defaultdict
ImportError: No module named _collections
【但是通过--master yarn-cluster提交是成功的】
【失败:如此情况,通过oozie还是失败】
【尝试五】通过shell来启动spark-submit来运行任务
org.lzl.MainClass
hdfs://bigdata/user/oozie/examples/apps/sparkHello/lib/OozieHelowod.jar
hdfs://bigdata/user/oozie/examples/input-data/text/data.txt
hdfs://bigdata/user/oozie/examples/output-data/spark/new/
hdfs://bigdata
【失败得日志都找不到怎么失败的】
/home/hadoop/oozie/oozie-4.3.0/examples/apps/pythonApp/lib/spark1.py
【尝试六】http://blog.csdn.net/xyf123/article/details/50853578
按照末尾的提示,给profile和spark配置文件添加SPARK_HOME
在所有的yarn节点都这么配置:
ssh bigdata-6
cd $HADOOP_HOME/etc/hadoop
cp yarn-env.sh yarn-env_backup2017_2_15_1800.sh
echo $SPARK_HOME
export SPARK_HOME=/usr/share/spark
/usr/share/spark
【移动到master】
配置环境变量 /usr/share/oozie/oozie-4.3.0
export SPARK_HOME=/usr/share/spark
export OOZIE_HOME=/usr/share/oozie/oozie-4.3.0
export CATALINA_HOME=/usr/share/oozie/oozie-4.3.0/oozie-server
export OOZIE_URL=http://bigdata-master:11000
export OOZIE_CONFIG=/usr/share/oozie/oozie-4.3.0/conf
启动后上传py依然失败,提示缺乏zip文件
继续taurus的操作
【尝试七】拷贝zip到tomcat的oozie的web里面的lib里面
oozie-server/webapps/oozie/WEB-INF/lib
【依然无法解决】
而奇葩的是,今天
$ spark-submit spark1.py --py-files py4j.zip,pyspark.zip
Pi is roughly 3.328000
$ spark-submit spark1.py --py-files py4j.zip,pyspark.zip --master yarn-cluster
Pi is roughly 3.232000
$ spark-submit spark1.py --master yarn-cluster
Pi is roughly 3.296000
$ spark-submit spark1.py
Pi is roughly 2.848000
$
单独跑全跑通了
【尝试八】按照https://oozie.apache.org/docs/4.3.0/AG_Install.html#Oozie_Share_Lib
的操作,把zip文件从
/user/oozie/share/lib/spark转移到
/user/oozie/share/lib/spark/lib
运行会报错找不到
ActionExecutorException: JA008: File does not exist: hdfs://bigdata/user/oozie/share/lib/spark/py4j.zip#py4j.zip
ActionExecutorException: JA008: File does not exist: hdfs://bigdata/user/oozie/share/lib/spark/pyspark.zip#pyspark.zip
【感觉可能是添加了--py-files py4j.zip,pyspark.zip 附加命令的影响】
【尝试九】
别特么重命名py4j.zip,特么oozie里面的源码(sharelib/spark/src/main/语句来查找的!
………………
import java.io.File;
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class SparkMain extends LauncherMain {
private static final String MASTER_OPTION = "--master";
private static final String MODE_OPTION = "--deploy-mode";
private static final String JOB_NAME_OPTION = "--name";
private static final String CLASS_NAME_OPTION = "--class";
private static final String VERBOSE_OPTION = "--verbose";
private static final String EXECUTOR_CLASSPATH = "spark.executor.extraClassPath=";
private static final String DRIVER_CLASSPATH = "spark.driver.extraClassPath=";
private static final String DIST_FILES = "spark.yarn.dist.files=";
private static final String JARS_OPTION = "--jars";
private static final String PY_FILES = "--py-files";
private static final Pattern[] PYSPARK_DEP_FILE_PATTERN = { Pattern.compile("py4\\S*src.zip"),
Pattern.compile("pyspark.zip") };
private String sparkJars = null;
………………
而且!特么它不会纠错!!特么你有了pyspark.zip,特么照样提示你缺2文件!!!!太特么傻逼了!!!!!!!
重命名回py4j-0.9-src.zip
【好吧缺成功解决缺py4j.zip和pyspark.zip问题】
新问题,现在不报错缺这2个文件了:
进hdfs中oozie的spark库里面删掉一系列py文件夹和zip,发觉py4j文件夹(文件夹!文件夹!重要的事情说三遍!)不能删除,还是需要依赖
pyspark文件夹(就是zip解压出来的文件夹)也不能删除
【不仅如此】
【pythonApp/lib下面的zip也不能省略,必须存在】
【运行返回exit with 1错误】
似乎运行有发现包冲突,想到之前为了处理确包问题到处上传py4j和pyspark的zip,把hdfs上面/user/oozie/share/lib/spark里面自己新建的lib文件夹给删除
删掉后居然【报错】找不到/user/oozie/share/lib/spark/lib/py4j.zip ??!?!?!?
不应该啊,py4j是我自己起的名字,不是固定要依赖的库啊?
我决定重启OOZIE。可能是sharelib刷新了,oozie没有认识到这个问题
【成功解决缺py4j.zip和pyspark.zip问题,重启一遍果然没有依赖库的问题了】