spark异常 Compression codec com.hadoop.compression.lzo.LzoCodec not found

1、场景描述:
执行:
scala> val lines=sc.textFile("/user/dev_yx/dpi/input/rule/keyWord.txt")
scala> lines.count()

Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
        ... 61 more
Caused by: java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found.
        at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:135)
        at org.apache.hadoop.io.compress.CompressionCodecFactory.(CompressionCodecFactory.java:175)
        at org.apache.hadoop.mapred.TextInputFormat.configure(TextInputFormat.java:45)
        ... 66 more
Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1980)
        at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:128)
        ... 68 more

以 spark-submit 执行:
spark-submit --class com.spark --master local --driver-memory 2g --executor-memory 2g --num-executors 10 --executor-cores 4 --queue test3 test-0.0.1-SNAPSHOT.jar
也报上述异常
改成这样执行:
spark-submit --class com.spark --master yarn --deploy-mode cluster --driver-memory 2g --executor-memory 1g --num-executors 10 --executor-cores 4 --queue test3 original-test-0.0.1-SNAPSHOT.jar
可以正常运行
2、原因:
  这是因为在hadoop core-site.xml mapred-site.xml 中开启了压缩,并且压缩式lzo 的。这就导致写入/ 上传到hdfs 的文件自动被压缩为lzo 了。这个时候你使用sc.textFile 读取文件时就会报告一堆lzo 找不到的异常。
因为在Spark on Yarn的模式下HadoopYarn的配置yarn.nodemanager.local-dirs会覆盖Spark的Spark.local.dir;

你可能感兴趣的:(Spark)