Oozie 出现 ClassNotFoundException 解决方法

参考:http://jyd.me/nosql/oozie-classnotfoundexception-solution/

    http://shiyanjun.cn/archives/684.html


如果出现以下错误:

1 java.lang.ClassNotFoundException: org.apache.hadoop.tools.DistCp
2 java.lang.NoClassDefFoundError: org/apache/pig/Main
3 java.lang.ClassNotFoundException: org.apache.pig.Main
4 java.lang.NoClassDefFoundError: org/apache/sqoop/Sqoop
5 java.lang.ClassNotFoundException: org.apache.sqoop.Sqoop
6 java.lang.NoClassDefFoundError: org/apache/hadoop/hive/cli/CliDriver
7 java.lang.ClassNotFoundException: org.apache.hadoop.hive.cli.CliDriver
8 java.lang.ClassNotFoundException: Class org.apache.hadoop.streaming.PipeMapRunner not found

怎么解决呢?当然是加入对应的包啦~~

运行Oozie的job,Oozie要怎么找到所需要的类呢?

有两种方式来添加jar包。

先看下Oozie workflow的目录。

1 sudo -u oozie hadoop fs -ls examples/apps/demo/
2 Found 4 items
3 -rw-r--r--   1 oozie supergroup 930  2012-12-14 13:23 examples/apps/demo/id.pig
4 -rw-r--r--   1 oozie supergroup 1020 2012-12-14 13:23 examples/apps/demo/job.properties
5 drwxr-xr-x   - oozie supergroup 0    2012-12-14 13:23 examples/apps/demo/lib
6 -rw-r--r--   1 oozie supergroup 6136 2012-12-14 13:23 examples/apps/demo/workflow.xml

其中workflow.xml和job.properties是必须要有的,id.pig则是跟job有关的pig脚本,这里可以无视。

关键是lib文件夹,这个文件夹可有可无,如果有,oozie则会自动把lib下面的jar包加到workflow的classpath里面去。

这样通过oozie运行的任务就能找到jar包啦~

以上这种方式是最简单的方式。

另外,我们也可以在job.properties文件里面使用oozie.libpath来指定其它的HDFS目录(可以指定多个目录,通过逗号分隔)。

通过oozie.libpath来指定jar包目录的好处就是,多个workflow可以共用jar包。

 

现在说第二种方式,通过ShareLib。

像 DistCp, Streaming, Pig, Sqoop, and Hive之类Action,需要额外的Jar包才能运行。

使用ShareLib的方式很像oozie.libpath,不一样的是,它是为上面说的那些特殊的action和他们对应的JARs而准备的。

CDH 4.1.2的ShareLib目录如下:

1 drwxr-xr-x share
2 drwxr-xr-x share/lib
3 drwxr-xr-x share/lib/distcp
4 -rw-r--r-- share/lib/distcp/hadoop-tools-2.0.0-mr1-cdh4.1.2.jar
5 drwxr-xr-x share/lib/hive
6 -rw-r--r-- share/lib/hive/JavaEWAH-0.3.2.jar
7 -rw-r--r-- share/lib/hive/antlr-2.7.7.jar
8 -rw-r--r-- share/lib/hive/antlr-3.0.1.jar
9 -rw-r--r-- share/lib/hive/antlr-runtime-3.0.1.jar
10 -rw-r--r-- share/lib/hive/avro-ipc-1.7.1.cloudera.2.jar
11 -rw-r--r-- share/lib/hive/avro-mapred-1.7.1.cloudera.2.jar
12 -rw-r--r-- share/lib/hive/commons-beanutils-1.7.0.jar
13 -rw-r--r-- share/lib/hive/commons-beanutils-core-1.8.0.jar
14 -rw-r--r-- share/lib/hive/commons-collections-3.2.1.jar
15 -rw-r--r-- share/lib/hive/commons-compress-1.4.1.jar
16 -rw-r--r-- share/lib/hive/commons-configuration-1.6.jar
17 -rw-r--r-- share/lib/hive/commons-dbcp-1.4.jar
18 -rw-r--r-- share/lib/hive/commons-digester-1.8.jar
19 -rw-r--r-- share/lib/hive/commons-pool-1.5.4.jar
20 -rw-r--r-- share/lib/hive/datanucleus-connectionpool-2.0.3.jar
21 -rw-r--r-- share/lib/hive/datanucleus-core-2.0.3.jar
22 -rw-r--r-- share/lib/hive/datanucleus-enhancer-2.0.3.jar
23 -rw-r--r-- share/lib/hive/datanucleus-rdbms-2.0.3.jar
24 -rw-r--r-- share/lib/hive/derby-10.6.1.0.jar
25 -rw-r--r-- share/lib/hive/guava-11.0.2.jar
26 -rw-r--r-- share/lib/hive/haivvreo-1.0.7-cdh-4.jar
27 -rw-r--r-- share/lib/hive/hive-builtins-0.9.0-cdh4.1.2.jar
28 -rw-r--r-- share/lib/hive/hive-cli-0.9.0-cdh4.1.2.jar
29 -rw-r--r-- share/lib/hive/hive-common-0.9.0-cdh4.1.2.jar
30 -rw-r--r-- share/lib/hive/hive-contrib-0.9.0-cdh4.1.2.jar
31 -rw-r--r-- share/lib/hive/hive-exec-0.9.0-cdh4.1.2.jar
32 -rw-r--r-- share/lib/hive/hive-metastore-0.9.0-cdh4.1.2.jar
33 -rw-r--r-- share/lib/hive/hive-serde-0.9.0-cdh4.1.2.jar
34 -rw-r--r-- share/lib/hive/hive-service-0.9.0-cdh4.1.2.jar
35 -rw-r--r-- share/lib/hive/hive-shims-0.9.0-cdh4.1.2.jar
36 -rw-r--r-- share/lib/hive/httpclient-4.0.1.jar
37 -rw-r--r-- share/lib/hive/httpcore-4.0.1.jar
38 -rw-r--r-- share/lib/hive/jackson-core-asl-1.8.8.jar
39 -rw-r--r-- share/lib/hive/jackson-mapper-asl-1.8.8.jar
40 -rw-r--r-- share/lib/hive/jdo2-api-2.3-ec.jar
41 -rw-r--r-- share/lib/hive/jetty-util-6.1.26.cloudera.2.jar
42 -rw-r--r-- share/lib/hive/jline-0.9.94.jar
43 -rw-r--r-- share/lib/hive/json-20090211.jar
44 -rw-r--r-- share/lib/hive/jsr305-1.3.9.jar
45 -rw-r--r-- share/lib/hive/jta-1.1.jar
46 -rw-r--r-- share/lib/hive/libfb303-0.7.0.jar
47 -rw-r--r-- share/lib/hive/libthrift-0.7.0.jar
48 -rw-r--r-- share/lib/hive/netty-3.4.0.Final.jar
49 -rw-r--r-- share/lib/hive/servlet-api-2.5-20081211.jar
50 -rw-r--r-- share/lib/hive/stringtemplate-3.1-b1.jar
51 -rw-r--r-- share/lib/hive/xz-1.0.jar
52 drwxr-xr-x share/lib/mapreduce-streaming
53 -rw-r--r-- share/lib/mapreduce-streaming/commons-cli-1.2.jar
54 -rw-r--r-- share/lib/mapreduce-streaming/commons-codec-1.4.jar
55 -rw-r--r-- share/lib/mapreduce-streaming/commons-el-1.0.jar
56 -rw-r--r-- share/lib/mapreduce-streaming/commons-httpclient-3.1.jar
57 -rw-r--r-- share/lib/mapreduce-streaming/commons-logging-1.1.jar
58 -rw-r--r-- share/lib/mapreduce-streaming/commons-net-3.1.jar
59 -rw-r--r-- share/lib/mapreduce-streaming/core-3.1.1.jar
60 -rw-r--r-- share/lib/mapreduce-streaming/hadoop-core-2.0.0-mr1-cdh4.1.2.jar
61 -rw-r--r-- share/lib/mapreduce-streaming/hadoop-streaming-2.0.0-mr1-cdh4.1.2.jar
62 -rw-r--r-- share/lib/mapreduce-streaming/hsqldb-1.8.0.7.jar
63 -rw-r--r-- share/lib/mapreduce-streaming/jackson-core-asl-1.8.8.jar
64 -rw-r--r-- share/lib/mapreduce-streaming/jackson-mapper-asl-1.8.8.jar
65 -rw-r--r-- share/lib/mapreduce-streaming/jasper-compiler-5.5.23.jar
66 -rw-r--r-- share/lib/mapreduce-streaming/jasper-runtime-5.5.23.jar
67 -rw-r--r-- share/lib/mapreduce-streaming/jets3t-0.6.1.jar
68 -rw-r--r-- share/lib/mapreduce-streaming/jetty-6.1.14.jar
69 -rw-r--r-- share/lib/mapreduce-streaming/jetty-util-6.1.26.cloudera.2.jar
70 -rw-r--r-- share/lib/mapreduce-streaming/jsp-api-2.1.jar
71 -rw-r--r-- share/lib/mapreduce-streaming/log4j-1.2.16.jar
72 -rw-r--r-- share/lib/mapreduce-streaming/oro-2.0.8.jar
73 -rw-r--r-- share/lib/mapreduce-streaming/servlet-api-2.5-6.1.14.jar
74 -rw-r--r-- share/lib/mapreduce-streaming/servlet-api-2.5.jar
75 -rw-r--r-- share/lib/mapreduce-streaming/xmlenc-0.52.jar
76 drwxr-xr-x share/lib/oozie
77 -rw-r--r-- share/lib/oozie/json-simple-1.1.jar
78 drwxr-xr-x share/lib/pig
79 -rw-r--r-- share/lib/pig/activation-1.1.jar
80 -rw-r--r-- share/lib/pig/antlr-2.7.7.jar
81 -rw-r--r-- share/lib/pig/antlr-runtime-3.4.jar
82 -rw-r--r-- share/lib/pig/commons-beanutils-1.7.0.jar
83 -rw-r--r-- share/lib/pig/commons-beanutils-core-1.8.0.jar
84 -rw-r--r-- share/lib/pig/commons-collections-3.2.1.jar
85 -rw-r--r-- share/lib/pig/commons-configuration-1.6.jar
86 -rw-r--r-- share/lib/pig/commons-digester-1.8.jar
87 -rw-r--r-- share/lib/pig/commons-io-2.1.jar
88 -rw-r--r-- share/lib/pig/guava-11.0.2.jar
89 -rw-r--r-- share/lib/pig/hbase-0.92.1-cdh4.1.2.jar
90 -rw-r--r-- share/lib/pig/high-scale-lib-1.1.1.jar
91 -rw-r--r-- share/lib/pig/hsqldb-1.8.0.7.jar
92 -rw-r--r-- share/lib/pig/httpclient-4.0.1.jar
93 -rw-r--r-- share/lib/pig/httpcore-4.0.1.jar
94 -rw-r--r-- share/lib/pig/jaxb-api-2.2.2.jar
95 -rw-r--r-- share/lib/pig/jline-0.9.94.jar
96 -rw-r--r-- share/lib/pig/joda-time-1.6.jar
97 -rw-r--r-- share/lib/pig/jruby-complete-1.6.5.jar
98 -rw-r--r-- share/lib/pig/jsch-0.1.42.jar
99 -rw-r--r-- share/lib/pig/jsr305-1.3.9.jar
100 -rw-r--r-- share/lib/pig/jython-2.5.0.jar
101 -rw-r--r-- share/lib/pig/libthrift-0.7.0.jar
102 -rw-r--r-- share/lib/pig/metrics-core-2.1.2.jar
103 -rw-r--r-- share/lib/pig/pig-0.10.0-cdh4.1.2.jar
104 -rw-r--r-- share/lib/pig/protobuf-java-2.4.0a.jar
105 -rw-r--r-- share/lib/pig/stax-api-1.0.1.jar
106 -rw-r--r-- share/lib/pig/stringtemplate-3.2.1.jar
107 -rw-r--r-- share/lib/sharelib.properties
108 drwxr-xr-x share/lib/sqoop
109 -rw-r--r-- share/lib/sqoop/avro-ipc-1.7.1.cloudera.2.jar
110 -rw-r--r-- share/lib/sqoop/avro-mapred-1.7.1.cloudera.2.jar
111 -rw-r--r-- share/lib/sqoop/commons-beanutils-1.7.0.jar
112 -rw-r--r-- share/lib/sqoop/commons-beanutils-core-1.8.0.jar
113 -rw-r--r-- share/lib/sqoop/commons-configuration-1.6.jar
114 -rw-r--r-- share/lib/sqoop/commons-digester-1.8.jar
115 -rw-r--r-- share/lib/sqoop/commons-io-2.1.jar
116 -rw-r--r-- share/lib/sqoop/guava-11.0.2.jar
117 -rw-r--r-- share/lib/sqoop/hbase-0.92.1-cdh4.1.2.jar
118 -rw-r--r-- share/lib/sqoop/high-scale-lib-1.1.1.jar
119 -rw-r--r-- share/lib/sqoop/hsqldb-1.8.0.7.jar
120 -rw-r--r-- share/lib/sqoop/httpclient-4.0.1.jar
121 -rw-r--r-- share/lib/sqoop/httpcore-4.0.1.jar
122 -rw-r--r-- share/lib/sqoop/jsr305-1.3.9.jar
123 -rw-r--r-- share/lib/sqoop/libthrift-0.7.0.jar
124 -rw-r--r-- share/lib/sqoop/metrics-core-2.1.2.jar
125 -rw-r--r-- share/lib/sqoop/netty-3.4.0.Final.jar
126 -rw-r--r-- share/lib/sqoop/servlet-api-2.5-20081211.jar
127 -rw-r--r-- share/lib/sqoop/sqoop-1.4.1-cdh4.1.2.jar

就如你所看到的,上面那些action各自依赖很多jar,每个action都有自己对应的文件夹,这样oozie就可以仅仅使用对应acion所需要的jar包,而不是把所有的jar包都包含进去。

其实,这么做是必须的。因为不是所有的action使用相同甚至是兼容的包。比如Hive action使用antlr-runtime-3.0.1.jar, 而用antlr-runtime-3.4.jar的时候将会运行失败,但是后者却是Pig action所需要的。

默认情况下,ShareLib必须放在HDFS上运行Oozie web server用户的目录下。不需要和提交oozie job的用户相同。对了,sharelib的压缩文件就在oozie的目录下~

在oozie-site.xml里面的oozie.service.WorkflowAppService.system.libpath可以指定ShareLib的位置,默认是/user/${user.name}/share/lib,其中${user.name}就是运行oozie服务的用户。详细的配置可以参考http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Installation-Guide/cdh4ig_topic_17_6.html

对了,因为cloudera的CDH4支持MRv1和YARN,所以必须用对应的sharelib包才行。

要让一个workflow使用ShareLib,只需要在job.properties里面加入oozie.use.system.libpath=true就行啦~。

Overriding the ShareLib

这个暂时没用到,就不翻译了。

In CDH 4.1.0 and later (or Oozie 3.3.0 and later), you can override the ShareLib location at the action, job, and server levels. This allows users or admins to support multiple versions or a patched version of an action at the same time. The property is calledoozie.action.sharelib.for.actiontype, where actiontype is the name of the action type (e.g. Pig, Sqoop); you would set its value to the name of a subfolder in the ShareLib. To set it at the action level you would put the property in that action’s<configuration>; to set it at the job level, you would put the property in that job’s job.properties; and to set it at the server level, you would put the property in oozie-site.xml.

For example, Oozie currently ships ready for Pig 0.10.x, but suppose you also want to be able to use Pig 0.9.x in the same workflow. The share/lib/pig folder is for Pig 0.10.x, but if you add a new folder with the Pig 0.9.x JARs, say share/lib/pig-9, you can put the following in the <configuration> element for the Pig 0.9.x action:

<property>
   <name>oozie.action.sharelib.for.pig</name>
    <value>pig-9<value>
 </property>

Oozie will continue to use share/lib/pig for the Pig 0.10.x action but will use share/lib/pig-9 for the Pig 0.9.x action.

参考:http://blog.cloudera.com/blog/2012/12/how-to-use-the-sharelib-in-apache-oozie/

转载请注明: 转载自http://jyd.me/

本文链接地址: Oozie 出现 ClassNotFoundException 解决方法


你可能感兴趣的:(Oozie 出现 ClassNotFoundException 解决方法)