一、场景
在用maven对 hadoop,spark等进行源码编译的时候,经常会遇到编译报错。以下为一些解决办法。
二、报错与分析排查:
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process (default) on project spark-yarn_2.11: Failed to resolve dependencies for one or more projects in the reactor. Reason: Unable to get dependency information for com.google.inject.extensions:guice-servlet:jar:3.0: Failed to retrieve POM for com.google.inject.extensions:guice-servlet:jar:3.0: Failure to transfer com.google.inject.extensions:guice-servlet:pom:3.0 from https://repo1.maven.org/maven2 was cached in the local repository, resolution will not be reattempted until the update interval of central has elapsed or updates are forced. Original error: Could not transfer artifact com.google.inject.extensions:guice-servlet:pom:3.0 from/to central (https://repo1.maven.org/maven2): Connection reset
[ERROR] com.google.inject.extensions:guice-servlet:jar:3.0
[ERROR]
[ERROR] from the specified remote repositories:
[ERROR] central (https://repo1.maven.org/maven2, releases=true, snapshots=false),
[ERROR] cloudera (https://repository.cloudera.com/artifactory/cloudera-repos/, releases=true, snapshots=true),
[ERROR] apache.snapshots (http://repository.apache.org/snapshots, releases=false, snapshots=true),
[ERROR] m2.java.net (http://download.java.net/maven/2, releases=true, snapshots=true),
[ERROR] repository.jboss.org (http://repository.jboss.org/nexus/content/groups/public/, releases=true, snapshots=true),
[ERROR] glassfish-repository (http://maven.glassfish.org/content/groups/glassfish, releases=true, snapshots=true),
[ERROR] jvnet-nexus-snapshots (https://maven.java.net/content/repositories/snapshots, releases=false, snapshots=true)
[ERROR] Path to dependency:
[ERROR] 1) org.apache.spark:spark-yarn_2.11:jar:2.2.0
[ERROR] 2) com.sun.jersey.contribs:jersey-guice:jar:1.9
[ERROR] -> [Help 1]
[ERROR] Failed to execute goal on project spark-sql-kafka-0-10_2.11: Could not resolve dependencies for project org.apache.spark:spark-sql-kafka-0-10_2.11:jar:2.2.0: Failed to collect dependencies at org.apache.kafka:kafka_2.11:jar:0.10.0.1 -> net.sf.jopt-simple:jopt-simple:jar:4.9: Failed to read artifact descriptor for net.sf.jopt-simple:jopt-simple:jar:4.9: Could not transfer artifact net.sf.jopt-simple:jopt-simple:pom:4.9 from/to central (https://repo1.maven.org/maven2): repo1.maven.org: Name or service not known: Unknown host repo1.maven.org: Name or service not known -> [Help 1]
1、在报错中发现,
Reason:
Unable to get dependency information for com.google.inject.extensions:guice-servlet:jar:3.0:
Failed to retrieve POM for com.google.inject.extensions:guice-servlet:jar:3.0:
Failure to transfer com.google.inject.extensions:guice-servlet:pom:3.0
Could not resolve dependencies for project org.apache.spark:spark-sql-kafka-0-10_2.11:jar:2.2.0:
Failed to collect dependencies at org.apache.kafka:kafka_2.11:jar:0.10.0.1
这一类报错。
由此可见,是由于maven从 我们在spark中的 pom.xml 文件中配置的远程库中去下载文件,解析文件的时候遇到了问题,导致无法将文件下载到本地目录里。
那么,首先先去本地目录中观察这些文件是否存在:
(1)vi apache-maven-3.3.9/conf/settings.xml 在文件中找到指定的本地库
/home/hadoop/maven_repo
(2)进入目录后,根据报错,一层层进入去查看
针对 Failure to transfer com.google.inject.extensions:guice-servlet:pom:3.0
cd /home/hadoop/maven_repo/com/google/inject/extensions/guice-servlet/3.0 pom 是指的缺少pom文件。
(3)打开网页https://repo1.maven.org/maven2,根据上面的路径去一层层打开,然后检查里面的文件和本地文件是否不一致。
这里有两个情况:
① 目录一致,本地缺少文件,则 将jar,jar.sha1,pom,pom.sha1手动下载下来到本地的路径目录下。再添加一个文件_remote.repositories
文件内容参考其他下载好的去写。
或者如方法② 也可以。
② 目录不一致,发现网站上没有这个路径,则需要去查看我们缺少的包是属于哪个项目。然后去添加合理的远程库。
此时,需要在 spark中的 pom.xml 文件中配置相应的远程库。一般可以配置如下:
找到这一串
central
Maven Repository
https://repo1.maven.org/maven2
true
false
#在下面紧接着添加:cloudera 和 aliyun 的库,一般都能解决。
cloudera
cloudera Repository
https://repository.cloudera.com/artifactory/cloudera-repos/
alimaven
aliyun maven
http://maven.aliyun.com/nexus/content/groups/public/
2、配置完成后,再次运行mvn 命令。一般情况下,问题解决。
注:最好能有能访问google。