CDH整合zeppelin

2019独角兽企业重金招聘Python工程师标准>>> hot3.png

CDH二进制安装zeppelin-0.8.0:
在下载地址1,下载zeppelin-0.8.0-bin-all.tgz 上传服务器,解压tar -zxvf zeppelin-0.8.0-bin-all.tgz

cd zeppelin-0.8.0-bin-all/conf/

cp  zeppelin-env.sh.template zeppelin-env.sh

cp zeppelin-site.xml.template zeppelin-site.xml

注:此处的大数据集群为CM管理

配置zeppelin环境vim zeppelin-env.sh

export HIVE_HOME=/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/hive

export JAVA_HOME=/usr/java/jdk1.8.0_161

export MASTER=local[*]

export ZEPPELIN_JAVA_OPTS="-Dmaster=yarn-client -Dspark.yarn.jar=/home/zepplin/zeppelin-0.8.0-bin-all/interpreter/spark/spark-interpreter-0.8.0.jar"

export DEFAULT_HADOOP_HOME=/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/hadoop

export SPARK_HOME=/opt/cloudera/parcels/SPARK2/lib/spark2

export HADOOP_HOME=${HADOOP_HOME:-$DEFAULT_HADOOP_HOME}

if [ -n "$HADOOP_HOME" ]; then

  export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${HADOOP_HOME}/lib/native

fi

export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/etc/hadoop/conf}

 

配置zeppelin端口vim zeppelin-site.xml

 

  zeppelin.server.port

  38099

  Server port.

#value为端口,自定义配置

配置完成。cd zeppelin-0.8.0-bin-all/bin/

./zeppelin-daemon.sh start

 

访问:http://YourIp:38099

CDH源码安装zeppelin:

刚开始选择下载二进制文件(zeppelin-0.7.3-bin-all.tgz)直接安装,很简单,直接解压后运行./bin/zeppelin-daemon.sh start即可。运行官方案例时报如下错误:

java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaMirrors$JavaMirror;
at org.apache.spark.repl.SparkILoop.(SparkILoop.scala:936)
at org.apache.spark.repl.SparkILoop.(SparkILoop.scala:70)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:790)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:491)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
.......

报scala中的方法找不到,查看了下scala2.11的源码没有此方法,我们的用的是CDH5.12.1自带的scala是2.10版本。因此,我们选择自己编译安装。

编译过程中又是各种报错。源文件:zeppelin-0.7.3.tgz

编译:

[C:\Users\yiming\Desktop\zeppelin-0.7.3]$ mvn clean package -Pbuild-distr -Pyarn -Dspark.version=1.6.0 -Dhadoop.version=2.6.0-cdh5.12.1 -Pscala-2.10 -Ppyspark -Psparkr -Pvendor-repo -DskipTests

报错:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (zip-pyspark-files) on project zeppelin-spark-dependencies_2.10: An Ant BuildException has occured: Warning: Could not find file C:\Users\yiming\Desktop\zeppelin2\zeppelin-0.7.3\zeppelin-0.7.3\spark-dependencies\target\spark-1.6.0\python\lib\py4j-0.8.2.1-src.zip to copy.
[ERROR] around Ant part ...... @ 5:188 in C:\Users\yiming\Desktop\zeppelin2\zeppelin-0.7.3\zeppelin-0.7.3\spark-dependencies\target\antrun\build-main.xml
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command

[ERROR]   mvn -rf :zeppelin-spark-dependencies_2.10

报没有找到对应的py4j包,进入对应的目录可以看到对应的是py4j-0.9-src.zip,在maven仓库中找到对应版本的包拷贝过来即可。

再次编译又报错了:

[INFO] BUILD FAILURE

[INFO] ------------------------------------------------------------------------
[INFO] Total time: 03:24 min
[INFO] Finished at: 2018-04-20T14:39:52+08:00
[INFO] Final Memory: 135M/1506M
[INFO] ------------------------------------------------------------------------
in-spark-dependencies_2.10:jar:0.7.3: Could not find artifact org.apache.hadoop:hadoop-client:jar:2.6.0-cdh5.7.0-SNAPSHOT in nexus (http://192.168.30.112:8081/nexus/content/groups/public) -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

这个坑一直没找到问题的原因,查看了源码也没搞清楚我指定的是cdh5.12.1的包,它却给我下cdg5.6.0的快照包。

接下来把编译命令中的-Dspark.version=1.6.0去掉之后再次编译:

[C:\Users\yiming\Desktop\zeppelin-0.7.3]$mvn clean package -Pbuild-distr -Pyarn -Pspark-1.6 -Ppyspark -Dhadoop.version=2.6.0-cdh5.12.1 -Phadoop-2.6   -DskipTests

又报如下错误:

[WARNING] warning [email protected]: Deprecated
[WARNING] warning [email protected]: ??  Thanks for using Babel: we recommend using babel-preset-env now: please read babeljs.io/env to update! 
[WARNING] warning grunt > [email protected]: CoffeeScript on NPM has moved to "coffeescript" (no hyphen)
[WARNING] warning grunt > [email protected]: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue
[WARNING] warning grunt > glob > [email protected]: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue
[WARNING] warning grunt > findup-sync > glob > [email protected]: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue
[WARNING] warning grunt > glob > [email protected]: please upgrade to graceful-fs 4 for compatibility with current and future versions of Node.js
[WARNING] warning load-grunt-tasks > multimatch > [email protected]: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue
[WARNING] warning grunt-wiredep > wiredep > glob > [email protected]: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue
[WARNING] warning grunt-google-fonts > cssparser > [email protected]: Package no longer supported. Contact [email protected] for more info.
[WARNING] warning grunt-htmlhint > htmlhint > jshint > [email protected]: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue
[WARNING] warning grunt-replace > applause > cson-parser > [email protected]: CoffeeScript on NPM has moved to "coffeescript" (no hyphen)
[WARNING] warning grunt-wiredep > wiredep > bower-config > [email protected]: please upgrade to graceful-fs 4 for compatibility with current and future versions of Node.js
[ERROR] error An unexpected error occurred: "https://registry.yarnpkg.com/autoprefixer: connect ETIMEDOUT 104.16.63.173:443".
[INFO] info If you think this is a bug, please open a bug report with the information provided in "C:\\Users\\yiming\\Desktop\\zeppelin2\\zeppelin-0.7.3\\zeppelin-0.7.3\\zeppelin-web\\yarn-error.log".
[INFO] info Visit https://yarnpkg.com/en/docs/cli/install for documentation about this command.
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Zeppelin ........................................... SUCCESS [  4.689 s]
[INFO] Zeppelin: Interpreter .............................. SUCCESS [ 20.595 s]
[INFO] Zeppelin: Zengine .................................. SUCCESS [ 16.617 s]
[INFO] Zeppelin: Display system apis ...................... SUCCESS [ 15.265 s]
[INFO] Zeppelin: Spark dependencies ....................... SUCCESS [03:20 min]
[INFO] Zeppelin: Spark .................................... SUCCESS [ 26.079 s]
[INFO] Zeppelin: Markdown interpreter ..................... SUCCESS [  2.110 s]
[INFO] Zeppelin: Angular interpreter ...................... SUCCESS [  1.747 s]
[INFO] Zeppelin: Shell interpreter ........................ SUCCESS [  1.081 s]
[INFO] Zeppelin: Livy interpreter ......................... SUCCESS [ 15.909 s]
[INFO] Zeppelin: HBase interpreter ........................ SUCCESS [  9.540 s]
[INFO] Zeppelin: Apache Pig Interpreter ................... SUCCESS [ 12.706 s]
[INFO] Zeppelin: PostgreSQL interpreter ................... SUCCESS [  2.039 s]
[INFO] Zeppelin: JDBC interpreter ......................... SUCCESS [  2.582 s]
[INFO] Zeppelin: File System Interpreters ................. SUCCESS [  2.553 s]
[INFO] Zeppelin: Flink .................................... SUCCESS [ 12.948 s]
[INFO] Zeppelin: Apache Ignite interpreter ................ SUCCESS [  4.426 s]
[INFO] Zeppelin: Kylin interpreter ........................ SUCCESS [  1.072 s]
[INFO] Zeppelin: Python interpreter ....................... SUCCESS [  8.722 s]
[INFO] Zeppelin: Lens interpreter ......................... SUCCESS [  8.945 s]
[INFO] Zeppelin: Apache Cassandra interpreter ............. SUCCESS [ 48.842 s]
[INFO] Zeppelin: Elasticsearch interpreter ................ SUCCESS [  5.995 s]
[INFO] Zeppelin: BigQuery interpreter ..................... SUCCESS [  2.445 s]
[INFO] Zeppelin: Alluxio interpreter ...................... SUCCESS [  5.870 s]
[INFO] Zeppelin: Scio ..................................... SUCCESS [ 42.271 s]
[INFO] Zeppelin: web Application .......................... FAILURE [01:14 min]
[INFO] Zeppelin: Server ................................... SKIPPED
[INFO] Zeppelin: Packaging distribution ................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 09:10 min
[INFO] Finished at: 2018-04-20T15:55:23+08:00
[INFO] Final Memory: 452M/1686M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal com.github.eirslett:frontend-maven-plugin:1.3:yarn (yarn install) on project zeppelin-web: Failed to run task: 'yarn install --no-lockfile' failed. (error code 1) -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn -rf :zeppelin-web

日志中报了很多版本过期的错误,因此我们打开源码web Application的目录下的pom文件,发现yarn的版本设定太低,将v0.18.1修改为v0.28.1,编译终于通过。这些错误尝试很多次才解决。在这里只想说,开源软件就这点不好,软件的集成太费劲了。搞大数据,大部分时间都在处理各种版本兼容之间的问题,心累啊。。。。接下来就是安装编译好的包,从zeppelin-0.7.3\zeppelin-distribution这个目录下将zeppelin-0.7.3.tar.gz上传到服务器解压,并配置下zeppelin的zeppelin-evn.sh,添加如下内容:

export JAVA_HOME=/opt/java
export HADOOP_CONF_DIR=/etc/hadoop/conf:/etc/hive/conf
export HADOOP_HOME=/opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/hadoop
export SPARK_HOME=/opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/spark
export MASTER=yarn-client
export ZEPPELIN_LOG_DIR=/var/log/zeppelin
export ZEPPELIN_PID_DIR=/var/run/zeppelin
export ZEPPELIN_WAR_TEMPDIR=/var/tmp/zeppelin

我是在window下编译的,然后linux服务器上安装的。

接下来使用又是各种问题。。。。

打开zeppelin web ,http://192.168.xxx.xxx:8080/ ,新建一个notebook,用官方的一个以spark为编译器的例子:

import org.apache.commons.io.IOUtils
import java.net.URL
import java.nio.charset.Charset

// Zeppelin creates and injects sc (SparkContext) and sqlContext (HiveContext or SqlContext)
// So you don't need create them manually
// load bank data
// val bankText = sc.parallelize(
//     IOUtils.toString(
//         new URL("https://s3.amazonaws.com/apache-zeppelin/tutorial/bank/bank.csv"),
//         Charset.forName("utf8")).split("\n"))
val bankText = sc.textFile("/tmp/bank.csv")
case class Bank(age: Integer, job: String, marital: String, education: String, balance: Integer)

val bank = bankText.map(s => s.split(";")).filter(s => s(0) != "\"age\"").map(
    s => Bank(s(0).toInt, 
            s(1).replaceAll("\"", ""),
            s(2).replaceAll("\"", ""),
            s(3).replaceAll("\"", ""),
            s(5).replaceAll("\"", "").toInt
        )
).toDF()
bank.registerTempTable("bank")
bank.show(10)
报如下错误:
java.lang.NoSuchMethodError: org.apache.hadoop.ipc.Client.getRpcTimeout(Lorg/apache/hadoop/conf/Configuration;)I
at org.apache.hadoop.hdfs.DFSClient$Conf.(DFSClient.java:355)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:690)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:673)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:155)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1688)
at org.apache.spark.scheduler.EventLoggingListener.(EventLoggingListener.scala:66)
at org.apache.spark.SparkContext.(SparkContext.scala:555)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_1(SparkInterpreter.java:499)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:389)
at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:843)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:491)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
发现此过程调用的是zeppelin lib目录下的hadoop-common-2.6.0.jar包,查看源码,确实没有getRpcTimeout(configuration)方法,只有getRpcTimeout()方法。getRpcTimeout(configuration是CDH包装的方法,此方法不需要再指定hadoop namenode的地址,直接访问yarn-site.xml和hdfs-core.xml文件读取yarn和namenode的地址。解决办法是在zeppelin lib的目录下建立一个软连接,将hadoop-*-2.6.0.jar包指向CDH的包并备份原来的包:

[root@xxx-7 lib]#mv /opt/zeppelin/lib/hadoop-common-2.6.0.jar /opt/zeppelin/lib/hadoop-common-2.6.0.jar.bak
[root@xxx-7 lib]#mv /opt/zeppelin/lib/hadoop-auth-2.6.0.jar /opt/zeppelin/lib/hadoop-auth-2.6.0.jar.bak
[root@xxx-7 lib]#mv /opt/zeppelin/lib/hadoop-annotations-2.6.0.jar /opt/zeppelin/lib/hadoop-annotations-2.6.0.jar.bak
[root@xxx-7 lib]#ln -s  /opt/cloudera/parcels/CDH/jars/hadoop-common-2.6.0-cdh5.12.1.jar /opt/zeppelin/lib/hadoop-common-2.6.0.jar
[root@xxx-7 lib]#ln -s  /opt/cloudera/parcels/CDH/jars/hadoop-auth-2.6.0-cdh5.12.1.jar /opt/zeppelin/lib/hadoop-auth-2.6.0.jar
[root@xxx-7 lib]#ln -s  /opt/cloudera/parcels/CDH/jars/hadoop-annotations-2.6.0-cdh5.12.1.jar /opt/zeppelin/lib/hadoop-annotations-2.6.0.jar
接下来运行又报如下错误:

com.fasterxml.jackson.databind.JsonMappingException: Could not find creator property with name 'id' (in class org.apache.spark.rdd.RDDOperationScope)
at [Source: {"id":"0","name":"textFile"}; line: 1, column: 1]
at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148)
at com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:843)
at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.addBeanProps(BeanDeserializerFactory.java:533)
at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.buildBeanDeserializer(BeanDeserializerFactory.java:220)
同样的问题,还是包的问题,将zeppelin lib下面的几个包替换为CDH下面的几个包,并备份原来的包。这里可以参考文章:https://www.iteblog.com/archives/1570.html

[root@xxx-7 lib]#ln -s  /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/jars/jackson-annotations-2.3.1.jar ../lib/jackson-annotations-2.3.1.jar
[root@xxx-7 lib]#ln -s  /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/jars/jackson-core-2.3.1.jar ../lib/jackson-core-2.3.1.jar
[root@xxx-7 lib]#ln -s  /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/jars/jackson-databind-2.3.1.jar ../lib/jackson-databind-2.3.1.jar
终于,zeppelin可以跑成功了

源码编译zepplin踩过的那些坑

最近项目需要从源码安装zepplin,所以就来试下源码安装。

官方教材: http://zeppelin.apache.org/docs/0.7.3/install/build.html

Step1. 确定系统环境和需要的各软件版本

 

Centos6.7
Orace JDK1.8
Hadoop 2.6.0-cdh5.10.2
hbase 1.2.0-cdh5.10.2
spark 2.2.0

我们用到的版本如上, 因为spark版本2.2 默认使用的scala2.11。

修改scala版本为2.11

 

# update all pom.xml to use scala 2.11
./dev/change_scala_version.sh 2.11

Step2. 编译打包

 
mvn clean package -Pbuild-distr,helium-dev,hadoop-2.6,r,spark-2.2,vendor-repo -Dhadoop.version=2.6.0-cdh5.10.2 -Dhbase.hbase.version=1.2.0-cdh5.10.2 -Dhbase.hadoop.version=2.6.0-cdh5.10.2 -DskipTests
 

碰到的第一个问题

 

 

Failed to execute goal org.scala-tools:maven-scala-plugin:2.15.2:compile (default) on project zeppelin-display_2.11: wrap: org.apache.commons.exec.ExecuteException: Process exited with an error: 1(Exit value: 1) -> [Help 1]

开始被这里的错误迷惑了,查看上面的错误提示,

 
  1. Compiling 6 source files to /tmp/zeppelin/zeppelin-display/target/classes at 1516876315586

  2. [ERROR] /tmp/zeppelin/zeppelin-display/src/main/scala/org/apache/zeppelin/display/angular/AbstractAngularElem.scala:25: error: object xml is not a member of package scala

  3. [INFO] import scala.xml._

scala2.10版中直接有xml包,在2.11版的时候已经移除来了。因此在编译中加入Scala-2.11.

 

mvn clean package -Pbuild-distr,helium-dev,hadoop-2.6,r,spark-2.2,vendor-repo -Dhadoop.version=2.6.0-cdh5.10.2 -Dhbase.hbase.version=1.2.0-cdh5.10.2 -Dhbase.hadoop.version=2.6.0-cdh5.10.2 -Pscala-2.11 -DskipTests


碰到的第二个问题:zepplin-zengine模块,zepplin-srever模块下都有同样的依赖版本冲突。

 

 
  1. Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:1.3.1:enforce (enforce) on project zeppelin-zengine: org.apache.maven.plugins.enforcer.DependencyConvergence failed with mes

  2. sage:

  3. [ERROR] Failed while enforcing releasability the error(s) are [

  4. [ERROR] Dependency convergence error for com.amazonaws:aws-java-sdk-core:1.10.62 paths to dependency are:

  5. [ERROR] +-org.apache.zeppelin:zeppelin-zengine:0.8.0-SNAPSHOT

  6. [ERROR] +-com.amazonaws:aws-java-sdk-s3:1.10.62

  7. [ERROR] +-com.amazonaws:aws-java-sdk-kms:1.10.62

  8. [ERROR] +-com.amazonaws:aws-java-sdk-core:1.10.62

  9. [ERROR] and

  10. [ERROR] +-org.apache.zeppelin:zeppelin-zengine:0.8.0-SNAPSHOT

  11. [ERROR] +-com.amazonaws:aws-java-sdk-s3:1.10.62

  12. [ERROR] +-com.amazonaws:aws-java-sdk-core:1.10.62

  13. [ERROR] and

  14. [ERROR] +-org.apache.zeppelin:zeppelin-zengine:0.8.0-SNAPSHOT

  15. [ERROR] +-org.apache.hadoop:hadoop-client:2.6.0-cdh5.10.2

  16. [ERROR] +-org.apache.hadoop:hadoop-aws:2.6.0-cdh5.10.2

  17. [ERROR] +-com.amazonaws:aws-java-sdk-sts:1.10.6

  18. [ERROR] +-com.amazonaws:aws-java-sdk-core:1.10.6

 

根据提示的冲突版本号,在project的pom.xml中加入:

 

 
  1. org.codehaus.jackson

  2. jackson-mapper-asl

  3. 1.9.13

  4. org.codehaus.jackson

  5. jackson-core-asl

  6. 1.9.13

  7. org.apache.zookeeper

  8. zookeeper

  9. 3.4.6

  10. com.amazonaws

  11. aws-java-sdk-s3

  12. 1.10.62

  13. com.amazonaws

  14. aws-java-sdk-core

  15. 1.10.62

 

 

碰到的第三个问题:zepplin-web install phantomjs的时候报错。

 

Failed to execute goal com.github.eirslett:frontend-maven-plugin:1.3:npm (npm install) on project zeppelin-web: Failed to run task: 'npm install --no-lockfile' failed. (error code 1) -> [Help 1]
 
  1. Running 'npm install --no-lockfile' in /tmp/zeppelin/zeppelin-web

  2. [INFO]

  3. [INFO] > [email protected] install /tmp/zeppelin/zeppelin-web/node_modules/phantomjs-prebuilt

  4. [INFO] > node install.js

  5. [INFO]

  6. [INFO] PhantomJS not found on PATH

  7. [INFO] Downloading https://github.com/Medium/phantomjs/releases/download/v2.1.1/phantomjs-2.1.1-linux-x86_64.tar.bz2

  8. [INFO] Saving to /tmp/phantomjs/phantomjs-2.1.1-linux-x86_64.tar.bz2

  9. [INFO] Receiving...

  10. [INFO]

  11. [INFO] Install exited unexpectedly

  12. [WARNING] npm WARN optional SKIPPING OPTIONAL DEPENDENCY: [email protected] (node_modules/fsevents):

  13. [WARNING] npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for [email protected]: wanted {"os":"darwin","arch":"any"} (current: {"os":"linux","arch":"x64"})

  14. [ERROR]

  15. [ERROR] npm ERR! code ELIFECYCLE

  16. [ERROR] npm ERR! errno 1

  17. [ERROR] npm ERR! [email protected] install: `node install.js`

  18. [ERROR] npm ERR! Exit status 1

  19. [ERROR] npm ERR!

  20. [ERROR] npm ERR! Failed at the [email protected] install script.

  21. [ERROR] npm ERR! This is probably not a problem with npm. There is likely additional logging output above.


这个错误没找到原因,原来是在/tmp目录下编译的,后来发现空间满了移到/opt目录下编译通过了,没有这个错误了。

[INFO] --- frontend-maven-plugin:1.4:npm (npm build) @ zeppelin-web ---
[DEBUG] Configuring mojo com.github.eirslett:frontend-maven-plugin:1.4:npm from plugin realm ClassRealm[plugin>com.github.eirslett:frontend-maven-plugin:1.4, parent: sun.misc.Launcher$AppClassLoader@7852e922]
[DEBUG] Configuring mojo 'com.github.eirslett:frontend-maven-plugin:1.4:npm' with basic configurator -->
[DEBUG]   (f) arguments = run build:dist
[DEBUG]   (f) npmInheritsProxyConfigFromMaven = true
[DEBUG]   (f) project = MavenProject: org.apache.zeppelin:zeppelin-web:0.8.1 @ /opt/soft/zeppelin-0.8.1/zeppelin-web/pom.xml
[DEBUG]   (f) repositorySystemSession = org.eclipse.aether.DefaultRepositorySystemSession@1b8834c1
[DEBUG]   (f) session = org.apache.maven.execution.MavenSession@2330e3e0
[DEBUG]   (f) skip = false
[DEBUG]   (f) skipTests = true
[DEBUG]   (f) testFailureIgnore = false
[DEBUG]   (f) workingDirectory = /opt/soft/zeppelin-0.8.1/zeppelin-web
[DEBUG]   (f) execution = com.github.eirslett:frontend-maven-plugin:1.4:npm {execution: npm build}
[DEBUG] -- end configuration --
[INFO] Running 'npm run build:dist' in /opt/soft/zeppelin-0.8.1/zeppelin-web
[DEBUG] Executing command line [/opt/soft/zeppelin-0.8.1/zeppelin-web/node/node, /opt/soft/zeppelin-0.8.1/zeppelin-web/node/node_modules/npm/bin/npm-cli.js, run, build:dist]
[INFO] 
[INFO] > [email protected] build:dist /opt/soft/zeppelin-0.8.1/zeppelin-web
[INFO] > npm-run-all prebuild && grunt pre-webpack-dist && webpack && grunt post-webpack-dist
[INFO] 
[INFO] 
[INFO] > [email protected] prebuild /opt/soft/zeppelin-0.8.1/zeppelin-web
[INFO] > npm-run-all clean lint:once
[INFO] 
[INFO] 
[INFO] > [email protected] clean /opt/soft/zeppelin-0.8.1/zeppelin-web
[INFO] > rimraf dist && rimraf .tmp
[INFO] 
[INFO] 
[INFO] > [email protected] lint:once /opt/soft/zeppelin-0.8.1/zeppelin-web
[INFO] > eslint src
[INFO] 
[INFO] 
[INFO] /opt/soft/zeppelin-0.8.1/zeppelin-web/src/app/notebook/paragraph/paragraph.controller.js
[INFO]    981:22  warning  Unexpected 'this'  no-invalid-this
[INFO]    982:9   warning  Unexpected 'this'  no-invalid-this
[INFO]    983:17  warning  Unexpected 'this'  no-invalid-this
[INFO]    985:17  warning  Unexpected 'this'  no-invalid-this
[INFO]    988:5   warning  Unexpected 'this'  no-invalid-this
[INFO]    989:15  warning  Unexpected 'this'  no-invalid-this
[INFO]    989:47  warning  Unexpected 'this'  no-invalid-this
[INFO]   1003:5   warning  Unexpected 'this'  no-invalid-this
[INFO]   1008:22  warning  Unexpected 'this'  no-invalid-this
[INFO]   1009:9   warning  Unexpected 'this'  no-invalid-this
[INFO]   1010:17  warning  Unexpected 'this'  no-invalid-this
[INFO]   1012:17  warning  Unexpected 'this'  no-invalid-this
[INFO]   1014:5   warning  Unexpected 'this'  no-invalid-this
[INFO]   1015:15  warning  Unexpected 'this'  no-invalid-this
[INFO]   1015:47  warning  Unexpected 'this'  no-invalid-this
[INFO]   1034:5   warning  Unexpected 'this'  no-invalid-this
[INFO] 
[INFO] /opt/soft/zeppelin-0.8.1/zeppelin-web/src/app/visualization/builtins/visualization-d3network.js
[INFO]   172:19  warning  Unexpected 'this'  no-invalid-this
[INFO]   185:19  warning  Unexpected 'this'  no-invalid-this
[INFO] 
[INFO] /opt/soft/zeppelin-0.8.1/zeppelin-web/src/components/note-action/note-action.service.js
[INFO]    20:3  warning  Unexpected 'this'  no-invalid-this
[INFO]    36:3  warning  Unexpected 'this'  no-invalid-this
[INFO]    49:3  warning  Unexpected 'this'  no-invalid-this
[INFO]    66:3  warning  Unexpected 'this'  no-invalid-this
[INFO]    80:3  warning  Unexpected 'this'  no-invalid-this
[INFO]    94:3  warning  Unexpected 'this'  no-invalid-this
[INFO]   108:3  warning  Unexpected 'this'  no-invalid-this
[INFO]   121:3  warning  Unexpected 'this'  no-invalid-this
[INFO]   131:3  warning  Unexpected 'this'  no-invalid-this
[INFO] 
[INFO] ✖ 27 problems (0 errors, 27 warnings)
[INFO] 
[INFO] Running "htmlhint:src" (htmlhint) task
[INFO] >> 40 files lint free.
[INFO] 
[INFO] Running "wiredep:test" (wiredep) task
[INFO] Warning: Error: Cannot find where you keep your Bower packages. Use --force to continue.
[INFO] 
[INFO] Aborted due to warnings.
[INFO] 
[INFO] 
[INFO] Execution Time (2019-02-21 03:03:19 UTC)
[INFO] loading tasks  209ms  ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 40%
[INFO] htmlhint:src   174ms  ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 33%
[INFO] wiredep:test   136ms  ▇▇▇▇▇▇▇▇▇▇▇▇▇ 26%
[INFO] Total 520ms
[INFO] 
[ERROR] npm ERR! code ELIFECYCLE
[ERROR] npm ERR! errno 3
[ERROR] npm ERR! [email protected] build:dist: `npm-run-all prebuild && grunt pre-webpack-dist && webpack && grunt post-webpack-dist`
[ERROR] npm ERR! Exit status 3
[ERROR] npm ERR! 
[ERROR] npm ERR! Failed at the [email protected] build:dist script.
[ERROR] npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
[ERROR] 
[ERROR] npm ERR! A complete log of this run can be found in:
[ERROR] npm ERR!     /root/.npm/_logs/2019-02-21T03_03_20_490Z-debug.log
[INFO] Zeppelin: web Application .......................... FAILURE [ 54.502 s]
[INFO] Zeppelin: Packaging distribution ................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 14:07 min
[INFO] Finished at: 2019-02-21T11:03:21+08:00
[INFO] Final Memory: 413M/6870M
[INFO] ------------------------------------------------------------------------
[WARNING] The requested profile "hadoop-2.6" could not be activated because it does not exist.
[WARNING] The requested profile "yarn" could not be activated because it does not exist.
[ERROR] Failed to execute goal com.github.eirslett:frontend-maven-plugin:1.4:npm (npm build) on project zeppelin-web: Failed to run task: 'npm run build:dist' failed. org.apache.commons.exec.ExecuteException: Process exited with an error: 3 (Exit value: 3) -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal com.github.eirslett:frontend-maven-plugin:1.4:npm (npm build) on project zeppelin-web: Failed to run task
 

我正在从maven构建Apache Zeppelin 0.8.0并且我必须使用Zeppelin提供的高级功能,例如Apache Zeppelin Notebook授权允许“Runners”但是我正在尝试不同版本的节点和npm但是在mvn clean package -DskipTests构建Building 期间仍然会出现以下错误 Zeppelin:网络应用程序。
以下是调试日志中的错误日志:/root/.npm/_logs/2018-03-22T10_38_10_265Z-debug.log

此新版本(0.8.0)何时发布?

1 verbose cli [ '/root/zeppelin/zeppelin-web/node/node',
1 verbose cli   '/root/zeppelin/zeppelin-web/node/node_modules/npm/bin/npm-cli.js',
1 verbose cli   'run',
1 verbose cli   'build:dist' ]
2 info using [email protected]
3 info using [email protected]
4 verbose run-script [ 'prebuild:dist', 'build:dist', 'postbuild:dist' ]
5 info lifecycle [email protected]~prebuild:dist: [email protected]
6 info lifecycle [email protected]~build:dist: [email protected]
7 verbose lifecycle [email protected]~build:dist: unsafe-perm in lifecycle true
8 verbose lifecycle [email protected]~build:dist: PATH: /root/zeppelin/zeppelin-web/node/node_modules/npm/bin/node-gyp-bin:/root/zeppelin/zeppelin-web/node_modules/.bin:/root/zeppelin/zeppel$9 verbose lifecycle [email protected]~build:dist: CWD: /root/zeppelin/zeppelin-web
10 silly lifecycle [email protected]~build:dist: Args: [ '-c',
10 silly lifecycle   'npm-run-all prebuild && grunt pre-webpack-dist && webpack && grunt post-webpack-dist' ]
11 silly lifecycle [email protected]~build:dist: Returned: code: 3  signal: null
12 info lifecycle [email protected]~build:dist: Failed to exec build:dist script
13 verbose stack Error: [email protected] build:dist: `npm-run-all prebuild && grunt pre-webpack-dist && webpack && grunt post-webpack-dist`
13 verbose stack Exit status 3
13 verbose stack     at EventEmitter. (/root/zeppelin/zeppelin-web/node/node_modules/npm/node_modules/npm-lifecycle/index.js:280:16)
13 verbose stack     at emitTwo (events.js:126:13)
13 verbose stack     at EventEmitter.emit (events.js:214:7)
13 verbose stack     at ChildProcess. (/root/zeppelin/zeppelin-web/node/node_modules/npm/node_modules/npm-lifecycle/lib/spawn.js:55:14)
13 verbose stack     at emitTwo (events.js:126:13)
13 verbose stack     at ChildProcess.emit (events.js:214:7)
13 verbose stack     at maybeClose (internal/child_process.js:925:16)
13 verbose stack     at Process.ChildProcess._handle.onexit (internal/child_process.js:209:5)
14 verbose pkgid [email protected]
15 verbose cwd /root/zeppelin/zeppelin-web
16 verbose Linux 4.4.0-87-generic
17 verbose argv "/root/zeppelin/zeppelin-web/node/node" "/root/zeppelin/zeppelin-web/node/node_modules/npm/bin/npm-cli.js" "run" "build:dist"
18 verbose node v8.9.3
19 verbose npm  v5.5.1
20 error code ELIFECYCLE
21 error errno 3
22 error [email protected] build:dist: `npm-run-all prebuild && grunt pre-webpack-dist && webpack && grunt post-webpack-dist`
22 error Exit status 3
23 error Failed at the [email protected] build:dist script.

您需要使用非root用户运行maven构建。如果与root用户一起使用,Bower将破坏安装。

使用root执行任何管理任务(先决条件),在用户空间下放置git repo并在用​​户空间中构建它。

以下应与普通用户一起使用:

#Prerequisites
sudo yum update -y
sudo yum install -y java-1.8.0-openjdk-devel git gcc-c++ make
#Using NODE.JS Version 8 (Version 10 / Actual also works).
curl -sL https://rpm.nodesource.com/setup_8.x | sudo -E bash -
sudo yum install -y nodejs fontconfig
curl -sL https://dl.yarnpkg.com/rpm/yarn.repo | sudo tee /etc/yum.repos.d    /yarn.repo
sudo yum install -y yarn
npm config set strict-ssl false
npm install -g bower

#Maven Enviroment
mkdir /usr/local/maven
cd /usr/local/maven
wget http://apache.rediris.es/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz
tar xzvf apache-maven-3.5.4-bin.tar.gz --strip-components=1
sudo ln -s /usr/local/maven/bin/mvn /usr/local/bin/mvn
#Configure Maven to use more resources
export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=1024m"

#Proxy Configs
#git config --global http.proxy http://your.company.proxy:port git config --global
#npm config set proxy http://your.company.proxy:8080
#npm config set https-proxy http://your.company.proxy:8080
#nano ~/.bowerrc
#{
#"proxy":"http ://:
#", "https-proxy":"http ://:
#"
#}

#Zeppelin Install
sudo useradd zeppelin
sudo su zeppelin
cd /home/zeppelin
git clone https://github.com/apache/zeppelin.git
cd zeppelin
mvn clean package -Dmaven.test.skip=true

希望能帮助到你。

Failed to execute goal com.github.eirslett:frontend-maven-plugin:0.0.23:npm (npm install) on project zeppelin-web: Failed to run task: 'npm install --color=false' failed. (error code 126) -> [Help 1]

网上查找修改zeppelin-web下的pom.xml

复制代码

1   
 2       npm install
 3       
 4         npm
 5       
 6     
 7 
 8      
 9       bower install
10       
11           bower
12       
13       
14         --allow-root install
15       
16     
17 
18   
19       grunt build
20       
21           grunt
22       
23       
24         --no-color --force
25       
26     

复制代码

1 [root@host-172-16-1-80 zeppelin-web]#  npm install
2 [root@host-172-16-1-80 zeppelin-web]#  bower –alow-root install
3 [root@host-172-16-1-80 zeppelin-web]#  grunt –force
4 [root@host-172-16-1-80 zeppelin-web]#  mvn install -DskipTests

碰到的第四个问题:

 

[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:exec (default) on project zeppelin-zrinterpreter_2.11: Command execution failed. Process exited with an error: 127 (Exit value: 127) -> [Help 1]
 
  1. [INFO] --- exec-maven-plugin:1.2.1:exec (default) @ zeppelin-zrinterpreter_2.11 ---

  2. +++ dirname R/install-dev.sh

  3. ++ cd R

  4. ++ pwd

  5. + FWDIR=/opt/zeppelin/r/R

  6. + LIB_DIR=/opt/zeppelin/r/R/../../R/lib

  7. + mkdir -p /opt/zeppelin/r/R/../../R/lib

  8. + pushd /opt/zeppelin/r/R

  9. + R CMD INSTALL --library=/opt/zeppelin/r/R/../../R/lib /opt/zeppelin/r/R/rzeppelin/

  10. R/install-dev.sh: line 38: R: command not found

发现需要R命令运行环境,解决方案如下:

 

yum install epel-release
yum install R -y

静默安装R环境。

安装R环境后,碰到下一个问题:

 

 
  1. ERROR: dependency ‘evaluate’ is not available for package ‘rzeppelin’

  2. * removing ‘/opt/zeppelin/R/lib/rzeppelin’

发现依赖evaluate没有,使用R进行安装。

 

R
>install.packages("evaluate")
 
>q()

 

参考资料:

http://blog.csdn.net/zhanhong39/article/details/47749023

部署环境
Name

Value

备注

Oracle JDK

1.7 

(set JAVA_HOME)

 

OS

Mac OSX 

Ubuntu 14.X 

CentOS 6.X 

RedHat 5.X

Windows 7 Pro SP1

 

Hadoop集群相关

Spark-1.6.0

Hive1.1.0

CDH 5.13.3-1

相关组件都已经部署正常运行(事业部大数据测试环境服务器,且可上外网)

Zeppelin

0.7.3

 

部署步骤
下载介质
Zepplin安装包:wget http://archive.apache.org/dist/zeppelin/zeppelin-0.7.3/zeppelin-0.7.3-bin-all.tgz

              解压配置
1)、此套环境下载安装介质包统一放在/opt下,下载完成后:

tar -zxvf zeppelin-0.7.3-bin-all.tgz

2)、由于在解压过程中可能属主不是root,还需要通过更改成root:

chown –r root zeppelin-0.7.3-bin-all

3)、解释器相关依赖jar包

统一存放在/opt/zeppelin-0.7.3-bin-all/lib目录下。

Hive相关依赖包:

wget http://central.maven.org/maven2/org/apache/hive/hive-jdbc/0.14.0/hive-jdbc-0.14.0.jar

hadoop-common-2.6.0-cdh5.13.3.jar

wget http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/2.6.0/hadoop-common-2.6.0.jar

wget http://central.maven.org/maven2/mysql/mysql-connector-java/5.1.38/mysql-connector-java-5.1.38.jar

cp hadoop-common-2.6.0-cdh5.13.3.jar /opt/zeppelin-0.7.3-bin-all/lib

spark相关依赖包见spark验证章节配置

4)、定义配置conf/zeppelin-site.xml

默认不需要配置此文件,也可以正常启动服务,但考虑到端口确保不冲突。

cp zeppelin-site.xml.template zeppelin-site.xml,通过vi编辑文件,更改此块内容:

默认web访问端口号:8080,一般建议改成不易冲突且易记忆的端口号。本次我改成了8383:

  zeppelin.server.port

  8383

  Server port.

 

如果想匿名访问web端,则可以将false更改为true,但一般不建议生产环境这样做。

  zeppelin.anonymous.allowed

  false

  Anonymous user allowed by default

5)、添加登录账号信息

将conf/shiro.ini.template拷贝为shiro.ini

修改里面的用户名和密码,修改见下面截图:

 

6)、定义配置conf/zeppelin-env.sh

默认不需要配置此文件,也可以正常启动服务,但我们引入外部hadoop相关组件。

cp zeppelin-env.sh.template zeppelin-env.sh,通过vi编辑文件,在文件尾部加入:

#add env darren 1808028

export JAVA_HOME=/usr/java/jdk1.7.0_79

export JRE_HOME=$JAVA_HOME/jre

export HADOOP_CONF_DIR=/etc/hadoop/conf

export HADOOP_HOME=/opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p0.2/lib/hadoop

export SPARK_HOME=/opt/cdh5/spark-1.6.0

export HIVE_HOME=/opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p0.2/lib/hive

export HBASE_HOME=/opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p0.2/lib/hbase

export MASTER=spark://172.17.XX.XXX:7099

export ZEPPELIN_HOME=/opt/zeppelin-0.7.3-bin-all

 

7)、访问hive仓库,依赖hive服务下配置文件hive-site.xml。在确保hive server运行正常的前提下,需要从相应运行服务环境。本轮环境是XXX-bigdata-2.novalocal节点上,运行

Scp /opt/cloudera/parcels/CDH/lib/hive/conf/ hive-site.xml

[email protected]: /opt/zeppelin-0.7.3-bin-all/conf

 

另外记得在该文件尾部添加一段访问hive元数据的账号密码:

  

    javax.jdo.option.ConnectionPassword

    XXX123!@#

    password to use against metastore database

  

 

8)、启停zeppelin服务

停启一体命令(先停再启)

./opt/zeppelin-0.7.3-bin-all/bin/zeppelin-daemon.sh restart

 

启命令

./opt/zeppelin-0.7.3-bin-all/bin/zeppelin-daemon.sh start

 

停命令

./opt/zeppelin-0.7.3-bin-all/bin/zeppelin-daemon.sh stop

 

9)、默认匿名登录zeppelin web,可直接访问http://172.17.XX.XXX:8383/#,当前部署的环境设置了登录账户信息,不可匿名。

 

输入账号admin,密码adimin1234

Hive解释器关键内容及依赖特别注释说明,具体如下表内容:

Properties
Name

Value

default.driver

org.apache.hive.jdbc.HiveDriver

default.url

jdbc:hive2://localhost:10000

default.user

hive_user

default.password

hive_password

Dependencies
Artifact

Excludes

org.apache.hive:hive-jdbc:0.14.0

 
org.apache.hadoop:hadoop-common:2.6.0

 

Maven Repository : org.apache.hive:hive-jdbc

注:由于本次部署环境是CDH-5.13.3,因此此处所依赖的包需要更改成实际对应版本的jar包,具体如下图:

 

 

在之前的经验中写过基于现有的CDH版本编译Spark的过程,同样的,也把编译Zeppelin的过程记录下来。

  • CentOS7 系统
  • Zeppelin0.7.2(当前最新版)
  • Maven环境

下载Zeppelin >> https://zeppelin.apache.org/download.html

解压到任意目录,执行Maven编译,不用更改pom.xml,编译时的Hadoop和Spark版本可以通过参数传入。
因为是CDH的Hadoop,编译过程会去CDH的repo中下载,在pom.xml中好像已有这个repo,但是发现编译过程好像未生效,所以最好把这个repo加到Maven的settings.xml里,使得中央仓库拉不到的jar会另外去CDH的仓库下载。

 

1

2

3

4

5

6

7

8

9

10

11

12

  

  cloudera

   https://repository.cloudera.com/artifactory/cloudera-repos/

  

   true

  

  

    false

  

  

准备完成后,执行Maven编译命令:

 

1

mvn clean package -Pspark-2.1 -Phadoop-2.6 -Dhadoop.version=2.6.0-cdh5.11.1 -Pscala-2.11 -DskipTests -X

过程开始比较顺利,但是到最后的web Application项目却进行不下去:

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

[INFO] Zeppelin: Scio ..................................... SUCCESS [03:55 min]

[INFO] Zeppelin: web Application .......................... FAILURE [02:34 min]

[INFO] Zeppelin: Server ................................... SKIPPED

[INFO] Zeppelin: Packaging distribution ................... SKIPPED

[INFO] ------------------------------------------------------------------------

[INFO] BUILD FAILURE

[INFO] ------------------------------------------------------------------------

[INFO] Total time: 23:43 min

[INFO] Finished at: 2017-08-15T11:14:03+08:00

[INFO] Final Memory: 206M/948M

[INFO] ------------------------------------------------------------------------

[ERROR] Failed to execute goal com.github.eirslett:frontend-maven-plugin:1.3:install-node-and-yarn (install node and yarn) on project zeppelin-web: Could not download Yarn: Could not download https://github.com/yarnpkg/yarn/releases/download/v0.18.1/yarn-v0.18.1.tar.gz: Connect to github-production-release-asset-2e65be.s3.amazonaws.com:443 [github-production-release-asset-2e65be.s3.amazonaws.com/54.231.72.19] failed: 连接超时 -> [Help 1]

[ERROR]

[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.

[ERROR] Re-run Maven using the -X switch to enable full debug logging.

[ERROR]

[ERROR] For more information about the errors and possible solutions, please read the following articles:

[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

[ERROR]

[ERROR] After correcting the problems, you can resume the build with the command

[ERROR]   mvn -rf :zeppelin-web

再次执行编译命令加上-X参数,打印Debug日志:mvn clean package -Pspark-2.1 -Phadoop-2.6 -Dhadoop.version=2.6.0-cdh5.11.1 -Pscala-2.11 -DskipTests -X

发现是以下问题

 

1

2

3

4

5

6

Caused by: org.apache.maven.plugin.MojoFailureException: Failed to run task

at com.github.eirslett.maven.plugins.frontend.mojo.AbstractFrontendMojo.execute(AbstractFrontendMojo.java:95)

at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)

at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207)

... 20 more

Caused by: com.github.eirslett.maven.plugins.frontend.lib.TaskRunnerException: 'yarn install --no-lockfile' failed. (error code 1)

解决>> https://stackoverflow.com/questions/41646832/build-zeppelin-0-7-0-master-branch-with-spark-2-0-failed-with-yarn-install-no
大概是因为编译Zeppelin: web Application项目需要用到npm,npm是NodeJs的套件,猜测Zeppelin的前端可能用到NodeJs。只能尝试安装npm了:
在用Yum安装npm时还遇到本地资源库版本不符,装不上的问题,在CentOS7上安装npm需要http-parser 2.7的版本,只能手动下载rpm包安装。
在CentOS官网上找到组件https://centos.pkgs.org/7/puias-unsupported-x86_64/

 

1

2

3

4

5

6

[root@hadoop-slave-1 ~]#wget https://centos.pkgs.org/7/puias-unsupported-x86_64/http-parser-2.7.1-3.sdl7.x86_64.rpm.html

[root@hadoop-slave-1 ~]# rpm -ivh http-parser-2.7.1-3.sdl7.x86_64.rpm

警告:http-parser-2.7.1-3.sdl7.x86_64.rpm: 头V3 RSA/SHA256 Signature, 密钥 ID 41a40948: NOKEY

准备中...                          ################################# [100%]

正在升级/安装...

   1:http-parser-2.7.1-3.sdl7         ################################# [100%]

安装完毕后再执行上述页面说的npm install命令,猜测是安装一些npm的套件。

 

1

sudo npm install -g bower grunt grunt-cli npm-run-all rimraf webpack

npm环境准备好后再次尝试,又遇以下异常:'yarn install --no-lockfile'

 

1

2

3

[INFO] ------------------------------------------------------------------------

[ERROR] Failed to execute goal com.github.eirslett:frontend-maven-plugin:1.3:yarn (yarn install) on project zeppelin-web: Failed to run task: 'yarn install --no-lockfile' failed. (error code 1) -> [Help 1]

org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal com.github.eirslett:frontend-maven-plugin:1.3:yarn (yarn install) on project zeppelin-web: Failed to run task

解决>> https://issues.apache.org/jira/browse/ZEPPELIN-1019
一个不起眼的评论说到解决方案:
CDH整合zeppelin_第1张图片
尝试后编译通过!

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

[DEBUG]   (f) siteDirectory = /root/_/zeppelin-0.7.2/zeppelin-distribution/src/site

[DEBUG] -- end configuration --

[DEBUG] Mapped url: /root/_/zeppelin-0.7.2/zeppelin-distribution/src/site to relative path: src/site

[INFO] ------------------------------------------------------------------------

[INFO] Reactor Summary:

[INFO]

[INFO] Zeppelin ........................................... SUCCESS [  2.239 s]

[INFO] Zeppelin: Interpreter .............................. SUCCESS [  5.265 s]

[INFO] Zeppelin: Zengine .................................. SUCCESS [  3.924 s]

[INFO] ------------------------------------------------------------------------

[INFO] Reactor Summary:

[INFO]

[INFO] Zeppelin ........................................... SUCCESS [  2.239 s]

[INFO] Zeppelin: Interpreter .............................. SUCCESS [  5.265 s]

[INFO] Zeppelin: Zengine .................................. SUCCESS [  3.924 s]

[INFO] Zeppelin: Display system apis ...................... SUCCESS [  8.710 s]

[INFO] Zeppelin: Spark dependencies ....................... SUCCESS [ 30.947 s]

[INFO] Zeppelin: Spark .................................... SUCCESS [ 12.656 s]

[INFO] Zeppelin: Markdown interpreter ..................... SUCCESS [  0.570 s]

[INFO] Zeppelin: Angular interpreter ...................... SUCCESS [  0.225 s]

[INFO] Zeppelin: Shell interpreter ........................ SUCCESS [  0.333 s]

[INFO] Zeppelin: Livy interpreter ......................... SUCCESS [  4.667 s]

[INFO] Zeppelin: HBase interpreter ........................ SUCCESS [  2.517 s]

[INFO] Zeppelin: Apache Pig Interpreter ................... SUCCESS [  2.564 s]

[INFO] Zeppelin: PostgreSQL interpreter ................... SUCCESS [  0.419 s]

[INFO] Zeppelin: JDBC interpreter ......................... SUCCESS [  0.725 s]

[INFO] Zeppelin: File System Interpreters ................. SUCCESS [  0.687 s]

[INFO] Zeppelin: Flink .................................... SUCCESS [  3.592 s]

[INFO] Zeppelin: Apache Ignite interpreter ................ SUCCESS [  0.541 s]

[INFO] Zeppelin: Kylin interpreter ........................ SUCCESS [  0.311 s]

[INFO] Zeppelin: Python interpreter ....................... SUCCESS [ 14.463 s]

[INFO] Zeppelin: Lens interpreter ......................... SUCCESS [  2.057 s]

[INFO] Zeppelin: Apache Cassandra interpreter ............. SUCCESS [ 27.274 s]

[INFO] Zeppelin: Elasticsearch interpreter ................ SUCCESS [  1.974 s]

[INFO] Zeppelin: BigQuery interpreter ..................... SUCCESS [  0.474 s]

[INFO] Zeppelin: Alluxio interpreter ...................... SUCCESS [  1.516 s]

[INFO] Zeppelin: Scio ..................................... SUCCESS [ 25.773 s]

[INFO] Zeppelin: web Application .......................... SUCCESS [04:54 min]

[INFO] Zeppelin: Server ................................... SUCCESS [02:24 min]

[INFO] Zeppelin: Packaging distribution ................... SUCCESS [  1.718 s]

[INFO] ------------------------------------------------------------------------

[INFO] BUILD SUCCESS

[INFO] ------------------------------------------------------------------------

[INFO] Total time: 09:55 min

[INFO] Finished at: 2017-08-16T09:58:18+08:00

[INFO] Final Memory: 217M/1134M

[INFO] ------------------------------------------------------------------------

在编译完成后将源码目录打包,得到一个1.2G的压缩文件,在任意地方解压即可使用。
后续关于Zeppelin的启动配置详见>> https://zeppelin.apache.org/docs/0.7.2/install/install.html
conf/zeppelin-env.sh配件中可修改启动的环境,在conf/zeppelin-site.xml.template中可修改人启动端口,默认8080,注意修改过的配置文件必须把后缀.template去掉,否则则不会生效。
因为我的环境中要用到Spark,所以需要配置Spark和Hadoop的环境变量,如>> https://zeppelin.apache.org/docs/0.7.2/interpreter/spark.html
Zeppelin的可以配置在On YARN模式,由YARN来分配Spark的容器资源,

这样,在使用Spark时,会在YARN上启动一个名为Zeppelin的Spark容器,

以上,关于Zeppelin的编译、部署已经完成,有关Zeppelin的使用Guide在官网上已有很多案例。


题外话:
因为我本次是在生产环境部署的,Zeppelin在自定义interpreter时需要配置Maven的依赖以获取jar,如配置Hive的时候就要添加以下依赖(这里和官网写的有出入,官方文档只说了两个jar,hadoop-common和hive-jdbc,但我配置上后使用Hive会有Jar缺失的异常,不知道是不是我提供的版本的原因)

CDH整合zeppelin_第2张图片
CDH整合zeppelin_第3张图片
因为生产环境是断网的,这里用了一种“欺骗”的做法,在配置好依赖后Save时会因下载不到Jar而报警,可以忽略,此时把所有的Jar复制到zeppelin.interpreter.localRepo变量指定的目录下,此时再使用Hive就可跳过Jar缺失的问题。注意这里配置完毕后千万不要重启interpreter,否则会删除刚才的目录,重启Zeppelin时该目录也会被删除。暂不知道Zeppelin是否有指定一个localRepo而不从网络下载,考虑到生产环境配置好以后很少会重启,先扣留这样的做法。为此,特写了个copy.sh脚本完成这个操作。
CDH整合zeppelin_第4张图片

虽然编译通过,但是还是有一些问题仅是通过查询别人的解决方法得以解决,并不是自己真正地“知其所以然”。
在以后的学习中,还得潜下心,知易行难!

转载于:https://my.oschina.net/hblt147/blog/3012313

你可能感兴趣的:(CDH整合zeppelin)