主机名 |
运行进程 |
nbidc-agent-03 |
Hadoop NameNode Spark Master |
nbidc-agent-04 |
Hadoop SecondaryNameNode |
nbidc-agent-11 |
Hadoop ResourceManager Hadoop DataNode、Hadoop NodeManager、Spark Worker |
nbidc-agent-12 |
Hadoop DataNode、Hadoop NodeManager、Spark Worker |
nbidc-agent-13 |
Hadoop DataNode、Hadoop NodeManager、Spark Worker |
nbidc-agent-14 |
Hadoop DataNode、Hadoop NodeManager、Spark Worker |
nbidc-agent-15 |
Hadoop DataNode、Hadoop NodeManager、Spark Worker |
nbidc-agent-18 |
Hadoop DataNode、Hadoop NodeManager、Spark Worker |
nbidc-agent-19 |
Hadoop DataNode、Hadoop NodeManager、Spark Worker |
nbidc-agent-20 |
Hadoop DataNode、Hadoop NodeManager、Spark Worker |
nbidc-agent-21 |
Hadoop DataNode、Hadoop NodeManager、Spark Worker |
nbidc-agent-22 |
Hadoop DataNode、Hadoop NodeManager、Spark Worker |
表1
yum install curl-devel expat-devel gettext-devel openssl-devel zlib-devel yum install gcc perl-ExtUtils-MakeMaker yum remove git cd /home/work/tools/ wget https://github.com/git/git/archive/v2.8.1.tar.gz tar -zxvf git-2.8.1.tar.gz cd git-2.8.1.tar.gz make prefix=/home/work/tools/git all make prefix=/home/work/tools/git install2. 安装Java
scp -r jdk1.7.0_75 nbidc-agent-04:/home/work/tools/3. 安装Apache Maven
cd /home/work/tools/ wget ftp://mirror.reverse.net/pub/apache/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz tar -zxvf apache-maven-3.3.9-bin.tar.gz4. 安装Hadoop客户端
scp -r hadoop nbidc-agent-04:/home/work/tools/5. 安装Spark客户端
scp -r spark nbidc-agent-04:/home/work/tools/6. 安装Hive客户端
scp -r hive nbidc-agent-04:/home/work/tools/7. 安装phantomjs
cd /home/work/tools/ tar -jxvf phantomjs-2.1.1-linux-x86_64.tar.bz28. 下载最新的zeppelin源码
cd /home/work/tools/ git clone https://github.com/apache/incubator-zeppelin.git9. 设置环境变量
# 添加下面的内容 export PATH=.:$PATH:/home/work/tools/jdk1.7.0_75/bin:/home/work/tools/hadoop/bin:/home/work/tools/spark/bin:/home/work/tools/hive/bin:/home/work/tools/phantomjs-2.1.1-linux-x86_64/bin:/home/work/tools/incubator-zeppelin/bin; export JAVA_HOME=/home/work/tools/jdk1.7.0_75 export HADOOP_HOME=/home/work/tools/hadoop export SPARK_HOME=/home/work/tools/spark export HIVE_HOME=/home/work/tools/hive export ZEPPELIN_HOME=/home/work/tools/incubator-zeppelin # 保存文件,并是设置生效 source /home/work/.bashrc10. 编译zeppelin源码
cd /home/work/tools/incubator-zeppelin mvn clean package -Pspark-1.6 -Dspark.version=1.6.0 -Dhadoop.version=2.7.0 -Phadoop-2.6 -Pyarn -DskipTests
cp /home/work/tools/incubator-zeppelin/conf/zeppelin-env.sh.template /home/work/tools/incubator-zeppelin/conf/zeppelin-env.shvi /home/work/tools/incubator-zeppelin/conf/zeppelin-env.sh
# 添加下面的内容 export JAVA_HOME=/home/work/tools/jdk1.7.0_75 export HADOOP_CONF_DIR=/home/work/tools/hadoop/etc/hadoop export MASTER=spark://nbidc-agent-03:70772. 配置zeppelin-site.xml
cp /home/work/tools/incubator-zeppelin/conf/zeppelin-site.xml.template /home/work/tools/incubator-zeppelin/conf/zeppelin-site.xmlvi /home/work/tools/incubator-zeppelin/conf/zeppelin-site.xml
# 修改下面这段的value值,设置zeppelin的端口为9090 <property> <name>zeppelin.server.port</name> <value>9090</value> <description>Server port.</description> </property>3. 将hive-site.xml拷贝到zeppelin的配置目录下
cd /home/work/tools/incubator-zeppelin cp /home/work/tools/hive/conf/hive-site.xml .
zeppelin-daemon.sh start
图1
点击'Interpreter'菜单,配置并保存spark和hive解释器,分别如图2、图3所示。图2
图3
点击'NoteBook'->'Create new note'子菜单项,建立一个新的查询并执行,结果如图4所示。图4
说明:%sql select * from wxy.t1 where rate > ${r}第一行指定解释器为SparkSQL,第二行用${r}指定一个运行时参数,执行时页面上会出现一个文本编辑框,输入参数后回车,查询会按照指定参数进行,如图会查询rate > 100的记录。