Hue是一个开源的Apache Hadoop UI系统,最早是由Cloudera Desktop演化而来,
由Cloudera贡献给开源社区,它是基于Python Web框架Django实现的。
通过使用Hue我们可以在浏览器端的Web控制台上与Hadoop集群进行交互来分析处理数据,
例如操作HDFS上的数据,运行MapReduce Job,Hive等等。
数据库查询编辑器,支持 Hive, Impala, MySql, PostGres, Sqlite and Oracle
动态查询仪表盘,支持 Solr
支持 Spark 编辑器和仪表盘
浏览器查看状态,支持 YARN, HDFS, Hive table Metastore, HBase, ZooKeeper
支持 Pig Editor, Sqoop2, Oozie workflows 编辑器和仪表盘
将数据导入hdfs
一台1core、2G内存的Centos6.7虚拟机
组件名称 | 组件版本 |
---|---|
Hadoop | Hadoop-2.6.0-cdh5.7.0-src.tar.gz |
jdk | jdk-8u45-linux-x64.gz |
hive | hive-1.1.0-cdh5.7.0.tar.gz |
hue | hue-3.9.0-cdh5.7.0 |
所有的组件都是基于cdh5.7.0的,可从CDH官网组件仓库下载。
[root@hadoop001 ~]# yum install -y ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi cyrus-sasl-plain gcc \
gcc-c++ krb5-devel libffi-devel libxml2-devel libxslt-devel make mysql mysql-devel openldap-devel python-devel sqlite-devel gmp-devel
[hadoop@hadoop001 ~]$ vim ./.bash_profile
export HUE_HOME=/home/hadoop/app/hue-3.9.0-cdh5.7.0
export PATH=$HUE_HOME/bin:$PATH
[hadoop@hadoop001 ~]$ source ./.bash_profile
#解压,注意检查用户与用户组
[hadoop@hadoop001 ~]$ tar -zxvf ~/soft/hue-3.9.0-cdh5.7.0.tar.gz -C ~/app/
#编译,make编译会下载很多的包,编译的快慢取决于网络,
[hadoop@hadoop001 ~]$ cd ~/app/hue-3.9.0-cdh5.7.0/
[hadoop@hadoop001 hue-3.9.0-cdh5.7.0]$ make apps
若出现如下:XXXX post-processed则表示hue编译成功
1190 static files copied to '/home/hadoop/app/hue-3.9.0-cdh5.7.0/build/static', 1190 post-processed.
make[1]: Leaving directory `/home/hadoop/app/hue-3.9.0-cdh5.7.0/apps'
[desktop]
secret_key=jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn
#hdfs-site.xml
dfs.webhdfs.enabled
true
#core-site.xml
hadoop.proxyuser.hue.hosts
*
hadoop.proxyuser.hue.groups
*
#httpfs-site.xml
httpfs.proxyuser.hue.hosts
*
httpfs.proxyuser.hue.groups
*
取消webhdfs_url的注释
[hadoop]
# Configuration for HDFS NameNode
# ------------------------------------------------------------------------
[[hdfs_clusters]]
# HA support by using HttpFs
[[[default]]]
# Enter the filesystem uri
fs_defaultfs=hdfs://hadoop001:8020
# NameNode logical name.
## logical_name=
# Use WebHdfs/HttpFs as the communication mechanism.
# Domain should be the NameNode or HttpFs host.
# Default port is 14000 for HttpFs.
webhdfs_url=http://hadoop001:50070/webhdfs/v1
设置hs2相关信息
hive.server2.thrift.port
10000
hive.server2.thrift.bind.host
hadoop001
hive.server2.long.polling.timeout
5000
hive.server2.authentication
NONE
取消如下注释,并修改hostname
[beeswax]
# Host where HiveServer2 is running.
# If Kerberos security is enabled, use fully-qualified domain name (FQDN).
hive_server_host=hadoop001
# Port where HiveServer2 Thrift server runs on.
hive_server_port=10000
# Hive configuration directory, where hive-site.xml is located
hive_conf_dir=/home/hadoop/app/hive-1.1.0-cdh5.7.0/conf/
#启动hadoop
[hadoop@hadoop001 ~]$ start-all.sh
#启动hiveserver2
[hadoop@hadoop001 ~]$ nohup hive --service hiveserver2 >~/app/hive-1.1.0-cdh5.7.0/console.log 2>&1 &
使用beeline连接hs2,-n hadoop 这里的hadoop是启动beeline的用户hadoop
[hadoop@hadoop001 ~]$ beeline -u jdbc:hive2://hadoop001:10000/default -n hadoop
连接时发生异常,信息如下:Error: Could not open client transport with JDBC Uri: jdbc:hive2://hadoop001:10000: null
解决方案:检查发现配置了hive的用户登录导致的,将hive.server2.authentication配置的值改为NONE即可,原先是NOSAL
#启动hue
[hadoop@hadoop001 ~]$ ~/app/hue-3.9.0-cdh5.7.0/build/env/bin/supervisor
hive是没必要启用用户登录去访问的。
访问地址:http://ip:88888
我创建了一个hadoop账号,必须是hadoop用户,不然dfs只能操作/usr/XXX/目录,hive也是无法操作的。