hue:大数据的web工具,cloudera开源
为以Hadoop为基础的生态系统的其他架构提供了一个统一的友好的
web管理界面
官网: http://archive.cloudera.com/cdh5/cdh/5/hue-3.7.0-cdh5.3.6/
hue的应用
hive: bin/hive --hql
1.通过hue的web界面提交执行hql语句
2.查询一个hql语句的执行计划
3.查看hive表的元数据库信息
4. 对查询结果提供图形展示报表
5.底层是借用的hiveserver2和jdbc
hdfs :
1.可以通过hue的web界面对hdfs上的文件或目录进行增删改查操作
2.底层调用的hdfs上的api
yarn :
1.可以通过hue的web界面监控所有的mr任务及在线查看每个任务的详情
2.底层借用的Hadoop的historyserver及8088web端口
oozie:
通过hue的web界面在线编辑提交一个oozie调度任务
hbase:
通过hue的web界面对hbase表的数据进行增删改查操作
RDBMS(传统关系型数据库):
通过hue的web界面对RDBMS的表增删改查操作
zookeeper:
sqoop:
hue+CM
hue与各个框架的集成,实际上就是hue作为一个客户端调用了各个框架的对应api,在web页面上进行显示及交互操作
一、hue的组织架构:
Hue Ui ---hue的web操作界面
hue server -- hue的服务进程,负责与其他框架进行通信交互
db - hue需要一个数据库的支持存储元数据等信息
二、安装部署
hue的安装方式--编译安装,需要外网
1、先安装hue编译安装需要的依赖包(使用root用户或者sudo)
# sudo yum -y install ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi gcc gcc-c++ krb5-devel libtidy libxml2-devel libxslt-devel openldap-devel python-devel sqlite-devel openssl-devel mysql-devel gmp-devel
Complete! 表示安装成功
安装成功后切换回普通用户!!
2、上传并解压hue的源码包,解压后进行hue的编译安装
$ tar zxvf hue-3.7.0-cdh5.3.6.tar.gz
进入到的源码包主目录:
$ cd /opt/modules/cdh/hue-3.7.0-cdh5.3.6 hue的源码包主目录
进行编译安装
$ make apps
make[1]: Leaving directory `/opt/modules/hue-3.7.0-cdh5.3.6/apps' --- 表示编译安装成功 !!
编译安装hue之后原先的jdk会被openJDK覆盖 --使用root用户
$ java -version
$ rpm -qa |grep java
# sudo rpm -e --nodeps java-1.7.0-openjdk-devel-1.7.0.131-2.6.9.0.el6_8.x86_64 java_cup-0.10k-5.el6.x86_64 java-1.7.0-openjdk-1.7.0.131-2.6.9.0.el6_8.x86_64 java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64 tzdata-java-2017a-1.el6.noarc
# source /etc/profile
# java -version
3、 修改hue的主配置文件
$ sudo vi desktop/conf/hue.ini
修改[desktop]模块下的配置信息
配置hue的session秘钥:
http://archive.cloudera.com/cdh5/cdh/5/hue-3.7.0-cdh5.3.6/manual.html#_web_server_configuration
# Set this to a random string, the longer the better.
# This is used for secure hashing in the session store.
secret_key=jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn
# Webserver listens on this address and port
http_host=192.168.134.101
http_port=8888
# Time zone name
time_zone=Asia/Shanghai
4、
启动hue
默认是一个前端进程
$ build/env/bin/supervisor
5、访问hue的web
http://com.bigdata:8888/about/
第一次登陆的用户名和密码要记住!!wanglu 123456
$ ps -ef | grep hue
$ netstat -antp | grep 8888
$ kill -9 61772 关闭hue 服务进程
三、hue与Hadoop集成
1)修改Hadoop相关配置文件
hdfs-site.xml 添加以下配置:
dfs.webhdfs.enabled
true
dfs.permissions.enabled
falsem
core-site.xml 添加下面配置:
为hue配置一个名为hue的代理用户,来获取hdfs的使用访问权限
hadoop.proxyuser.hue.hosts
*
hadoop.proxyuser.hue.groups
*
重启Hadoop服务!!!
$ sbin/stop-all.sh
$ sbin/start-all.sh
2)修改hue的配置文件
hue.ini文件
[[hdfs_clusters]]
# HA support by using HttpFs
[[[default]]]
# Enter the filesystem uri
fs_defaultfs=hdfs://192.168.134.101:8020
# NameNode logical name.
## logical_name=
# Use WebHdfs/HttpFs as the communication mechanism.
# Domain should be the NameNode or HttpFs host.
# Default port is 14000 for HttpFs.
webhdfs_url=http://192.168.134.101:50070/webhdfs/v1
# Change this if your HDFS cluster is Kerberos-secured
## security_enabled=false
hadoop_hdfs_home=/opt/modules/hadoop-2.5.0-cdh5.3.6
hadoop_bin=/opt/modules/hadoop-2.5.0-cdh5.3.6/bin
hadoop_conf_dir=/opt/modules/hadoop-2.5.0-cdh5.3.6/etc/hadoop
[[yarn_clusters]]
[[[default]]]
# Enter the host on which you are running the ResourceManager
resourcemanager_host=192.168.134.101
# The port where the ResourceManager IPC listens on
resourcemanager_port=8032
# Whether to submit jobs to this cluster
submit_to=True
# Resource Manager logical name (required for HA)
## logical_name=
# Change this if your YARN cluster is Kerberos-secured
## security_enabled=false
# URL of the ResourceManager API
resourcemanager_api_url=http://192.168.134.101:8088
# URL of the ProxyServer API
proxy_api_url=http://192.168.134.101:8088
# URL of the HistoryServer API
history_server_api_url=http://192.168.134.101:19888
3)重启hue并到web上查看Hadoop的集成情况
管理hdfs
增删改查
管理任务
注意修改任务的用户名 admin -> 普通用户
四、hue与hive的集成
1)启动hiveserver2服务
hue需要借助hiveserver2进程获取hive表的表数据
$ bin/hiveserver2 &
//&表示在后台运行
2)配置hive的metastore服务进程并启动
metastore服务:
hue需要通过metastore进程获取远程端的mysql中的hive表元数据
一般企业的mysql都是配置在远程端,本地客户端需要借助metastore来回获取元数据
配置: hive-site.xml添加以下
hive.metastore.uris
thrift://192.168.134.101:9083
启动:
$ bin/hive --service metastore &
3)配置hue.ini
[beeswax]
# Host where HiveServer2 is running.
# If Kerberos security is enabled, use fully-qualified domain name (FQDN).
hive_server_host=192.168.134.101
# Port where HiveServer2 Thrift server runs on.
hive_server_port=10000
# Hive configuration directory, where hive-site.xml is located
hive_conf_dir=/opt/modules/hive-0.13.1-cdh5.3.6/conf
重启hue !!!!!
4)web页面集成情况
五、hue与mysql数据库的集成
修改hue.ini配置:
[[databases]]
# sqlite configuration.
## [[[sqlite]]]
# Name to show in the UI.
## nice_name=SQLite
# For SQLite, name defines the path to the database.
## name=/tmp/sqlite.db
# Database backend to use.
## engine=sqlite
# Database options to send to the server when connecting.
# https://docs.djangoproject.com/en/1.4/ref/databases/
## options={}
# mysql, oracle, or postgresql configuration.
[[[mysql]]]
# Name to show in the UI.
nice_name="My SQL DB"
# For MySQL and PostgreSQL, name is the name of the database.
# For Oracle, Name is instance of the Oracle server. For express edition
# this is 'xe' by default.
## name=mysqldb
# Database backend to use. This can be:
# 1. mysql
# 2. postgresql
# 3. oracle
engine=mysql
# IP or hostname of the database to connect to.
host=192.168.134.101
# Port the database server is listening to. Defaults are:
# 1. MySQL: 3306
# 2. PostgreSQL: 5432
# 3. Oracle Express Edition: 1521
port=3306
# Username to authenticate with when connecting to the database.
user=root
# Password matching the username to authenticate with when
# connecting to the database.
password=root123
重启hue
六、hue与oozie的集成
[liboozie]
# The URL where the Oozie service runs on. This is required in order for
# users to submit jobs. Empty value disables the config check.
oozie_url=http://192.168.134.101:11000/oozie
# Requires FQDN in oozie_url if enabled
## security_enabled=false
# Location on HDFS where the workflows/coordinator are deployed when submitted.
remote_deployement_dir=/examples //用户自定义应用在hdfs上的根目录