hue

hue:大数据的web工具,cloudera开源
       为以Hadoop为基础的生态系统的其他架构提供了一个统一的友好的 web管理界面
        官网: http://archive.cloudera.com/cdh5/cdh/5/hue-3.7.0-cdh5.3.6/  
hue的应用 
      hive:  bin/hive    --hql   
           1.通过hue的web界面提交执行hql语句      
           2.查询一个hql语句的执行计划  
           3.查看hive表的元数据库信息  
           4. 对查询结果提供图形展示报表  
           5.底层是借用的hiveserver2和jdbc  
          hdfs :
            1.可以通过hue的web界面对hdfs上的文件或目录进行增删改查操作  
            2.底层调用的hdfs上的api
         yarn :
            1.可以通过hue的web界面监控所有的mr任务及在线查看每个任务的详情  
            2.底层借用的Hadoop的historyserver及8088web端口
         oozie:
            通过hue的web界面在线编辑提交一个oozie调度任务  
         hbase:
            通过hue的web界面对hbase表的数据进行增删改查操作  
         RDBMS(传统关系型数据库):
            通过hue的web界面对RDBMS的表增删改查操作
         zookeeper:
         sqoop:
                
        hue+CM  
    hue与各个框架的集成,实际上就是hue作为一个客户端调用了各个框架的对应api,在web页面上进行显示及交互操作   
        
一、hue的组织架构:
        Hue Ui  ---hue的web操作界面  
        hue server -- hue的服务进程,负责与其他框架进行通信交互
        db - hue需要一个数据库的支持存储元数据等信息  
        
二、安装部署
    hue的安装方式--编译安装,需要外网          
1、先安装hue编译安装需要的依赖包(使用root用户或者sudo)         
# sudo yum -y install ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi gcc gcc-c++ krb5-devel libtidy libxml2-devel libxslt-devel openldap-devel python-devel sqlite-devel openssl-devel mysql-devel gmp-devel          
Complete! 表示安装成功  
安装成功后切换回普通用户!!
2、上传并解压hue的源码包,解压后进行hue的编译安装  
tar zxvf hue-3.7.0-cdh5.3.6.tar.gz 
    进入到的源码包主目录:   
cd /opt/modules/cdh/hue-3.7.0-cdh5.3.6   hue的源码包主目录
    进行编译安装  
$ make apps    
    make[1]: Leaving directory `/opt/modules/hue-3.7.0-cdh5.3.6/apps'  --- 表示编译安装成功 !!
    
    编译安装hue之后原先的jdk会被openJDK覆盖 --使用root用户       
$ java -version
$ rpm -qa |grep java      
# sudo rpm -e --nodeps java-1.7.0-openjdk-devel-1.7.0.131-2.6.9.0.el6_8.x86_64 java_cup-0.10k-5.el6.x86_64 java-1.7.0-openjdk-1.7.0.131-2.6.9.0.el6_8.x86_64 java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64 tzdata-java-2017a-1.el6.noarc
  # source /etc/profile 
 # java -version      
3、    修改hue的主配置文件     
$ sudo vi  desktop/conf/hue.ini  
    修改[desktop]模块下的配置信息
     配置hue的session秘钥:
http://archive.cloudera.com/cdh5/cdh/5/hue-3.7.0-cdh5.3.6/manual.html#_web_server_configuration
# Set this to a random string, the longer the better.
  # This is used for secure hashing in the session store.
  secret_key=jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn
  # Webserver listens on this address and port
  http_host=192.168.134.101
  http_port=8888
  # Time zone name
  time_zone=Asia/Shanghai      

4、 启动hue
     默认是一个前端进程
    $ build/env/bin/supervisor
       
5、访问hue的web
http://com.bigdata:8888/about/
    第一次登陆的用户名和密码要记住!!wanglu   123456
  
   $ ps -ef | grep hue  
  $ netstat -antp | grep 8888  
  $ kill -9 61772  关闭hue 服务进程
    
三、hue与Hadoop集成   
1)修改Hadoop相关配置文件
    hdfs-site.xml 添加以下配置:
   
      dfs.webhdfs.enabled
      true
        
    
        dfs.permissions.enabled
        falsem
        
   
    core-site.xml 添加下面配置:        
    为hue配置一个名为hue的代理用户,来获取hdfs的使用访问权限
      hadoop.proxyuser.hue.hosts
      *
    
    
      hadoop.proxyuser.hue.groups
      *
    
    重启Hadoop服务!!!
        $ sbin/stop-all.sh
        $ sbin/start-all.sh  
        
2)修改hue的配置文件
        hue.ini文件  
[[hdfs_clusters]]
    # HA support by using HttpFs
    [[[default]]]
      # Enter the filesystem uri
      fs_defaultfs=hdfs://192.168.134.101:8020
      # NameNode logical name.
      ## logical_name=
      # Use WebHdfs/HttpFs as the communication mechanism.
      # Domain should be the NameNode or HttpFs host.
      # Default port is 14000 for HttpFs.
      webhdfs_url=http://192.168.134.101:50070/webhdfs/v1
      # Change this if your HDFS cluster is Kerberos-secured
      ## security_enabled=false
        
       hadoop_hdfs_home=/opt/modules/hadoop-2.5.0-cdh5.3.6
       hadoop_bin=/opt/modules/hadoop-2.5.0-cdh5.3.6/bin
       hadoop_conf_dir=/opt/modules/hadoop-2.5.0-cdh5.3.6/etc/hadoop            
  [[yarn_clusters]]
    [[[default]]]
      # Enter the host on which you are running the ResourceManager
      resourcemanager_host=192.168.134.101
      # The port where the ResourceManager IPC listens on
      resourcemanager_port=8032
      # Whether to submit jobs to this cluster
      submit_to=True
      # Resource Manager logical name (required for HA)
      ## logical_name=
      # Change this if your YARN cluster is Kerberos-secured
      ## security_enabled=false
      # URL of the ResourceManager API
      resourcemanager_api_url=http://192.168.134.101:8088
      # URL of the ProxyServer API
      proxy_api_url=http://192.168.134.101:8088
      # URL of the HistoryServer API
      history_server_api_url=http://192.168.134.101:19888
    
3)重启hue并到web上查看Hadoop的集成情况
    管理hdfs
        增删改查  
    管理任务
        注意修改任务的用户名 admin -> 普通用户
                 
四、hue与hive的集成
1)启动hiveserver2服务  
    hue需要借助hiveserver2进程获取hive表的表数据         
   
$ bin/hiveserver2 &  
//&表示在后台运行
   
2)配置hive的metastore服务进程并启动  
    metastore服务:
        hue需要通过metastore进程获取远程端的mysql中的hive表元数据
        一般企业的mysql都是配置在远程端,本地客户端需要借助metastore来回获取元数据
        
    配置: hive-site.xml添加以下       
  hive.metastore.uris
  thrift://192.168.134.101:9083
   
    启动:     
$ bin/hive --service metastore &  
   
3)配置hue.ini  
    
[beeswax]
  # Host where HiveServer2 is running.
  # If Kerberos security is enabled, use fully-qualified domain name (FQDN).
  hive_server_host=192.168.134.101
  # Port where HiveServer2 Thrift server runs on.
  hive_server_port=10000
  # Hive configuration directory, where hive-site.xml is located
  hive_conf_dir=/opt/modules/hive-0.13.1-cdh5.3.6/conf   
 
    重启hue !!!!!
    
4)web页面集成情况
            
五、hue与mysql数据库的集成
    修改hue.ini配置:    
  
[[databases]]
    # sqlite configuration.
    ## [[[sqlite]]]
      # Name to show in the UI.
      ## nice_name=SQLite
      # For SQLite, name defines the path to the database.
      ## name=/tmp/sqlite.db
      # Database backend to use.
      ## engine=sqlite
      # Database options to send to the server when connecting.
      # https://docs.djangoproject.com/en/1.4/ref/databases/
      ## options={}
    # mysql, oracle, or postgresql configuration.
    [[[mysql]]]
      # Name to show in the UI.
      nice_name="My SQL DB"
      # For MySQL and PostgreSQL, name is the name of the database.
      # For Oracle, Name is instance of the Oracle server. For express edition
      # this is 'xe' by default.
      ## name=mysqldb
      # Database backend to use. This can be:
      # 1. mysql
      # 2. postgresql
      # 3. oracle
      engine=mysql
      # IP or hostname of the database to connect to.
      host=192.168.134.101
      # Port the database server is listening to. Defaults are:
      # 1. MySQL: 3306
      # 2. PostgreSQL: 5432
      # 3. Oracle Express Edition: 1521
      port=3306
      # Username to authenticate with when connecting to the database.
      user=root
      # Password matching the username to authenticate with when
      # connecting to the database.
      password=root123    

    重启hue  
    
六、hue与oozie的集成        
[liboozie]
  # The URL where the Oozie service runs on. This is required in order for
  # users to submit jobs. Empty value disables the config check.
  oozie_url=http://192.168.134.101:11000/oozie
  # Requires FQDN in oozie_url if enabled
  ## security_enabled=false
  # Location on HDFS where the workflows/coordinator are deployed when submitted.
  remote_deployement_dir=/examples  //用户自定义应用在hdfs上的根目录  
  


你可能感兴趣的:(大数据,hue,大数据开发)