Hive简单入门

简介

hive是一个客户端, 也可以当作一个软件, 它可以将hql(类似于sql)语句转化为mapreduce算法执行, 得到需要的结果.

原理就是将hadoop文件系统中的一定格式的文件的解析思路保存到mysql(或者其他数据库)中, 这样就可以从数据库解析方法操作分布式文件系统的文件了!

环境准备

1. 3台centOS 6.5

关闭防火墙
安装jdk
配置host ( zk1, zk2, zk3)
配置免密钥ssh (包括自己链接自己)

2. mysql一台(主机名mysql)

允许远程连接
给与数据库权限

安装运行hadoop

1. 配置hadoop

解压

mkdir -p /opt/modules/cdh/
tar -zxvf hadoop-2.5.0-cdh5.3.6.tar.gz -C /opt/modules/cdh
cd /opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/etc/hadoop

修改配置文件

  • core-site.xml hdfs-site.xml yarn-site.xml mapred-site.xml hadoop.env.sh yarn-env.sh mapred-env.sh去掉后缀 .template
  • hadoop.env.sh yarn-env.sh mapred-env.sh添加JAVA_HOME的变量
    添加变量
  • core-site.xml

        
                fs.defaultFS
                hdfs://zk1:8020
        

        
                hadoop.tmp.dir
                /opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/data
        

  • hdfs-site.xml

        
        
                dfs.replication
                3
        

        
        
                dfs.permissions.enable
                false
        

        
                dfs.namenode.secondary.http-address
                zk3:50090
        

        
                dfs.namenode.http-address
                zk1:50070
        

        
                dfs.webhdfs.enabled
                true
        

  • yarn-site.xml

        
        
                yarn.nodemanager.aux-services
                mapreduce_shuffle
        

        
                yarn.resourcemanager.hostname
                zk2
        

        
                yarn.log-aggregation-enable
                true
        

        
                yarn.log-aggregation.retain-seconds
                86400
        

        
        
                yarn.log.server.url
                http://zk1:19888/jobhistory/logs/
        

  • mapred-site.xml

        
                mapreduce.framework.name
                yarn
        

        
                mapreduce.jobhistory.adress
                zk1:10020
        

        
                mapreduce.jobhistory.webapp.adress
                zk1:19888
        

添加slave文件 (etc/hadoop/目录下)

vi slave

添加

zk1
zk2
zk3

配置完成后scp到其他两台主机上

scp -r /opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/etc/hadoop root@zk2:/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/etc/
scp -r /opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/etc/hadoop root@zk3:/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/etc/

在namenode机器(zk1)执行格式化namenode

/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/bin/hdfs namenode -format

2. 启动hadoop

启动namenode (zk1)

/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/sbin/hadoop-daemon.sh start namenode

启动secondarynamenode (zk3)

/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/sbin/hadoop-daemon.sh start secondarynamenode

启动datanode (zk1, zk2, zk3)

/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/sbin/hadoop-daemon.sh start datanode

启动resourcemanager (zk2)

/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/sbin/yarn-daemon.sh start resourcemanager

启动nodemanager (zk1, zk2, zk3)

/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/sbin/yarn-daemon.sh start nodemanager

启动historyserver (zk1)

/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/sbin/mr-jobhistory-daemon.sh start historyserver

验证是否完成启动, 浏览器访问http://zk1:50070

运行成功

安装运行hive

安装hive

解压tarbao

tar -zxvf hive-0.13.1-cdh5.3.6.tar.gz -C /opt/modules/cdh/

修改配置文件

  • 重命名配置文件
mv hive-default.xml.template hive-site.xml
mv hive-env.sh.template hive-env.sh
mv hive-log4j.properties.template hive-log4j.properties
  • hive-env.sh
JAVA_HOME=/usr/local/jdk
HADOOP_HOME=/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/
export HIVE_CONF_DIR=/opt/modules/cdh/hive-0.13.1-cdh5.3.6/conf
  • hive-site.xml (修改不是添加)
        
                javax.jdo.option.ConnectionURL
                jdbc:mysql://mysql:3306/metastore?createDatabaseIfNotExist=true
                JDBC connect string for a JDBC metastore
        

        
                javax.jdo.option.ConnectionDriverName
                com.mysql.jdbc.Driver
                Driver class name for a JDBC metastore
        

        
                javax.jdo.option.ConnectionUserName
                root
                username to use against metastore database
        

        
                javax.jdo.option.ConnectionPassword
                123123
                password to use against metastore database
        
  • hive-log4j.properties
hive.log.dir=/opt/modules/cdh/hive-0.13.1-cdh5.3.6/logs

拷贝jdbc驱动到lib目录下

cp -a mysql-connector-java-5.1.27-bin.jar /opt/modules/cdh/hive-0.13.1-cdh5.3.6/lib/

运行hive

/opt/modules/cdh/hive-0.13.1-cdh5.3.6/bin/hive

测试

测试

你可能感兴趣的:(Hive简单入门)