搭建hive环境（一）

为什么80%的码农都做不了架构师？>>>

一、与 Hadoop 类似，Hive 也有 3 种运行模式：

内嵌模式：

将元数据保存在本地内嵌的 Derby 数据库中，这是使用 Hive 最简单的方式。但是这种方式缺点也比较明显，因为一个内嵌的 Derby 数据库每次只能访问一个数据文件，这也就意味着它不支持多会话连接。 2. 本地模式

这种模式是将元数据保存在本地独立的数据库中（一般是 MySQL），这用就可以支持多会话和多用户连接了。

远程模式

此模式应用于 Hive 客户端较多的情况。把 MySQL 数据库独立出来，将元数据保存在远端独立的 MySQL 服务中，避免了在每个客户端都安装 MySQL 服务从而造成冗余浪费的情况。

一、下载安装 Hive

http://hive.apache.org/downloads.html

二、配置系统环境变量：

修改 /etc/profile或者~/.bashrc 文件，使用 vim /etc/profile 或者vim ~/.bashrc来修改：
# Hive environment
export HIVE_HOME=/usr/local/hadoop/hive
export PATH=$HIVE_HOME/bin:$HIVE_HOME/conf:$PATH

二、内嵌模式

（1）修改 Hive 配置文件：hive-env.sh和hive-site.xml

hive-env.sh

export HADOOP_HEAPSIZE=1024

# Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=/usr/local/hadoop

# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/usr/local/hadoop/hive/conf

# Folder containing extra ibraries required for hive compilation/execution can be controlled by:
export HIVE_AUX_JARS_PATH=/usr/local/hadoop/hive/lib

hive-site.xml

hive.metastore.warehouse.dir
该参数指定了 Hive 的数据存储目录，默认位置在 HDFS 上面的 /user/hive/warehouse 路径下。

hive.exec.scratchdir
该参数指定了 Hive 的数据临时文件目录，默认位置为 HDFS 上面的 /tmp/hive 路径下。

（2）创建必要目录

前面我们看到 hive-site.xml 文件中有两个重要的路径，切换到 hadoop 用户下查看 HDFS 是否有这些路径：

$ hadoop dfs -ls /
没有发现上面提到的路径，因此我们需要自己新建这些目录，并且给它们赋予用户写（W）权限。
$ hadoop dfs -mkdir /user/hive/warehouse
$ hadoop dfs -mkdir /tmp/hive
$ hadoop dfs -chmod 777 /user/hive/warehouse
$ hadoop dfs -chmod 777 /tmp/hive

（3）修改 io.tmpdir 路径

同时，要修改 hive-site.xml 中所有包含 ${system:java.io.tmpdir} 字段的 value 即路径，你可以自己新建一个目录来替换它，例如 /home/hive/iotmp .

mkdir /home/hadoop/cloud/apache-hive-2.1.1-bin/iotmp  
chmod 777 /home/hadoop/cloud/apache-hive-2.1.1-bin/iotmp

最后再将


    hive.exec.local.scratchdir
    /home/lch/software/Hive/apache-hive-2.1.1-bin/tmp/${system:user.name}
    Local scratch space for Hive jobs

中的${system:user.name}修改如下：


    hive.exec.local.scratchdir
    /home/lch/software/Hive/apache-hive-2.1.1-bin/tmp/${user.name}
    Local scratch space for Hive jobs

(4) 初始化

./schematool -initSchema -dbType derby

 nohup ./hive --service metastore &
 nohup ./hive --service hiveserver2 &

运行 Hive

./hive

前面我们已经提到过，内嵌模式使用默认配置和 Derby 数据库，所以无需其它特别修改，先 ./start-all.sh 启动 Hadoop, 然后直接运行 hive：

四、Thrift 服务

通过hiveServer/hiveServer2启动Thrift服务，客户端连接Thrift服务访问Hive数据库（JDBC，JAVA等连接Thrift服务访问Hive）。


    hive.server2.thrift.port
    
    Port number of HiveServer2 Thrift interface when hive.server2.transport.mode is 'binary'.
  

  
    hive.server2.thrift.bind.host
    127.0.0.1
    Bind host on which to run the HiveServer2 Thrift service.
  

  
    hive.server2.enable.doAs
    false
    
      Setting this property to true will have HiveServer2 execute
      Hive operations as the user making the calls to it.
　　　 如果为True：Hive Server会以提交用户的身份去执行语句
　　　 如果为False：会以hive server daemon的admin user来执行语句

启动Thrift服务：hive --service hiveserver2

测试Thrift服务：

　　新开一个命令行窗口，执行beeline命令：　　

./beeline
1 若是内嵌模式则：
    If hive.server2.authentication is "NONE" in HIVE_HOME/conf/hive-site.xml then connect beeline with below url
    Connection URL:
               !connect jdbc:hive2://
2  Remote Mode: 
    i.)SASL Authentication:

        If value of "hive.server2.authentication"      property in HIVE_HOME/conf/hive-site.xml to be      set as "SASL" then connect hive beeline with    below url
        
        Beeline URL:
                  !connect      jdbc:hive2://:/
        
    ii.)NOSASL Authentication:
        If "hive.server2.authentication" is nosasl      then connect the beeline like below.
        Beeline URL:
        
                  !connect      jdbc:hive2://:/;auth     =noSasl

三、本地模式（选用mySql作为元数据库，将mySql和Hive安装在master服务器上）

首先用root用户安装mysql：

yum install mysql-server

安装完成后：

service mysqld start 
mysql
若不报错则执行下命令：
use mysql; 
update user set password=PASSWORD(“hadoop”)where user=”root”; 
flush privileges; 
quit 
service mysqld restart 
mysql -u root -p 
password:hadoop

到此测表示mysql已经安装成功。接下来创建用户：

创建Hive元数据库: 
mysql>create database hive;
alter database hive character set latin1;

mysql>grant all privileges on hive.* to 'hive'@'%' identified by 'hive';





mysql>flush privileges;

二、安装hive

hive安装的前提是hadoop已经安装完毕。且需要在hdfs上新建/user/hive/metahouse/和/tmp/hive文件夹，并赋予777权限：

hdfs dfs -mkdir /user/hive/metahouse(只能一级一级的建)
hdfs dfs -mkdir /tmp/hive
hdfs dfs -chmod 777 /user/hive/warehouse
hdfs dfs -chmod 777 /tmp/hive

hive.metastore.warehouse.dir
该参数指定了 Hive 的数据存储目录，默认位置在 HDFS 上面的 /user/hive/warehouse 路径下。

hive.exec.scratchdir
该参数指定了 Hive 的数据临时文件目录，默认位置为 HDFS 上面的 /tmp/hive 路径下。

下载hvie：apache-hive-2.2.0-bin.tar.gz 放入/home/connect/install文件夹中然后解压至/home/connect/software

tar -zxvf apache-hive-2.2.0-bin.tar.gz -C /home/connect/software
ln -s apache-hive-2.2.0-bin hive

然后设置环境变量：

vim ~/.bashrc
export HIVE_HOME=/home/connect/software/hive
export PATH=$HIVE_HOME/bin:$HIVE_HOME/conf:$PATH
source ~/.bashrc

修改配置文件：

cp hive-default.xml.template hive-site.xml
cp hive-env.sh.template hive-env.sh

a.修改hive-site.xml


  javax.jdo.option.ConnectionURL
    jdbc:mysql://10.37.167.203:3306/hive_metastore?characterEncoding=UTF-8
    


  javax.jdo.option.ConnectionDriverName
    com.mysql.jdbc.Driver
    


  javax.jdo.option.ConnectionUserName
    root
    


  javax.jdo.option.ConnectionPassword
    hadoop

修改 io.tmpdir 路径要修改 hive-site.xml 中所有包含 ${system:java.io.tmpdir} 字段的 value 即路径，可以自己新建一个目录来替换它，例如:

/home/connect/hive/bin/iotmp
chmod 777 /home/connect/hive/bin/iotmp
把hive-site.xml 中所有包含 ${system:Java.io.tmpdir}替换成/home/hadoop/cloud/apache-hive-2.1.1-bin/iotmp

拷贝JDBC驱动包

将mySql的JDBC驱动包复制到Hive的lib目录下

cp mysql-connector-java.bin.jar /home/connect/hive/lib

6、分发Hive分别到slave1,slave2,slave3上

scp -r /usr/local/hadoop/apache-hive-1.2.1-bin slave1:/usr/local/hadoop/ 
scp -r /usr/local/hadoop/apache-hive-1.2.1-bin slave2:/usr/local/hadoop/ 
scp -r /usr/local/hadoop/apache-hive-1.2.1-bin slave3:/usr/local/hadoop/

环境变量等如同master

7、测试Hive 进入到Hive的安装目录，命令行： cd /usr/local/hadoop/apache-hive-1.2.1-bin/bin hive hive>show tables; 正常显示，即是安装配置成功。

初始化

schematool -dbType mysql -initSchema

参考及后续有用博文

https://www.cnblogs.com/tq03/p/5107949.html

https://www.cnblogs.com/netuml/p/7841387.html

http://www.cnblogs.com/zlslch/p/6025410.html

https://www.cnblogs.com/yaohaitao/p/6588922.html