Hive是基于Hadoop构建的一套数据仓库分析系统,它提供了丰富的SQL查询方式来分析存储在Hadoop 分布式文件系统中的数据。其在Hadoop的架构体系中承担了一个SQL解析的过程,它提供了对外的入口来获取用户的指令然后对指令进行分析,解析出一个 MapReduce程序组成可执行计划,并按照该计划生成对应的MapReduce任务提交给Hadoop集群处理,获取最终的结果。元数据——如表模式 ——存储在名为metastore的数据库中。
1. 安装mysql
① 卸载系统自带的mysql相关安装包,仅卸载mysql开头的包
[hadoop@masternode1 hadoop]# rpm -qa|grep mysql
mysql-libs-5.1.73-3.el6_5.x86_64
[hadoop@masternode1 hadoop]# rpm
rpm rpmdb rpmquery rpmverify
rpm2cpio rpmdumpheader rpmsign
[hadoop@masternode1 hadoop]# rpm -e --nodeps mysql-libs-5.1.73-3.el6_5.x86_64
② 安装依赖包
[hadoop@masternode1 hadoop]#yum install gcc gcc-c++ ncurses-devel -y
③ 安装cmake
下载cmake安装包
[hadoop@masternode1 cmake-2.8.12]$ wget http://www.cmake.org/files/v2.8/cmake-2.8.12.tar.gz
[hadoop@masternode1 cmake-2.8.12]$ tar zxvf cmake-2.8.12.tar.gz
[hadoop@masternode1 cmake-2.8.12]$ cd cmake-2.8.12
[hadoop@masternode1 cmake-2.8.12]$ ./bootstrap
[hadoop@masternode1 hadoop]# make && make install
④ 创建mysql用户的相应目录
[hadoop@masternode1 ~]# groupadd mysql;useradd -g mysql mysql
[hadoop@masternode1 ~]# su - hadoop
[hadoop@masternode1 cmake-2.8.12]$ mkdir -p /opt/mysql
[hadoop@masternode1 cmake-2.8.12]$ mkdir -p /opt/mysql/data
[hadoop@masternode1 cmake-2.8.12]$ mkdir -p /opt/mysql/log
⑤ 修改目录权限
[hadoop@masternode1 ~]$ chmod +w /opt/hadoop/mysql-5.6.25
[hadoop@masternode1 ~]$ chmod 756 /opt/hadoop/mysql-5.6.25/
⑥ 获取MySQL安装包并安装
wget http://dev.mysql.com/get/downloads/mysql/mysql-5.6.25.tar.gz
[hadoop@masternode1 hadoop]]$ tar zxvf mysql-5.6.25.tar.gz
[hadoop@masternode1 hadoop]]$ cd mysql-5.6.25
[hadoop@masternode1 hadoop]]$ mkdir data
[hadoop@masternode1 mysql-5.6.25]$ cmake -DCMAKE_INSTALL_PREFIX=/opt/hadoop/mysql-5.6.25 -DMYSQL_UNIX_ADDR=/opt/hadoop/mysql-5.6.25/mysql.sock -DDEFAULT_CHARSET=utf8 -DDEFAULT_COLLATION=utf8_general_ci -DWITH_INNOBASE_STORAGE_ENGINE=1 -DWITH_ARCHIVE_STORAGE_ENGINE=1 -DWITH_BLACKHOLE_STORAGE_ENGINE=1 -DMYSQL_DATADIR=/opt/hadoop/mysql-5.6.25/data/ -DMYSQL_TCP_PORT=3306 -DENABLE_DOWNLOADS=1-DENABLE_DOWNLOADS=1
若提示cmake命名不能运行,之后执行如下即可
export PATH=/opt/hadoop/cmake-2.8.12/bin/:$PATH
[hadoop@masternode1e mysql-5.6.25]# make && make install
注意
DMYSQL_UNIX_ADDR=/opt/hadoop/mysql-5.6.25/mysql.sock 必须目录有写的权限,chmod 756 /opt/hadoop/mysql-5.6.25/
或者-DMYSQL_UNIX_ADDR=/tmp/mysql.sock 只需/tmp有写的权限即可 chmod 756 /tmp
Chown mysql:mysql /tmp 因此DMYSQL_UNIX_ADDR最好配置/tmp/mysql.scok
⑦ 创建软连接
[hadoop@masternode1 ~]# ln -s /opt/hadoop/mysql-5.6.25/libmysql/libmysqlclient.so.18 /usr/lib/libmysqlclient.so.18
[hadoop@masternode1 ~]# chown hadoop:hadoop /usr/lib/libmysqlclient.so.18
⑧ 修改配置文件
[hadoop@masternode1 mysql-5.6.25]# cp support-files/my-default.cnf /etc/my.cnf
[hadoop@masternode1 mysql-5.6.25]# chown hadoop:hadoop /etc/my.cnf
[hadoop@masternode1 mysql-5.6.25]$ vi /etc/my.cnf
datadir=/opt/hadoop/mysql-5.6.25/data
log-error=/opt/hadoop/mysql-5.6.25/mysql_error.log
pid-file=/opt/hadoop/mysql-5.6.25/data/mysql.pid
socket=/opt/hadoop/mysql-5.6.25/mysql.sock
user=mysql
⑨ 初始化数据库
[hadoop@masternode1 ~]$ chmod 756 /opt/hadoop/mysql-5.6.25/scripts/mysql_install_db
[[[email protected]]$/opt/hadoop/mysql-5.6.25/scripts/mysql_install_db --user=mysql --basedir=/opt/hadoop/mysql-5.6.25 --datadir=/opt/hadoop/mysql-5.6.25/data/ --defaults-file=/etc/my.cnf --basedir=/opt/hadoop/mysql-5.6.25
⑩ 创建启动文件
[hadoop@masternode1 ~]# cp /opt/hadoop/mysql-5.6.25/support-files/mysql.server /etc/init.d/mysqld
[hadoop@masternode1 ~]# chown hadoop:hadoop /etc/init.d/mysqld ;chmod 756 /etc/init.d/mysqld
[hadoop@slavenode1 mysql-5.6.25]# mysqld -uroot -h127.0.0.1 -p
bash: mysqld: command not found
解决方法如下:
ln -s /opt/hadoop/mysql-5.6.25/bin/mysql /usr/bin/mysql
11 启动MySQL服务
[hadoop@masternode1 mysql-5.6.25]# service mysqld start
Starting MySQL. SUCCESS!
chkconfig --level 2345 mysqld on
12 初始化密码
[hadoop@masternode1 ~]$ mysql -uroot -h127.0.0.1 -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.6.25 Source distribution
Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> set PASSWORD=PASSWORD('123456');
Query OK, 0 rows affected (0.00 sec)
13 创建Hive用户
mysql> CREATE USER 'hive' IDENTIFIED BY 'hive';
Query OK, 0 rows affected (0.00 sec)
mysql> GRANT ALL PRIVILEGES on *.* TO 'hive'@'masternode1' WITH GRANT OPTION;
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
14 Hive用户登录
[hadoop@masternode1 ~]$ mysql -h masternode1 -uhive
mysql> set PASSWORD = PASSWORD('hive');
15 创建Hive数据库
mysql> create database hive;
Query OK, 1 row affected (0.00 sec)
2. 解压hive软件包
[hadoop@masternode1 hadoop]# tar -zxf apache-hive-2.0.0-bin.tar.gz
[hadoop@masternode1 hadoop]# mv apache-hive-2.0.0-bin hive-2.0.0
① Hive环境变量设置
[hadoop@masternode1 hadoop]# cat /etc/profile|tail
#set hive
export HIVE_HOME=/opt/hadoop/hive-2.0.0
export PATH=$PATH:$HIVE_HOME/bin
vi ~/.bash_profile
#set hive
export HIVE_HOME=/opt/hadoop/hive-2.0.0
export PATH=$PATH:$HIVE_HOME/bin
② 配置参数文件
[hadoop@masternode1 hadoop$cd hive-2.0.0/conf/
[hadoop@masternode1 conf]$ cp hive-env.sh.template hive-env.sh
[hadoop@masternode1 conf]$ vi hive-env.sh
export HADOOP_HOME=/opt/hadoop/hadoop-2.7.2
export HIVE_HOME=/opt/hadoop/hive-2.0.0
[hadoop@masternode1 conf]$ cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties
[hadoop@masternode1 conf]$ cp hive-log4j2.properties.template hive-log4j2.properties
[hadoop@masternode1 conf]$ cp hive-default.xml.template hive-site.xml
[hadoop@masternode1 conf]# vi hive-site.xml
Location of Hive run time structured log file
cp hive-exec-log4j.properties.template hive-exec-log4j.properties
vi hive-exec-log4j.properties
hive.log.dir=/opt/hadoop/hive-2.0.0/logs
log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter
cp hive-log4j.properties.template hive-log4j.properties
vi hive-log4j.properties
hive.log.dir=/opt/hadoop/hive-2.0.0/logs
log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter
③ 创建临时目录
[hadoop@masternode1 conf]$ mkdir /opt/hadoop/hive-2.0.0/local/hive
[hadoop@masternode1 hive-2.0.0]$ mkdir local
[hadoop@masternode1 hive-2.0.0]$mkdir /opt/hadoop/hive-2.0.0/logs
[hadoop@masternode1 ~$mkdir /opt/hadoop/hive-2.0.0/local/logs/operation_logs
④ 拷贝MySQL驱动文件
下载地址:
wget http://ftp.nchu.edu.tw/Unix/Database/MySQL/Downloads/Connector-J/mysql-connector-java-5.1.36.tar.gz
[hadoop@masternode1 conf]# cp /home/centos/mysql-connector-java-5.1.36.tar.gz /opt/hadoop/
[hadoop@masternode1 conf]# cd /opt/hadoop/
[hadoop@masternode1 hadoop]# tar zxvf mysql-connector-java-5.1.36.tar.gz
[hadoop@masternode1 hadoop]# cd mysql-connector-java-5.1.36
[hadoop@masternode1 mysql-connector-java-5.1.36]# ls
build.xml CHANGES COPYING docs mysql-connector-java-5.1.36-bin.jar README README.txt src
[hadoop@masternode1 mysql-connector-java-5.1.36]# cp mysql-connector-java-5.1.36-bin.jar $HIVE_HOME/lib/
[hadoop@masternode1 mysql-connector-java-5.1.36]# cd $HIVE_HOME/conf
⑤ 拷贝到其他机器上
[hadoop@masternode1 hadoop]# for i in {31,32,33,34,35,36};do scp -r hive-2.0.0/ [email protected]$i:/opt/hadoop/ ; done
⑥ 配置7台客户端hive-site.xml
[hadoop@slavenode2 conf]# cat hive-site.xml
⑦ 初始化 db
[hadoop@masternode1 bin]# schematool -initSchema -dbType mysql
Metastore connection URL: jdbc:mysql://slavenode1:3306/hive_local_meta?createDatabaseIfNotExist=true
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: hive
Starting metastore schema initialization to 2.0.0
Initialization script hive-schema-2.0.0.mysql.sql
Initialization script completed
schemaTool completed
检查防火墙,不需要得话就关了防火墙,需要得话就把端口放开,比如9083,9000,9001,50070等然后测试,切换到hive主目录,输入一下命令
[hadoop@masternode1 hive-2.0.0]# bin/hive --service metastore &
Starting Hive Metastore Server
⑧ 启动hive
$HIVE_HOME/bin/hive
[hadoop@masternode1 hadoop-2.7.2]# $HIVE_HOME/bin/hive
SLF4J: Class path contains multiple SLF4J bindings.
file:/opt/hadoop/hive-2.0.0/conf/hive-log4j2.properties
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases.
hive> show databases;
OK
hive> create table studentinfo (id int,name string, age int,tel string)
> row format delimited fields terminated by '\t'
> stored as textfile;
OK
Time taken: 0.439 seconds
hive> load data local inpath '/tmp/stdentInfo.txt' into table studentinfo;
Loading data to table default.studentinfo
OK
Time taken: 1.313 seconds
hive> select * from studentinfo;
OK
1 a 26 110
2 b 29 120
hive> show tables;
OK
studentinfo
Time taken: 1.008 seconds, Fetched: 1 row(s)
hive> create table tb(id int,name string);
OK
Time taken: 0.427 seconds
hadoop@slavenode1 bin]# hive --service hwi
[hadoop@解决办法:从源码手动build出这个war包,具体过程如第二步。
wget http://mirror.bit.edu.cn/apache/hive/hive-2.0.0/apache-hive-2.0.0-src.tar.gz
[hadoop@masternode1 hive-2.0.0]$ tar xf apache-hive-2.0.0-src.tar.gz
[hadoop@masternode1 hive-2.0.0]$ cd apache-hive-2.0.0-src
构建war包
[hadoop@masternode1 hwi]$ jar cvfM0 hive-hwi-2.0.0.war -C web/ .
[hadoop@masternode1 hive-2.0.0]$ cp hive-hwi-2.0.0.war /opt/hadoop/hive-2.0.0/lib/hive-hwi-2.0.0.war
[hadoop@masternode1 hive-2.0.0]$ scp /opt/hadoop/hive-2.0.0/lib/hive-hwi-2.0.0.war hadoop@masternode2:/opt/hadoop/hive-2.0.0/lib/hive-hwi-2.0.0.war
hadoop@slavenode1 bin]# hive --service hwi
如有下报错:
Problem accessing /hwi/. Reason:
Unable to find a javac compiler;
com.sun.tools.javac.Main is not on the classpath.
Perhaps JAVA_HOME does not point to the JDK.
It is currently set to "/usr/java/jdk1.7.0_65/jre"
解决办法:
cp /usr/java/jdk1.7.0_65/lib/tools.jar /opt/hadoop/hive-2.0.0/lib/
hive --service hwi 重启即可。