参考: https://www.2cto.com/net/201803/731209.html
Hive是一个数据仓库基础工具在Hadoop中用来处理结构化数据。它架构在Hadoop之上,总归为大数据,并使得查询和分析方便
它存储架构在一个数据库中并处理数据到HDFS。
• 它是专为OLAP设计。
• 它提供SQL类型语言查询叫HiveQL或HQL。
• 它是熟知,快速,可扩展和可扩展的
相关文档资料: http://hive.apache.org/
官网下载: http://hive.apache.org/downloads.html
https://mirrors.tuna.tsinghua.edu.cn/apache/hive/
选择稳定版:apache-hive-2.3.4-bin.tar.gz
主namenode节点hdp-01: mkdir -p /opt/hive
通过FileZilla上传tgz安装包至hdp-01的hive目录
解压:cd /opt/hive && tar -zxvf apache-hive-2.3.4-bin.tar.gz
vim ~/.bash_profile
#末尾添加环境变量
#hive
export HIVE_HOME=/opt/hive/apache-hive-2.3.4-bin
export PATH=$HIVE_HOME/bin:$PATH
#立即生效
source ~/.bash_profile
cd $HIVE_HOME/conf
cp hive-default.xml.template hive-site.xml
vim hive-site.xml
#修改 属性:hive.metastore.warehouse.dir 为 /data/hive/warehouse
#修改 属性:hive.exec.scratchdir 为 /data/hive/tmp
3.2中配置的目录都是hdfs上的目录,需要单独创建
#创建warehouse目录
hdfs dfs -mkdir -p /data/hive/warehouse
hdfs dfs -chmod 777 /data/hive/warehouse
#创建tmp目录
hdfs dfs -mkdir -p /data/hive/tmp
hdfs dfs -chmod 777 /data/hive/tmp
#验证
hdfs dfs -ls /data/hive/
sed -i 's/\${system:java.io.tmpdir}/\/opt\/hive\/apache-hive-2.3.4-bin\/tmp/g' hive-site.xml
sed -i 's/\${system:user.name}/hadoop/g' hive-site.xml
mkdir -p /opt/hive/apache-hive-2.3.4-bin/tmp
sudo yum install -y mysql
下载一个mysql-connector-java驱动包 mysql-connector-java-5.1.44.tar.gz,并上传解压
http://ftp.ntu.edu.tw/MySQL/Downloads/Connector-J/
tar -zxvf mysql-connector-java-5.1.44.tar.gz
移动jar包
cp mysql-connector-java-5.1.44-bin.jar $HIVE_HOME/lib
修改属性值:
javax.jdo.option.ConnectionURL
javax.jdo.option.ConnectionDriverName
javax.jdo.option.ConnectionUserName
javax.jdo.option.ConnectionPassword
hive.metastore.schema.verification
javax.jdo.option.ConnectionURL
jdbc:mysql://hdp-01:3306/hive?createDatabaseIfNotExist=true
JDBC connect string for a JDBC metastore.
To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
Driver class name for a JDBC metastore
javax.jdo.option.ConnectionUserName
hive
Username to use against metastore database
javax.jdo.option.ConnectionPassword
hive
password to use against metastore database
hive.metastore.schema.verification
false
Enforce metastore schema version consistency.
True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
proper metastore schema migration. (Default)
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
cp $HIVE_HOME/conf/hive-env.sh.template $HIVE_HOME/conf/hive-env.sh
vim $HIVE_HOME/conf/hive-env.sh
增加如下内容
export HADOOP_HOME=/opt/hadoop/hadoop-2.9.0
export HIVE_CONF_DIR=/opt/hive/apache-hive-2.3.4-bin/conf
export HIVE_AUX_JARS_PATH=/opt/hive/apache-hive-2.3.4-bin/lib
sudo service mysqld start
sudo chkconfig mysqld on
sudo chkconfig --list | grep mysql
mysql
create user hive@'hdp-01' identified by 'hive';
grant all privileges on *.* to 'hive'@'hdp-01' with grant option;
flush privileges;
此处使用本地独立模式安装的mysql,若使用远程模式授权语句可改为:grant all privileges on . to ‘hive’@‘hdp-01’ with grant option;
schematool -initSchema -dbType mysql
登陆数据库查看
nohup hive --service metastore &
nohup hive --service hiveserver2 &
hive
问题:Caused by: java.lang.OutOfMemoryError: unable to create new native thread
原因:mysql启动了过多的线程
解决:ulimit -u 10240
https://www.cnblogs.com/zengkefu/p/5649407.html
hive创建一张表
create table tmp_hjl_20190113_1 (id decimal(20,10),name string);
Mysql:存储Hive表的元数据
参考: https://blog.csdn.net/SunnyYoona/article/details/51648871
19/01/13 17:55:09 [main]: WARN jdbc.HiveConnection: Failed to connect to hdp-01:10000
Error: Could not open client transport with JDBC Uri: jdbc:hive2://hdp-01:10000: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hadoop is not allowed to impersonate hive (state=08S01,code=0)
解决方案:
修改Hadoop配置文件
vim /opt/hadoop/hadoop-2.9.0/etc/hadoop/core-site.xml
hadoop.proxyuser.hadoop.hosts
*
hadoop.proxyuser.hadoop.groups
*
说明:
hadoop.proxyuser.XXX.hosts 与 hadoop.proxyuser.XXX.groups 中XXX为异常信息中User:* 中的用户名部分
hadoop.proxyuser.xiaosi.hosts
*
The superuser can connect only from host1 and host2 to impersonate a user
hadoop.proxyuser.xiaosi.groups
*
Allow the superuser oozie to impersonate any members of the group group1 and group2
重启hadoop后,重新beeline