官方文档
hive的搭建方式有三种,分别是
1、Local/Embedded Metastore Database (Derby)
2、Remote Metastore Database
3、Remote Metastore Server
一般情况下,我们在学习的时候直接使用hive –service metastore的方式启动服务端,使用hive的方式直接访问登录客户端,除了这种方式之外,hive提供了hiveserver2的服务端启动方式,提供了beeline和jdbc的支持,并且官网也提出,一般在生产环境中,使用hiveserver2的方式比较多:
使用hiveserver2的优点如下:
1、在应用端不需要部署hadoop和hive的客户端
2、hiveserver2不用直接将hdfs和metastore暴露给用户
3、有HA机制,解决应用端的并发和负载问题
4、jdbc的连接方式,可以使用任何语言,方便与应用进行数据交互
hive的HA搭建
使用zookeeper完成HA:
ZooKeeper-based service discovery introduced in Hive 0.14.0 (HIVE-7935) enables high availability and rolling upgrade for HiveServer2. A JDBC URL that specifies
With further changes in Hive 2.0.0 and 1.3.0 (unreleased, HIVE-11581), none of the additional configuration parameters such as authentication mode, transport mode, or SSL parameters need to be specified, as they are retrieved from the ZooKeeper entries along with the hostname.
The JDBC connection URL: jdbc:hive2://
.
The
Additional runtime parameters needed for querying can be provided within the URL as follows, by appending it as a ?
1、环境准备
Node01 |
Node02 |
Node03 |
Node04 |
|
Namenode |
1 |
1 |
||
Journalnode |
1 |
1 |
1 |
|
Datanode |
1 |
1 |
1 |
|
Zkfc |
1 |
1 |
||
zookeeper |
1 |
1 |
1 |
|
resourcemanager |
1 |
1 |
||
nodemanager |
1 |
1 |
1 |
|
Hiveserver2 |
1 |
1 |
||
beeline |
1 |
2、node01的hive-site.xml文件:
hive.metastore.warehouse.dir
/user/hive/warehouse
javax.jdo.option.ConnectionURL
jdbc:mysql://node01:3306/hive?createDatabaseIfNotExist=true
javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
javax.jdo.option.ConnectionUserName
root
javax.jdo.option.ConnectionPassword
root
hive.server2.thrift.bind.host
node02
Bind host on which to run the HiveServer2 Thrift service.
hive.server2.thrift.port
10001
Port number of HiveServer2 Thrift interface when hive.server2.transport.mode is 'binary'.
hive.security.authorization.enabled
true
hive.server2.enable.doAs
false
hive.users.in.admin.role
root
hive.security.authorization.manager
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory
hive.security.authenticator.manager
org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator
hive.server2.support.dynamic.service.discovery
true
hive.server2.zookeeper.namespace
hiveserver2_zk
hive.zookeeper.quorum
node04:2181,node02:2181,node03:2181
hive.zookeeper.client.port
2181
3、node03的hive-site.xml文件:
hive.metastore.warehouse.dir
/user/hive/warehouse
javax.jdo.option.ConnectionURL
jdbc:mysql://node01:3306/hive?createDatabaseIfNotExist=true
javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
javax.jdo.option.ConnectionUserName
root
javax.jdo.option.ConnectionPassword
root
hive.server2.thrift.bind.host
node03
Bind host on which to run the HiveServer2 Thrift service.
hive.server2.thrift.port
10000
Port number of HiveServer2 Thrift interface when hive.server2.transport.mode is 'binary'.
hive.security.authorization.enabled
true
hive.server2.enable.doAs
false
hive.users.in.admin.role
root
hive.security.authorization.manager
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory
hive.security.authenticator.manager
org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator
hive.server2.support.dynamic.service.discovery
true
hive.server2.zookeeper.namespace
hiveserver2_zk
hive.zookeeper.quorum
node04:2181,node02:2181,node03:2181
hive.zookeeper.client.port
2181
4、使用jdbc或者beeline两种方式进行访问
1、使用beeline连接
[root@node04 conf]# beeline
Beeline version 2.3.9 by Apache Hive
beeline> !connect jdbc:hive2://node02,node03,node04/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2_zk root
Connecting to jdbc:hive2://node02,node03,node04/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2_zk
Enter password for jdbc:hive2://node02,node03,node04/: ****
21/07/04 09:53:08 [main]: INFO jdbc.HiveConnection: Connected to node03:10000
Connected to: Apache Hive (version 2.3.9)
Driver: Hive JDBC (version 2.3.9)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://node02,node03,node04/> show tables;
+----------------------------+
| tab_name |
+----------------------------+
| apachelog |
| bucket_sample |
| bucket_test_tbl |
| cdr_summ_info |
| hive_index_table |
| student |
| student_as |
| student_contain_struct |
| student_dml |
| student_dml2 |
| student_dynamic_partition |
| student_ex |
| student_for_dp |
| student_like |
| student_static_partion1 |
| student_static_partition2 |
| v_student |
| wordcount |
| wordcount_result |
+----------------------------+
19 rows selected (1.387 seconds)
2、使用zk查看Hive HA注册信息
[root@node04 ~]# zkCli.sh
[zk: localhost:2181(CONNECTED) 3] ls /
[hadoop-ha, hiveserver2_zk, yarn-leader-election, zookeeper]
[zk: localhost:2181(CONNECTED) 4] ls /hiveserver2_zk
[serverUri=node02:10001;version=2.3.9;sequence=0000000000, serverUri=node03:10000;version=2.3.9;sequence=0000000001]
[zk: localhost:2181(CONNECTED) 5]
JDBC连接:
public class HiveJdbcClient2 {
private static String driverName = "org.apache.hive.jdbc.HiveDriver";
public static void main(String[] args) throws SQLException {
try {
Class.forName(driverName);
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
Connection conn = DriverManager.getConnection("jdbc:hive2://node02,node03,node04/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2_zk", "root", "");
Statement stmt = conn.createStatement();
String sql = "select * from student";
ResultSet res = stmt.executeQuery(sql);
while (res.next()) {
System.out.println(res.getString(1));
}
}
}