Remote方式
hive-site.xml 中jdbc URL、驱动、用户名、密码等的配置信息如下:
javax.jdo.option.ConnectionURL
jdbc:derby:;databaseName=metastore_db;create=true
JDBC connect string for a JDBC metastore
javax.jdo.option.ConnectionDriverName
org.apache.derby.jdbc.EmbeddedDriver
Driver class name for a JDBC metastore
javax.jdo.option.ConnectionUserName
daxin
username to use against metastore database
javax.jdo.option.ConnectionPassword
root
password to use against metastore database
hive.metastore.warehouse.dir
hdfs://node:9000/user/hive/
unit test data goes in here on your local filesystem
[二]、Local方式
javax.jdo.option.ConnectionURL
jdbc:mysql://node:3306/hive?createDatabaseIfNotExist=true
JDBC connect string for a JDBC metastore
javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
Driver class name for a JDBC metastore
javax.jdo.option.ConnectionUserName
daxin
username to use against metastore database
javax.jdo.option.ConnectionPassword
root
password to use against metastore database
hive.metastore.warehouse.dir
hdfs://node:9000/user/hive/warehouse
location of default database for the warehouse
配置完成之后启动元数据服务,然后使用hive进入shell进行交互式查询。
javax.jdo.option.ConnectionURL
jdbc:mysql://node:3306/hive?createDatabaseIfNotExist=true
javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
javax.jdo.option.ConnectionUserName
root
javax.jdo.option.ConnectionPassword
root
hive.metastore.schema.verification
false
hive.metastore.uris
thrift://node:9083
hive.execution.engine
spark
spark.home
/home/daxin/bigdata/spark
spark.enentLog.enabled
true
spark.executor.memeory
1g
spark.driver.memeory
1g
hive.server2.transport.mode
binary
hive.server2.thrift.http.port
10001
hive.server2.thrift.http.max.worker.threads
5
hive.server2.thrift.http.min.worker.threads
2
启动hiveserver2
最后启动beeline进行交互式查询
补充:
hive --service metastore我们知道启动的是元数据服务,这个都可以很容易明白。
但是hiveserver2是什么呢?
官方解释:
HiveServer2 (HS2) is a server interface that enables remote clients to execute queries against Hive and retrieve the results (a more detailed intro here). The current implementation, based on Thrift RPC, is an improved version of HiveServer and supports multi-client concurrency and authentication. It is designed to provide better support for open API clients like JDBC and ODBC.
HiveServer2其实就是一个远程客户端执行查询的和检索的一个服务接口,目前是基于Thrift RPC实现的,是HiveServer的提高版本,支持多客户端并发查询和认证,它被提供是为了更好的开放客户端API,例如JDBC和ODBC(言外之意就是远程的hive client执行查询都需要连接到HiveServer2之上)
HiveServer2 (HS2) is a service that enables clients to execute queries against Hive. HiveServer2 is the successor to HiveServer1 which has been deprecated. HS2 supports multi-client concurrency and authentication. It is designed to provide better support for open API clients like JDBC and ODBC.
HiveServer2是一个服务,支持客户端不使用Hive脚本进行执行查询,HiveServer2继承了HiveServer1,HiveServer1已经过时!
三种部署方式区别总结:(转:http://www.jianshu.com/p/6108e0aed204)
内嵌模式使用的是内嵌的Derby数据库来存储元数据,也不需要额外起Metastore服务。这个是默认的,配置简单,但是一次只能一个客户端连接,适用于用来实验,不适用于生产环境。
本地元存储和远程元存储都采用外部数据库来存储元数据,目前支持的数据库有:MySQL、Postgres、Oracle、MS SQL Server.在这里我们使用MySQL。
本地元存储和远程元存储的区别是:本地元存储不需要单独起metastore服务,用的是跟hive在同一个进程里的metastore服务。远程元存储需要单独起metastore服务,然后每个客户端都在配置文件里配置连接到该metastore服务。远程元存储的metastore服务和hive运行在不同的进程里。
最后Hive的webui默认地址是:http://node:10002/