使用eclipse远程连接hive

阅读更多

基础环境:
namenode 192.168.1.187  kafka3
datanode 192.168.1.188  kafka4
datanode 192.168.1.189  kafka1

这个集群是自己下的hadoop-*.tar.gz包逐个服务安装的,因此配置文件都需要手动修改,相对cloudera manager的要复杂一些。

hadoop 2.6.2
hive 2.0.1   --只安装在了187上面

1.启动hadoop
./start-all.sh

2.配置hive
[root@kafka3 conf]# cat hive-site.xml




    hive.metastore.warehouse.dir
    /user/hive/warehouse
    location of default database for the warehouse


    hive.querylog.location
    /hadoop/hive/log
    Location of Hive run time structured log file

   
   mapred.job.tracker 
   http://192.168.1.187:9001 
 
 
   
   mapreduce.framework.name 
   yarn 
 
 
 
   hive.server2.thrift.port
   10000
 

 
   hive.server2.thrift.bind.host
   192.168.1.187
 



  hive.server2.enable.doAs
  true

   hive.hwi.listen.port
   9999
   This is the port the Hive Web Interface will listen on


   datanucleus.autoCreateSchema
   false


   datanucleus.fixedDatastore
   true


  javax.jdo.option.ConnectionURL
  jdbc:mysql://192.168.1.189:3306/hive?createDatabaseIfNotExist=true


  javax.jdo.option.ConnectionDriverName
  com.mysql.jdbc.Driver


  javax.jdo.option.ConnectionUserName
  root


  javax.jdo.option.ConnectionPassword
  root


hbase.zookeeper.quorum
kafka1,kafka4,kafka3



3.启动hiveserver服务
[root@kafka3 bin]# ./hiveserver2
命令行模式:
hive --service hiveserver2

服务模式:
./hiveserver2

4.测试连接:
不用写jdbc程序,运行 bin/beeline.sh

[root@kafka3 bin]# ./beeline
ls: 无法访问/opt/apache-hive-2.0.1-bin//lib/hive-jdbc-*-standalone.jar: 没有那个文件或目录
Beeline version 2.0.1 by Apache Hive
beeline> !connect jdbc:hive://192.168.1.187:10000 root root         
scan complete in 1ms
scan complete in 7577ms
No known driver to handle "jdbc:hive://192.168.1.187:10000"  ---不用hive,改用hive2
beeline>

找到了这个包
注意要讲hive的lib下的所有jar包都放到eclipse里面去

[root@kafka3 bin]# cp /opt/apache-hive-2.0.1-bin/jdbc/hive-jdbc-2.0.1-standalone.jar  /opt/apache-hive-2.0.1-bin/lib/

beeline> !connect jdbc:hive2://192.168.1.187:10000
Connecting to jdbc:hive2://192.168.1.187:10000
Enter username for jdbc:hive2://192.168.1.187:10000: root
Enter password for jdbc:hive2://192.168.1.187:10000: root

beeline>  !connect jdbc:hive2://192.168.1.187:10000
Connecting to jdbc:hive2://192.168.1.187:10000
Enter username for jdbc:hive2://192.168.1.187:10000: root
Enter password for jdbc:hive2://192.168.1.187:10000:                                                   
Enter password for jdbc:hive2://192.168.1.187:10000: Error: Failed to open new session: java.lang.RuntimeException:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException):
User: root is not allowed to impersonate root (state=,code=0)        

重启hadoop后,还是不行,但是报错内容换了。
在hadoop的core-site.xml中添加内容:

        hadoop.proxyuser.hadoop.hosts       --刚开始这里写错了一直不知道!不是hadoop用户,我是用的root用户                                        
        *
   

   
            hadoop.proxyuser.hadoop.groups
            root
   


正确的是:
在hadoop的core-site.xml中添加内容:

        hadoop.proxyuser.root.hosts                                    
        *
   

   
            hadoop.proxyuser.root.groups
            root
   


beeline>  !connect jdbc:hive2://192.168.1.187:10000
Connecting to jdbc:hive2://192.168.1.187:10000
Enter username for jdbc:hive2://192.168.1.187:10000: root
Enter password for jdbc:hive2://192.168.1.187:10000:                                                   
16/06/02 11:22:00 [main]: INFO jdbc.HiveConnection: Transport Used for JDBC connection: null
Error: Could not open client transport with JDBC Uri: jdbc:hive2://192.168.1.187:10000: java.net.ConnectException: 拒绝连接 (state=08S01,code=0)


把$HIVE_HOME/lib下的所有hive开头的jar包都拷贝过去
[root@kafka3 bin]# ./beeline
Beeline version 2.0.1 by Apache Hive   --报错没有了
beeline>

开启hive的log
cd /opt/apache-hive-2.0.1-bin/conf
cp hive-log4j2.properties.template hive-log4j2.properties
vi hive-log4j2.properties
property.hive.log.dir = /hadoop/hive/log
property.hive.log.file = hive.log

[root@kafka3 log]# more hive.log

2016-06-03T10:20:16,883 INFO  [main]: service.AbstractService (AbstractService.java:init(89)) - Service:OperationManager is inited.
2016-06-03T10:20:16,884 INFO  [main]: service.AbstractService (AbstractService.java:init(89)) - Service:SessionManager is inited.
2016-06-03T10:20:16,884 INFO  [main]: service.AbstractService (AbstractService.java:init(89)) - Service:CLIService is inited.
2016-06-03T10:20:16,884 INFO  [main]: service.AbstractService (AbstractService.java:init(89)) - Service:ThriftBinaryCLIService is inited.
2016-06-03T10:20:16,884 INFO  [main]: service.AbstractService (AbstractService.java:init(89)) - Service:HiveServer2 is inited.
2016-06-03T10:20:17,022 INFO  [main]: service.AbstractService (AbstractService.java:start(104)) - Service:OperationManager is started.
2016-06-03T10:20:17,022 INFO  [main]: service.AbstractService (AbstractService.java:start(104)) - Service:SessionManager is started.
2016-06-03T10:20:17,023 INFO  [main]: service.AbstractService (AbstractService.java:start(104)) - Service:CLIService is started.
2016-06-03T10:20:17,023 INFO  [main]: service.AbstractService (AbstractService.java:start(104)) - Service:ThriftBinaryCLIService is started.
2016-06-03T10:20:17,023 INFO  [main]: service.AbstractService (AbstractService.java:start(104)) - Service:HiveServer2 is started.
2016-06-03T10:20:17,038 INFO  [main]: server.Server (Server.java:doStart(252)) - jetty-7.6.0.v20120127
2016-06-03T10:20:17,064 INFO  [main]: webapp.WebInfConfiguration (WebInfConfiguration.java:unpack(455)) - Extract jar:file:/opt/apache-hive-2.0.1-bin/lib/hive-jdbc-2.
0.1-standalone.jar!/hive-webapps/hiveserver2/ to /tmp/jetty-0.0.0.0-10002-hiveserver2-_-any-/webapp
2016-06-03T10:20:17,582 INFO  [Thread-10]: thrift.ThriftCLIService (ThriftBinaryCLIService.java:run(100)) - Starting ThriftBinaryCLIService on port 10000 with 5...500
worker threads

2016-06-03T10:20:17,023 INFO  [main]: service.AbstractService (AbstractService.java:start(104)) - Service:ThriftBinaryCLIService is started.
2016-06-03T10:20:17,023 INFO  [main]: service.AbstractService (AbstractService.java:start(104)) - Service:HiveServer2 is started.
2016-06-03T10:20:17,038 INFO  [main]: server.Server (Server.java:doStart(252)) - jetty-7.6.0.v20120127
2016-06-03T10:20:17,064 INFO  [main]: webapp.WebInfConfiguration (WebInfConfiguration.java:unpack(455)) - Extract jar:file:/opt/apache-hive-2.0.1-bin/lib/hive-jdbc-2.0.1-standalone.jar!/hive-webapps/hiveserver2/ to /tmp/jetty-0.0.0.0-10002-hiveserver2-_-any-/webapp
2016-06-03T10:20:17,582 INFO  [Thread-10]: thrift.ThriftCLIService (ThriftBinaryCLIService.java:run(100)) - Starting ThriftBinaryCLIService on port 10000 with 5...500 worker threads
2016-06-03T10:20:17,781 INFO  [main]: handler.ContextHandler (ContextHandler.java:startContext(737)) - started o.e.j.w.WebAppContext{/,file:/tmp/jetty-0.0.0.0-10002-hiveserver2-_-any-/webapp/},jar:file:/opt/apache-hive-2.0.1-bin/lib/hive-jdbc-2.0.1-standalone.jar!/hive-webapps/hiveserver2
2016-06-03T10:20:17,827 INFO  [main]: handler.ContextHandler (ContextHandler.java:startContext(737)) - started o.e.j.s.ServletContextHandler{/static,jar:file:/opt/apache-hive-2.0.1-bin/lib/hive-jdbc-2.0.1-standalone.jar!/hive-webapps/static}
2016-06-03T10:20:17,827 INFO  [main]: handler.ContextHandler (ContextHandler.java:startContext(737)) - started o.e.j.s.ServletContextHandler{/logs,file:/hadoop/hive/log/}
2016-06-03T10:20:17,841 INFO  [main]: server.AbstractConnector (AbstractConnector.java:doStart(333)) - Started [email protected]:10002
2016-06-03T10:20:17,841 INFO  [main]: server.HiveServer2 (HiveServer2.java:start(438)) - Web UI has started on port 10002


网页可以打开,看到hiveserver2
http://192.168.1.187:10002/hiveserver2.jsp


1..通过日志,可以看到hiveserver2是正常开启的,但就是一直报错: User: root is not allowed to impersonate root

设置hadoop的core-site.xml

hadoop.tmp.dir
/hadoop/tmp


  fs.default.name
  hdfs://192.168.1.187:9000


dfs.name.dir
/hadoop/name


  hadoop.proxyuser.root.hosts                                              
  192.168.1.187


  hadoop.proxyuser.root.groups
  root


  fs.checkpoint.period
  3600
  The number of seconds between two periodic checkpoints.


  fs.checkpoint.size
  67108864


     fs.checkpoint.dir
     /hadoop/namesecondary


搞了很久才发现,在187上对hadoop的core-site.xml做的修改,没有传到另外两个节点


2. 设置impersonation,这样hive server会以提交用户的身份去执行语句,如果设置为false,则会以起hive server daemon的admin user来执行语句
[html] 
 
  hive.server2.enable.doAs 
  true 
 

3. JDBC方式
hive server 1的driver classname是org.apache.hadoop.hive.jdbc.HiveDriver,Hive Server 2的是org.apache.hive.jdbc.HiveDriver,这两个容易混淆。

[root@kafka3 bin]# hiveserver2   --终于成功啦!!
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/apache-hive-2.0.1-bin/lib/hive-jdbc-2.0.1-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.6.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
OK


[root@kafka3 hadoop]# cd /opt/apache-hive-2.0.1-bin/bin
[root@kafka3 bin]# ./beeline
Beeline version 2.0.1 by Apache Hive
beeline>  !connect jdbc:hive2://192.168.1.187:10000
Connecting to jdbc:hive2://192.168.1.187:10000
Enter username for jdbc:hive2://192.168.1.187:10000: root
Enter password for jdbc:hive2://192.168.1.187:10000:                                                   
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/apache-hive-2.0.1-bin/lib/hive-jdbc-2.0.1-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.6.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connected to: Apache Hive (version 2.0.1)
Driver: Hive JDBC (version 2.0.1)
16/06/03 15:44:19 [main]: WARN jdbc.HiveConnection: Request to set autoCommit to false; Hive does not support autoCommit=false.
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://192.168.1.187:10000>
0: jdbc:hive2://192.168.1.187:10000> show tables;
INFO  : Compiling command(queryId=root_20160603154642_dd611020-8d3f-4abe-9bd5-7f2fda519007): show tables
INFO  : Semantic Analysis Completed
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from deserializer)], properties:null)
INFO  : Completed compiling command(queryId=root_20160603154642_dd611020-8d3f-4abe-9bd5-7f2fda519007); Time taken: 0.291 seconds
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Executing command(queryId=root_20160603154642_dd611020-8d3f-4abe-9bd5-7f2fda519007): show tables
INFO  : Starting task [Stage-0:DDL] in serial mode
INFO  : Completed executing command(queryId=root_20160603154642_dd611020-8d3f-4abe-9bd5-7f2fda519007); Time taken: 0.199 seconds
INFO  : OK
+---------------------------+--+
|         tab_name          |
+---------------------------+--+
| c2                        |
| hbase_runningrecord_temp  |
| rc_file                   |
| rc_file1                  |
| runningrecord_old         |
| sequence_file             |
| studentinfo               |
| t2                        |
| test_table                |
| test_table1               |
| tina                      |
+---------------------------+--+
11 rows selected (1.194 seconds)
0: jdbc:hive2://192.168.1.187:10000>



创建项目:hivecon
新建包:hivecon
新建类:testhive
package hivecon;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;

public class testhive {
public static void main(String[] args)throws Exception {
// TODO Auto-generated method stub 
Class.forName("org.apache.hive.jdbc.HiveDriver"); 
Connection conn=DriverManager.getConnection("jdbc:hive2://192.168.1.187:10000","root","");
System.out.println("连接:"+conn);
Statement stmt=conn.createStatement();
//String tablename="";
String query_sql="select systemno from runningrecord_old limit 1";
ResultSet rs=stmt.executeQuery(query_sql);
System.out.println("是否有数据:"+rs.next());
}
}
可以直接执行:
ERROR StatusLogger Unrecognized format specifier [msg]
ERROR StatusLogger Unrecognized conversion specifier [msg] starting at position 54 in conversion pattern.
ERROR StatusLogger Unrecognized format specifier [n]
ERROR StatusLogger Unrecognized conversion specifier [n] starting at position 56 in conversion pattern. --日志的报错暂时忽略
连接:org.apache.hive.jdbc.HiveConnection@64485a47
是否有数据:false


---再添加一些操作:
package hivecon;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;

public class testhive {
    private static String sql = ""; 
    private static ResultSet res; 

public static void main(String[] args)throws Exception {
// TODO Auto-generated method stub 
Class.forName("org.apache.hive.jdbc.HiveDriver"); 
Connection conn=DriverManager.getConnection("jdbc:hive2://192.168.1.187:10000","root","");
System.out.println("连接:"+conn);
Statement stmt=conn.createStatement();
String query_sql="select systemno from runningrecord_old limit 1";
ResultSet rs=stmt.executeQuery(query_sql);
System.out.println("是否有数据:"+rs.next());

//创建的表名 
String tableName = "tinatest"; 

/** 第一步:存在就先删除 **/ 
sql = "drop table " + tableName; 
stmt.execute(sql); 

/** 第二步:不存在就创建 **/ 
sql = "create table " + tableName + " (key int, value string)  row format delimited fields terminated by ','"; 
stmt.execute(sql); 

// 执行“show tables”操作 
sql = "show tables '" + tableName + "'"; 
System.out.println("Running:" + sql); 
res = stmt.executeQuery(sql); 
System.out.println("执行“show tables”运行结果:"); 
if (res.next()) { 
        System.out.println(res.getString(1)); 
}

// 执行“describe table”操作 
sql = "describe " + tableName; 
System.out.println("Running:" + sql); 
res = stmt.executeQuery(sql); 
System.out.println("执行“describe table”运行结果:"); 
while (res.next()) {   
        System.out.println(res.getString(1) + "\t" + res.getString(2)); 

// 执行“load data into table”操作 
String filepath = "/tmp/test2.txt"; 
sql = "load data local inpath '" + filepath + "' into table " + tableName; 
System.out.println("Running:" + sql); 
stmt.executeUpdate(sql); 
// 执行“select * query”操作 
sql = "select * from " + tableName; 
System.out.println("Running:" + sql); 
res = stmt.executeQuery(sql); 
System.out.println("执行“select * query”运行结果:"); 
while (res.next()) { 
        System.out.println(res.getInt(1) + "\t" + res.getString(2)); 

conn.close(); 
conn = null; 
}
}

--执行结果:
连接:org.apache.hive.jdbc.HiveConnection@64485a47
是否有数据:true
Running:show tables 'tinatest'
执行“show tables”运行结果:
tinatest
Running:describe tinatest
执行“describe table”运行结果:
key int
value string
Running:load data local inpath '/tmp/test2.txt' into table tinatest
Running:select * from tinatest
执行“select * query”运行结果:
1 a
2 b
3 tina

去hive里面验证:
hive> show tables;
OK
c2
hbase_runningrecord_temp
rc_file
rc_file1
runningrecord_old
sequence_file
studentinfo
t2
test_table
test_table1
tina
tinatest
Time taken: 0.065 seconds, Fetched: 12 row(s)
hive> select * from tinatest;
OK
1 a
2 b
3 tina
Time taken: 3.065 seconds, Fetched: 3 row(s)

你可能感兴趣的:(使用eclipse远程连接hive)