利用hbase api在本地访问并操作服务器的hbase数据库

最近因为实验室项目需要,开始研究了hbase。然后想一次性往集群服务器上写入大量的数据,并存到hbase中,考虑到在hbase shell下只能单个数据put,这样对于批量插入数据的要求完全不合适。于是就研究起hbase的java api,然后去大量填充数据到hbase以测试查询的性能。于是,故事开始了。

首先在eclipse下写好了,如下的Java代码:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Admin;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;

public class testCreate {
    public static void main(String[] args){
    	Configuration conf = HBaseConfiguration.create();
    	conf.set("hbase.rootdir", "hdfs://xxx-a.test.com:8020/apps/hbase/data");//使用eclipse时必须添加这个,否则无法定位
        conf.set("hbase.zookeeper.quorum", "xxx-a1.test.com,xxx-a2.test.com,xxx-a3.test.com");
        conf.set("hbase.zookeeper.property.clientPort", "2181");
        conf.set("zookeeper.znode.parent","/hbase-unsecure");//这行代码也是后来加上的
        try {           
            Connection conn = ConnectionFactory.createConnection(conf);
            System.out.println("StartConnect...");
            Admin hAdmin = conn.getAdmin();
            HTableDescriptor hTableDesc = new HTableDescriptor(TableName.valueOf("Customer2"));
            hTableDesc.addFamily(new HColumnDescriptor("name"));
            hTableDesc.addFamily(new HColumnDescriptor("contactinfo"));
            hTableDesc.addFamily(new HColumnDescriptor("address"));
            hAdmin.createTable(hTableDesc);
            System.out.println("Table created Successfully...");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}
	代码写好,一堆错误。是因为缺少依赖的jar包。以下是添加的jar截图。

利用hbase api在本地访问并操作服务器的hbase数据库_第1张图片

这样jar包都是程序运行报错一点点加进去的,分享给大家可以少走冤枉路。(debug也是一种乐趣,虽然有时候想死的心都有,但是坚持找,肯定会解决的)

此时还得配置电脑的hosts,在C:\Windows\System32\drivers\etc\目录,有一个hosts文件,在其中加上xxx-a.test.com对应的实际ip地址,这样程序运行才会访问到正确的地方,格式类似于这样xxx-a.test.com      X.X.X.X。基于这样,程序是可以运行的,Java的运行日志会有如下显示:


表示连接hbase数据库成功,创建表成功。皆大欢喜。

下面说一下几个需要注意的地方:

1、conf.set("hbase.rootdir", "hdfs://xxx-a.test.com:8020/apps/hbase/data");//使用eclipse时必须添加这个,否则无法定位
这句话必须要加上,通过这句配置,才能准确的定位hbase实际存储的hdfs位置,所以这里面的xxx-a.test.com换成详细的ip地址也是可以的。
2、conf.set("hbase.zookeeper.quorum", "xxx-a1.test.com,xxx-a2.test.com,xxx-a3.test.com");
Zookeeper集群的地址列表,用逗号分割。例如:"xxx-a1.test.com,xxx-a2.test.com,xxx-a3.test.com".默认是localhost,是给伪分布式用的。要修改才能在完全分布式的情况下使用。如果在hbase-env.sh设置了HBASE_MANAGES_ZK,这些ZooKeeper节点就会和HBase一起启动。
3、conf.set("hbase.zookeeper.property.clientPort", "2181");
这句话是设置zookeeper监控连接的客户端的端口号是2181,这是默认的配置。
4、conf.set("zookeeper.znode.parent","/hbase-unsecure");//这行代码也是后来加上的
ZooKeeper中的HBase的根ZNode。所有的HBase的ZooKeeper会用这个目录配置相对路径。默认情况下,所有的HBase的ZooKeeper文件路径是用相对路径,所以他们会都去这个目录下面。
5.没加上面那句代码之前我的报错信息一直是:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=35, exceptions:
Wed Aug 10 21:29:19 CST 2016, RpcRetryingCaller{globalStartTime=1470835740311, pause=100, retries=35}, org.apache.hadoop.hbase.MasterNotRunningException: org.apache.hadoop.hbase.MasterNotRunningException: Can't get connection to ZooKeeper: KeeperErrorCode = ConnectionLoss for /hbase
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:157)
	at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4212)
	at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsyncV2(HBaseAdmin.java:748)
	at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:669)
	at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:602)
	at hbase.testCreate.main(testCreate.java:24)
Caused by: org.apache.hadoop.hbase.MasterNotRunningException: org.apache.hadoop.hbase.MasterNotRunningException: Can't get connection to ZooKeeper: KeeperErrorCode = ConnectionLoss for /hbase
	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1535)
	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1555)
	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1706)
	at org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38)
	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
	... 5 more
Caused by: org.apache.hadoop.hbase.MasterNotRunningException: Can't get connection to ZooKeeper: KeeperErrorCode = ConnectionLoss for /hbase
	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(ConnectionManager.java:907)
	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.access$400(ConnectionManager.java:546)
	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1485)
	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1526)
	... 9 more
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:221)
	at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:541)
	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(ConnectionManager.java:896)
	... 12 more
加上conf.set("zookeeper.znode.parent","/hbase-unsecure");就可以成功访问并操作hbase数据库。
以上是个人实际学习过程的一些经验,在此记录一下,欢迎大家一起学习交流hbase相关技术!


你可能感兴趣的:(数据库,hbase,eclipse,api,shell)