版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/xiaokang123456kao/article/details/73323429
原文有格式,好看
一、HBase的启动
由上一篇可知,HBase是建立在Hadoop HDFS之上的,因此在启动HBase之前要确保已经启动了Hadoop,启动Hadoop的命令是:start-all.sh。在hadoop2.x中,启动hadoop推荐使用start-hdfs.sh和start-yarn.sh两个命令来代替start-all.sh。hadoop集群启动后,启动HBase使用命令:start-hbase.sh。
我这里配置的是完全分布模式,一台机器作为NameNode和HMaster,另外两台作为DataNode和HRegionServer。Hadoop和HBase启动完毕后,通过jps命令查看结果如下:
机器一:
机器二:
机器三:
通过访问页面:http://192.168.2.120:60010/master-status可以看到HBase集群的相关情况。
可能会遇到的问题
有时候,我们启动Hadoop和Hbase后,访问Hadoop集群的web端口页面可以显示,但是访问HBase的web页面时却显示不出来。这时候再使用jps命令查看发现HMaster这个HBase主进程自动退出了。我们可以通过查看HBase的log日志来定位问题,对于我的情况来说,是因为三台虚拟机的物理时间不一致导致的,通过在三台虚拟机上执行命令:ntpdate time.nist.gov来同步时间,然后执行命令stop-hbase.sh,再执行start-habse.sh就行了。
二、HBase Shell的相关操作。
1、进入Hbase shell
通过运行命令hbase shell来进入:
首先需要注意,在hbase shell中使用回退键是无效的,如果输错信息要回退,请按住ctrl键再按回退键。
2、列出所有表
通过list命令得出:
可以看到这里已经有一张“users”的表,这是我之前自主创建的,如果第一次执行HBase,list结果肯定是空。
3、删除表
在HBase中删除表需要两步,首先disable,其次drop
4、创建表:
这里创建了一张表users,有三个列族user_id,address,info
create 'users','user_id','address','info'
1
5、获取表的具体描述:
通过命令describe 'users'即可。
6、增删改查
增加记录:put
put 'users','xiaoming','info:age','24'
这个命令的意思就是向表users的行xiaoming、列info:age添加数据24。同理依次执行如下语句:
put 'users','xiaoming','info:birthday','1987-06-17'
put 'users','xiaoming','info:company','alibaba'
put 'users','xiaoming','address:contry','china'
put 'users','xiaoming','address:province','zhejiang'
put 'users','xiaoming','address:city','hangzhou'
扫描users表的所有记录:scan
通过命令scan ‘users’即可。
获取一条记录
①取得一个id(row_key)的所有数据
get 'users','xiaoming'
②获取一个id的一个列族的所有数据
get 'users','xiaoming','info'
③获取一个id,一个列族中一个列的所有数据
get 'users','xiaoming','info:age'
更新一条记录:put
更新users表中小明的年龄为29
put 'users','xiaoming','info:age' ,'29'
删除记录:delete与deleteall
①删除xiaoming的值的’info:age’字段
delete 'users','xiaoming','info:age'
②删除xiaoming的整行信息
deleteall 'users','xiaoming'
其他几个比较有用的命令
count:统计行数
count 'users'
当前users表中只有xiaoming一行数据。
truncate:清空指定表
`truncate 'users'
这个操作实际上是先删除表,然后又创建了一张相同的表。
三、HBase的JavaAPI详解
1、JavaAPI和HBase数据模型之间的关系
2、HBaseConfiguration
该类主要对HBase进行配置。主要方法如下:
示例:
Configuration conf = HBaseConfiguration.create();
该方法用HBase的默认资源来创建Configuration,它默认会加载classpath下的habse-site.xml来初始化Configuration。
3、HBaseAdmin
该类主要提供一个接口来管理HBase数据库的表信息,包括方法如下:
使用示例:
HBaseAdmin admin = new HBaseAdmin(conf);
admin.disableTable(tableName);
admin.deleteTable(tableName);
4、HTableDescriptor
该类包含了表的名字即对应表的列族,主要方法如下:
示例:
HTableDescriptor tableDesc = new HTableDescriptor(tableName);
for (int i = 0; i < familys.length; i++) {
tableDesc.addFamily(new HColumnDescriptor(familys[i]));
}
5、HColumnDescriptor
该类主要用于维护关于列族的信息例如版本号,压缩设置等。通常在创建表或为表添加列族时使用。主要方法如下:
示例:
HTableDescriptor tableDesc = new HTableDescriptor(tableName);
for (int i = 0; i < familys.length; i++) {
tableDesc.addFamily(new HColumnDescriptor(familys[i]));
}
6、HTable
该类主要用于和HBase的表进行通信,对更新操作来说是非线程安全的。在多线程操作的环境下,建议使用HTablePool类进行操作。主要方法如下:
示例:
HTable table = new HTable(conf, tableName);
Scan s = new Scan();
ResultScanner ss = table.getScanner(s);
7、Put
该类主要对单个行执行添加或者更新操作。主要方法如下:
示例:
HTable table = new HTable(conf, tableName);
Put put = new Put(Bytes.toBytes(rowKey));
put.add(Bytes.toBytes(family), Bytes.toBytes(qualifier), Bytes.toBytes(value));
table.put(put);
8、Get
该类主要用于获取单个行的相关信息。主要方法如下:
示例:
HTable table = new HTable(conf, tableName);
Get get = new Get(rowKey.getBytes());
Result rs = table.get(get);
9、Result
该类用于存储Get或者Scan操作后获取的表的单行值。主要方法如下:
示例:
HTable table = new HTable(conf, tableName);
Get get = new Get(rowKey.getBytes());
Result rs = table.get(get);
for (KeyValue kv : rs.raw()) {
System.out.print(new String(kv.getRow()) + " ");
System.out.print(new String(kv.getFamily()) + ":");
System.out.print(new String(kv.getQualifier()) + " ");
System.out.print(kv.getTimestamp() + " ");
System.out.println(new String(kv.getValue()));
}
10、ResultScanner
该类主要作用是提供客户端获取值得接口。主要方法如下:
示例:
HTable table = new HTable(conf, tableName);
Scan s = new Scan();
ResultScanner ss = table.getScanner(s);
for (Result r : ss) {
for (KeyValue kv : r.raw()) {
System.out.print(new String(kv.getRow()) + " ");
System.out.print(new String(kv.getFamily()) + ":");
System.out.print(new String(kv.getQualifier()) + " ");
System.out.print(kv.getTimestamp() + " ");
System.out.println(new String(kv.getValue()));
}
}
四、HBase的JavaAPI实例
1、导入相关jar包
Eclipse中执行Hbase程序需要导入的jar包如下:
Hadoop全部jar包
Hbase部分jar包
Hbase jar包不能多也不能少,多了会冲突,少了会提醒找不到相应类,Hbase这些jar包整理了一下如下图所示:
2、获取配置
//获取配置
static {
conf = HBaseConfiguration.create();
//获取zookeeper集群的位置
conf.set("hbase.zookeeper.quorum", "sparkproject1,sparkproject2,sparkproject3");
System.out.println(conf.get("hbase.zookeeper.quorum"));
}
sparkproject1,sparkproject2,sparkproject3
1
是hadoop集群的主机名称,注意在本地的hosts文件中加入这三个主机的ip地址。
3、表的创建和删除
/**
* 创建一张表
*/
public static void creatTable(String tableName, String[] familys) throws Exception {
HBaseAdmin admin = new HBaseAdmin(conf);
if (admin.tableExists(tableName)) {
System.out.println("table already exists!");
} else {
HTableDescriptor tableDesc = new HTableDescriptor(tableName);
for (int i = 0; i < familys.length; i++) {
tableDesc.addFamily(new HColumnDescriptor(familys[i]));
}
admin.createTable(tableDesc);
System.out.println("create table " + tableName + " ok.");
}
}
/**
* 删除表
*/
public static void deleteTable(String tableName) throws Exception {
try {
HBaseAdmin admin = new HBaseAdmin(conf);
admin.disableTable(tableName);
admin.deleteTable(tableName);
System.out.println("delete table " + tableName + " ok.");
} catch (MasterNotRunningException e) {
e.printStackTrace();
} catch (ZooKeeperConnectionException e) {
e.printStackTrace();
}
}
4、表数据的增删改查
/**
* 插入一行记录
*/
public static void addRecord(String tableName, String rowKey, String family, String qualifier, String value)
throws Exception {
try {
HTable table = new HTable(conf, tableName);
Put put = new Put(Bytes.toBytes(rowKey));
put.add(Bytes.toBytes(family), Bytes.toBytes(qualifier), Bytes.toBytes(value));
table.put(put);
System.out.println("insert recored " + rowKey + " to table " + tableName + " ok.");
} catch (IOException e) {
e.printStackTrace();
}
}
/**
* 删除一行记录
*/
public static void delRecord(String tableName, String rowKey) throws IOException {
HTable table = new HTable(conf, tableName);
List list = new ArrayList();
Delete del = new Delete(rowKey.getBytes());
list.add(del);
table.delete(list);
System.out.println("del recored " + rowKey + " ok.");
}
/**
* 查找一行记录
*/
public static void getOneRecord(String tableName, String rowKey) throws IOException {
HTable table = new HTable(conf, tableName);
Get get = new Get(rowKey.getBytes());
Result rs = table.get(get);
for (KeyValue kv : rs.raw()) {
System.out.print(new String(kv.getRow()) + " ");
System.out.print(new String(kv.getFamily()) + ":");
System.out.print(new String(kv.getQualifier()) + " ");
System.out.print(kv.getTimestamp() + " ");
System.out.println(new String(kv.getValue()));
}
}
/**
* 显示所有数据
*/
public static void getAllRecord(String tableName) {
try {
HTable table = new HTable(conf, tableName);
Scan s = new Scan();
ResultScanner ss = table.getScanner(s);
for (Result r : ss) {
for (KeyValue kv : r.raw()) {
System.out.print(new String(kv.getRow()) + " ");
System.out.print(new String(kv.getFamily()) + ":");
System.out.print(new String(kv.getQualifier()) + " ");
System.out.print(kv.getTimestamp() + " ");
System.out.println(new String(kv.getValue()));
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
5、整体测试
源码如下:
package com.kang.hbase;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.KeyValue;
import org.apache.hadoop.hbase.MasterNotRunningException;
import org.apache.hadoop.hbase.ZooKeeperConnectionException;
import org.apache.hadoop.hbase.client.Delete;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.HBaseAdmin;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.util.Bytes;
public class HBaseTest {
private static final String TABLE_NAME = "demo_table";
public static Configuration conf = null;
public HTable table = null;
public HBaseAdmin admin = null;
//获取配置
static {
conf = HBaseConfiguration.create();
//获取zookeeper集群的位置
conf.set("hbase.zookeeper.quorum", "sparkproject1,sparkproject2,sparkproject3");
System.out.println(conf.get("hbase.zookeeper.quorum"));
}
/**
* 创建一张表
*/
public static void creatTable(String tableName, String[] familys) throws Exception {
HBaseAdmin admin = new HBaseAdmin(conf);
if (admin.tableExists(tableName)) {
System.out.println("table already exists!");
} else {
HTableDescriptor tableDesc = new HTableDescriptor(tableName);
for (int i = 0; i < familys.length; i++) {
tableDesc.addFamily(new HColumnDescriptor(familys[i]));
}
admin.createTable(tableDesc);
System.out.println("create table " + tableName + " ok.");
}
}
/**
* 删除表
*/
public static void deleteTable(String tableName) throws Exception {
try {
HBaseAdmin admin = new HBaseAdmin(conf);
admin.disableTable(tableName);
admin.deleteTable(tableName);
System.out.println("delete table " + tableName + " ok.");
} catch (MasterNotRunningException e) {
e.printStackTrace();
} catch (ZooKeeperConnectionException e) {
e.printStackTrace();
}
}
/**
* 插入一行记录
*/
public static void addRecord(String tableName, String rowKey, String family, String qualifier, String value)
throws Exception {
try {
HTable table = new HTable(conf, tableName);
Put put = new Put(Bytes.toBytes(rowKey));
put.add(Bytes.toBytes(family), Bytes.toBytes(qualifier), Bytes.toBytes(value));
table.put(put);
System.out.println("insert recored " + rowKey + " to table " + tableName + " ok.");
} catch (IOException e) {
e.printStackTrace();
}
}
/**
* 删除一行记录
*/
public static void delRecord(String tableName, String rowKey) throws IOException {
HTable table = new HTable(conf, tableName);
List list = new ArrayList();
Delete del = new Delete(rowKey.getBytes());
list.add(del);
table.delete(list);
System.out.println("del recored " + rowKey + " ok.");
}
/**
* 查找一行记录
*/
public static void getOneRecord(String tableName, String rowKey) throws IOException {
HTable table = new HTable(conf, tableName);
Get get = new Get(rowKey.getBytes());
Result rs = table.get(get);
for (KeyValue kv : rs.raw()) {
System.out.print(new String(kv.getRow()) + " ");
System.out.print(new String(kv.getFamily()) + ":");
System.out.print(new String(kv.getQualifier()) + " ");
System.out.print(kv.getTimestamp() + " ");
System.out.println(new String(kv.getValue()));
}
}
/**
* 显示所有数据
*/
public static void getAllRecord(String tableName) {
try {
HTable table = new HTable(conf, tableName);
Scan s = new Scan();
ResultScanner ss = table.getScanner(s);
for (Result r : ss) {
for (KeyValue kv : r.raw()) {
System.out.print(new String(kv.getRow()) + " ");
System.out.print(new String(kv.getFamily()) + ":");
System.out.print(new String(kv.getQualifier()) + " ");
System.out.print(kv.getTimestamp() + " ");
System.out.println(new String(kv.getValue()));
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
// TODO Auto-generated method stub
try {
String tablename = "scores";
String[] familys = { "grade", "course" };
HBaseTest.creatTable(tablename, familys);
// add record zkb
HBaseTest.addRecord(tablename, "zkb", "grade", "", "5");
HBaseTest.addRecord(tablename, "zkb", "course", "", "90");
HBaseTest.addRecord(tablename, "zkb", "course", "math", "97");
HBaseTest.addRecord(tablename, "zkb", "course", "art", "87");
// add record baoniu
HBaseTest.addRecord(tablename, "baoniu", "grade", "", "4");
HBaseTest.addRecord(tablename, "baoniu", "course", "math", "89");
System.out.println("===========get one record========");
HBaseTest.getOneRecord(tablename, "zkb");
System.out.println("===========show all record========");
HBaseTest.getAllRecord(tablename);
System.out.println("===========del one record========");
HBaseTest.delRecord(tablename, "baoniu");
HBaseTest.getAllRecord(tablename);
System.out.println("===========show all record========");
HBaseTest.getAllRecord(tablename);
} catch (Exception e) {
e.printStackTrace();
}
}
}
run as –>Run on Hadoop,控制台显示结果如下:
在hbase shell中查看如下:
五、HBase结合MapReduce
结合之前博客走向云计算之MapReduce的代码辅助优化和改善中的手机上网日志为背景,我们要做的就是将日志通过MapReduce导入到HBase中进行存储。
1、在HBase中创建表
在HBase中通过Shell创建一张表:wlan_log,这里为了简单定义,之定义了一个列族cf。
create 'wlan_log','cf'
1
2、将数据输出到HBase中
在ecplise中新建一个类,该类的代码如下所示:
package com.kang.hbase;
import java.text.SimpleDateFormat;
import java.util.Date;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.mapreduce.TableOutputFormat;
import org.apache.hadoop.hbase.mapreduce.TableReducer;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Counter;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
public class MRHbase {
static class BatchImportMapper extends Mapper
SimpleDateFormat dateformat1 = new SimpleDateFormat("yyyyMMddHHmmss");
Text v2 = new Text();
protected void map(LongWritable key, Text value, Context context)
throws java.io.IOException, InterruptedException {
final String[] splited = value.toString().split("\t");
try {
final Date date = new Date(Long.parseLong(splited[0].trim()));
final String dateFormat = dateformat1.format(date);
String rowKey = splited[1] + ":" + dateFormat;
v2.set(rowKey + "\t" + value.toString());
context.write(key, v2);
} catch (NumberFormatException e) {
final Counter counter = context.getCounter("BatchImportJob", "ErrorFormat");
counter.increment(1L);
System.out.println("出错了" + splited[0] + " " + e.getMessage());
}
};
}
static class BatchImportReducer extends TableReducer
protected void reduce(LongWritable key, java.lang.Iterable
throws java.io.IOException, InterruptedException {
for (Text text : values) {
final String[] splited = text.toString().split("\t");
final Put put = new Put(Bytes.toBytes(splited[0]));
put.add(Bytes.toBytes("cf"), Bytes.toBytes("date"), Bytes.toBytes(splited[1]));
put.add(Bytes.toBytes("cf"), Bytes.toBytes("msisdn"), Bytes.toBytes(splited[2]));
// 省略其他字段,调用put.add(....)即可
context.write(NullWritable.get(), put);
}
};
}
public static void main(String[] args) throws Exception {
final Configuration configuration = new Configuration();
// 设置zookeeper
configuration.set("hbase.zookeeper.quorum", "sparkproject1,sparkproject2,sparkproject3");
// 设置hbase表名称
configuration.set(TableOutputFormat.OUTPUT_TABLE, "wlan_log");
// 将该值改大,防止hbase超时退出
configuration.set("dfs.socket.timeout", "180000");
final Job job = new Job(configuration, "HBaseBatchImportJob");
job.setMapperClass(BatchImportMapper.class);
job.setReducerClass(BatchImportReducer.class);
// 设置map的输出,不设置reduce的输出类型
job.setMapOutputKeyClass(LongWritable.class);
job.setMapOutputValueClass(Text.class);
job.setInputFormatClass(TextInputFormat.class);
// 不再设置输出路径,而是设置输出格式类型
job.setOutputFormatClass(TableOutputFormat.class);
FileInputFormat.setInputPaths(job, "hdfs://sparkproject1:9000/root/input/");
boolean success = job.waitForCompletion(true);
if (success) {
System.out.println("Bath import to HBase success!");
System.exit(0);
} else {
System.out.println("Batch import to HBase failed!");
System.exit(1);
}
}
}
上述代码执行后,在HBase中通过Shell命令(list)查看导入结果:
3、通过java代码来访问显示数据
package com.kang.hbase;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.util.Bytes;
public class WlanLogApp {
private static final String TABLE_NAME = "wlan_log";
private static final String FAMILY_NAME = "cf";
/**
* HBase Java API基本使用示例
*
* @throws Exception
*/
public static void main(String[] args) throws Exception {
System.out.println("手机13600217502的所有上网记录如下:");
scan(TABLE_NAME,"13600217502");
System.out.println("134号段的所有上网记录如下:");
scanPeriod(TABLE_NAME, "136");
}
/*
* 查询手机13600217502的所有上网记录
*/
public static void scan(String tableName, String mobileNum)
throws IOException {
HTable table = new HTable(getConfiguration(), tableName);
Scan scan = new Scan();
scan.setStartRow(Bytes.toBytes(mobileNum + ":/"));
scan.setStopRow(Bytes.toBytes(mobileNum + "::"));
ResultScanner scanner = table.getScanner(scan);
int i = 0;
for (Result result : scanner) {
System.out.println("Scan: " + i + " " + result);
i++;
}
}
/*
* 查询134号段的所有上网记录
*/
public static void scanPeriod(String tableName, String period)
throws IOException {
HTable table = new HTable(getConfiguration(), tableName);
Scan scan = new Scan();
scan.setStartRow(Bytes.toBytes(period + "/"));
scan.setStopRow(Bytes.toBytes(period + ":"));
scan.setMaxVersions(1);
ResultScanner scanner = table.getScanner(scan);
int i = 0;
for (Result result : scanner) {
System.out.println("Scan: " + i + " " + result);
i++;
}
}
/*
* 获取HBase配置
*/
private static Configuration getConfiguration() {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.zookeeper.quorum", "sparkproject1,sparkproject2,sparkproject3");
return conf;
}
}
运行结果如下:
---------------------
作者:想作会飞的鱼
来源:CSDN
原文:https://blog.csdn.net/xiaokang123456kao/article/details/73323429
版权声明:本文为博主原创文章,转载请附上博文链接!