引言
之前详细写了一篇HBase过滤器的文章,今天把基础的表和数据相关操作补上。
本文档 参考最新 (截止2014年7月16日)的 官方 Ref Guide、 Developer API编写。
所有代码均基于“hbase 0.96.2-hadoop2 ”版本编写,均实测通过。
http://blog.csdn.net/u010967382/article/details/37878701
概述
对于建表,和RDBMS类似,HBase也有namespace的概念,可以指定表空间创建表,也可以直接创建表,进入default表空间。
对于数据操作,HBase支持四类主要的数据操作,分别是:
Put :增加一行,修改一行;
Delete :删除一行,删除指定列族,删除指定column的多个版本,删除指定column的制定版本等;
这四个类都是 org.apache.hadoop.hbase.client的子类,可以到官网API去查看详细信息,本文仅总结常用方法,力争让读者用20%的时间掌握80%的常用功能。
1.命名空间Namespace
2.创建表
3.删除表
4.修改表
5.新增、更新数据Put
6.删除数据Delete
7.获取单行Get
8.获取多行Scan
1. 命名空间Namespace
在关系数据库系统中,命名空间
namespace指的是一个 表的逻辑分组 ,同一组中的表有类似的用途。命名空间的概念为 即将到来 的多租户特性打下基础:
1.1.命名空间管理
命名空间可以被创建、移除、修改。
表和命名空间的隶属关系 在在创建表时决定,通过以下格式指定:
Example:hbase shell中创建命名空间、创建命名空间中的表、移除命名空间、修改命名空间 1.2. 预定义的命名空间 有两个系统内置的预定义命名空间: Example:指定命名空间和默认命名空间 废话不多说,直接上样板代码,代码后再说明注意事项和知识点: Configuration conf = HBaseConfiguration. create (); HBaseAdmin admin = new HBaseAdmin(conf); //create namespace named "my_ns" admin.createNamespace(NamespaceDescriptor. create ( "my_ns" ).build()); //create tableDesc, with namespace name "my_ns" and table name "mytable " HTableDescriptor tableDesc = new HTableDescriptor(TableName. valueOf ("my_ns:mytable" )); tableDesc.setDurability(Durability. SYNC_WAL ); //add a column family " mycf " HColumnDescriptor hcd = new HColumnDescriptor( "mycf" ); tableDesc.addFamily(hcd); admin.createTable(tableDesc); admin.close(); 关键知识点: 删除表没创建表那么多学问,直接上代码: Configuration conf = HBaseConfiguration. create (); HBaseAdmin admin = new HBaseAdmin(conf); String tablename = "my_ns:mytable" ; if (admin.tableExists(tablename)) { try { admin.disableTable(tablename); admin.deleteTable(tablename); } catch (Exception e) { // TODO : handle exception e.printStackTrace(); } } admin.close(); 说明 :删除表前必须先disable表。 4.1.实例代码 (1)删除列族、新增列族 修改之前,四个列族: hbase(main):014:0> describe 'rd_ns:itable' DESCRIPTION ENABLED 'rd_ns:itable', {NAME => ' info ', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', V true ERSIONS => '10', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => ' newcf ', DATA_BLOCK_ENCODING => 'NONE ', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '10', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'tr ue'}, {NAME => ' note ', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '10', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => ' sysinfo', DATA_BLOCK_ENCODING => 'NONE', BLOOM FILTER => 'ROW', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '10', TTL => '2147483647', MIN_VERS IONS => '0', KEEP_DELETED_CELLS => 'true', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'} 1 row(s) in 0.0450 seconds 修改表,删除三个列族,新增一个列族,代码如下: Configuration conf = HBaseConfiguration. create (); HBaseAdmin admin = new HBaseAdmin(conf); String tablename = "rd_ns:itable" ; if (admin.tableExists(tablename)) { try { admin.disableTable(tablename); //get the TableDescriptor of target table HTableDescriptor newtd = admin.getTableDescriptor (Bytes. toBytes ("rd_ns:itable" )); //remove 3 useless column families newtd.removeFamily(Bytes. toBytes ( "note" )); newtd.removeFamily(Bytes. toBytes ( "newcf" )); newtd.removeFamily(Bytes. toBytes ( "sysinfo" )); //create HColumnDescriptor for new column family HColumnDescriptor newhcd = new HColumnDescriptor( "action_log" ); newhcd.setMaxVersions(10); newhcd.setKeepDeletedCells( true ); //add the new column family(HColumnDescriptor) to HTableDescriptor newtd.addFamily(newhcd); //modify target table struture admin. modifyTable (Bytes. toBytes ( "rd_ns:itable" ),newtd); admin.enableTable(tablename); } catch (Exception e) { // TODO : handle exception e.printStackTrace(); } } admin.close(); 修改之后: hbase(main):015:0> describe 'rd_ns:itable' DESCRIPTION ENABLED 'rd_ns:itable', {NAME => ' action_log ', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => true '0', COMPRESSION => 'NONE', VERSIONS => '10', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'tr ue', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => ' info ', DATA_BLOCK_ENCODING => ' NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '10', COMPRESSION => 'NONE', MIN_VERSIONS => ' 0', TTL => '2147483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'} 1 row(s) in 0.0400 seconds 逻辑很简单: (2)修改现有列族的属性(setMaxVersions) Configuration conf = HBaseConfiguration. create (); HBaseAdmin admin = new HBaseAdmin(conf); String tablename = "rd_ns:itable" ; if (admin.tableExists(tablename)) { try { admin.disableTable(tablename); //get the TableDescriptor of target table HTableDescriptor htd = admin.getTableDescriptor(Bytes. toBytes ("rd_ns:itable" )); HColumnDescriptor infocf = htd.getFamily(Bytes. toBytes ( "info" )); infocf.setMaxVersions(100); //modify target table struture admin.modifyTable(Bytes. toBytes ( "rd_ns:itable" ),htd); admin.enableTable(tablename); } catch (Exception e) { // TODO : handle exception e.printStackTrace(); } } admin.close(); 5.新增、更新数据Put 5.1.常用构造函数: (1)指定行键 public Put(byte[] row) 参数: row 行键 (2)指定行键和时间戳 public Put(byte[] row, long ts) 参数: row 行键, ts 时间戳 (3)从目标字符串中提取子串,作为行键 Put(byte[] rowArray, int rowOffset, int rowLength) (4)从目标字符串中提取子串,作为行键,并加上时间戳 Put(byte[] rowArray, int rowOffset, int rowLength, long ts) 5.2.常用方法: (1)指定 列族、限定符 ,添加值 add(byte[] family, byte[] qualifier, byte[] value) (2)指定 列族、限定符、时间戳 ,添加值 add(byte[] family, byte[] qualifier, long ts, byte[] value) (3) 设置写WAL (Write-Ahead-Log)的级别 public void setDurability(Durability d) 参数是一个枚举值,可以有以下几种选择: ASYNC_WAL : 当数据变动时,异步写WAL日志 SYNC_WAL : 当数据变动时,同步写WAL日志 FSYNC_WAL : 当数据变动时,同步写WAL日志,并且,强制将数据写入磁盘 SKIP_WAL : 不写WAL日志 5.3.实例代码 (1)插入行 Configuration conf = HBaseConfiguration. create (); HTable table = new HTable(conf, "rd_ns:leetable" ); Put put = new Put(Bytes. toBytes ( "100001" )); put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "name" ), Bytes. toBytes ("lion" )); put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "address" ), Bytes. toBytes ("shangdi" )); put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "age" ), Bytes. toBytes ("30" )); put.setDurability(Durability. SYNC_WAL ); table.put(put); table.close(); (2)更新行 Configuration conf = HBaseConfiguration. create (); HTable table = new HTable(conf, "rd_ns:leetable" ); Put put = new Put(Bytes. toBytes ( "100001" )); put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "name" ), Bytes. toBytes ("lee" )); put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "address" ), Bytes. toBytes ("longze" )); put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "age" ), Bytes. toBytes ("31" )); put.setDurability(Durability. SYNC_WAL ); table.put(put); table.close(); Put的构造函数都需要指定行键,如果是全新的行键,则新增一行;如果是已有的行键,则更新现有行。 (3) 从目标字符串中提取子串,作为行键,构建Put Configuration conf = HBaseConfiguration. create (); HTable table = new HTable(conf, "rd_ns:leetable" ); Put put = new Put(Bytes. toBytes ( "100001_100002" ),7,6); put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "name" ), Bytes. toBytes ("show" )); put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "address" ), Bytes. toBytes ("caofang" )); put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "age" ), Bytes. toBytes ("30" )); table.put(put); table.close(); 注意,关于: Put put = new Put(Bytes. toBytes ( "100001_100002" ),7,6) Delete类用于删除表中的一行数据,通过HTable.delete来执行该动作。 在执行Delete操作时,HBase并不会立即删除数据,而是对需要删除的数据打上一个“墓碑”标记,直到当Storefile合并时,再清除这些被标记上“墓碑”的数据。 如果希望删除整行,用行键来初始化一个Delete对象即可。如果希望进一步定义删除的具体内容,可以使用以下这些Delete对象的方法: 下面详细说明构造函数和常用方法: 6.1.构造函数 (1)指定要删除的行键 Delete(byte[] row) 删除行键指定行的数据。 如果没有进一步的操作,使用该构造函数将删除行键指定的行中 所有列族中所有列的所有版本 ! (2)指定要删除的行键和时间戳 Delete(byte[] row, long timestamp) 删除行键和时间戳共同确定行的数据。 如果没有进一步的操作,使用该构造函数将删除行键指定的行中,所有列族中所有列的 时间戳 小于等于 指定时间戳的数据版本 。 注意 :该时间戳仅仅和删除行有关,如果需要进一步指定列族或者列,你必须分别为它们指定时间戳。 (3)给定一个字符串,目标行键的偏移,截取的长度 Delete(byte[] rowArray, int rowOffset, int rowLength) (4)给定一个字符串,目标行键的偏移,截取的长度,时间戳 Delete(byte[] rowArray, int rowOffset, int rowLength, long ts) 6.2.常用方法 6.3.实例代码 (1)删除整行的所有列族、所有行、所有版本 Configuration conf = HBaseConfiguration. create (); HTable table = new HTable(conf, "rd_ns:leetable" ); Delete delete = new Delete(Bytes. toBytes ( "000" )); table.delete(delete); table.close(); (2)删除 指定列的最新版本 以下是删除之前的数据,注意看100003行的info:address,这是该列最新版本的数据,值是caofang1,在这之前的版本值是caofang: hbase(main):007:0> scan 'rd_ns:leetable' ROW COLUMN+CELL 100001 column=info:address, timestamp=1405304843114, value=longze 100001 column=info:age, timestamp=1405304843114, value=31 100001 column=info:name, timestamp=1405304843114, value=leon 100002 column=info:address, timestamp=1405305471343, value=caofang 100002 column=info:age, timestamp=1405305471343, value=30 100002 column=info:name, timestamp=1405305471343, value=show 100003 column=info:address, timestamp=1405390959464, value=caofang1 100003 column=info:age, timestamp=1405390959464, value=301 100003 column=info:name, timestamp=1405390959464, value=show1 3 row(s) in 0.0270 seconds 执行以下代码: Configuration conf = HBaseConfiguration. create (); HTable table = new HTable(conf, "rd_ns:leetable" ); Delete delete = new Delete(Bytes. toBytes ( "100003" )); delete.deleteColumn(Bytes. toBytes ( "info" ), Bytes. toBytes ( "address" )); table.delete(delete); table.close(); 然后查看数据,发现100003列的info:address列的值显示为前一个版本的caofang了!其余值均不变: hbase(main):008:0> scan 'rd_ns:leetable' ROW COLUMN+CELL 100001 column=info:address, timestamp=1405304843114, value=longze 100001 column=info:age, timestamp=1405304843114, value=31 100001 column=info:name, timestamp=1405304843114, value=leon 100002 column=info:address, timestamp=1405305471343, value=caofang 100002 column=info:age, timestamp=1405305471343, value=30 100002 column=info:name, timestamp=1405305471343, value=show 100003 column=info:address, timestamp=1405390728175, value=caofang 100003 column=info:age, timestamp=1405390959464, value=301 100003 column=info:name, timestamp=1405390959464, value=show1 3 row(s) in 0.0560 seconds (3)删除 指定列的所有版本 接以上场景,执行以下代码: Configuration conf = HBaseConfiguration. create (); HTable table = new HTable(conf, "rd_ns:leetable" ); Delete delete = new Delete(Bytes. toBytes ( "100003" )); delete. deleteColumns (Bytes. toBytes ( "info" ), Bytes. toBytes ( "address")); table.delete(delete); table.close(); 然后我们会发现,100003行的整个info:address列都没了: hbase(main):009:0> scan 'rd_ns:leetable' ROW COLUMN+CELL 100001 column=info:address, timestamp=1405304843114, value=longze 100001 column=info:age, timestamp=1405304843114, value=31 100001 column=info:name, timestamp=1405304843114, value=leon 100002 column=info:address, timestamp=1405305471343, value=caofang 100002 column=info:age, timestamp=1405305471343, value=30 100002 column=info:name, timestamp=1405305471343, value=show 100003 column=info:age, timestamp=1405390959464, value=301 100003 column=info:name, timestamp=1405390959464, value=show1 3 row(s) in 0.0240 seconds (4) 删除指定列族中所有 列的时间戳 等于 指定时间戳 的版本数据 为了演示效果,我已经向100003行的info:address列新插入一条数据 hbase(main):010:0> scan 'rd_ns:leetable' ROW COLUMN+CELL 100001 column=info:address, timestamp=1405304843114, value=longze 100001 column=info:age, timestamp=1405304843114, value=31 100001 column=info:name, timestamp=1405304843114, value=leon 100002 column=info:address, timestamp=1405305471343, value=caofang 100002 column=info:age, timestamp=1405305471343, value=30 100002 column=info:name, timestamp=1405305471343, value=show 100003 column=info:address, timestamp= 1405391883886 , value=shangdi 100003 column=info:age, timestamp= 1405390959464 , value=301 100003 column=info:name, timestamp= 1405390959464 , value=show1 3 row(s) in 0.0250 seconds 现在,我们的目的是删除info列族中,时间戳为1405390959464的所有列数据: Configuration conf = HBaseConfiguration. create (); HTable table = new HTable(conf, "rd_ns:leetable" ); Delete delete = new Delete(Bytes. toBytes ( "100003" )); delete. deleteFamilyVersion (Bytes. toBytes ( "info" ), 1405390959464L); table.delete(delete); table.close(); hbase(main):011:0> scan 'rd_ns:leetable' ROW COLUMN+CELL 100001 column=info:address, timestamp=1405304843114, value=longze 100001 column=info:age, timestamp=1405304843114, value=31 100001 column=info:name, timestamp=1405304843114, value=leon 100002 column=info:address, timestamp=1405305471343, value=caofang 100002 column=info:age, timestamp=1405305471343, value=30 100002 column=info:name, timestamp=1405305471343, value=show 100003 column=info:address, timestamp= 1405391883886 , value=shangdi 100003 column=info:age, timestamp= 1405390728175 , value=30 100003 column=info:name, timestamp= 1405390728175 , value=show 3 row(s) in 0.0250 seconds 可以看到,100003行的info列族,已经不存在时间戳为 1405390959464的数据,比它更早版本的数据被查询出来,而info列族中时间戳不等于 1405390959464的address列,不受该delete的影响 。 7.获取单行Get 如果希望获取整行数据,用行键初始化一个Get对象就可以,如果希望进一步缩小获取的数据范围,可以使用Get对象的以下方法: 下面详细描述构造函数及常用方法: 7.1.构造函数 Get的构造函数很简单,只有一个构造函数: Get(byte[] row) 参数是行键。 7.2.常用方法 7.3.实测代码 测试表的所有数据: hbase(main):016:0> scan 'rd_ns:leetable' ROW COLUMN+CELL 100001 column=info:address, timestamp=1405304843114, value=longze 100001 column=info:age, timestamp=1405304843114, value=31 100001 column=info:name, timestamp=1405304843114, value=leon 100002 column=info:address, timestamp=1405305471343, value=caofang 100002 column=info:age, timestamp=1405305471343, value=30 100002 column=info:name, timestamp=1405305471343, value=show 100003 column=info:address, timestamp=1405407883218, value=qinghe 100003 column=info:age, timestamp=1405407883218, value=28 100003 column=info:name, timestamp=1405407883218, value=shichao 3 row(s) in 0.0250 seconds (1)获取行键指定行的 所有列族、所有列 的 最新版本 数据 Configuration conf = HBaseConfiguration. create (); HTable table = new HTable(conf, "rd_ns:leetable" ); Get get = new Get(Bytes. toBytes ( "100003" )); Result r = table.get(get); for (Cell cell : r.rawCells()) { System. out .println( "Rowkey : " +Bytes. toString (r.getRow())+ " Familiy:Quilifier : " +Bytes. toString (CellUtil. cloneQualifier (cell))+ " Value : " +Bytes. toString (CellUtil. cloneValue (cell)) ); } table.close(); 代码输出: Rowkey : 100003 Familiy:Quilifier : address Value : qinghe Rowkey : 100003 Familiy:Quilifier : age Value : 28 Rowkey : 100003 Familiy:Quilifier : name Value : shichao (2)获取行键指定行中, 指定列 的最新版本数据 Configuration conf = HBaseConfiguration. create (); HTable table = new HTable(conf, "rd_ns:leetable" ); Get get = new Get(Bytes. toBytes ( "100003" )); get.addColumn(Bytes. toBytes ( "info" ), Bytes. toBytes ( "name" )); Result r = table.get(get); for (Cell cell : r.rawCells()) { System. out .println( "Rowkey : " +Bytes. toString (r.getRow())+ " Familiy:Quilifier : " +Bytes. toString (CellUtil. cloneQualifier (cell))+ " Value : " +Bytes. toString (CellUtil. cloneValue (cell)) ); } table.close(); 代码输出: Rowkey : 100003 Familiy:Quilifier : name Value : shichao (3)获取行键指定的行中, 指定时间戳 的数据 Configuration conf = HBaseConfiguration. create (); HTable table = new HTable(conf, "rd_ns:leetable" ); Get get = new Get(Bytes. toBytes ( "100003" )); get.setTimeStamp(1405407854374L); Result r = table.get(get); for (Cell cell : r.rawCells()) { System. out .println( "Rowkey : " +Bytes. toString (r.getRow())+ " Familiy:Quilifier : " +Bytes. toString (CellUtil. cloneQualifier (cell))+ " Value : " +Bytes. toString (CellUtil. cloneValue (cell)) ); } table.close(); 代码输出了上面scan命令输出中没有展示的历史数据: Rowkey : 100003 Familiy:Quilifier : address Value : huangzhuang Rowkey : 100003 Familiy:Quilifier : age Value : 32 Rowkey : 100003 Familiy:Quilifier : name Value : lily (4)获取行键指定的行中, 所有版本 的数据 Configuration conf = HBaseConfiguration. create (); HTable table = new HTable(conf, "rd_ns:itable" ); Get get = new Get(Bytes. toBytes ( "100003" )); get.setMaxVersions(); Result r = table.get(get); for (Cell cell : r.rawCells()) { System. out .println( "Rowkey : " +Bytes. toString (r.getRow())+ " Familiy:Quilifier : " +Bytes. toString (CellUtil. cloneQualifier (cell))+ " Value : " +Bytes. toString (CellUtil. cloneValue (cell))+ " Time : " +cell.getTimestamp() ); } table.close(); 代码输出: Rowkey : 100003 Familiy:Quilifier : address Value : xierqi Time : 1405417500485 Rowkey : 100003 Familiy:Quilifier : address Value : shangdi Time : 1405417477465 Rowkey : 100003 Familiy:Quilifier : address Value : longze Time : 1405417448414 Rowkey : 100003 Familiy:Quilifier : age Value : 29 Time : 1405417500485 Rowkey : 100003 Familiy:Quilifier : age Value : 30 Time : 1405417477465 Rowkey : 100003 Familiy:Quilifier : age Value : 31 Time : 1405417448414 Rowkey : 100003 Familiy:Quilifier : name Value : leon Time : 1405417500485 Rowkey : 100003 Familiy:Quilifier : name Value : lee Time : 1405417477465 Rowkey : 100003 Familiy:Quilifier : name Value : lion Time : 1405417448414 注意: 能输出多版本数据的前提是当前列族能保存多版本数据,列族可以保存的数据版本数通过HColumnDescriptor的setMaxVersions(Int)方法设置。 8.获取多行Scan Scan对象可以返回满足给定条件的多行数据。 如果希望获取所有的行,直接初始化一个Scan对象即可。 如果希望限制扫描的行范围,可以使用以下方法: 下面是官网文档中的一个入门示例:假设表有几行键值为 "row1", "row2", "row3",还有一些行有键值 "abc1", "abc2", 和 "abc3",目标是返回"row"打头的行: HTable htable = ... // instantiate HTable Scan scan = new Scan(); scan.addColumn(Bytes.toBytes("cf"),Bytes.toBytes("attr")); scan.setStartRow( Bytes.toBytes("row")); // start key is inclusive scan.setStopRow( Bytes.toBytes("row" + (char)0)); // stop key is exclusive ResultScanner rs = htable.getScanner(scan); try { for (Result r = rs.next(); r != null; r = rs.next()) { // process result... } finally { rs.close(); // always close the ResultScanner! } 8.1.常用构造函数 (1)创建扫描所有行的Scan Scan() (2)创建Scan,从指定行开始扫描 , Scan(byte[] startRow) 参数: startRow 行键 注意 :如果指定行不存在,从下一个最近的行开始 (3)创建Scan,指定起止行 Scan(byte[] startRow, byte[] stopRow) 参数: startRow起始行, stopRow终止行 注意 : startRow <= 结果集 < stopRow (4)创建Scan,指定起始行和过滤器 Scan(byte[] startRow, Filter filter) 参数: startRow 起始行, filter 过滤器 注意:过滤器的功能和构造参见http://blog.csdn.net/u010967382/article/details/37653177 8.2.常用方法 void setRaw (boolean raw) 激活或者禁用raw模式。如果raw模式被激活,Scan将返回 所有已经被打上删除标记但尚未被真正删除 的数据。该功能仅用于激活了KEEP_DELETED_ROWS的列族,即列族开启了 hcd.setKeepDeletedCells(true) 。Scan激活raw模式后,就不能指定任意的列,否则会报错 Enable/disable "raw" mode for this scan. If "raw" is enabled the scan will return all delete marker and deleted rows that have not been collected, yet. This is mostly useful for Scan on column families that have KEEP_DELETED_ROWS enabled. It is an error to specify any column when "raw" is set. hcd.setKeepDeletedCells(true); 8.3.实测代码 (1)扫描表中的 所有行 的最新版本数据 Configuration conf = HBaseConfiguration. create (); HTable table = new HTable(conf, "rd_ns:itable" ); Scan s = new Scan(); ResultScanner rs = table.getScanner(s); for (Result r : rs) { for (Cell cell : r.rawCells()) { System. out .println( "Rowkey : " +Bytes. toString (r.getRow())+ " Familiy:Quilifier : " +Bytes. toString (CellUtil. cloneQualifier(cell))+ " Value : " +Bytes. toString (CellUtil. cloneValue (cell))+ " Time : " +cell.getTimestamp() ); } } table.close(); 代码输出: Rowkey : 100001 Familiy:Quilifier : address Value : anywhere Time : 1405417403438 Rowkey : 100001 Familiy:Quilifier : age Value : 24 Time : 1405417403438 Rowkey : 100001 Familiy:Quilifier : name Value : zhangtao Time : 1405417403438 Rowkey : 100002 Familiy:Quilifier : address Value : shangdi Time : 1405417426693 Rowkey : 100002 Familiy:Quilifier : age Value : 28 Time : 1405417426693 Rowkey : 100002 Familiy:Quilifier : name Value : shichao Time : 1405417426693 Rowkey : 100003 Familiy:Quilifier : address Value : xierqi Time : 1405417500485 Rowkey : 100003 Familiy:Quilifier : age Value : 29 Time : 1405417500485 Rowkey : 100003 Familiy:Quilifier : name Value : leon Time : 1405417500485 (2) 扫描指定行键范围,通过末尾加0,使得结果集包含StopRow Configuration conf = HBaseConfiguration. create (); HTable table = new HTable(conf, "rd_ns:itable" ); Scan s = new Scan(); s. setStartRow (Bytes. toBytes ( "100001" )); s. setStopRow (Bytes. toBytes ( " 1000020 " )); ResultScanner rs = table.getScanner(s); for (Result r : rs) { for (Cell cell : r.rawCells()) { System. out .println( "Rowkey : " +Bytes. toString (r.getRow())+ " Familiy:Quilifier : " +Bytes. toString (CellUtil. cloneQualifier(cell))+ " Value : " +Bytes. toString (CellUtil. cloneValue (cell))+ " Time : " +cell.getTimestamp() ); } } table.close(); 代码输出: Rowkey : 100001 Familiy:Quilifier : address Value : anywhere Time : 1405417403438 Rowkey : 100001 Familiy:Quilifier : age Value : 24 Time : 1405417403438 Rowkey : 100001 Familiy:Quilifier : name Value : zhangtao Time : 1405417403438 Rowkey : 100002 Familiy:Quilifier : address Value : shangdi Time : 1405417426693 Rowkey : 100002 Familiy:Quilifier : age Value : 28 Time : 1405417426693 Rowkey : 100002 Familiy:Quilifier : name Value : shichao Time : 1405417426693 (3) 返回 所有已经被打上删除标记但尚未被真正删除 的数据 本测试针对rd_ns:itable表的100003行。 如果使用get结合 setMaxVersions() 方法能返回所有未删除的数据,输出如下: Rowkey : 100003 Familiy:Quilifier : address Value : huilongguan Time : 1405494141522 Rowkey : 100003 Familiy:Quilifier : address Value : shangdi Time : 1405417477465 Rowkey : 100003 Familiy:Quilifier : age Value : new29 Time : 1405494141522 Rowkey : 100003 Familiy:Quilifier : name Value : liyang Time : 1405494141522 然而,使用Scan强大的 s.setRaw( true ) 方法,可以获得所有 已经被打上删除标记但尚未被真正删除 的数据。 代码如下: Configuration conf = HBaseConfiguration. create (); HTable table = new HTable(conf, "rd_ns:itable" ); Scan s = new Scan(); s.setStartRow(Bytes. toBytes ( "100003" )); s.setRaw( true ); s.setMaxVersions(); ResultScanner rs = table.getScanner(s); for (Result r : rs) { for (Cell cell : r.rawCells()) { System. out .println( "Rowkey : " +Bytes. toString (r.getRow())+ " Familiy:Quilifier : " +Bytes. toString (CellUtil. cloneQualifier(cell))+ " Value : " +Bytes. toString (CellUtil. cloneValue (cell))+ " Time : " +cell.getTimestamp() ); } } table.close(); 输出结果如下: Rowkey : 100003 Familiy:Quilifier : address Value : huilongguan Time : 1405494141522 Rowkey : 100003 Familiy:Quilifier : address Value : Time : 1405417500485 Rowkey : 100003 Familiy:Quilifier : address Value : xierqi Time : 1405417500485 Rowkey : 100003 Familiy:Quilifier : address Value : shangdi Time : 1405417477465 Rowkey : 100003 Familiy:Quilifier : address Value : Time : 1405417448414 Rowkey : 100003 Familiy:Quilifier : address Value : longze Time : 1405417448414 Rowkey : 100003 Familiy:Quilifier : age Value : new29 Time : 1405494141522 Rowkey : 100003 Familiy:Quilifier : age Value : Time : 1405417500485 Rowkey : 100003 Familiy:Quilifier : age Value : Time : 1405417500485 Rowkey : 100003 Familiy:Quilifier : age Value : 29 Time : 1405417500485 Rowkey : 100003 Familiy:Quilifier : age Value : 30 Time : 1405417477465 Rowkey : 100003 Familiy:Quilifier : age Value : 31 Time : 1405417448414 Rowkey : 100003 Familiy:Quilifier : name Value : liyang Time : 1405494141522 Rowkey : 100003 Familiy:Quilifier : name Value : Time : 1405493879419 Rowkey : 100003 Familiy:Quilifier : name Value : leon Time : 1405417500485 Rowkey : 100003 Familiy:Quilifier : name Value : lee Time : 1405417477465 Rowkey : 100003 Familiy:Quilifier : name Value : lion Time : 1405417448414 (4) 结合过滤器,获取所有age在25到30之间的行 目前的数据: hbase(main):049:0> scan 'rd_ns:itable' ROW COLUMN+CELL 100001 column=info:address, timestamp=1405417403438, value=anywhere 100001 column=info:age, timestamp=1405417403438, value=24 100001 column=info:name, timestamp=1405417403438, value=zhangtao 100002 column=info:address, timestamp=1405417426693, value=shangdi 100002 column=info:age, timestamp=1405417426693, value=28 100002 column=info:name, timestamp=1405417426693, value=shichao 100003 column=info:address, timestamp=1405494141522, value=huilongguan 100003 column=info:age, timestamp=1405494999631, value=29 100003 column=info:name, timestamp=1405494141522, value=liyang 3 row(s) in 0.0240 seconds 代码: Configuration conf = HBaseConfiguration. create (); HTable table = new HTable(conf, "rd_ns:itable" ); FilterList filterList = new FilterList(FilterList.Operator. MUST_PASS_ALL ); SingleColumnValueFilter filter1 = new SingleColumnValueFilter( Bytes. toBytes ( "info" ), Bytes. toBytes ( "age" ), CompareOp. GREATER_OR_EQUAL , Bytes. toBytes ( "25" ) ); SingleColumnValueFilter filter2 = new SingleColumnValueFilter( Bytes. toBytes ( "info" ), Bytes. toBytes ( "age" ), CompareOp. LESS_OR_EQUAL , Bytes. toBytes ( "30" ) ); filterList.addFilter(filter1); filterList.addFilter(filter2); Scan scan = new Scan(); scan.setFilter(filterList); ResultScanner rs = table.getScanner(scan); for (Result r : rs) { for (Cell cell : r.rawCells()) { System. out .println( "Rowkey : " +Bytes. toString (r.getRow())+ " Familiy:Quilifier : " +Bytes. toString (CellUtil. cloneQualifier(cell))+ " Value : " +Bytes. toString (CellUtil. cloneValue (cell))+ " Time : " +cell.getTimestamp() ); } } table.close(); 代码输出: Rowkey : 100002 Familiy:Quilifier : address Value : shangdi Time : 1405417426693 Rowkey : 100002 Familiy:Quilifier : age Value : 28 Time : 1405417426693 Rowkey : 100002 Familiy:Quilifier : name Value : shichao Time : 1405417426693 Rowkey : 100003 Familiy:Quilifier : address Value : huilongguan Time : 1405494141522 Rowkey : 100003 Familiy:Quilifier : age Value : 29 Time : 1405494999631 Rowkey : 100003 Familiy:Quilifier : name Value : liyang Time : 1405494141522
#Create a namespace
create_namespace 'my_ns'
#create my_table in my_ns namespace
create 'my_ns:my_table', 'fam'
#drop namespace
drop_namespace 'my_ns'
#alter namespace
alter_namespace 'my_ns', {METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'}
#namespace=foo and table qualifier=bar
create 'foo:bar', 'fam'
#namespace=default and table qualifier=bar
create 'bar', 'fam'
2.创建表
3.删除表
4.修改表
6.删除数据Delete
你可能感兴趣的:(hbase)