Hbase Shell 调用java代码:通过比较器,强过滤查询

文章目录

  • 1, Columns + ValueFilter / RowFilter :过滤字段 && 过滤值
  • 2, Columns + SingleColumnValueFilter: 过滤字段 && 过滤值

  • Hbase 过滤器( api使用文档)http://hbase.apache.org/1.4/apidocs/index.html

1, Columns + ValueFilter / RowFilter :过滤字段 && 过滤值

  • RowFilter, ValueFilter:用法一样
#值过滤: 内容匹配
scan 'BASIC_INFORMATION', FILTER=>"ValueFilter(=,'substring:2009')"
scan 'BASIC_INFORMATION', FILTER=>"ValueFilter(=,'binaryprefix:2005-07-04')"

#值过滤: 等值查询
scan 'BASIC_INFORMATION', FILTER=>"ValueFilter(  =,'binary:2005-07-04 00:00:00.0')"
scan 'BASIC_INFORMATION', {FILTER=>"ValueFilter(>=,'binary:2005-07-04 00:00:00.0')", LIMIT=>2 }
scan 'BASIC_INFORMATION', {FILTER=>"ValueFilter(>=,'binary:2005-07-04 00:00:00.0')", LIMIT=>2 ,COLUMNS=>['f:Visit_Date']}

hbase(main):015:0> scan 'BASIC_INFORMATION', {FILTER=>"ValueFilter(>=,'binary:2005-07-04 00:00:00.0')", LIMIT=>2 ,COLUMNS=>['f:Visit_Date']}
ROW                                          COLUMN+CELL                                                                                                                     
 r1                   column=f:Visit_Date, timestamp=1567637460098, value=2010-12-04 00:00:00.0                                                       
 r2                   column=f:Visit_Date, timestamp=1567593570811, value=2005-07-04 00:00:00.0                                                       
2 row(s) in 0.0150 seconds

2, Columns + SingleColumnValueFilter: 过滤字段 && 过滤值

  • 列值比较器 + 比较符号: SingleColumnValueFilter + CompareFilter.CompareOp
  • http://hbase.apache.org/1.4/apidocs/org/apache/hadoop/hbase/filter/CompareFilter.CompareOp.html
  • CompareOp 可选取值:EQUAL,NOT_EQUAL, GREATER,GREATER_OR_EQUAL, LESS,LESS_OR_EQUAL
####0, 声明要导入的java类
hbase(main):026:0> import org.apache.hadoop.hbase.filter.CompareFilter
 import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
 import org.apache.hadoop.hbase.filter.SubstringComparator
 import org.apache.hadoop.hbase.util.Bytes
 
####1, 比较数值字段: >某个数 的数据打印
hbase(main):026:0> scan 't'
ROW                                          COLUMN+CELL                                                                                                                     
 1                                           column=f:age, timestamp=1587466381584, value=25                                                                                 
 1                                           column=f:name, timestamp=1587466381584, value=a                                                                                 
 1                                           column=f:pm, timestamp=1587466381584, value=123                                                                                 
 2                                           column=f:name, timestamp=1587466381584, value=                                                                                  
 2                                           column=f:pm, timestamp=1587466381584, value=456                                                                                 
 3                                           column=f:name, timestamp=1587466381584, value=c                                                                                 
3 row(s) in 0.0300 seconds

hbase(main):060:0> scan 't', { COLUMNS => ['f:pm'], FILTER =>SingleColumnValueFilter.new(Bytes.toBytes('f'),Bytes.toBytes('pm'),CompareFilter::CompareOp.valueOf('GREATER'),Bytes.toBytes("123"))  }
ROW                                          COLUMN+CELL                                                                                                                     
 2                                           column=f:pm, timestamp=1587466381584, value=456                                                                                 
1 row(s) in 0.0210 seconds


####2, 比较时间字段: >某个时间 的数据打印
hbase(main):059:0> scan 'BASIC_INFORMATION',{COLUMNS=>['f:Visit_Date'], LIMIT=> 3}
ROW                                          COLUMN+CELL                                                                                                                     
 row1                   column=f:Visit_Date, timestamp=1567637460098, value=2010-12-04 00:00:00.0                                                       
 row2                   column=f:Visit_Date, timestamp=1567593570811, value=2005-07-04 00:00:00.0                                                       
 row3                   column=f:Visit_Date, timestamp=1567607471125, value=2009-09-21 00:00:00.0                                                       
3 row(s) in 0.0100 seconds

hbase(main):039:0> scan 'BASIC_INFORMATION', \
{ COLUMNS=>['f:Visit_Date'], LIMIT=> 3, \
FILTER =>SingleColumnValueFilter.new(Bytes.toBytes('f'),Bytes.toBytes('Visit_Date'), CompareFilter::CompareOp.valueOf('GREATER'),Bytes.toBytes("2005-07-04 00:00:00.0"))  }
ROW                                          COLUMN+CELL                                                                                                                     
 r1                   column=f:Visit_Date, timestamp=1567637460098, value=2010-12-04 00:00:00.0                                                       
 r3                   column=f:Visit_Date, timestamp=1567607471125, value=2009-09-21 00:00:00.0                                                       
 r4                   column=f:Visit_Date, timestamp=1567599833678, value=2007-12-20 00:00:00.0                                                       
3 row(s) in 0.0130 seconds

你可能感兴趣的:(大数据hadoop-hbase)