Hbase 查询命令 条件筛选
方便测试
建一下表
hbase(main):001:0> create 'student','c1'
不写namespace的话就是默认在default里
查询有哪些namespace
hbase(main):001:0> list_namespace
查看表的全量数据
hbase(main):002:0> scan 'default:student'
放入一些测试数据
put 'student','1001','c1:id','1001'
put 'student','1002','c1:id','1002'
put 'student','1003','c1:id','1003'
put 'student','1004','c1:id','1004'
put 'student','1005','c1:id','1005'
只查询一行
hbase(main):025:0> scan 'student',LIMIT=>1
ROW COLUMN+CELL
1001 column=c1:id, timestamp=1658911986336, value=1001
查询表的总记录数
count 'student'
按写入的时间戳查询数据
scan 'student', {COLUMN => 'c1', TIMERANGE => [1658827317000,1658913717000]}
查询值为1002的记录
hbase(main):004:0> scan 'student',FILTER=>"ValueFilter(=,'binary:1002')"
ROW COLUMN+CELL
1002 column=c1:id, timestamp=1658911989184, value=1002
1 row(s) in 0.1060 seconds
查询c1:id列的值为1002的
hbase(main):006:0> scan 'student',COLUMNS => 'c1:id',FILTER=>"ValueFilter(=,'binary:1002')"
ROW COLUMN+CELL
1002 column=c1:id, timestamp=1658911989184, value=1002
1 row(s) in 0.0340 seconds
查询值包含100的记录,就跟sql的模糊匹配一样
hbase(main):007:0> scan 'student',FILTER=>"ValueFilter(=,'substring:100')"
ROW COLUMN+CELL
1001 column=c1:id, timestamp=1658911986336, value=1001
1002 column=c1:id, timestamp=1658911989184, value=1002
1003 column=c1:id, timestamp=1658911989217, value=1003
1004 column=c1:id, timestamp=1658911989243, value=1004
1005 column=c1:id, timestamp=1658911989788, value=1005
5 row(s) in 0.0470 seconds
为了方便列的其他查询,多放入一个列
put 'student','1001','c1:sex','1'
put 'student','1002','c1:sex','2'
put 'student','1003','c1:sex','1'
put 'student','1004','c1:sex','2'
put 'student','1005','c1:sex','1'
hbase(main):015:0* scan 'student'
ROW COLUMN+CELL
1001 column=c1:id, timestamp=1658911986336, value=1001
1001 column=c1:sex, timestamp=1658914149713, value=1
1002 column=c1:id, timestamp=1658911989184, value=1002
1002 column=c1:sex, timestamp=1658914152500, value=2
1003 column=c1:id, timestamp=1658911989217, value=1003
1003 column=c1:sex, timestamp=1658914152535, value=1
1004 column=c1:id, timestamp=1658911989243, value=1004
1004 column=c1:sex, timestamp=1658914152563, value=2
1005 column=c1:id, timestamp=1658911989788, value=1005
1005 column=c1:sex, timestamp=1658914153242, value=1
5 row(s) in 0.0390 seconds
查询列为id打头的值
hbase(main):019:0> scan 'student',FILTER=>"ColumnPrefixFilter('id')"
ROW COLUMN+CELL
1001 column=c1:id, timestamp=1658911986336, value=1001
1002 column=c1:id, timestamp=1658911989184, value=1002
1003 column=c1:id, timestamp=1658911989217, value=1003
1004 column=c1:id, timestamp=1658911989243, value=1004
1005 column=c1:id, timestamp=1658911989788, value=1005
5 row(s) in 0.0270 seconds
各项查询的条件是可以叠加的,比如下面这个
查询列为id打头且值为1003的
hbase(main):020:0> scan 'student',FILTER=>"ColumnPrefixFilter('id') AND ValueFilter(=,'binary:1003')"
ROW COLUMN+CELL
1003 column=c1:id, timestamp=1658911989217, value=1003
1 row(s) in 0.0550 seconds
查询rowkey为100打头的
hbase(main):021:0> scan 'student',FILTER=>"PrefixFilter('100')"
ROW COLUMN+CELL
1001 column=c1:id, timestamp=1658911986336, value=1001
1001 column=c1:sex, timestamp=1658914149713, value=1
1002 column=c1:id, timestamp=1658911989184, value=1002
1002 column=c1:sex, timestamp=1658914152500, value=2
1003 column=c1:id, timestamp=1658911989217, value=1003
1003 column=c1:sex, timestamp=1658914152535, value=1
1004 column=c1:id, timestamp=1658911989243, value=1004
1004 column=c1:sex, timestamp=1658914152563, value=2
1005 column=c1:id, timestamp=1658911989788, value=1005
1005 column=c1:sex, timestamp=1658914153242, value=1
查询rowkey为100打头的且不同返回列信息
hbase(main):022:0> scan 'student',FILTER=>"PrefixFilter('100') AND KeyOnlyFilter()"
ROW COLUMN+CELL
1001 column=c1:id, timestamp=1658911986336, value=
1001 column=c1:sex, timestamp=1658914149713, value=
1002 column=c1:id, timestamp=1658911989184, value=
1002 column=c1:sex, timestamp=1658914152500, value=
1003 column=c1:id, timestamp=1658911989217, value=
1003 column=c1:sex, timestamp=1658914152535, value=
1004 column=c1:id, timestamp=1658911989243, value=
1004 column=c1:sex, timestamp=1658914152563, value=
1005 column=c1:id, timestamp=1658911989788, value=
1005 column=c1:sex, timestamp=1658914153242, value=
5 row(s) in 0.0670 seconds
从特定行开始查三行
hbase(main):006:0> scan 'student',{STARTROW=>'1002',LIMIT=>3}
ROW COLUMN+CELL
1002 column=c1:id, timestamp=1658911989184, value=1002
1002 column=c1:sex, timestamp=1658914152500, value=2
1003 column=c1:id, timestamp=1658911989217, value=1003
1003 column=c1:sex, timestamp=1658914152535, value=1
1004 column=c1:id, timestamp=1658911989243, value=1004
1004 column=c1:sex, timestamp=1658914152563, value=2
3 row(s) in 0.0300 seconds
获取特定的行
hbase(main):007:0> get 'student','1001'
COLUMN CELL
c1:id timestamp=1658911986336, value=1001
c1:sex timestamp=1658914149713, value=1
2 row(s) in 0.0170 seconds
默认的查询是正序,倒叙使用REVERSED => TRUE
scan 'student',{REVERSED => TRUE,LIMIT=>1}
以上这些命令基本满足大部分的查询需求了