Hbase 查询命令 条件筛选

Hbase 查询命令 条件筛选

方便测试

建一下表

hbase(main):001:0> create 'student','c1'

不写namespace的话就是默认在default里

查询有哪些namespace

hbase(main):001:0> list_namespace

查看表的全量数据

hbase(main):002:0> scan 'default:student'

放入一些测试数据

put 'student','1001','c1:id','1001'
put 'student','1002','c1:id','1002'
put 'student','1003','c1:id','1003'
put 'student','1004','c1:id','1004'
put 'student','1005','c1:id','1005'

只查询一行

hbase(main):025:0> scan 'student',LIMIT=>1
ROW                        COLUMN+CELL
 1001                      column=c1:id, timestamp=1658911986336, value=1001

查询表的总记录数

count 'student'

按写入的时间戳查询数据

scan 'student', {COLUMN => 'c1', TIMERANGE => [1658827317000,1658913717000]}

查询值为1002的记录

hbase(main):004:0> scan 'student',FILTER=>"ValueFilter(=,'binary:1002')"
ROW                        COLUMN+CELL
 1002                      column=c1:id, timestamp=1658911989184, value=1002
1 row(s) in 0.1060 seconds

查询c1:id列的值为1002的

hbase(main):006:0> scan 'student',COLUMNS => 'c1:id',FILTER=>"ValueFilter(=,'binary:1002')"
ROW                        COLUMN+CELL
 1002                      column=c1:id, timestamp=1658911989184, value=1002
1 row(s) in 0.0340 seconds

查询值包含100的记录,就跟sql的模糊匹配一样

hbase(main):007:0> scan 'student',FILTER=>"ValueFilter(=,'substring:100')"
ROW                        COLUMN+CELL
 1001                      column=c1:id, timestamp=1658911986336, value=1001
 1002                      column=c1:id, timestamp=1658911989184, value=1002
 1003                      column=c1:id, timestamp=1658911989217, value=1003
 1004                      column=c1:id, timestamp=1658911989243, value=1004
 1005                      column=c1:id, timestamp=1658911989788, value=1005
5 row(s) in 0.0470 seconds

为了方便列的其他查询,多放入一个列

put 'student','1001','c1:sex','1'
put 'student','1002','c1:sex','2'
put 'student','1003','c1:sex','1'
put 'student','1004','c1:sex','2'
put 'student','1005','c1:sex','1'

hbase(main):015:0* scan 'student'
ROW                        COLUMN+CELL
 1001                      column=c1:id, timestamp=1658911986336, value=1001
 1001                      column=c1:sex, timestamp=1658914149713, value=1
 1002                      column=c1:id, timestamp=1658911989184, value=1002
 1002                      column=c1:sex, timestamp=1658914152500, value=2
 1003                      column=c1:id, timestamp=1658911989217, value=1003
 1003                      column=c1:sex, timestamp=1658914152535, value=1
 1004                      column=c1:id, timestamp=1658911989243, value=1004
 1004                      column=c1:sex, timestamp=1658914152563, value=2
 1005                      column=c1:id, timestamp=1658911989788, value=1005
 1005                      column=c1:sex, timestamp=1658914153242, value=1
5 row(s) in 0.0390 seconds

查询列为id打头的值

hbase(main):019:0> scan 'student',FILTER=>"ColumnPrefixFilter('id')"
ROW                        COLUMN+CELL
 1001                      column=c1:id, timestamp=1658911986336, value=1001
 1002                      column=c1:id, timestamp=1658911989184, value=1002
 1003                      column=c1:id, timestamp=1658911989217, value=1003
 1004                      column=c1:id, timestamp=1658911989243, value=1004
 1005                      column=c1:id, timestamp=1658911989788, value=1005
5 row(s) in 0.0270 seconds

各项查询的条件是可以叠加的,比如下面这个

查询列为id打头且值为1003的

hbase(main):020:0> scan 'student',FILTER=>"ColumnPrefixFilter('id') AND ValueFilter(=,'binary:1003')"
ROW                        COLUMN+CELL
 1003                      column=c1:id, timestamp=1658911989217, value=1003
1 row(s) in 0.0550 seconds

查询rowkey为100打头的

hbase(main):021:0> scan 'student',FILTER=>"PrefixFilter('100')"
ROW                        COLUMN+CELL
 1001                      column=c1:id, timestamp=1658911986336, value=1001
 1001                      column=c1:sex, timestamp=1658914149713, value=1
 1002                      column=c1:id, timestamp=1658911989184, value=1002
 1002                      column=c1:sex, timestamp=1658914152500, value=2
 1003                      column=c1:id, timestamp=1658911989217, value=1003
 1003                      column=c1:sex, timestamp=1658914152535, value=1
 1004                      column=c1:id, timestamp=1658911989243, value=1004
 1004                      column=c1:sex, timestamp=1658914152563, value=2
 1005                      column=c1:id, timestamp=1658911989788, value=1005
 1005                      column=c1:sex, timestamp=1658914153242, value=1

查询rowkey为100打头的且不同返回列信息

hbase(main):022:0> scan 'student',FILTER=>"PrefixFilter('100') AND KeyOnlyFilter()"
ROW                        COLUMN+CELL
 1001                      column=c1:id, timestamp=1658911986336, value=
 1001                      column=c1:sex, timestamp=1658914149713, value=
 1002                      column=c1:id, timestamp=1658911989184, value=
 1002                      column=c1:sex, timestamp=1658914152500, value=
 1003                      column=c1:id, timestamp=1658911989217, value=
 1003                      column=c1:sex, timestamp=1658914152535, value=
 1004                      column=c1:id, timestamp=1658911989243, value=
 1004                      column=c1:sex, timestamp=1658914152563, value=
 1005                      column=c1:id, timestamp=1658911989788, value=
 1005                      column=c1:sex, timestamp=1658914153242, value=
5 row(s) in 0.0670 seconds

从特定行开始查三行

hbase(main):006:0> scan 'student',{STARTROW=>'1002',LIMIT=>3}
ROW                        COLUMN+CELL
 1002                      column=c1:id, timestamp=1658911989184, value=1002
 1002                      column=c1:sex, timestamp=1658914152500, value=2
 1003                      column=c1:id, timestamp=1658911989217, value=1003
 1003                      column=c1:sex, timestamp=1658914152535, value=1
 1004                      column=c1:id, timestamp=1658911989243, value=1004
 1004                      column=c1:sex, timestamp=1658914152563, value=2
3 row(s) in 0.0300 seconds

获取特定的行

hbase(main):007:0> get 'student','1001'
COLUMN                     CELL
 c1:id                     timestamp=1658911986336, value=1001
 c1:sex                    timestamp=1658914149713, value=1
2 row(s) in 0.0170 seconds

默认的查询是正序,倒叙使用REVERSED => TRUE

scan 'student',{REVERSED => TRUE,LIMIT=>1}

以上这些命令基本满足大部分的查询需求了

你可能感兴趣的:(大数据,hbase,大数据)