命名空间是与关系数据库系统中的数据库类似的表的逻辑分组。这种抽象为即将出现的多租户相关功能奠定了基础:
配额管理(HBASE-8410) - 限制命名空间可以使用的资源量(即区域,表)。
命名空间安全管理(HBASE-9206) - 为租户提供另一级别的安全管理。
区域服务器组(HBASE-6721) - 可以将命名空间/表固定到RegionServers的子集上,从而保证粗略的隔离级别。
创建namespace
hbase(main):039:0>create_namespace 'my_ns'
删除namespace
hbase(main):039:0>drop_namespace 'my_ns'
修改namespace
hbase(main):049:0>alter_namespace 'my_ns', {METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'}
查看namespace
hbase(main):049:0>describe_namespace 'my_ns'
列出所有namespace
hbase(main):049:0>list_namespace
创建namespace中的表
hbase(main):049:0>create 'my_ns:table1','info'
删除命名空间中的表
hbase(main):052:0> disable 'my_ns:table1'
0 row(s) in 2.2520 seconds
hbase(main):053:0> drop 'my_ns:table1'
0 row(s) in 1.2360 seconds
查看namespace下的表
hbase(main):053:0>list_namespace_tables 'my_ns'
有两个预定义的特殊Namespace
hbase - 系统命名空间,用于包含HBase内部表
default - 没有明确指定名称空间的表将自动落入此名称空间
hbase(main):055:0> list_namespace
NAMESPACE
default
hbase
my_ns
3 row(s) in 0.0130 seconds
hbase(main):058:0> create 'my_test','f1','f2'
0 row(s) in 1.2390 seconds
=> Hbase::Table - my_test
hbase(main):060:0> put 'my_test','u1_td1','f1:a1','abc1'
0 row(s) in 0.0620 seconds
hbase(main):061:0> put 'my_test','u1_td2','f1:a1','abc199'
0 row(s) in 0.0050 seconds
hbase(main):062:0> put 'my_test','u1_td3','f1:b1','abc123'
0 row(s) in 0.0080 seconds
hbase(main):063:0> put 'my_test','u2_td4','f1:a1','abc2'
0 row(s) in 0.0040 seconds
hbase(main):064:0> put 'my_test','u2_td5','f1:a2','abc299'
0 row(s) in 0.0050 seconds
hbase(main):065:0> put 'my_test','u2_td6','f1:s2','abc222'
0 row(s) in 0.0050 seconds
将hbase的查询结果保存到文件中
echo “scan ‘tablename’, {LIMIT=>1}” | hbase shell > hbaseout1.txt
指定要查询哪些列族或列,示例中查询列族f1中的的a1列b1列,只写f1则查询f1列族中所有列。
hbase(main):019:0> scan 'my_test',{COLUMNS => ['f1:a1','f1:b1']}
ROW COLUMN+CELL
u1_td1 column=f1:a1, timestamp=1548136934397, value=ab1
u1_td2 column=f1:a1, timestamp=1548128525390, value=abc199
u1_td3 column=f1:b1, timestamp=1548128540271, value=abc123
u2_err column=f1:b1, timestamp=1548139413694, value=abc888
u2_td4 column=f1:a1, timestamp=1548128558845, value=abc2
5 row(s) in 0.0120 seconds
查询指定时间范围内的数据,前闭后开区间。
hbase(main):021:0> scan 'my_test',{TIMERANGE=>[1548128540271,1548128614427]}
ROW COLUMN+CELL
u1_td3 column=f1:b1, timestamp=1548128540271, value=abc123
u2_td4 column=f1:a1, timestamp=1548128558845, value=abc2
u2_td5 column=f1:a2, timestamp=1548128587695, value=abc299
3 row(s) in 0.0070 seconds
按照rowkey的范围查找数据。
hbase(main):095:0> scan 'my_test',{STARTROW=>'u1_td2',STOPROW=>'u2_td5'}
ROW COLUMN+CELL
u1_td2 column=f1:a1, timestamp=1548128525390, value=abc199
u1_td3 column=f1:b1, timestamp=1548128540271, value=abc123
u2_td4 column=f1:a1, timestamp=1548128558845, value=abc2
3 row(s) in 0.0050 seconds
查询结果反转排序。
hbase(main):022:0> scan 'my_test',{TIMERANGE=>[1548128540271,1548128614427],REVERSED => true}
ROW COLUMN+CELL
u2_td5 column=f1:a2, timestamp=1548128587695, value=abc299
u2_td4 column=f1:a1, timestamp=1548128558845, value=abc2
u1_td3 column=f1:b1, timestamp=1548128540271, value=abc123
3 row(s) in 0.0120 seconds
查看有关扫描执行的指标,ALL_METRICS设置为true返回或全部指标,METRICS返回指定的指标。
hbase(main):023:0> scan 'my_test',{ALL_METRICS => true}
ROW COLUMN+CELL
u1_td1 column=f1:a1, timestamp=1548136934397, value=ab1
u1_td2 column=f1:a1, timestamp=1548128525390, value=abc199
u1_td3 column=f1:b1, timestamp=1548128540271, value=abc123
u2_err column=f1:b1, timestamp=1548139413694, value=abc888
u2_td4 column=f1:a1, timestamp=1548128558845, value=abc2
u2_td5 column=f1:a2, timestamp=1548128587695, value=abc299
u2_td6 column=f1:s2, timestamp=1548128614427, value=abc222
7 row(s) in 0.0390 seconds
METRIC VALUE
BYTES_IN_REMOTE_RESULT 275
S
BYTES_IN_RESULTS 275
MILLIS_BETWEEN_NEXTS 11
NOT_SERVING_REGION_EXC 0
EPTION
REGIONS_SCANNED 1
REMOTE_RPC_CALLS 3
REMOTE_RPC_RETRIES 0
ROWS_FILTERED 0
ROWS_SCANNED 7
RPC_CALLS 3
RPC_RETRIES 0
hbase(main):024:0> scan 'my_test',{METRICS => ['ROWS_SCANNED','RPC_CALLS']}
ROW COLUMN+CELL
u1_td1 column=f1:a1, timestamp=1548136934397, value=ab1
u1_td2 column=f1:a1, timestamp=1548128525390, value=abc199
u1_td3 column=f1:b1, timestamp=1548128540271, value=abc123
u2_err column=f1:b1, timestamp=1548139413694, value=abc888
u2_td4 column=f1:a1, timestamp=1548128558845, value=abc2
u2_td5 column=f1:a2, timestamp=1548128587695, value=abc299
u2_td6 column=f1:s2, timestamp=1548128614427, value=abc222
7 row(s) in 0.0100 seconds
METRIC VALUE
ROWS_SCANNED 7
RPC_CALLS 3
查询以指定开头的rowkey数据。
hbase(main):040:0> scan 'my_test',{ROWPREFIXFILTER => 'u2'}
ROW COLUMN+CELL
u2_err column=f1:b1, timestamp=1548139413694, value=abc888
u2_td4 column=f1:a1, timestamp=1548128558845, value=abc2
u2_td5 column=f1:a2, timestamp=1548128587695, value=abc299
u2_td6 column=f1:s2, timestamp=1548128614427, value=abc222
4 row(s) in 0.0060 seconds
hbase(main):029:0> scan 'my_test',{FILTER => "PrefixFilter('u2')"}
ROW COLUMN+CELL
u2_err column=f1:b1, timestamp=1548139413694, value=abc888
u2_td4 column=f1:a1, timestamp=1548128558845, value=abc2
u2_td5 column=f1:a2, timestamp=1548128587695, value=abc299
u2_td6 column=f1:s2, timestamp=1548128614427, value=abc222
4 row(s) in 0.0070 seconds
按列查找,可以指定某一确定的列或列的范围。binary是确定的参数,substring是参数中含有的值。
hbase(main):081:0> scan 'my_test',{FILTER => "(QualifierFilter (<,'binary:b1')) AND (QualifierFilter (=,'substring:1'))"}
ROW COLUMN+CELL
u1_td1 column=f1:a1, timestamp=1548136934397, value=ab1
u1_td2 column=f1:a1, timestamp=1548128525390, value=abc199
u2_td4 column=f1:a1, timestamp=1548128558845, value=abc2
3 row(s) in 0.0060 seconds
以指定列的前缀查找数据。
hbase(main):073:0> scan 'my_test',{FILTER=>"ColumnPrefixFilter('a') AND (ValueFilter(=,'substring:9') OR ValueFilter(=,'substring:2'))"}
ROW COLUMN+CELL
u1_td2 column=f1:a1, timestamp=1548128525390, value=abc199
u2_td4 column=f1:a1, timestamp=1548128558845, value=abc2
u2_td5 column=f1:a2, timestamp=1548128587695, value=abc299
3 row(s) in 0.0060 seconds
按值查找,可以指定确定的值或者值的范围。
hbase(main):066:0> scan 'my_test',{FILTER=>"ValueFilter(=,'binary:abc1')"}
ROW COLUMN+CELL
u1_td1 column=f1:a1, timestamp=1548128506440, value=abc1
1 row(s) in 0.0420 seconds
按照时间戳范围查找。
hbase(main):071:0> scan 'my_test',{FILTER => "TimestampsFilter(1548128525390,1548128614427)"}
ROW COLUMN+CELL
u1_td2 column=f1:a1, timestamp=1548128525390, value=abc199
u2_td6 column=f1:s2, timestamp=1548128614427, value=abc222
2 row(s) in 0.0070 seconds
它指导扫描器返回所有单元格(包括删除标记和未收集的已删除单元格)。此选项不能与请求特定列相结合。默认情况下禁用。
hbase(main):023:0> scan 'my_test'
ROW COLUMN+CELL
u1_td2 column=f1:a1, timestamp=1548128525390, value=abc199
u1_td3 column=f1:b1, timestamp=1548128540271, value=abc123
u2_err column=f1:b1, timestamp=1548139413694, value=abc888
u2_td4 column=f1:a1, timestamp=1548128558845, value=abc2
u2_td5 column=f1:a2, timestamp=1548128587695, value=abc299
u2_td6 column=f1:s2, timestamp=1548128614427, value=abc222
6 row(s) in 0.0090 seconds
hbase(main):024:0> scan 'my_test',{RAW => true,VERSIONS => 2}
ROW COLUMN+CELL
u1_td1 column=f1:a1, timestamp=1548226315249, type=DeleteColumn
u1_td1 column=f1:a1, timestamp=1548136934397, value=ab1
u1_td1 column=f1:a1, timestamp=1548128506440, value=abc1
u1_td2 column=f1:a1, timestamp=1548128525390, value=abc199
u1_td3 column=f1:b1, timestamp=1548128540271, value=abc123
u2_err column=f1:b1, timestamp=1548139413694, value=abc888
u2_td4 column=f1:a1, timestamp=1548128558845, value=abc2
u2_td5 column=f1:a2, timestamp=1548128587695, value=abc299
u2_td6 column=f1:s2, timestamp=1548128614427, value=abc222
7 row(s) in 0.0120 seconds
一个rowkey可以有多个version,同一个rowkey的同一个column也会有多个的值, 只拿出key中的第一个column的第一个version
KeyOnlyFilter: 只要key,不要value
hbase(main):081:0> scan 'my_test',FILTER => "FirstKeyOnlyFilter() AND ValueFilter(=,'binary:abc199') AND KeyOnlyFilter()"
ROW COLUMN+CELL
u1_td2 column=f1:a1, timestamp=1548128525390, value=
1 row(s) in 0.0140 seconds
返回列的个数
返回列的个数,(5,1)第一个参数表示返回列的多少,第二个参数表示从第几个列开始。
hbase(main):012:0> scan 'test001', {LIMIT => 1}
ROW COLUMN+CELL
36.56.0.0_10000120 column=f1:bts, timestamp=1545897321394, value=10704
36.56.0.0_10000120 column=f1:dip, timestamp=1545897321394, value=36.56.0.211
36.56.0.0_10000120 column=f1:dport, timestamp=1545897321394, value=81085
36.56.0.0_10000120 column=f1:pk, timestamp=1545897321394, value=2
36.56.0.0_10000120 column=f1:sip, timestamp=1545897321394, value=36.56.0.0
36.56.0.0_10000120 column=f1:sport, timestamp=1545897321394, value=12790
36.56.0.0_10000120 column=f1:ts, timestamp=1545897321394, value=1545896770661
1 row(s) in 0.0180 seconds
hbase(main):014:0> import org.apache.hadoop.hbase.filter.ColumnPaginationFilter
=> Java::OrgApacheHadoopHbaseFilter::ColumnPaginationFilter
hbase(main):015:0> scan 'test001', {FILTER =>ColumnPaginationFilter.new(5, 1),LIMIT => 1}
ROW COLUMN+CELL
36.56.0.0_10000120 column=f1:dip, timestamp=1545897321394, value=36.56.0.211
36.56.0.0_10000120 column=f1:dport, timestamp=1545897321394, value=81085
36.56.0.0_10000120 column=f1:pk, timestamp=1545897321394, value=2
36.56.0.0_10000120 column=f1:sip, timestamp=1545897321394, value=36.56.0.0
36.56.0.0_10000120 column=f1:sport, timestamp=1545897321394, value=12790
1 row(s) in 0.0110 seconds
查找rowkey里面包含td3
hbase(main):097:0> import org.apache.hadoop.hbase.filter.CompareFilter
=> Java::OrgApacheHadoopHbaseFilter::CompareFilter
hbase(main):098:0> import org.apache.hadoop.hbase.filter.SubstringComparator
=> Java::OrgApacheHadoopHbaseFilter::SubstringComparator
hbase(main):099:0> import org.apache.hadoop.hbase.filter.RowFilter
=> Java::OrgApacheHadoopHbaseFilter::RowFilter
hbase(main):101:0> scan 'my_test',{FILTER => RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'),SubstringComparator.new('td3'))}
ROW COLUMN+CELL
u1_td3 column=f1:b1, timestamp=1548128540271, value=abc123
1 row(s) in 0.0100 seconds
正则表达式
加入一条测试数据
hbase(main):001:0> put 'my_test','u2_err','f1:b1','abc888'
0 row(s) in 0.2470 seconds
查询rowkey里面以u开头的,新加入的测试数据并不符合正则表达式的规则,故查询不出来
hbase(main):003:0> import org.apache.hadoop.hbase.filter.RegexStringComparator
=> Java::OrgApacheHadoopHbaseFilter::RegexStringComparator
hbase(main):004:0> import org.apache.hadoop.hbase.filter.CompareFilter
=> Java::OrgApacheHadoopHbaseFilter::CompareFilter
hbase(main):006:0> import org.apache.hadoop.hbase.filter.SubstringComparator
=> Java::OrgApacheHadoopHbaseFilter::SubstringComparator
hbase(main):008:0> import org.apache.hadoop.hbase.filter.RowFilter
=> Java::OrgApacheHadoopHbaseFilter::RowFilter
hbase(main):010:0> scan 'my_test', {FILTER => RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'),RegexStringComparator.new('^u\d+\_td\d+$'))}
ROW COLUMN+CELL
u1_td1 column=f1:a1, timestamp=1548136934397, value=ab1
u1_td2 column=f1:a1, timestamp=1548128525390, value=abc199
u1_td3 column=f1:b1, timestamp=1548128540271, value=abc123
u2_td4 column=f1:a1, timestamp=1548128558845, value=abc2
u2_td5 column=f1:a2, timestamp=1548128587695, value=abc299
u2_td6 column=f1:s2, timestamp=1548128614427, value=abc222
6 row(s) in 0.0110 seconds