hbase数据结构模型

文章目录

  • 1,数据结构
  • 2,表数据组织结构: meta表
    • a, 结构
    • b, 示例
  • 3,存储目录结构

1,数据结构

  • hbase 表= multi-dimensional map (多维度的Map: kv对)
  • row: 一条数据记录
    row的结构如下:有序的存储(按rowkey的字典序, a->z )
    ( rowkey1: { 列组: {列名: “属性值”} } )

参考: https://dzone.com/articles/understanding-hbase-and-bigtab

hbase(main):014:0> scan 'test'
ROW          COLUMN+CELL                                                                                                                     
 a1          column=f:name, timestamp=1586779007598, value=name1                                                                             
 r1          column=f:name, timestamp=1586422233658, value=test                                                                              
2 row(s) in 0.0150 seconds


## 数据模型
  "rowkey1" : {
    "cf1" : {
      "name" : {
        1123123123112 : "lisi",
        1231243123312 : "lisi2"
      },
      "job" : {
        1512343252351 : "java-enginer"
      }
    },

    "cf2" : {
      "city" : {
        123461231 : "BeiJin"
      }
    }
  }

2,表数据组织结构: meta表

a, 结构

hbase系统表(meta, namespace) --> namespace表:存放所有的命名空间, meta: 表的region分布信息
hbase数据结构模型_第1张图片
参考:http://hbase.apache.org/book.html#arch.catalog
hbase数据结构模型_第2张图片

##1,命名空间( database)
hbase(main):034:0> list_namespace
NAMESPACE                                                                                                                                                                    
default                                                                                                                                                                      
hbase                                                                                                                                                                        
2 row(s) in 0.0130 seconds

##2,某个命名空间下的表( database.table )
hbase(main):037:0> list_namespace_tables 'hbase'
TABLE                                                                                                                                                                        
meta                                                                                                                                                                         
namespace                                                                                                                                                                    
2 row(s) in 0.0340 seconds

hbase(main):038:0>  scan 'hbase:namespace'
ROW                                          COLUMN+CELL                                                                                                                     
 default                                     column=info:d, timestamp=1586414716734, value=\x0A\x07default                                                                   
 hbase                                       column=info:d, timestamp=1586414716750, value=\x0A\x05hbase                                                         
2 row(s) in 0.0230 seconds

hbase(main):036:0>  scan 'hbase:meta'
ROW                                          COLUMN+CELL                                                                                                                     
 hbase:namespace,,1586414715365.6faa8977c7b0 column=info:regioninfo, timestamp=1586744606619, value={ENCODED => 6faa8977c7b0bc9c7021864a0e131544, NAME => 'hbase:namespace,,1
 bc9c7021864a0e131544.                       586414715365.6faa8977c7b0bc9c7021864a0e131544.', STARTKEY => '', ENDKEY => ''}                                                  
 hbase:namespace,,1586414715365.6faa8977c7b0 column=info:seqnumDuringOpen, timestamp=1586744606619, value=\x00\x00\x00\x00\x00\x00\x00\x0D                                   
 bc9c7021864a0e131544.                                                                                                                                                       
 hbase:namespace,,1586414715365.6faa8977c7b0 column=info:server, timestamp=1586744606619, value=cdh-node1:60020                                                              
 bc9c7021864a0e131544.                                                                                                                                                       
 hbase:namespace,,1586414715365.6faa8977c7b0 column=info:serverstartcode, timestamp=1586744606619, value=1586744536803                                                       
 bc9c7021864a0e131544.   
                                                                                                                                                     
 test,,1581074147974.6d9cc122b356452eb1a7f5f column=info:regioninfo, timestamp=1586776597280, value={ENCODED => 6d9cc1fc906d, NAME => 'test,,158107414797
 834fc906d.                                  4.6d9cc1fc906d.', STARTKEY => '', ENDKEY => ''}                                                             
 test,,1581074147974.6d9cc122b356452eb1a7f5f column=info:seqnumDuringOpen, timestamp=1586776597280, value=\x00\x00\x00\x00\x00\x00\x00+                                      
 834fc906d.                                                                                                                                                                  
 test,,1581074147974.6d9cc122b356452eb1a7f5f column=info:server, timestamp=1586776597280, value=cdh-node1:60020                                                              
 834fc906d.                                                                                                                                                                  
 test,,1581074147974.6d9cc122b356452eb1a7f5f column=info:serverstartcode, timestamp=1586776597280, value=1586744536803                                                       
 834fc906d.                                                                                                                                                                  
2 row(s) in 0.0230 seconds

b, 示例

##创建表, 预切分region(设置region数量2个:  [1,10),   [10, -无穷大] )
hbase(main):050:0> create 't1', 'f1', SPLITS => ['10']
0 row(s) in 1.2570 seconds
=> Hbase::Table - t1


hbase(main):062:0> scan 'hbase:meta',{ROWPREFIXFILTER => 't1'}
ROW                                          COLUMN+CELL                                                                                                                     
 t1,,1586782576057.3a09a80a565194a112e74fb58 column=info:regioninfo, timestamp=1586782576688, value={ENCODED => 3a09a80a565194a112e74fb58be40476, NAME => 't1,,1586782576057.
 be40476.                                    3a09a80a565194a112e74fb58be40476.', STARTKEY => '', ENDKEY => '10'}                                                             
 t1,,1586782576057.3a09a80a565194a112e74fb58 column=info:seqnumDuringOpen, timestamp=1586782576688, value=\x00\x00\x00\x00\x00\x00\x00\x02                                   
 be40476.                                                                                                                                                                    
 t1,,1586782576057.3a09a80a565194a112e74fb58 column=info:server, timestamp=1586782576688, value=cdh-node1:60020                                                              
 be40476.                                                                                                                                                                    
 t1,,1586782576057.3a09a80a565194a112e74fb58 column=info:serverstartcode, timestamp=1586782576688, value=1586744536803                                                       
 be40476.    
                                                                                                                                                                 
 t1,10,1586782576057.c05ee6ae537b3a8a8372f68 column=info:regioninfo, timestamp=1586782576683, value={ENCODED => c05ee6ae537b3a8a8372f6854dba0bb9, NAME => 't1,10,158678257605
 54dba0bb9.                                  7.c05ee6ae537b3a8a8372f6854dba0bb9.', STARTKEY => '10', ENDKEY => ''}                                                           
 t1,10,1586782576057.c05ee6ae537b3a8a8372f68 column=info:seqnumDuringOpen, timestamp=1586782576683, value=\x00\x00\x00\x00\x00\x00\x00\x02                                   
 54dba0bb9.                                                                                                                                                                  
 t1,10,1586782576057.c05ee6ae537b3a8a8372f68 column=info:server, timestamp=1586782576683, value=cdh-node1:60020                                                              
 54dba0bb9.                                                                                                                                                                  
 t1,10,1586782576057.c05ee6ae537b3a8a8372f68 column=info:serverstartcode, timestamp=1586782576683, value=1586744536803                                                       
 54dba0bb9.                                                                                                                                                                  
2 row(s) in 0.0090 seconds

3,存储目录结构

## 0,hbase表的存储格式
[root@cdh-node1 ~]# hdfs dfs -ls /hbase/data/default/test
Found 3 items
drwxr-xr-x   - hbase hbase   0 2020-04-13 11:16 /hbase/data/default/test/.tabledesc #表的描述信息: desc tab1
drwxr-xr-x   - hbase hbase   0 2020-04-13 11:16 /hbase/data/default/test/.tmp #正常是空目录,除非compact,split
drwxr-xr-x   - hbase hbase   0 2020-04-13 11:16 /hbase/data/default/test/6d9cc1fc906d #表的regionfile/hstore/storefile

[root@cdh-node1 ~]# hdfs dfs -ls /hbase/data/default/test/6d9cc1fc906d
Found 4 items
-rw-r--r--   1 hbase hbase   39 2020-04-09 08:51 /hbase/data/default/test/6d9cc1fc906d/.regioninfo
drwxr-xr-x   - hbase hbase    0 2020-04-13 11:16 /hbase/data/default/test/6d9cc1fc906d/.tmp #正常是空目录
drwxr-xr-x   - hbase hbase    0 2020-04-13 11:16 /hbase/data/default/test/6d9cc1fc906d/f
drwxr-xr-x   - hbase hbase    0 2020-04-13 11:16 /hbase/data/default/test/6d9cc1fc906d/recovered.edits
[root@cdh-node1 ~]# hdfs dfs -ls /hbase/data/default/test/6d9cc1fc906d/.regioninfo
-rw-r--r--   1 hbase hbase         39 2020-04-09 08:51 /hbase/data/default/test/6d9cc1fc906d/.regioninfo
[root@cdh-node1 ~]# hdfs dfs -cat /hbase/data/default/test/6d9cc1fc906d/.regioninfo
PBUF�����.
defaulttest"(08

[root@cdh-node1 ~]# hdfs dfs -ls /hbase/data/default/test/6d9cc1fc906d/f
Found 1 items
-rw-r--r--   1 hbase hbase   994 2020-04-13 11:16 /hbase/data/default/test/6d9cc1fc906d/f/bd8f4d9313db49529bac5a6086871063



## 1,查看表默认属性
## hbase(main):005:0> desc 'ttl1'
## Table ttl1 is ENABLED                                                                                                                                                        
## ttl1                                                                                                                                                                         
## COLUMN FAMILIES DESCRIPTION                                                                                                                                                  
## {NAME => 'f1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'N
## ONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                                                             
## 1 row(s) in 0.0220 seconds

[root@cdh-node1 ~]# hdfs dfs -cat /hbase/data/default/test/.tabledesc/.tableinfo.0000000001
PBUF
defaulttest
IS_METAfalse�
f
BLOOMFILTERROW
VERSIONS1
  IN_MEMORYfalse
KEEP_DELETED_CELLSFALSE
DATA_BLOCK_ENCODINGNONE
TTL
2147483647
COMPRESSIONNONE
MIN_VERSIONS0
BLOCKCACHEtrue
  BLOCKSIZE65536
REPLICATION_SCOPE0



## 2,修改表属性:压缩+ 块编码
## hbase(main):006:0> alter 'test',{NAME=>'f',COMPRESSION=>"LZ4",DATA_BLOCK_ENCODING=>"FAST_DIFF"}
## Updating all regions with the new schema...
## 1/1 regions updated.
## Done.
## 0 row(s) in 2.0640 seconds
## 
## hbase(main):007:0> compact
## compact       compact_mob   compact_rs
## hbase(main):007:0> compact 'test'
## 0 row(s) in 0.0690 seconds
## 
## hbase(main):008:0> major_compact
## major_compact       major_compact_mob
## hbase(main):008:0> major_compact 'test'
## 0 row(s) in 0.0510 seconds

[root@cdh-node1 ~]# hdfs dfs -ls /hbase/data/default/test/.tabledesc
Found 1 items
-rw-r--r--   1 hbase hbase   285 2020-04-13 11:16 /hbase/data/default/test/.tabledesc/.tableinfo.0000000002

[root@cdh-node1 ~]# hdfs dfs -cat /hbase/data/default/test/.tabledesc/.tableinfo.0000000002
PBUF
defaulttest
IS_METAfalse�
f
BLOOMFILTERROW
VERSIONS1
  IN_MEMORYfalse
KEEP_DELETED_CELLSFALSE 
DATA_BLOCK_ENCODING FAST_DIFF
TTL
2147483647
COMPRESSIONLZ4
MIN_VERSIONS0
BLOCKCACHEtrue
  BLOCKSIZE65536
REPLICATION_SCOPE0

你可能感兴趣的:(大数据hadoop-hbase)