HBase 表的创建 属性 避免热点问题 region split

1.查看建表帮助

help 'create'
Creates a table. Pass a table name, and a set of column family
specifications (at least one), and, optionally, table configuration.
Column specification can be a simple string (name), or a dictionary
(dictionaries are described below in main help output), necessarily 
including NAME attribute. 
Examples:

Create a table with namespace=ns1 and table qualifier=t1
  hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}

Create a table with namespace=default and table qualifier=t1
  hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}
  hbase> # The above in shorthand would be the following:
  hbase> create 't1', 'f1', 'f2', 'f3'
  hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}
  hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}

Table configuration options can be put at the end.

Examples:

  hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']
  hbase> create 't1', 'f1', SPLITS => ['10', '20', '30', '40']
  hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'
  hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' }
  hbase> # Optionally pre-split the table into NUMREGIONS, using
  hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)
  hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}
 SPLITALGO => 'HexStringSplit', REGION_REPLICATION => 2, 
  CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}}

You can also keep around a reference to the created table:

  hbase> t1 = create 't1', 'f1'

Which gives you a reference to the table named 't1', on which you can then
call methods.

2.避免热点问题的方法:手动分区

creata 't1','f1',{NUUMREGIONS => 15,SPLITLGO => 'HexStringSplit'}

3.查看表的描述:

describe 'terminal_data_file_jn'

'terminal_data_file_jn', {NAME => 'cf', BLOOMFILTER true
  => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', K
 EEP_DELETED_CELLS => 'false', DATA_BLOCK_ENCODING =
 > 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE',
 MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZ
 E => '65536', REPLICATION_SCOPE => '0'}
  • IN_MEMORY可以提高读的性能
  • TTL:Time To Live , 即cell存活的时间,forever 就是永远不会自动删除,也可以设定比如三个月删除等等
  • COMPRESSION: 是否压缩数据,默认不压缩,也可以应用不同的压缩方法,如’gz’, ‘lzo’ , ‘snappy’, or ‘none’.
  • MIN_VERSIONS : 数据保存的最小版本数,配合TTL使用
  • BLOCKCACHE : 涉及HBase的缓存策略
  • BLOCKSIZE : block的大小
  • REPLICATION_SCOPE : 复制

参考:
Hbase split的过程以及解发条件
实时系统HBase读写优化–大量写入无障碍
Hbase split的三种方式和split的过程

HBase的Block Cache实现机制分析

你可能感兴趣的:(hbase)