HBase(docker版)简单部署和HBase shell操作实践

文章目录

  • 说明
  • HBase部署
  • 访问HBase Shell
  • 常见命令
    • 数据定义语言(DDL)
  • 数据操作语言(DML)
  • 通用操作
  • 访问HBase WebUI

说明

  • 本文适合HBase初学者快速搭建HBase环境,练习常见shell使用
  • 本文参考资料
    • 《大数据技术原理和应用》(林子雨 编著 第三版)
    • zhoupengbo的大数据练手项目

HBase部署

  1. 安装docker:可以安装1panel快速安装docker,然后再管理面板中配置镜像加速
  2. 然后在面板中拉取harisekhon/hbase镜像到本地
    HBase(docker版)简单部署和HBase shell操作实践_第1张图片
  3. 运行容器
    docker run -d -h docker-hbase \
            -p 2181:2181 \
            -p 8080:8080 \
            -p 8085:8085 \
            -p 9090:9090 \
            -p 9000:9000 \
            -p 9095:9095 \
            -p 16000:16000 \
            -p 16010:16010 \
            -p 16201:16201 \
            -p 16301:16301 \
            -p 16020:16020\
            --name hbase \
            harisekhon/hbase
    
宿主机端口 容器端口 功能
2181 2181 ZooKeeper端口,用于HBase集群的协调和通信
8080 8080 HBase主控台Web界面端口,用于管理和监控
8085 8085 HBase REST服务端口,通过REST API访问HBase
9090 9090 HBase主控节点的RPC端口,客户端与主控节点通信
9000 9000 HDFS的默认文件系统端口,用于数据存储
9095 9095 HBase主控节点的Master服务端口,管理Master节点
16000 16000 HBase区域服务器的RPC端口,客户端与区域服务器通信
16010 16010 HBase区域服务器的Web界面端口,管理和监控区域服务器
16201 16201 HBase区域服务器的备用RPC端口,用于故障转移和容错
16301 16301 HBase区域服务器的备用Web界面端口,用于故障转移和容错
16020 16020 HBase主控节点的备用RPC端口,用于故障转移和容错

访问HBase Shell

# 查看运行的容器
docker ps
# 找到容器id,进入容器
docker exec -it <container ID前缀> bash
# 访问HBase Shell,进入容器后输入
hbase shell

常见命令

  • 在终端输入“hbase shell”命令进入该 Shell环境,输入“help”,可以查看 HBase 支持的所有 Shell 命令

    HBase Shell, version 2.1.3, rda5ec9e4c06c537213883cca8f3cc9a7c19daf67, Mon Feb 11 15:45:33 CST 2019
    Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
    Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.
    
    COMMAND GROUPS:
      Group name: general
      Commands: processlist, status, table_help, version, whoami
    
      Group name: ddl
      Commands: alter, alter_async, alter_status, clone_table_schema, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, list_regions, locate_region, show_filters
    
      Group name: namespace
      Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables
    
      Group name: dml
      Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve
    
      Group name: tools
      Commands: assign, balance_switch, balancer, balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, cleaner_chore_enabled, cleaner_chore_run, cleaner_chore_switch, clear_block_cache, clear_compaction_queues, clear_deadservers, close_region, compact, compact_rs, compaction_state, flush, is_in_maintenance_mode, list_deadservers, major_compact, merge_region, move, normalize, normalizer_enabled, normalizer_switch, split, splitormerge_enabled, splitormerge_switch, stop_master, stop_regionserver, trace, unassign, wal_roll, zk_dump
    
      Group name: replication
      Commands: add_peer, append_peer_exclude_namespaces, append_peer_exclude_tableCFs, append_peer_namespaces, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, get_peer_config, list_peer_configs, list_peers, list_replicated_tables, remove_peer, remove_peer_exclude_namespaces, remove_peer_exclude_tableCFs, remove_peer_namespaces, remove_peer_tableCFs, set_peer_bandwidth, set_peer_exclude_namespaces, set_peer_exclude_tableCFs, set_peer_namespaces, set_peer_replicate_all, set_peer_serial, set_peer_tableCFs, show_peer_tableCFs, update_peer_config
    
      Group name: snapshots
      Commands: clone_snapshot, delete_all_snapshot, delete_snapshot, delete_table_snapshots, list_snapshots, list_table_snapshots, restore_snapshot, snapshot
    
      Group name: configuration
      Commands: update_all_config, update_config
    
      Group name: quotas
      Commands: list_quota_snapshots, list_quota_table_sizes, list_quotas, list_snapshot_sizes, set_quota
    
      Group name: security
      Commands: grant, list_security_capabilities, revoke, user_permission
    
      Group name: procedures
      Commands: list_locks, list_procedures
    
      Group name: visibility labels
      Commands: add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility
    
      Group name: rsgroup
      Commands: add_rsgroup, balance_rsgroup, get_rsgroup, get_server_rsgroup, get_table_rsgroup, list_rsgroups, move_namespaces_rsgroup, move_servers_namespaces_rsgroup, move_servers_rsgroup, move_servers_tables_rsgroup, move_tables_rsgroup, remove_rsgroup, remove_servers_rsgroup
    
    SHELL USAGE:
    Quote all names in HBase Shell such as table and column names.  Commas delimit
    command parameters.  Type <RETURN> after entering a command to run it.
    Dictionaries of configuration used in the creation and alteration of tables are
    Ruby Hashes. They look like this:
    
      {'key1' => 'value1', 'key2' => 'value2', ...}
    
    and are opened and closed with curley-braces.  Key/values are delimited by the
    '=>' character combination.  Usually keys are predefined constants such as
    NAME, VERSIONS, COMPRESSION, etc.  Constants do not need to be quoted.  Type
    'Object.constants' to see a (messy) list of all constants in the environment.
    
    If you are using binary keys or values and need to enter them in the shell, use
    double-quote'd hexadecimal representation. For example:
    
      hbase> get 't1', "key\x03\x3f\xcd"
      hbase> get 't1', "key\003\023\011"
      hbase> put 't1', "test\xef\xff", 'f1:', "\x01\x33\x40"
    
    The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added.
    For more on the HBase Shell, see http://hbase.apache.org/book.html
    

数据定义语言(DDL)

  1. create:创建表
    # 创建表 t1,列族为 f1,列族版本号为 5
    create 't1', {NAME => 'f1', VERSIONS => 5}
    # 创建表 t2,3 个列族分别为 f1、f2、f3
    create 't2', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}
    # 等价的命令
    create 't2', 'f1', 'f2', 'f3'
    # 创建表 t3,将表依据分割算法 HexStringSplit 分布在 15 个 Region 里
    create 't3', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}
    # 创建表 t1,指定切分点
    create 't5', 'f1', {SPLITS => ['10', '20', '30', '40']} 
    
  2. list:列出表信息
    hbase(main):035:0> list
    TABLE                                                                                                      
    t1                                                                                                         
    1 row(s)
    Took 0.0260 seconds                                                                                        
    => ["t1"]
    
  3. alter:修改列族模式
    # 向表t2添加列族 f1
    hbase(main):025:0> alter 't2', NAME => 'f4' 
    Updating all regions with the new schema...
    1/1 regions updated.
    Done.
    Took 1.9711 seconds                                                                                        
    # 删除表t2中的列族 f1
    hbase(main):026:0> alter 't2', NAME => 'f4', METHOD => 'delete'
    Updating all regions with the new schema...
    1/1 regions updated.
    Done.
    Took 1.7745 seconds  
    # 设定表t2中列族 f1 最大为 128 MB                                                                                      
    hbase(main):027:0> alter 't2', METHOD => 'table_att', MAX_FILESIZE => '134217728'
    Updating all regions with the new schema...
    1/1 regions updated.
    Done.
    Took 1.6691 seconds
    
  4. describe:显示表的相关信息
    hbase(main):039:0> describe 't2'
    Table t2 is ENABLED                                                                                        
    t2, {TABLE_ATTRIBUTES => {MAX_FILESIZE => '134217728'}                                                     
    COLUMN FAMILIES DESCRIPTION                                                                                
    {NAME => 'f1', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}                                                                
    {NAME => 'f2', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}                                                                
    {NAME => 'f3', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}                                                                
    3 row(s)
    Took 0.0345 seconds  
    
  5. 失效、删除表
    • 在HBase shell中,要删除一个表,首先需要确保该表是被禁用的(disable),然后才能进行删除(drop)。此外,表名应该用引号包围,因为它是一个字符串。
    # 先禁用表
    hbase(main):003:0> disable 't1'
    # 删除
    hbase(main):004:0> drop 't1'
    

数据操作语言(DML)

  1. put:向表、行、列指定的单元格添加数据
    # 向表t2中行row1和列f1:c1所对应的单元格中添加数据value1,时间戳为1421822284898
    hbase(main):011:0> put 't2', 'row1', 'f1:c1', 'value1', 1421822284898
    Took 0.0933 seconds    
    
  2. get:查询单元格数据
    # 获得表 t2、行 row1、列 c1、版本号为 4 的数据
    hbase(main):018:0> get 't2', 'row1', {COLUMN => 'f1',version=>4} 
    2.1.3, rda5ec9e4c06c537213883cca8f3cc9a7c19daf67, Mon Feb 11 15:45:33 CST 2019
    Took 0.0003 seconds                                                                                        
    COLUMN                      CELL                                                                           
     f1:c1                      timestamp=1421822284898, value=value1                                          
    1 row(s)
    Took 0.0059 seconds
    # 获得表t2、行row1、列c1和c2的数据
    hbase(main):021:0> get 't1','row1','f1'
    COLUMN                      CELL                                                                           
     f1:c1                      timestamp=1421822284898, value=value1                                          
    1 row(s)
    Took 0.0071 seconds     
    
  3. scan:浏览表的相关信息
    hbase(main):022:0> scan '.META.', {COLUMNS => 'info:regioninfo'}
    ERROR: .META. no longer exists. The table has been renamed to hbase:meta
    For usage try 'help "scan"'
    Took 0.0020 seconds                                                                                        
    hbase(main):023:0> scan 'hbase:meta', {COLUMNS => 'info:regioninfo'}
    ROW                         COLUMN+CELL                                                                    
    hbase:namespace,,170660797 column=info:regioninfo, timestamp=1706607975642, value={ENCODED => 8df5ef60ba63
    4908.8df5ef60ba6316c3e7dd2 16c3e7dd2e8cad870dc1, NAME => 'hbase:namespace,,1706607974908.8df5ef60ba6316c3ee8cad870dc1.  7dd2e8cad870dc1.', STARTKEY => '', ENDKEY => ''}                               
    t1,,1706610172446.41e2dba0 column=info:regioninfo, timestamp=1706610172938, value={ENCODED => 41e2dba025fd
    25fd0c022b04007fc93e380a.  0c022b04007fc93e380a, NAME => 't1,,1706610172446.41e2dba025fd0c022b04007fc93e380a.', STARTKEY => '', ENDKEY => ''}                                            
    2 row(s)
    Took 0.0389 seconds 
    
  4. count:统计表中的行数
    hbase(main):028:0> count 't1'
    1 row(s)
    Took 0.0440 seconds                                                                                        
    => 1
    

通用操作

  1. status:输出 HBase 集群状态信息
    • 可以通过 summary、simple 或者 detailed指定输出信息的详细程度
    hbase(main):031:0> status 'summary'
    1 active master, 0 backup masters, 1 servers, 0 dead, 3.0000 average load
    Took 0.0143 seconds                                                                                        
    hbase(main):032:0> status 'simple'
    active master:  docker-hbase:16000 1706607959227
    0 backup masters
    1 live servers
        docker-hbase:16020 1706607960371
            requestsPerSecond=0.0, numberOfOnlineRegions=3, usedHeapMB=52, maxHeapMB=1958, numberOfStores=7, numberOfStorefiles=7, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeKB=0, readRequestsCount=79, filteredReadRequestsCount=7, writeRequestsCount=36, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=24, currentCompactedKVs=24, compactionProgressPct=1.0, coprocessors=[MultiRowMutationEndpoint]
    0 dead servers
    Aggregate load: 0, regions: 3
    Took 0.0077 seconds   
    
  2. version:查看版本
hbase(main):040:0> version
2.1.3, rda5ec9e4c06c537213883cca8f3cc9a7c19daf67, Mon Feb 11 15:45:33 CST 2019
Took 0.0004 seconds  

访问HBase WebUI

http://服务器IP:16010/master-status
  • 具体的内容,各位可以自行探究
    HBase(docker版)简单部署和HBase shell操作实践_第2张图片

你可能感兴趣的:(分布式数据库Hbase探究,hbase,docker,数据库)