一、简介

HBase Shell 提供了大多数的 HBase 命令，通过 HBase Shell，用户可以方便地创建、删除及修改表，还可以向表中添加数据，列出表中的相关信息等。本节介绍一些常用的命令和具体操作，并讲解如何使用命令行实现一个“学生成绩表”。

Shell命令很杂，很难简单的描述，我发现最好的办法就是用help命令，在理解HBase列式存储的基础上，多试几次就懂了

二、HBase内部存储结构

下图摘自：https://blog.csdn.net/u010416101/article/details/89186320

在HBase内, 数据按照<行键><列族1: 列1-1, 列1-2><列族2: 列2-1, 列2-2>这样的类型进行存储的. 且, 其一, 其中行键的排列顺序是按照字典顺序排序的, 这点对于搜索非常重要. 其二, 同一行键的相同列族中列的值, 是可能变化的, 并且按照时间戳进行排序的.(当然, 有些数据在合并的时候, 会被删除.)

其中, 相应的模块知识, 如下所示:

Row Key

Row Key, 行键. 是用来检索记录的主键. 访问HBase Table中的行, 主要有三种方式. 单个row key进行访问/通过 row key 正则匹配 / 全表扫描. Row Key的值可以是任意字符串(最大长度为64KB, 实际使用经常为10-100byte) .其中, 行键的排列顺序是按照字典顺序排序的, 这点对于搜索非常重要. (PS: 字典顺序: 1 10 12 6 7 9 中, 11排在9之前.)

Columns Family

Columns Family列族. HBase内的每个列, 都属于一个列族. 列族是Schema一部分(即,表设计), 而列不是(列可以在插入数据时, 动态添加). 列族是需要在使用之前进行提前定义的. 列名都以列族为前缀, 如course:namecourse:age.

Cell

Cell, 数据单元. 有唯一确定的单元. Cell内的数据是没有类型的, 全部都是字节码进行存储的.

Time Stamp

每个Cell存储一个数据的多个版本. 版本号, 通过时间戳进行索引(时间精确到毫秒). 时间戳类型为64位整数类型. 时间戳按照时间类型倒叙排序.

回收版本机制: <保存数据的最后n个版本>/<保存最近一段时间的版本(如最近七天)>.

三、HBase Shell命令列表

1.查看命令列表（忘记了就help）

使用help命令可以查看所有的命令

使用方法1：help

使用方法2：help "COMMAND"

使用方法3：help "COMMAND_GROUP"

示例：

help

help "get"

help "ddl"

常用的Shell命令组及Shell命令:

Group name: general

Commands: processlist, status, table_help, version, whoami

Group name: ddl

Commands: alter, alter_async, alter_status, clone_table_schema, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, list_regions, locate_region, show_filters

Group name: namespace

Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables

Group name: dml

Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve

Group name: tools

Commands: assign, balance_switch, balancer, balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, cleaner_chore_enabled, cleaner_chore_run, cleaner_chore_switch, clear_block_cache, clear_compaction_queues, clear_deadservers, clear_slowlog_responses, close_region, compact, compact_rs, compaction_state, compaction_switch, decommission_regionservers, flush, get_largelog_responses, get_slowlog_responses, hbck_chore_run, is_in_maintenance_mode, list_deadservers, list_decommissioned_regionservers, major_compact, merge_region, move, normalize, normalizer_enabled, normalizer_switch, recommission_regionserver, regioninfo, rit, snapshot_cleanup_enabled, snapshot_cleanup_switch, split, splitormerge_enabled, splitormerge_switch, stop_master, stop_regionserver, trace, unassign, wal_roll, zk_dump

Group name: replication

Commands: add_peer, append_peer_exclude_namespaces, append_peer_exclude_tableCFs, append_peer_namespaces, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, get_peer_config, list_peer_configs, list_peers, list_replicated_tables, remove_peer, remove_peer_exclude_namespaces, remove_peer_exclude_tableCFs, remove_peer_namespaces, remove_peer_tableCFs, set_peer_bandwidth, set_peer_exclude_namespaces, set_peer_exclude_tableCFs, set_peer_namespaces, set_peer_replicate_all, set_peer_serial, set_peer_tableCFs, show_peer_tableCFs, update_peer_config

Group name: snapshots

Commands: clone_snapshot, delete_all_snapshot, delete_snapshot, delete_table_snapshots, list_snapshots, list_table_snapshots, restore_snapshot, snapshot

Group name: configuration

Commands: update_all_config, update_config

Group name: quotas

Commands: disable_exceed_throttle_quota, disable_rpc_throttle, enable_exceed_throttle_quota, enable_rpc_throttle, list_quota_snapshots, list_quota_table_sizes, list_quotas, list_snapshot_sizes, set_quota

Group name: security

Commands: grant, list_security_capabilities, revoke, user_permission

Group name: procedures

Commands: list_locks, list_procedures

Group name: visibility labels

Commands: add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility

Group name: rsgroup

Commands: add_rsgroup, balance_rsgroup, get_rsgroup, get_server_rsgroup, get_table_rsgroup, list_rsgroups, move_namespaces_rsgroup, move_servers_namespaces_rsgroup, move_servers_rsgroup, move_servers_tables_rsgroup, move_tables_rsgroup, remove_rsgroup, remove_servers_rsgroup, rename_rsgroup

2.使用方法

所有名字都要用单引号或者双引号引起来，参数之间用逗号分隔

回车后运行

create或者alter表的时候，使用Ruby Hashes表达法

{'key1' => 'value1', 'key2' => 'value2', ...}

当key为NAME, VERSIONS, COMPRESSION这些关键字的时候，不需要引号

使用二进制表达时，使用如下格式

hbase> get 't1', "key\x03\x3f\xcd"

hbase> get 't1', "key\003\023\011"

hbase> put 't1', "test\xef\xff", 'f1:', "\x01\x33\x40"

四、实验数据模型

本实验采用如下的表结构、列族和列

表结构

create 'student','info','score'

put 'student','1','info:name','zhang'

put 'student','1','info:age','18'

put 'student','1','info:sex','male'

put 'student','1','score:math','89'

put 'student','1','score:eng','91'

put 'student','1','score:phy','88'

put 'student','1','score:chem','99'

五、常用HBase Shell命令

1.general

processlist 进程列表

processlist

status 服务状态

status

version 版本

version

whoami 我是谁

whoami

table_help 表的帮助

table_help

尝试一下：

hbase> t = create 't', 'cf'

hbase> t = get_table 't'

hbase> t.put 'r', 'cf:q', 'v'

hbase> t.scan

hbase> t.help 'scan'

hbase> t.enable

hbase> t.flush

hbase> t.disable

hbase> t.drop

2. ddl

create 建表

help 'create' 打‘***’的可以试一下
Create a table with namespace=ns1 and table qualifier=t1

*** hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}

Create a table with namespace=default and table qualifier=t1

*** hbase> create 't2', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}

# The above in shorthand would be the following:

*** hbase> create 't3', 'f1', 'f2', 'f3'

hbase> create 't4', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}

Table configuration options can be put at the end.

Examples:

*** hbase> create 'ns1:t7', 'f1', SPLITS => ['10', '20', '30', '40']

hbase> create 't10', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'hadoop'

*** hbase> create 't11', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' }

hbase> # Optionally pre-split the table into NUMREGIONS, using

hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)

hbase> create 't12', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}

hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', REGION_REPLICATION => 2, CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}}

hbase> create 't13', 'f1', {SPLIT_ENABLED => false, MERGE_ENABLED => false}

hbase> create 't14', {NAME => 'f1', DFS_REPLICATION => 1}

You can also keep around a reference to the created table:

*** hbase> t1 = create 't1', 'f1'

(1)：create , [ , ,... , ]

create 'student','info','score'

create

(2)：create , {NAME=>'colFamilyName' } [, {NAME=>'colFamilyNameN' } ]

create 'student',{'NAME'=>'info'},{'NAME'=>'score'}

create

alter 变更表

help ‘alter’

describe 获取表的描述

describe 'student'

describe

disable 失效表

disable 'student'

disable

drop 删除表

drop 'student'

drop

enable 生效表

enable 'student'

enable

exists 是否存在表

exists 'student'

exists

get_table 获取表的链接

t = get_table 'student'

t.put '1', 'info:name', 'zhang'

t.put '1','info:age','18'

t.put '1','info:sex','male'

t.put '1','score:math','89'

t.put '1','score:eng','91'

t.put '1','score:phy','88'

t.put '1','score:chem','99'

t.scan

get_table

list 列出当前命名空间的表

list

list_regions 表的region

list_regions 'student'

list_regions

show_filters 过滤器

show_filters

3. namespace 命名空间

create_namespace 创建命名空间

help 'create_namespace'

create_namespace 'ns1'

create_namespace 'ns1', {'PROPERTY_NAME'=>'PROPERTY_VALUE'}

create_namespace

alter_namespace 变更命名空间的属性

help 'alter_namespace'

alter_namespace 'ns1', {METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'}

alter_namespace 'ns1', {METHOD => 'unset', NAME=>'PROPERTY_NAME'}

alter_namespace

describe_namespace 获取命名空间的描述

describe_namespace 'ns1'

describe_namespace

drop_namespace 删除命名空间

drop_namespace 'ns2'

drop_namespace

list_namespace 列出所有的命名空间

list_namespace

list_namespace_tables 列出命名空间下的表

list_namespace_tables 'ns1'

list_namespace_tables

4. dml

append 追加值

help 'append'

append 'student','2','info:name','li'

append 'student','2','info:age','19'

t=get_table 'student'

t.append '2','info:sex','female'

append

count 表的行计数

help 'count'

count 'student'

count 'student', FILTER => "RowFilter(=, 'binary:1')"

count 'student', FILTER => "(RowFilter(=, 'binary:1')) AND (FamilyFilter(=,'substring:info'))"

count

delete 删除值

hbase> delete 'student', '1', 'info:name'

hbase> delete 'student', '1', 'info:name',ts1

deleteall

hbase> deleteall 'student', '1'

hbase> deleteall 'student', '1', 'info:name'

hbase> deleteall 'student', '1', 'info:name', ts1

get

get 'student', '1'

get 'student', '1', {TIMERANGE => [1303668804000, 2303668904000]}

get 'student', '1', {COLUMN => 'info:name'}

get 'student', '1', {COLUMN => ['info:name', 'info:age', 'info:sex']}

get 'student', '1', {COLUMN => 'info:name', TIMESTAMP => 1303668804000}

get 'student', '1', {COLUMN => 'info:name', TIMERANGE => [1303668804000, 2303668904000, VERSIONS => 4}

get 'student', '1', {FILTER => "ValueFilter(=, 'binary:18')"}

get 'student', '1', 'info:name'

get 'student', '1', 'info:name', 'info:age', 'info:sex'

get 'student', '1', ['info:name', 'info:age', 'info:sex']

get 'student', '1', {COLUMN => ''info:name', 'info:age', 'info:sex', ATTRIBUTES =>{'mykey'=>'myvalue'}}

get 'student', '1', {COLUMN => ''info:name', 'info:age', 'info:sex', AUTHORIZATIONS => ['PRIVATE','SECRET']}

get 'student', '1', {CONSISTENCY => 'TIMELINE'}

get 'student', '1', {CONSISTENCY => 'TIMELINE', REGION_REPLICA_ID => 1}

get_counter

Return a counter cell value at specified table/row/column coordinates.

A counter cell should be managed with atomic increment functions on HBase

and the data should be binary encoded (as long value). Example:

hbase> get_counter 'student', '1', 'info:c1'

The same commands also can be run on a table reference.

hbase> t.get_counter '1', 'info:c1'

get_splits

get_splits 't1'

get_splits

incr

Increments a cell 'value' at specified table/row/column coordinates.

To increment a cell value in table 'ns1:t1' or 't1' at row 'r1' under column

'c1' by 1 (can be omitted) or 10 do:

hbase> incr 'student', '1', 'info:c1'

hbase> incr 'student', '1', 'info:c1', 1

hbase> incr 'student', '1', 'info:c1', 10

hbase> incr 'student', '1', 'info:c1', 10, {ATTRIBUTES=>{'mykey'=>'myvalue'}}

hbase> incr 'student', '1', 'info:c1', {ATTRIBUTES=>{'mykey'=>'myvalue'}}

hbase> incr 'student', '1', 'info:c1', 10, {VISIBILITY=>'PRIVATE|SECRET'}

The same commands also can be run on a table reference.

hbase> t.incr '1', 'info:c1'

hbase> t.incr '1', 'info:c1', 1

hbase> t.incr '1', 'info:c1', 10, {ATTRIBUTES=>{'mykey'=>'myvalue'}}

hbase> t.incr '1', 'info:c1', 10, {VISIBILITY=>'PRIVATE|SECRET'}

put 放入值

help 'put'

put 'student','3','info:name','zhao'

put 'student','3','info:age','18'

t=get_table 'student'

t.put '3','info:sex','male'

t.scan

put

scan 扫描表

help 'scan'

Scan a table; pass table name and optionally a dictionary of scanner

specifications. Scanner specifications may include one or more of:

TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW,

ROWPREFIXFILTER, TIMESTAMP,

MAXLENGTH, COLUMNS, CACHE, RAW, VERSIONS, ALL_METRICS, METRICS,

REGION_REPLICA_ID, ISOLATION_LEVEL, READ_TYPE,

ALLOW_PARTIAL_RESULTS,BATCH or MAX_RESULT_SIZE

Some examples:

scan 'student'

scan 'student', {COLUMNS => 'info:name'}

scan

scan 'student', {COLUMNS => [ 'info:name', 'info:age'], LIMIT => 10, STARTROW => '2'}

scan

scan 'student', {COLUMNS => [ 'info:name', 'info:age'], TIMERANGE => [1303668804000, 2303668904000]}

scan 'student', {REVERSED => true}

scan 'student', {ALL_METRICS => true}

scan 'student', {METRICS => ['RPC_RETRIES', 'ROWS_FILTERED']}

scan 'student', {ROWPREFIXFILTER => '2', FILTER => "(QualifierFilter (>=, 'binary:name')) "}

scan 'student', {FILTER =>org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}

scan 'student', {CONSISTENCY => 'TIMELINE'}

scan 'student', {ISOLATION_LEVEL => 'READ_UNCOMMITTED'}

scan 'student', {MAX_RESULT_SIZE => 1}

scan 'student', { COLUMNS => [ 'info:name', 'info:age'], ATTRIBUTES => {'mykey' => 'myvalue'}}

scan 'student', { COLUMNS => [ 'info:name', 'info:age'], AUTHORIZATIONS => ['PRIVATE','SECRET']}

t=get_table 'student'

t.scan

truncate

truncate 't1'

好玩的大数据之22：Hbase Shell

一、简介

二、HBase内部存储结构

Row Key

Columns Family

Cell

Time Stamp

三、HBase Shell命令列表

1.查看命令列表（忘记了就help）

Group name: general

Group name: ddl

Group name: namespace

Group name: dml

Group name: tools

Group name: replication

Group name: snapshots

Group name: configuration

Group name: quotas

Group name: security

Group name: procedures

Group name: visibility labels

Group name: rsgroup

2.使用方法

四、实验数据模型

五、常用HBase Shell命令

1.general

processlist 进程列表

status 服务状态

version 版本

whoami 我是谁

table_help 表的帮助

2. ddl

create 建表

alter 变更表

describe 获取表的描述

disable 失效表

drop 删除表

enable 生效表

exists 是否存在表

get_table 获取表的链接

list 列出当前命名空间的表

list_regions 表的region

show_filters 过滤器

3. namespace 命名空间

create_namespace 创建命名空间

alter_namespace 变更命名空间的属性

describe_namespace 获取命名空间的描述

drop_namespace 删除命名空间

list_namespace 列出所有的命名空间

list_namespace_tables 列出命名空间下的表

4. dml

append 追加值

count 表的行计数

delete 删除值

deleteall

get

get_counter

get_splits

incr

put 放入值

scan 扫描表

truncate

你可能感兴趣的:(好玩的大数据之22：Hbase Shell)