40. Time To Live (TTL)
ColumnFamilies can set a TTL length in seconds, and HBase will automatically delete rows once the expiration time is reached. This applies to all versions of a row - even the current one. The TTL time encoded in the HBase for the row is specified in UTC.
Store files which contains only expired rows are deleted on minor compaction. Setting
hbase.store.delete.expired.storefile
tofalse
disables this feature. Setting minimum number of versions to other than 0 also disables this.See HColumnDescriptor for more information.
Recent versions of HBase also support setting time to live on a per cell basis. See HBASE-10560 for more information. Cell TTLs are submitted as an attribute on mutation requests (Appends, Increments, Puts, etc.) using Mutation#setTTL. If the TTL attribute is set, it will be applied to all cells updated on the server by the operation. There are two notable differences between cell TTL handling and ColumnFamily TTLs:
Cell TTLs are expressed in units of milliseconds instead of seconds.
A cell TTLs cannot extend the effective lifetime of a cell beyond a ColumnFamily level TTL setting.
40.生存时间(TTL)
ColumnFamilies可以设置TTL长度(以秒为单位),HBase将在到达到期时间后自动删除行。这适用于行的所有版本 - 即使是当前版本。在HBase中为行编码的TTL时间以UTC指定。
在轻微压缩时删除仅包含过期行的存储文件。设置hbase.store.delete.expired.storefile为false禁用此功能。将最小版本数设置为0以外也会禁用此功能。
最新版本的HBase还支持基于每个单元格设置生存时间。使用Mutation#setTTL将cell TTL作为突变请求(Appends,Increments,Puts等)的属性提交。如果设置了TTL属性,它将应用于操作在服务器上更新的所有单元格。
Cell的TTL与Column family的TTL区别:
以上内容来自Apache的hbase官网,可供参考。以下实际操作一下。
创建表:
create 'dc:event',{NAME => 'f1'},{NAME => 'cf'},{NAME => 'f2'}
查看表结构:
desc "dc:event"
'dc:event', {NAME => 'cf', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1',COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'f1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'f2', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
put 数据
put 'dc:event','866925023233621','f1:eventid','866925023233621'
put 'dc:event','866925023233622','f1:eventid','866925023233621'
put 'dc:event','866925023233623','f1:eventid','866925023233621'
put 'dc:event','866925023233624','f1:eventid','866925023233621'
put 'dc:event','866925023233625','f1:eventid','866925023233621'
put 'dc:event','866925023233626','f1:eventid','866925023233621'
put 'dc:event','866925023233627','f1:eventid','866925023233621'
put 'dc:event','866925023233628','f1:eventid','866925023233621'
put 'dc:event','866925023233629','f1:eventid','866925023233621'
put 'dc:event','866925023233630','f1:eventid','866925023233621'
put 'dc:event','8669250232336-21','cf:eventid','866925023233621'
put 'dc:event','8669250232336-22','cf:eventid','866925023233621'
put 'dc:event','8669250232336-23','cf:eventid','866925023233621'
put 'dc:event','8669250232336-24','cf:eventid','866925023233621'
put 'dc:event','8669250232336-25','cf:eventid','866925023233621'
put 'dc:event','8669250232336-26','cf:eventid','866925023233621'
put 'dc:event','8669250232336-27','cf:eventid','866925023233621'
put 'dc:event','8669250232336-28','cf:eventid','866925023233621'
put 'dc:event','8669250232336-29','cf:eventid','866925023233621'
put 'dc:event','8669250232336-30','cf:eventid','866925023233621'
put 'dc:event','866925023233-6-21','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-22','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-23','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-24','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-25','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-26','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-27','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-28','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-29','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-30','f2:eventid','866925023233621'
scan 'dc:event'
hbase(main):048:0> scan 'dc:event'
ROW COLUMN+CELL
866925023233-6-21 column=f2:eventid, timestamp=1536805384815, value=866925023233621
866925023233-6-22 column=f2:eventid, timestamp=1536805384873, value=866925023233621
866925023233-6-23 column=f2:eventid, timestamp=1536805384881, value=866925023233621
866925023233-6-24 column=f2:eventid, timestamp=1536805384890, value=866925023233621
866925023233-6-25 column=f2:eventid, timestamp=1536805384898, value=866925023233621
866925023233-6-26 column=f2:eventid, timestamp=1536805384907, value=866925023233621
866925023233-6-27 column=f2:eventid, timestamp=1536805384922, value=866925023233621
866925023233-6-28 column=f2:eventid, timestamp=1536805384936, value=866925023233621
866925023233-6-29 column=f2:eventid, timestamp=1536805384946, value=866925023233621
866925023233-6-30 column=f2:eventid, timestamp=1536805384958, value=866925023233621
8669250232336-21 column=cf:eventid, timestamp=1536805310816, value=866925023233621
8669250232336-22 column=cf:eventid, timestamp=1536805310850, value=866925023233621
8669250232336-23 column=cf:eventid, timestamp=1536805310861, value=866925023233621
8669250232336-24 column=cf:eventid, timestamp=1536805310870, value=866925023233621
8669250232336-25 column=cf:eventid, timestamp=1536805310881, value=866925023233621
8669250232336-26 column=cf:eventid, timestamp=1536805310890, value=866925023233621
8669250232336-27 column=cf:eventid, timestamp=1536805310911, value=866925023233621
8669250232336-28 column=cf:eventid, timestamp=1536805310918, value=866925023233621
8669250232336-29 column=cf:eventid, timestamp=1536805310930, value=866925023233621
8669250232336-30 column=cf:eventid, timestamp=1536805310937, value=866925023233621
866925023233621 column=f1:eventid, timestamp=1536805258985, value=866925023233621
866925023233622 column=f1:eventid, timestamp=1536805259053, value=866925023233621
866925023233623 column=f1:eventid, timestamp=1536805259060, value=866925023233621
866925023233624 column=f1:eventid, timestamp=1536805259070, value=866925023233621
866925023233625 column=f1:eventid, timestamp=1536805259078, value=866925023233621
866925023233626 column=f1:eventid, timestamp=1536805259084, value=866925023233621
866925023233627 column=f1:eventid, timestamp=1536805259112, value=866925023233621
866925023233628 column=f1:eventid, timestamp=1536805259119, value=866925023233621
866925023233629 column=f1:eventid, timestamp=1536805259127, value=866925023233621
866925023233630 column=f1:eventid, timestamp=1536805259143, value=866925023233621
30 row(s) in 0.0920 seconds
以下内容设置TTL值,
1.disable 'dc:event'
2. alter "dc:event" ,NAME=>'cf',TTL=>600
alter "dc:event" ,NAME=>'f1',TTL=>600
3. enable 'dc:event'
4. scan 'dc:event'
ROW COLUMN+CELL
866925023233-6-21 column=f2:eventid, timestamp=1536805384815, value=866925023233621
866925023233-6-22 column=f2:eventid, timestamp=1536805384873, value=866925023233621
866925023233-6-23 column=f2:eventid, timestamp=1536805384881, value=866925023233621
866925023233-6-24 column=f2:eventid, timestamp=1536805384890, value=866925023233621
866925023233-6-25 column=f2:eventid, timestamp=1536805384898, value=866925023233621
866925023233-6-26 column=f2:eventid, timestamp=1536805384907, value=866925023233621
866925023233-6-27 column=f2:eventid, timestamp=1536805384922, value=866925023233621
866925023233-6-28 column=f2:eventid, timestamp=1536805384936, value=866925023233621
866925023233-6-29 column=f2:eventid, timestamp=1536805384946, value=866925023233621
866925023233-6-30 column=f2:eventid, timestamp=1536805384958, value=866925023233621
10 row(s) in 0.0740 seconds
对表中原有的cf,f1,f2 列中的cf,f1列设置ttl,时间到之后,cf、f1列的数据会自动清除,f2的数据由于没有设置ttl时间,数据依然还在。
表的TTL修改前后对比:
修改HBASE ttl shell
#!/bin/bash -l
# 针对这一步骤的操作是否需要做回滚操作
# 如果需要,需要查看生产的对应表的ttl,回滚时数据无法回滚
WB_DIR=$(cd $(dirname $0); pwd)
HBASE_NAMESPACE='hochoy'
origin_tables="tabTest1 tabTest2 tabTest3"
alter_ttl="alter_hbase.script"
get_ttl_value(){
years=${1}
ttl=FOREVER
ttl=$(echo "scale = 0; 60 * 60 * 24 * 365 * ${years} " | bc)
echo ${ttl%\.*}
}
gen_alt_script(){
ttl=${1}
echo ''>${WB_DIR}/${alter_ttl}
for table in ${origin_tables}
do
echo "desc '${HBASE_NAMESPACE}:${table}'">>${WB_DIR}/${alter_ttl}
echo "disable '${HBASE_NAMESPACE}:${table}' ">>${WB_DIR}/${alter_ttl}
echo "alter '${HBASE_NAMESPACE}:${table}', {NAME=>'f',TTL=>${ttl} } ">>${WB_DIR}/${alter_ttl}
echo "enable '${HBASE_NAMESPACE}:${table}' ">>${WB_DIR}/${alter_ttl}
echo "desc '${HBASE_NAMESPACE}:${table}'">>${WB_DIR}/${alter_ttl}
done
echo "exit">>${WB_DIR}/${alter_ttl}
}
if [ $# -lt 1 ]; then
echo "Usage:
Input value of TTL please!
"
exit
fi
if [ "${1}" = "FOREVER" ] ;then
gen_alt_script FOREVER
else
ttl=$(get_ttl_value ${1})
gen_alt_script $ttl
fi
cat ${WB_DIR}/${alter_ttl}
hbase shell ${WB_DIR}/${alter_ttl}