一个学期过去, 看了不少HBase的source code.
趁假期稍作总结, 加深理解.
看得source code版本是0.98.9 stable version中最新的了. (其实已经有0.99了, 估计还在测试...题外话, 实验室学长很厉害, 修了HBase的一个bug, 下个version的contributors应该有他, 羡慕!! 向他学习!
不会贴代码. 有些class长达也是几千行(这个interface还比较短, 可以贴一贴...), 估计没人看得下去除了我们搞研究的, 写博客实在也是快餐式分享和学习. 也不可能像专家们图文并茂写得非常好. 学生精力毕竟有限, 稍作记录, 便继续埋头看更多的code. 望各位看官勿吐槽. 交流欢迎评论.
Cell是HBase中最基本的存储单元了, 它包含的信息有: row, column family, column qualifier, timestamp, type, mvcc version, value, tag.
其中Cell的唯一性由: row, column family, timestamp, type五者组合而决定.
row就是众所周知的行的值, 就是row key;
column family中文翻译是列族, 其实我的理解就一个列了
column qualifier也是一个列, family一个族, qualifier是这个族里面的一个子列, 我是这样理解的.
column qualifier是非必需的, 但column family是必需的. 意思就是创建table, 或者插入更新数据的时候, qualifier可以没用, 但必须指定column family.
timestamp是一个时间戳, 其实是一种concurrency的protocol, 叫timestamp protocal, 这个protocal的使用目的是可以避免使用暴力的锁机制, 主要用于维持数据的atomic. timestamp的获取方式默认是当前时间. 然后时间上最新的cell会排在最前, 以保证给人看到的数据是最新的, 这是timestamp protocol的最大特点.
type是指操作类型, 一般都是写操作会被赋予类型, 读操作则没有. 类型一般是"put", "delete"诸如此类需要更改数据的.
mvcc version. multiple version control concurrency. 多版本并发控制. 因为数据有多个版本, 所以可以保证每次数据读取都会读到, 而不用因为锁而需要等待神马的, 而且是能读到的数据中最新的. (有可能数据正在背修改, 但因为没有被commited, 所以不会被看到)
value就是这个row中存放的值, 这个很容易理解.
tags. tag的数目不唯一, 可能一个cell中有很多tags. 而且tag也是非必需的. (图来自网络)
例如上图中, URI和Parser都是qualifier, =号右边的就是value, 其它都有标识, 不再累述.
Cell在source code中只是一个接口interface, 提供方法, 不提供实现(相信大家都知道), 那有哪些方法呢? (Deprecated的不解释)
对于row, column family, column qualifier, value, tag这个几个信息, Cell都提供了相应的getArray(), getOffset(), getLength()三个操作.
第一个和第三个不用解释, 大家都清楚; 第二个就是获取信息在数组中第一次出现的位置, 例如可以数组中row的第一个信息在index 5, 则getRowOffset()则等于5.
还有getTimeStamp(), getType(), getSequenceId().
诶? 第三个什么鬼? sequenceId是Hadoop的HDFS中存储数据需要用到的信息. 所以这里的sequenceId是作用于当数据需要写入到HFile的这个阶段.
由于Cell是最基本, 也可以说最底层的物理存储模式. 所以以上所有信息的存储单位就是Byte, 一个字节. 所以对应的数组也是byte arrays.
顺带一题, 每个信息的arrays长度是有限制的.
row的array长度最大是Short.Max_Value: 32,767 bytes; (不是byte为单位么? 怎么Short? 说明它会以两个byte来计算row的长度, 仅此而已).
family的array长度最大是Byte.Max_Value: 127 bytes.
qualifier的array长度最大是Short.Max_Value: 32,767 bytes.
value的array长度最大是Integer.Max_Value: 2,147,483,648 bytes.
timestamp 和 sequenceId 最长都是Long.Max_Value: (位数太多了, 不列了, 就2的63次方).
type就是一个bype类型, 最长也就是Byte.Max_Value.
(顺带复习一下Java基础类型吧→ →,:
byte 1个字节, short 2个字节, int 4个字节, long 8个字节, float 4个字节, double 8个字节, char 2个字节)
计算长度时, 只需要2的n bits次方除以2, 因为考虑到负数.
好. 如此.
2015/01/24
from Reid Chan
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.hbase;
import org.apache.hadoop.hbase.classification.InterfaceAudience;
import org.apache.hadoop.hbase.classification.InterfaceStability;
/**
* The unit of storage in HBase consisting of the following fields:
*
* 1) row
* 2) column family
* 3) column qualifier
* 4) timestamp
* 5) type
* 6) MVCC version
* 7) value
*
*
* Uniqueness is determined by the combination of row, column family, column qualifier,
* timestamp, and type.
*
* The natural comparator will perform a bitwise comparison on row, column family, and column
* qualifier. Less intuitively, it will then treat the greater timestamp as the lesser value with
* the goal of sorting newer cells first.
*
* This interface should not include methods that allocate new byte[]'s such as those used in client
* or debugging code. These users should use the methods found in the {@link CellUtil} class.
* Currently for to minimize the impact of existing applications moving between 0.94 and 0.96, we
* include the costly helper methods marked as deprecated.
*
* Cell implements Comparable which is only meaningful when comparing to other keys in the
* same table. It uses CellComparator which does not work on the -ROOT- and hbase:meta tables.
*
* In the future, we may consider adding a boolean isOnHeap() method and a getValueBuffer() method
* that can be used to pass a value directly from an off-heap ByteBuffer to the network without
* copying into an on-heap byte[].
*
* Historic note: the original Cell implementation (KeyValue) requires that all fields be encoded as
* consecutive bytes in the same byte[], whereas this interface allows fields to reside in separate
* byte[]'s.
*
*/
@InterfaceAudience.Public
@InterfaceStability.Evolving
public interface Cell {
//1) Row
/**
* Contiguous raw bytes that may start at any index in the containing array. Max length is
* Short.MAX_VALUE which is 32,767 bytes.
* @return The array containing the row bytes.
*/
byte[] getRowArray();
/**
* @return Array index of first row byte
*/
int getRowOffset();
/**
* @return Number of row bytes. Must be < rowArray.length - offset.
*/
short getRowLength();
//2) Family
/**
* Contiguous bytes composed of legal HDFS filename characters which may start at any index in the
* containing array. Max length is Byte.MAX_VALUE, which is 127 bytes.
* @return the array containing the family bytes.
*/
byte[] getFamilyArray();
/**
* @return Array index of first family byte
*/
int getFamilyOffset();
/**
* @return Number of family bytes. Must be < familyArray.length - offset.
*/
byte getFamilyLength();
//3) Qualifier
/**
* Contiguous raw bytes that may start at any index in the containing array. Max length is
* Short.MAX_VALUE which is 32,767 bytes.
* @return The array containing the qualifier bytes.
*/
byte[] getQualifierArray();
/**
* @return Array index of first qualifier byte
*/
int getQualifierOffset();
/**
* @return Number of qualifier bytes. Must be < qualifierArray.length - offset.
*/
int getQualifierLength();
//4) Timestamp
/**
* @return Long value representing time at which this cell was "Put" into the row. Typically
* represents the time of insertion, but can be any value from 0 to Long.MAX_VALUE.
*/
long getTimestamp();
//5) Type
/**
* @return The byte representation of the KeyValue.TYPE of this cell: one of Put, Delete, etc
*/
byte getTypeByte();
//6) MvccVersion
/**
* @deprecated as of 1.0, use {@link Cell#getSequenceId()}
*
* Internal use only. A region-specific sequence ID given to each operation. It always exists for
* cells in the memstore but is not retained forever. It may survive several flushes, but
* generally becomes irrelevant after the cell's row is no longer involved in any operations that
* require strict consistency.
* @return mvccVersion (always >= 0 if exists), or 0 if it no longer exists
*/
@Deprecated
long getMvccVersion();
/**
* A region-specific unique monotonically increasing sequence ID given to each Cell. It always
* exists for cells in the memstore but is not retained forever. It will be kept for
* {@link HConstants#KEEP_SEQID_PERIOD} days, but generally becomes irrelevant after the cell's
* row is no longer involved in any operations that require strict consistency.
* @return seqId (always > 0 if exists), or 0 if it no longer exists
*/
long getSequenceId();
//7) Value
/**
* Contiguous raw bytes that may start at any index in the containing array. Max length is
* Integer.MAX_VALUE which is 2,147,483,648 bytes.
* @return The array containing the value bytes.
*/
byte[] getValueArray();
/**
* @return Array index of first value byte
*/
int getValueOffset();
/**
* @return Number of value bytes. Must be < valueArray.length - offset.
*/
int getValueLength();
/**
* @return the tags byte array
*/
byte[] getTagsArray();
/**
* @return the first offset where the tags start in the Cell
*/
int getTagsOffset();
/**
* @return the total length of the tags in the Cell.
*/
int getTagsLength();
/**
* WARNING do not use, expensive. This gets an arraycopy of the cell's value.
*
* Added to ease transition from 0.94 -> 0.96.
*
* @deprecated as of 0.96, use {@link CellUtil#cloneValue(Cell)}
*/
@Deprecated
byte[] getValue();
/**
* WARNING do not use, expensive. This gets an arraycopy of the cell's family.
*
* Added to ease transition from 0.94 -> 0.96.
*
* @deprecated as of 0.96, use {@link CellUtil#cloneFamily(Cell)}
*/
@Deprecated
byte[] getFamily();
/**
* WARNING do not use, expensive. This gets an arraycopy of the cell's qualifier.
*
* Added to ease transition from 0.94 -> 0.96.
*
* @deprecated as of 0.96, use {@link CellUtil#cloneQualifier(Cell)}
*/
@Deprecated
byte[] getQualifier();
/**
* WARNING do not use, expensive. this gets an arraycopy of the cell's row.
*
* Added to ease transition from 0.94 -> 0.96.
*
* @deprecated as of 0.96, use {@link CellUtil#getRowByte(Cell, int)}
*/
@Deprecated
byte[] getRow();
} |