BIGTABLE
一个结构化数据的分布式数据存储系统,主要解决大数据量存储的问题。
数据模型为:
Rows 通过行键来指定,例如com.cnn.www
Column Families Column keys are grouped into sets called column families,
which form the basic unit of access control. All data
stored in a column family is usually of the same type (we
compress data in the same column family together). A
column family must be created before data can be stored
under any column key in that family; after a family has
been created, any column key within the family can be
used. It is our intent that the number of distinct column
families in a table be small (in the hundreds at most), and
that families rarely change during operation. In contrast,
a table may have an unbounded number of columns.
Timestamps 时间戳,用于指定版本号
Bigtable的底层存储为例如GFS之类的分布式文件系统。
The Google SSTable _le format is used internally to
store Bigtable data. An SSTable provides a persistent,
ordered immutable map from keys to values, where both
keys and values are arbitrary byte strings.
Bigtable relies on a highly-available and persistent
distributed lock service called Chubby
Chubby我个人理解为是一种文件锁,例如table server,master等存活检测,或者是表是否存在的检测都需要依靠它。
Each tablet is assigned to one tablet server at a time. The
master keeps track of the set of live tablet servers, and
the current assignment of tablets to tablet servers, including
which tablets are unassigned. When a tablet is
unassigned, and a tablet server with suf_cient room for
the tablet is available, the master assigns the tablet by
sending a tablet load request to the tablet server.
Bigtable uses Chubby to keep track of tablet servers.
When a tablet server starts, it creates, and acquires an
exclusive lock on, a uniquely-named _le in a speci_c
Chubby directory. The master monitors this directory
(the servers directory) to discover tablet servers. A tablet
server stops serving its tablets if it loses its exclusive
lock: e.g., due to a network partition that caused the
server to lose its Chubby session. (Chubby provides an
ef_cient mechanism that allows a tablet server to check
whether it still holds its lock without incurring network
traf_c.) A tablet server will attempt to reacquire an exclusive
lock on its _le as long as the _le still exists. If the
_le no longer exists, then the tablet server will never be
able to serve again, so it kills itself. Whenever a tablet
server terminates (e.g., because the cluster management
system is removing the tablet server's machine from the
cluster), it attempts to release its lock so that the master
will reassign its tablets more quickly.
Table server的工作原理:
更多细节可以阅读:
http://labs.google.com/papers/bigtable.html