LMDB:轻量级内存映射数据库-----介绍(翻译)

1.     概述

LMDB is compact(紧凑的), fast,powerful, and robust and implements a simplified variant of the BerkeleyDB(BDB) API. (BDB is also very powerful, and verbosely documented in its ownright.) After reading this page, the main \ref mdb documentation should make sense.Thanks to Bert Hubert for creating the initial version of this writeup.

补充介绍:

LMDB的全称是LightningMemory-Mapped Database,闪电般的内存映射数据库。它文件结构简单,一个文件夹,里面一个数据文件,一个锁文件。数据随意复制,随意传输。它的访问简单,不需要运行单独的数据库管理进程,只要在访问数据的代码里引用LMDB库,访问时给文件路径即可。

2.     使用流程?

1)    先打开环境:

Everything starts with anenvironment, created by #mdb_env_create().Once created, this environment mustalso be opened with #mdb_env_open().

#mdb_env_open() gets passed a name which isinterpretedas a directory path. Note that thisdirectory must exist already(目录路径必须已经存在), it is not created foryou. Within that directory,a lock file and a storagefile will be generated(产生一个锁文件和存储文件). If you don't want to use adirectory, you can pass the#MDB_NOSUBDIRoption, in which case the path you provided is used directly as the data file,and another file with a "-lock" suffix added will be used for thelock file.

2)    开始事务

Once the environment is open, a transactioncan be created within it using#mdb_txn_begin().Transactions may be read-write or read-only, and read-write transactions may benested(嵌套的).A transaction must only be used by onethread at a time(一个事务必须同时只有一个线程执行). Transactions are alwaysrequired, even for read-only access. The transaction provides a consistent viewof the data.

3)    打开数据库

Once a transaction has been created, adatabase can be opened within it using#mdb_dbi_open().If only one database will ever be used in the environment, a NULLcan be passed as the database name. For named databases, the#MDB_CREATE flag must be used to create the database ifit doesn't already exist. Also,#mdb_env_set_maxdbs()must be called after #mdb_env_create() and before#mdb_env_open() (create之后,open之前设置支持的最大数据库个数)to set the maximum number of named databases you want to support.

Note: a single transaction can openmultiple databases. Generally databases should only be opened once, by thefirst transaction in the process. After the first transaction completes, the databasehandles can freely be used by all subsequent transactions.

4)    数据获取和设置

Within a transaction, #mdb_get() and #mdb_put()can store single key/value pairs if that is all you need to do (but see \refCursors below if you want to do more).

A key/value pair is expressed as two#MDB_val structures. This struct has two fields, \c mv_size and \c mv_data. Thedata is a \c void pointer to an array of \c mv_size bytes.

Because LMDB is very efficient (and usuallyzero-copy), the data returned in an #MDB_val structure may bememory-mapped straight from disk(内存映射的数据). In otherwords look but do not touch (or free() for that matter).Once a transaction is closed, the values can no longer beused, so make a copy if you need to keep them after that. (当关闭数据库,get获取的数据将不能再使用,因而我们需要拷贝一个副本)

3.     游标

@section Cursors Cursors

To do more powerful things, we must use acursor.

Within the transaction, a cursor can be created with #mdb_cursor_open()(在一个特定事务中,通过mdb_cursor_open创建游标). Withthis cursor we can store/retrieve/delete (multiple)values using #mdb_cursor_get(), #mdb_cursor_put(), and #mdb_cursor_del()(通过此游标可以存/取/删除数据项).

#mdb_cursor_get() positions itselfdepending on the cursor operation requested, and for some operations, on thesupplied key. For example, to list all key/value pairs in a database, use operation #MDB_FIRST for the first call to#mdb_cursor_get(), and #MDB_NEXT on subsequent calls, until the end is hit(用MDB_FIRST/ MDB_NEXT去遍历所有数据项).

To retrieve all keys starting from aspecified key value, use #MDB_SET(获取所有的关键字,用MDB_SET操作). Formore cursor operations, see the \ref mdb docs.

When using#mdb_cursor_put(), either the function will position the cursor for youbased on the \b key, or you can use operation #MDB_CURRENT to use the currentposition of the cursor. Note that \b key must then match the current position'skey.

4.     小结

@subsection summary Summarizing the Opening

So we have a cursor in a transaction whichopened a database in an environment which is opened from a filesystem after itwas separately created.

Or, we create an environment, open it froma filesystem, create a transaction within it, open a database within thattransaction, and create a cursor within all of the above.

Got it?

5.     多线程/多进程

@section thrproc Threads and Processes 

LMDB uses POSIXlocks on filesPOSIX文件锁), and these locks have issues if one processopens a file multiple times(一个进程多次打开会有问题). Because of this, do not #mdb_env_open() a file multiple times froma single process. Instead, share the LMDB environmentthat has opened the file across all threads(应该在所有线程中共享数据库环境). Otherwise, if a single process opens the same environment multipletimes, closing it once will remove all the locks held on it, and the other instanceswill be vulnerable(易受攻击的) to corruption from other processes.-----意思就是说:一个进程只能打开一个环境一次,此环境在此进程的所有线程中共享。

Also note that a transaction is tied to onethread by default usingThread Local Storage(默认情况下是通过线程局部存储完成一个事务). If you want to pass read-only transactions across threads, you canuse the #MDB_NOTLS option on the environment.(如果在在多线程中传递 read-only事务,用MDB_NOTLS选项

6.     数据操作事务

@section txns Transactions, Rollbacks, etc.

To actually get anything done, atransaction must becommitted using #mdb_txn_commit()(完成一个事务,必须调用commitAPI). Alternatively, all of a transaction's operations can be discardedusing #mdb_txn_abort()(丢弃所有事务用abort API. In a read-only transaction, anycursors will \b not automatically be freed. In aread-writetransaction, all cursors will be freed and must notbe used again.

For read-only transactions, obviously thereis nothing to commit to storage. The transaction still must eventually be abortedto close any database handle(s) opened in it, or committed to keep the databasehandles around for reuse in new transactions.

In addition, aslong as a transaction is open, a consistent view of the database is kept alive(只要事务一直打开着,数据操作是连续的?), which requires storage. A read-only transaction that no longerrequires this consistent view should be terminated (committed or aborted) whenthe view is no longer needed (but see below for an optimization).

There can be multiplesimultaneously active read-only transactions but only one that can write.Once a single read-write transaction is opened, allfurther attempts to begin one will block until the first one is committed oraborted(只要有一个人以读写打开数据库,后续的读写操作将被阻塞,直到第一个读写操作的人commit 或abort). This has no effect on read-only transactions, however, and theymay continue to be opened at any time.

7.     重复关键词(一对多模型)

@section dupkeys Duplicate Keys

#mdb_get() and #mdb_put() respectively haveno and only some support for multiple key/value pairs with identical keys. Ifthere are multiple values for a key,#mdb_get() will only return the first value.(对于重复关键词,get操作只返回第一个数据)

When multiple values for one key arerequired, pass the #MDB_DUPSORT flag to#mdb_dbi_open()(如果要支持一key多值,需要用MDB_DUPSORT打开dbi. In an #MDB_DUPSORT database, by default #mdb_put()will not replace the value for a key if the key existed already. Instead itwill add the new value to the key. In addition, #mdb_del() willpayattention tothe value field too, allowing for specific values of akey to be deleted.(对于一keyvalue数据,put操作不会替代已有value,而是增加一项数据;删除操作也只删除此key下面的特定value

Finally, additional cursor operationsbecome available for traversing through and retrieving duplicate values.

8.     优化

@section optim Some Optimization

If you frequently begin and abort read-onlytransactions, as an optimization, it is possible to only reset and renew atransaction.(如果频繁对事务进行begin和abort操作,lib可能仅仅只reset和renew一项事务?)

#mdb_txn_reset() releasesany old copies of data kept around for a read-only transactionreset操作将是否只读事务获取的所有数据). To reuse this resettransaction, call #mdb_txn_renew() on it. Any cursors in this transaction mustalso be renewed using #mdb_cursor_renew().(要重用reset之后的事务,调用renew接口,游标也需要用renew操作进行重置)

Note that #mdb_txn_reset() is similar to#mdb_txn_abort() and will close any databases you opened within thetransaction.(reset和abort类型,会关闭所有数据库)

To permanently free a transaction, reset ornot, use #mdb_txn_abort().

9.     清理

@section cleanup Cleaning Up

For read-only transactions, any cursorscreated within it must be closed using #mdb_cursor_close().(已创建的游标必须通过close清理)

It is very rarely necessary to close a database handle(很少需要关闭数据库句柄,保持它一直打开就行?), and in general they should just be left open.

10.展望

@section onward The Full API

The full \ref mdb documentation listsfurther details, like how to:

  \lisize a database (the default limits are intentionally small)

  \lidrop and clean a database

  \lidetect and report errors

  \lioptimize (bulk) loading speed

  \li(temporarily) reduce robustness to gain even more speed

  \ligather statistics about the database

  \lidefine custom sort orders

说明:本文是lmdb源码中文件intro.doc的简单翻译,详细使用demo请阅读后续博文。



你可能感兴趣的:(开源库,数据库,lmdb)