阅读更多
Sounds impressive and cool. I wonder about Table engine performance though.
My experiences with Berkeley DB (using the native C API) were positive, from a performance point of view, largely because BDB has no built-in query support. When I wanted to look something up, and I had no index on the desired value (BDB calls indices "secondary databases"), I always knew I was doing a linear table scan, because I had to write the loop myself. No SQL meant that inefficient queries could not hide in plain sight.
Berkeley DB trades off the flexibility of SQL queries for (1) excellent lookup performance in simpler cases, and (2) the ability to mix relational and object-based data storage, reducing the famous impedance mismatch of relational data modeling. For some applications, this trade-off works beautifully. I daresay that it holds true for most web apps which really don't need a relational data store, and either outgrew pure in-memory storage or just sensibly want to persist their state to disk. (Hint: if you use an ORM, then you probably don't want a relational data store.)
For some other apps, the lack of SQL queries hurts. I once worked on a trading system implemented on top of a proprietary engine which did not support arbitrary queries. It made writing even trivial reports ("list all transactions done two days ago by trader X using instrument Y") unnecessarily painful. (Of course, on a typical RDB with a non-trivial schema and a large dataset, these arbitrary queries would take 20 minutes to run, so catch-22.)
The Table engine for Tokyo Cabinet seems to do everything implicitly. Fast and arbitrary queries. If it works on the kind of schema-less free-form data the author refers to, and has working master-master replication, that sounds almost too good to be true.