也就是说原始的想法就是向如何将随机的io操作转换成顺序的io写操作,下面可能需要考虑的问题就是基于LSM这种数据结构如何进行insert, delete, update操作。这篇论文中写的很详细,如下:
数据恢复:这里使用了通常的做法Write Ahead Log来实现,保证数据的可靠性。
Simple cost-benefit arguments allow one to compute the break-even point for trading central-memory residence against disc accesses for data. If an item is accessed frequently enough, it should be main memory resident. In current technology, “frequently enough” means about every five minutes.如果数据被频繁的使用的话,那么就尽量将数据写入到内存中,这里的“频繁”具体是指五分钟,也就是如果数据被使用的频率在5分钟以内,那就尽量经该数据写入到内存中去。
Changing topics, another interesting questionis: “When does it makeeconomic sense to use more memory to save some cpu power?以空间换取时间的想法”, or conversely save some memory atthe expense of some cpu cycles? This issue comes up in codeoptimization where one can save some instructions by unwinding loops and indata structure design where one can pack data at the expense of masking andshifting operations to extract the data.
The logic is quitesimilar to the Five Minute Rule. Onepicks a certain price for memory (say 5K$/MB) and a certain price per MIP (say50K$/MIP). This means that 5 bytes costabout .005$. Similarly, one instructionper second costs about .005$. So 5 bytescosts about as much as 1 instruction per second.This gives the rule
Spend 5 bytes of main memory to save 1 instruction per second