LSM-Tree(32)

3.3 Multi-Component LSM-trees(3)

Now there is a canonical size for C0 determined by the point at which the total cost of the LSM- tree, memory cost for C0 plus media/disk arm cost for the C1 component, is minimized. To arrive at this balance, we start with a large C0 component and pack the C1 component closely on disk media. If the C0 component is sufficiently large, we will have a very small I/O rate to C1. We can now decrease the size of C0, trading off expensive memory for inexpensive disk space, until the I/O rate to service C1 increases to a point where the disk arms sitting over the C1 component media are running at full rate. At this point, further savings in memory cost for C0 will result in increased media cost, as we are required to spread out the C1 component over fractionally full disks to reduce the disk arm load, and at some point as we continue to shrink C0 we will reach a minimum cost point. Now it is common in the two component LSM-tree that the canonical size we determine for C0 will still be quite expensive in terms of memory use. An alternative is to consider adopting an LSM-tree of three or more components. Conceptually, if the size of the C0 component is so large that the memory cost is a significant factor, then we consider creating another intermediate size disk based component between the two extremes. This will permit us to limit the cost of disk arms while reducing the size of the C0 component.
现在,C0有一个标准大小,由LSM树的总成本(C0的内存成本加上C1组件的媒体/磁盘臂成本)最小的点决定。为了达到这一平衡,我们从一个大的C0组件开始,并将C1组件紧密地封装在磁盘介质上。如果C0组件足够大,C1的I/O率将非常小。现在,我们可以减小C0的大小,以昂贵的内存换取廉价的磁盘空间,直到为C1服务的I/O速率增加到某个点,即位于C1组件媒体上的磁盘臂以全速运行。在这一点上,进一步节省内存C0将导致成本增加媒体成本,我们需要分散C1组件在略微满磁盘来减少磁盘臂负载,并在某种程度上我们继续缩小C0将达到一个最低成本点。现在,在两个组件lsm树中,我们为C0确定的规范大小在内存使用方面仍然非常昂贵,这是很常见的。另一种方法是考虑采用由三个或更多组件组成的lsm树。从概念上讲,如果C0组件的大小非常大,内存成本是一个重要因素,那么我们考虑在这两个极端之间创建另一个中等大小的基于磁盘的组件。这将允许我们限制磁盘臂的成本,同时减少C0组件的尺寸。

todo:自己翻译

你可能感兴趣的:(LSM-Tree(32))