1,Flash Cache简介

Flash cache 本身是Facebook的一个开源项目,(准确的说是一个Linux的模块),可以动态加载。Flashcache通过在文件系统(VFS)和设备驱动之间新增了一次缓存层,来实现对热门的缓存。Flashcache是另一种缓存,一般用SSD作为介质的缓存(一般的缓存用的是内存),通过将传统硬盘上的热门数据缓存到SSD上,然后利用SSD优秀的读性能,来加速系统。这个方法较之内存缓存,没有内存快,但是空间可以比内存大很多。如下图:

  现在很多硬件厂商也会在存储设备中增加这个功能来,例如IBM XIV的Flash cache和 EMC的VFCache

2, XIV Flash Cache

从11.1.0开始,XIV Gen3开始支持可选的Flash caching feature,极大的提高小数据块,随机读的性能,适用于Data patterns一直改变的环境。

XIV flash caching is implemented as an extension of the primary cache layer.   每个模块400GB Cache,那一个系统就是6TB(400GB*15),从11.4开始,每个Module能支持800GB。Flash cache 只用于读操作,当不再需要Cache中的数据时,直接drop掉就行了。


XIV(4)--Flash caching_第1张图片

XIV flash caching overview

  Flash caching算法是嵌入在XIV Firmware中的,能自动适应相应的IO类型,对用户透明,不需要管理员手动的做performance turning。    


XIV系统中有2种类型的Cache:main 和 extended  
--The main cache handles host write I/Os and then destages them directly to the disk drive. //Main cache主要处理主机写IO然后直接Flush到Disk    
--The extended cache handles the caching of random read miss operations less than 64 KB. //Extended cache主要处理小于64KB的随机读操作    

Sequential read prefetches (larger than 64 KB) are handled in main dynamic random access memory (DRAM) cache. //大于64KB的顺序读操作是由main cache处理的


3,Flash Cache learning

A flash cache map is built as read misses occur in the DRAM cache. The process, known as flash cache learning


关于XIV中Main cache和Extended Cache(Flash Cache)是如何扮演各自角色的,请看下图:

XIV(4)--Flash caching_第2张图片  

XIV flash cache learning

 

The cache node immediately checks the extended cache for the requested I/O. If the requested I/O exists in the extended flash cache, it is served to the host through the main cache. The I/O operation is now complete and is recorded as a flash cache read hit.


If the operation results in a true read miss (not in the DRAM cache and not in extended flash cache), the request is forwarded in an unmodified state to the disk drive (SAS layer). The I/O is retrieved from the disk drive and served to the host through the main cache. From a host perspective, the I/O operation is now complete and is recorded as a read miss. The related pages are copied into reserved buffers in the main cache.

 

Important: Any read larger than 64 KB bypasses the extended flash cache.

当Buffer中的data达到512KB时,会顺序地写到Flash cache中。这种方式延长了Flash cache中的寿命。


Note: XIV在系统重启和Firmware升级中能保存Flash Cache中的数据    

XIV Storage System software Version 11.2 introduced improved flash caching algorithms,providing a performance boost of up to 4.5 times over systems without flash cache for random database-type workloads. This boost is accomplished by storing and computing all flash cache-related data integrity checking tasks in DRAM rather than on the flash cache.


4,Approaches for Using SSDs in a Storage System

XIV(4)--Flash caching_第3张图片

Approach of the tier. With policies that moved the data … or As a caching layer. This is the approach of XIV.


5,SSD failure

  • No redistribution in case of failure/phase-out

  • SSD Failure:

  -Reinitialize the metadata of the SSD (so it is not used)
  -The degraded module continue to server reads from its DRAM cache and large sequential reads from its disks
  -Small read misses are redirected to the secondary

  • SSD Phase-out and not failed

- Its data is invalidated (on writes)
=> if phased-in not all data is lost


During a rebuild, following a SAS disk failure in a module, the data on SSD in that module is

not invalidated. Rather, it is gradually updated to contain the new data blocks (the same way

the DRAM does).

------------------------------------------------------------------------------

后记:关于EMC VFCache和IBM Flash Cache,感兴趣的可以看看比特网的下图对比。

参考文章:http://storage.chinabyte.com/223/12261223.shtml