参考
《Linux内核设计与实现》
*******************************************
页高速缓存是linux内核实现的一种主要磁盘缓存,它主要用来减少对磁盘的IO操作,具体地讲,是通过把磁盘中的数据缓存到物理内存中,把对磁盘的访问变为对物理内存的访问。为什么要这么做呢?一,速度;二临时局部原理。有关这两个概念,相信熟悉操作系统的我们不会太陌生。页高速缓存是由RAM中的物理页组成的,缓存中的每一页都对应着磁盘中的多个块。每当内核开始执行一个页IO操作时,就先到高速缓存中找。这样就可以大大减少磁盘操作。
一个物理页可能由多个不连续的物理磁盘块组成。也正是由于页面中映射的磁盘块不一定连续,所以在页高速缓存中检测特定数据是否已被缓存就变得不那么容易了。另外linux页高速缓存对被缓存页的范围定义的非常宽。缓存的目标是任何基于页的对象,这包含各种类型的文件和各种类型的内存映射。为了满足普遍性要求,linux使用定义在linux/fs.h中的结构体address_space结构体描述页高速缓存中的页面,如下:
struct address_space {
struct inode *host; /* owning inode */
struct radix_tree_root page_tree; /* radix tree of all pages */
spinlock_t tree_lock; /* page_tree lock */
unsigned int i_mmap_writable; /* VM_SHARED ma count */
struct prio_tree_root i_mmap; /* list of all mappings */
struct list_head i_mmap_nonlinear; /* VM_NONLINEAR ma list */
spinlock_t i_mmap_lock; /* i_mmap lock */
atomic_t truncate_count; /* truncate re count */
unsigned long nrpages; /* total number of pages */
pgoff_t writeback_index; /* writeback start offset */
struct address_space_operations *a_ops; /* operations table */
unsigned long flags; /* gfp_mask and error flags */
struct backing_dev_info *backing_dev_info; /* read-ahead information */
spinlock_t private_lock; /* private lock */
struct list_head private_list; /* private list */
struct address_space *assoc_mapping; /* associated buffers */
};
struct address_space_operations {
int (*writepage)(struct page *, struct writeback_control *);
int (*readpage) (struct file *, struct page *);
int (*sync_page) (struct page *);
int (*writepages) (struct address_space *, struct writeback_control *);
int (*set_page_dirty) (struct page *);
int (*readpages) (struct file *, struct address_space *,struct list_head *, unsigned);
int (*prepare_write) (struct file *, struct page *, unsigned, unsigned);
int (*commit_write) (struct file *, struct page *, unsigned, unsigned);
sector_t (*bmap)(struct address_space *, sector_t);
int (*invalidatepage) (struct page *, unsigned long);
int (*releasepage) (struct page *, int);
int (*direct_IO) (int, struct kiocb *, const struct iovec *,loff_t, unsigned long);
};
background-color: rgb(255, 255, 255);">这里面最重要的两个就是readpage()与writepage()了。对于readpage()方法而言,首先,一个address_space对象和一个偏移量会被传给该方法,这两个参数用来在页高速缓存中搜素需要的数据:
page = find_get_page(mapping, index);
mapping是指定的地址空间,index是文件中的指定位置。如果要搜索的页并没在高速缓存中,那么内核将分配一个新页面,然后将其加入到页高速缓存中,如下
int error;
cached_page = page_cache_alloc_cold(mapping);
if (!cached_page)
/* error allocating memory */
error = add_to_page_cache_lru(cached_page, mapping, index, GFP_KERNEL);
if (error)
/* error adding page to page cache */
page = __grab_cache_page(mapping, index, &cached_page, &lru_pvec);
status = a_ops->prepare_write(file, page, offset, offset+bytes);
page_fault = filemap_copy_from_user(page, offset, buf, bytes);
status = a_ops->commit_write(file, page, offset, offset+bytes);