《Understanding the Linux kernel》学习笔记 Chapter 12: The Virtual Filesystem

Linux manages to support multiple filesystem types in the same way other Unix variants do, through a concept called the Virtual Filesystem.

The idea behind the Virtual Filesystem is to put a wide range of information in the kernel to represent many different types of filesystems; there is a field or function to support each operation provided by all real filesystems supported by Linux.


12.1 The Role of the Virtual Filesystem (VFS)

The Virtual Filesystem (also known as Virtual Filesystem Switch or VFS) is a kernel software layer that handles all system calls related to a standard Unix filesystem. Its main strenght is providing a common interface to several kinds of filesystems.

Filesystems supported by the VFS may grouped into three main classes: Disk-based filesystems, Network filesystems, Special filesystems.

Unix directories build a tree whose root is the / directory. The root directory is contained in the root filesystem, which in Linux, is usually of type Ext2 or Ext3. All other filesystems can be "mounted" on subdirectories of the root filesystem.


12.2.1 The Common File Model

The key idea behind the VFS consists of introducing a common file model capable of representing all supported filesystems.

The common file model consists of the following object types:

  • The superblock object  stores information concerning a mounted filesystem.
  • The inode object  stores general information about a specific file.
  • The file object  stores information about the interaction between an open file and a process.
  • The dentry object stores information about the linking of a directory entry (that is, a particular name of the file) with the corresponding file.
The most recently used dentry objects are contained in a disk cache named the dentry cache, which speeds up the translation from a file pathname to the inode of the last pathname component.

12.2.2 System Calls Handled by the VFS


12.2 VFS Data Structures

12.2.1 Superblock Objects

12.2.2 Inode Objects

All information needed by the filesystem to handle a file is included in a data structure called an inode. A filename is a casually assigned label that can be changed, but the inode is unique to the file and remains the same as long as the file exists.


12.2.3 File Objects

A file object describes how a process interacts with a file it has opened. The object is created when the file is opened and consists of a file structure.


12.2.4 dentry Objects

Once a directory entry is read into memory, it is transformed by the VFS into a dentry object based on the dentry structure.

The kernel creates a dentry object for every component of a pathname that a process looks up; the dentry object associates the component to its corresponding inode.


12.2.5 The dentry Cache

To maximize efficiency in handling dentries, Linux uses a dentry cache, which consists of two kinds of data structures:

  • A set of dentry objects in the in-use, unused, or negative state.
  • A hash table to derive the dentry object associated with a given filename and a given directory quickly.

12.2.6 Files Associated with a Process


12.3 Filesystem Types

12.3.1 Special Filesystems

Special filesystemss may provide an easy way for system programs and administrators to manipulate the data structures of the kernel and to implement special features of the operating system.


12.3.2 Filesystem Type Registration

The VFS must keep track of all filesystem types whose code is currently included in the kernel.

Each registered filesystem is represented as a file_system_type object.

All filesystem-type objects are inserted into a singly linked list.


12.4 Filesystem Handling

Being a tree of directories, every filesystem has its own root directory. The directory on which a filesystem is mounted is called the mount point. A mounted filesystem is a child of the mounted filesystem to which the mount point directory belongs.


12.4.1 Namespaces

Every process might have its own tree of mounted filesystems -- the so-called namespace of the process.

A process gets a new namespace if it is created by the clone() system call with the CLONE_NEWS flag set.

When a process mount -- or unmounts -- a filesystem, it only modifies its namespace.

The namespace of a process is represented by a namespace structure pointed to by the namespace field of the process descriptor.


12.4.2 Filesystem Mounting

Linux is possible to mount the same filesystem several times.


12.4.3 Mounting a Generic Filesystem

12.4.4 Mounting the Root Filesystem

Linux kernel allows the root filesystem to be stored in many different places.

Mounting the root filesystem is a two-stage procedure, shown in the following list:

  1. The kernel mounts the special rootfs filesystem, which simply provides an empty directory that serves as initial mount point.
  2. The kernel mounts the real root filesystem over the empty directory.

12.4.5 Unmounting a Filesystem


12.5 Pathname Lookup

12.5.1 Standard Pathname Lookup

12.5.2 Parent Pathname Lookup

12.5.3 Lookup of Symbolic Links


12.6 Implementations of VFS System Calls

12.6.1 The open() System Call

The open() system call is serviced by the sys_open() function, which receives as its parameters the pathnamefilename of the file to be opened, some access mode flags flags, and a permission bit maskmode if the file must be created.


12.6.2 The read() and write() System Calls

12.6.3 The close() System Call


12.7 File Locking

12.7.1 Linux File Locking

12.7.2 File-Locking Data Structures

12.7.3 FL_FLOCK Locks

12.7.4 FL_POSIX Locks


你可能感兴趣的:(ULK)