Paging: Introduction
Crux: How to virtualize memory without segments?
Here comes paging.
Idea: Instead of splitting up the address space into three logical segments (each of variable size), we split up the address space into fixed-sized units we call a page. Here is an example:
The address space in the example above is divided into 4 pages, with each page of 16 bytes.
With paging, physical memory is also split into some number of pages as well; we call each page of physical memory a page frame:
Advantages of Paging
- Flexibility: The system can support the abstraction of an address space effectively, regardless of how the processes uses the address space. We thus don't need to make assumptions about how the heap and stack grow and how they are used
- Simplicity: The OS keeps a free list of all free pages, and when the program with 64 bytes comes, the OS only needs to find four free pages for it.
In order to record where each virtual page of the address space is placed in physical memory, the OS keeps a per-process data structure known as a page table. The major role of a page table is to store address translations for each of the virtual pages of the address space, thus letting us know where in physical memory they live. In the example above, the page table would have the following entries: (Virtual Page 0 -> Physical Frame 3), (VP 1 -> PF 7), (VP 2 -> PF 5), and (VP 3 -> PF 2).
Now, consider that there is an instruction that loads the data at a virtual address into a register:
movl , %eax
To translate this virtual address into the physical address, we have to first split it into two components: the virtual page number (VPN), and the offset within the page. For this example, as the virtual address space of the process is 64 bytes, we need 6 bits total for the virtual address (2^6 = 64):
As there are four pages altogether, we need 2 bits to select page (namely, the VPN) and 4 bits to demonstrate the 16 bytes within a page:
Now let's assume the virtual address is 21, which in binary form is:
With the virtual page number 1, we can now index our page table and find which physical page that virtual page 1 resides within. In the page table above the physical page number (PPN) (a.k.a. physical frame number or PFN) is 7 (binary 111). Thus, we can translate this virtual address by replacing the VPN with the PFN and then issue the load to physical memory:
The VPN 1 has been replaced with PFN 111, but the offset remains the same. The final physical address is 1110101 (117 in decimal).
Where Are Page Tables Stored?
Page tables can get extremely large, much larger than the small segment table or base/bounds pair we have discussed previously. Because they are so big, we don't keep any special on-chip hardware in the MMU to store page table of the currently-running process. Instead, we store the page table for each process in memory somewhere.
What's Actually In The Page Table?
The simplest form is a linear page table, which is just an array. The OS indexes the array by the VPN, and looks up the page-table entry (PTE) at that index in order to get the desired PFN.
As for the contents of each PTE, we have a number of different bits. A valid bit is common to indicate whether the particular translation is valid. For example, all the unused space between a running process's stack and heap are marked invalid. Thus, the valid bit is crucial for supporting a sparse address space.
We also might have protection bits, indicating whether the page could be read from, written to, or executed from.
A present bit indicates whether this page is in physical memory or on disk (swapped out).
A dirty bit indicates whether the page has been modified since it was brought into memory.
A reference bit (a.k.a, access bit) is sometimes used to track whether a page has been accessed, which is useful in determining which pages are popular and thus should be kept in memory.
Paging: Also Too Slow
Still consider the load instruction before:
movl 21, %eax
The system must first fetch the proper page table entry from the process's page table, perform the translation and then finally get the desired data from physical memory. To do so, the hardware must know where the page table is for the currently-running process. Assume that a single page-table base register contains the physical address of the starting location of the page table. To find the location of the desired PTE, the hardware will thus perform the following functions:
VPN = (VirtualAddress & VPN_MASK) >> SHIFT;
PTEAddr = PageTableBaseRegister + (VPN * sizeof(PTE));
In our example, VPN_MASK
would be 0x30 (110000 in binary) to pick out the VPN bits from the full virtual address; SHIFT
is set to 4 (the number of bits in the offset) so as to get VPN. The VPN is thus used as an index into the array of PTEs pointed to by the page table base register to get the physical address of the PTE.
Once this physical address is known, the hardware can fetch the PTE from memory, extract the PFN, and concatenate it with the offset from the virtual address to form the desired physical address.
Finally the hardware can fetch the desired data from memory and put it into register eax
.
For every memory reference (whether an instruction fetch or an explicit load or store), paging requires us to perform one extra memory reference in order to first fetch the translation from the page table, which is a lot of work. Extra memory references are extremely costly.
Therefore, there are two real problems:
- Run slowly
- Take up too much memory
Summary
Pro: Paging does not lead to external fragmentation; it is quite flexible
Con: Slower machine; memory waste