There is no visible difference between a non-aliasing VIPT cache and a
PIPT cache. Address bit allocation for a 32-byte cache line VIPT
non-aliasing cache with 4K page size:
[0:1] = byte offset into word
[4:2] = word offset
[N-1:5] = virtual index
[31:N] = physical tag
where N <= PAGE_SHIFT, otherwise it would be an aliasing VIPT cache.
For a PIPT non-aliasing cache:
[0:1] = byte offset into word
[4:2] = word offset
[M-1:5] = physical index
[31:M] = physical tag
where M doesn't matter as it can never alias with itself.
The requirements for N is such the CPU visible conditions which qualify a
cache as being VIPT non-aliasing also satisfy PIPT - and a non-aliasing
VIPT cache has the same properties as a PIPT cache.
- Latency: The physical address is available from the MMU some time, perhaps a few cycles, after the virtual address is available from the address generator.
- Aliasing: Multiple virtual addresses can map to a single physical address. Most processors guarantee that all updates to that single physical address will happen in program order. To deliver on that guarantee, the processor must ensure that only one copy of a physical address resides in the cache at any given time.
- Granularity: The virtual address space is broken up into pages. For instance, a 4 GB virtual address space might be cut up into 1048576 pages of 4 KB size, each of which can be independently mapped. There may be multiple page sizes supported; see virtual memory for elaboration.
A historical note: some early virtual memory systems were very slow, because they required an access to the page table (held in main memory) before every programmed access to main memory.[NB 1] With no caches, this effectively cut the speed of the machine in half. The first hardware cache used in a computer system was not actually a data or instruction cache, but rather a TLB.
Caches can be divided into 4 types, based on whether the index or tag correspond to physical or virtual addresses:
- Physically indexed, physically tagged (PIPT) caches use the physical address for both the index and the tag. While this is simple and avoids problems with aliasing, it is also slow, as the physical address must be looked up (which could involve a TLB miss and access to main memory) before that address can be looked up in the cache.
- Virtually indexed, virtually tagged (VIVT) caches use the virtual address for both the index and the tag. This caching scheme can result in much faster lookups, since the MMU doesn't need to be consulted first to determine the physical address for a given virtual address. However, VIVT suffers from aliasing problems, where several different virtual addresses may refer to the same physical address. The result is that such addresses would be cached separately despite referring to the same memory, causing coherency problems. Another problem is homonyms, where the same virtual address maps to several different physical addresses. It is not possible to distinguish these mappings by only looking at the virtual index, though potential solutions include: flushing the cache after a context switch, forcing address spaces to be non-overlapping, tagging the virtual address with an address space ID (ASID), or using physical tags. Additionally, there is a problem that virtual-to-physical mappings can change, which would require flushing cache lines, as the VAs would no longer be valid.
- Virtually indexed, physically tagged (VIPT) caches use the virtual address for the index and the physical address in the tag. The advantage over PIPT is lower latency, as the cache line can be looked up in parallel with the TLB translation, however the tag can't be compared until the physical address is available. The advantage over VIVT is that since the tag has the physical address, the cache can detect homonyms. VIPT requires more tag bits, as the index bits no longer represent the same address.
- Physically indexed, virtually tagged (PIVT) caches are only theoretical as they would basically be useless.[13]
The speed of this recurrence (the load latency) is crucial to CPU performance, and so most modern level-1 caches are virtually indexed, which at least allows the MMU's TLB lookup to proceed in parallel with fetching the data from the cache RAM.
But virtual indexing is not the best choice for all cache levels. The cost of dealing with virtual aliases grows with cache size, and as a result most level-2 and larger caches are physically indexed.
Caches have historically used both virtual and physical addresses for the cache tags, although virtual tagging is now uncommon. If the TLB lookup can finish before the cache RAM lookup, then the physical address is available in time for tag compare, and there is no need for virtual tagging. Large caches, then, tend to be physically tagged, and only small, very low latency caches are virtually tagged. In recent general-purpose CPUs, virtual tagging has been superseded by vhints, as described below.