The evolution of driver page remapping

Two weeks ago, this page looked at the newVM_UNPAGED flag, introduced in 2.6.15-rc2 to mark virtual memory areas (VMAs) which are not made up of "normal" pages. These areas are usually created by device drivers which map special memory areas (which may or may not be device I/O memory) into user space. Your editor now humbly suggests that readers ignore that article; things have changed significantly since then.

As it turns out, Linus didn't like the VM_UNPAGED idea, so he rewrote the code for 2.6.15-rc4. TheVM_UNPAGED VMA flag is gone, replaced by VM_PFNMAP. The new flag has a very similar meaning: it marks the VMA as containing special page table entries which should not be touched by the VM subsystem. In particular, it states that there is no page structure associated with any page in that VMA, so the VM subsystem should not go looking for one. Even in cases where that structure does exist (such as remappings of real memory), the VM code will pretend that it does not.

The advantage of the reworked code is that it takes out a number of special cases; theVM_PFNMAP VMAs can be treated just like normal VMAs in more places. Things quickly got a bit more complicated, however. The initialVM_PFNMAP code assumed that a linear range of addresses was being mapped into user space. In fact, some drivers piece together memory in more complicated ways.

So a subsequent patch added explicit support for "incomplete" VMAs, marked with yet another flag:VM_INCOMPLETE. When the kernel detects that a driver is creating something other than a straightforward, linear mapping, it sets that flag and emits a warning. It also requires, in this case, that the pages being remapped carry thePG_reserved flag - even though this flag is being phased out. Remapping RAM in this way always required that flag in the past, so this requirement is not a change as far as drivers are concerned.

The patch adding VM_INCOMPLETE notes that "In the long run we almost certainly want to export a totally different interface for that, though." In this case, "in the long run" meant about one day, when yet another patch was merged adding a new function:

int vm_insert_page(struct vm_area_struct *vma, 
                       unsigned long address,
                       struct page *page);

This function inserts the given page into vma, mapped at the givenaddress. It does not put out warnings, and does not require that PG_reserved be set. What itdoes require is that the page be an order-zero allocation obtained for this purpose; it is not possible to remap arbitrary RAM pages withvm_insert_page(). Since a page structure is required, the new function is also unsuitable for remapping I/O memory. But it is useful for drivers which wish to map a set of pages into a user-space address range.

Just which driver might want to do something like that became clear when another patch was merged for 2.6.15-rc5. It removed the GPL-only export forvm_insert_page() and included this commit message:

Make vm_insert_page() available to NVidia module. It used to use remap_pfn_range(), which wasn't GPL-only either, and the new interface is actually simpler and does more checking, so we shouldn't unnecessarily discourage people from switching over.

Some developers objected to this change, seeing it as an explicit endorsement of the proprietary NVidia drivers. Others, however, saw it as a simple attempt to avoid breaking drivers without a good reason. The kernel developers may well be working toward taking a stronger stand against proprietary modules, but this particular interface will not be the place where that battle is fought.

你可能感兴趣的:(insert,interface,patch,Allocation,Warnings,structure)