Hot Patching

Well, how frustrating is this... as far as I can tell, .Text completely ate my last post, except for a couple of sentences at the top. I've switched to writing these posts off-line (like a real blogger!), so hopefully that's the last time this will happen to me.

Some co-workers and I were discussing a change you may have noticed in recent Windows kernel binaries. Disassembling into a kernel function shows an odd-looking instruction at the top:

 

   lkd> u NtCreateFile
   nt!NtCreateFile:
   80570d48 8bff             mov     edi,edi
   80570d4a 55               push    ebp
   80570d4b 8bec             mov     ebp,esp
   ...

Notice the 'mov edi,edi' at the top - that would seem to be a creative no-op, and in fact, it is. This seemingly useless instruction is designed to enable a new capability for modern kernels: hot patching. Hot patching is designed to address the availablility requirements of modern servers (and workstations, for that matter), by enabling certain hotfixes to patch the live kernel and redirect function calls from the existing function into a replacement (or potentially a filter function, etc). Administrators need something like this to enable them to apply hotfixes in a timely manner, rather than having to wait for a maintenance window. The world is a better place when servers are patched!

Technically, this works by replacing those two extra MOV bytes at the top of a function with a jump instruction. Now, some of you may have noticed that, on x86, two bytes will only get you a short jump, good for 127 bytes in either direction. Clearly this isn't going to do it - you'd have to write the new function into memory that is probably occupied by another function, which is no good. However, a little extra searching reveals this:

 

   lkd> u NtCreateFile-0x10
   nt!NtOpenFile+0x55:
   80570d38 0f841afdffff     je      nt!FsRtlCurrentBatchOplock+0xf7 (80570a58)
   80570d3e e9cd440600       jmp    nt!FsRtlRegisterUncProvider+0x18b (805d5210)
   80570d43 90               nop
   80570d44 90               nop
   80570d45 90               nop
   80570d46 90               nop
   80570d47 90               nop
   nt!NtCreateFile:
   80570d48 8bff             mov     edi,edi
   ...

Well, just what we needed - five extra bytes in which to place a long jump (relative to the current selector). Those five bytes can get patched on-the-fly with the pointer to the real function. Because they're immediately before the MOV, they're in range for our short jump. The hotfix can then write a relative long jump into those NOP bytes that points to the replacement function.

This design is interestng for several reasons. First, Microsoft chose to use the "creative no-op", rather than two real NOP instructions, even though they then used five real NOPs for the long jump. The reason is easy - the MOV is called every time the function is called, whether it is hooked or not (or, more precisely, when it its not hooked). Because you could be talking about a measurable performance impact on high-traffic code paths, those two bytes should get executed as fast as possible, and with minimal side-effects. The mov is just what the doctor ordered - as a single instruction, it should execute faster than two consecutive NOP instructions.

Why didn't Microsoft just put the five NOPs in as the first bytes of the function (rather than using the short jump)? Again, the question is performance - they have optimized for the common case (the unhooked function), rather than the uncommon case. This costs two extra bytes per function statically, but they save you 2/3 of the instructions (assuming you reserved your five bytes with MOV / MOV / NOP).

One other question that came up was why Microsoft didn't just use Detours, or something much like it. Detours relies on disassembly to perform runtime hooks like these, and it winds up overwriting a number of bytes at the begining of the function to make this work. This is difficult, at best - the implemented solution is much more robust. The cost is an extra seven bytes per function, but it's worth it, IMO.

This is an interesting, and instructive, bit of low-level software engineering, where space, performance, and robustness must be balanced.

你可能感兴趣的:(Hot Patching)