Perhaps one might think that this topic is not a big deal. We have a guideline, called ARM-EABI (Embedded Application Binary Interface). Among other things, this paper specifies which registers must be preserved, how the stack must be handled, etc. We just have to follow these rules, don't we?
Well, life is not so simple. The developers of SquirellFish Extreme, the JavaScript JIT compiler of WebKit, pushed the limitations of x86 ABI in two ways.
The first one is the unusual way of argument passing. A typical compiler pushes the arguments onto the stack. Those arguments are freed either by the caller or by the callee function. In SquirellFish Extreme the arguments are never freed, however. The same argument list is passed to the high level C++ callback functions again and again. The first 16 items of this list contain the temporary arguments which may be changed by the JITted code, but the items over 16 are constants (for example, there is a pointer to the global data among them). In this way, the JIT can pass all important constants to the C++ callbacks without any stack manipulation (thus, the code runs faster). Fortunately, the ARM-EABI has a similar way of passing arguments, except that the first 4 are passed through registers. Since those are only temporary arguments, we can mimic the x86-jit constant argument passing on ARM as well.
The second feature of SqurellFish Extreme is much less portable. On x86 architectures, even the return address is stored on the stack.
[ ARG n ][ ARG n-1 ]...[ arg 2 ][ arg 1 ][ return address ]Figure 1
When an exception occures, SquirellFish Extreme changes the return address of the C++ callback function. The new return address will points to the exception handler in the jit code. Note: only high level callbacks can throw exceptions, jit code itself never throws any. As you can see on Figure 1 (above), overwriting this return address is not that difficult on x86.
On arm, the return address is stored in the link register. The link register is a scratch register, the callee function can do anything with it as long as it is capable to return to the caller. However, we noticed that most of the time the compiler simply pushes it onto the top of the stack by an stmdb sp!, { ... , lr } or an str lr, [sp - 4]! instruction.
Our method is the following: if the callback function starts with one of these instructions above, we know that the return value is stored on the top of the stack (exactly the same way as for x86). We wrote a smart algorithm to detect these instructions, even if the instruction scheduler of the compiler reordered them. If the algorithm cannot find a matching instruction, a stub function is created for that specific callback function by the JIT. The stub function provides a way to overwrite the return address even for those functions. Fail rate is very low, only 1 stub needs to be generated for every 40-50 callbacks.
If you like this kind of technical related posts, just write an encouraging comment, and I will continue them. :)