JIT相关的技术:http://www.cs.toronto.edu/~matz/dissertation/matzDissertation-latex2html/node1.html
cti: context-threaded-interpreter
Direct-threadedinterpreters: use indirect branches todispatch bytecodes, but deeply-pipelined architectures rely on branchprediction for performance. Due to the poor correlation between the virtualprogram's control flow and the hardware program counter, which we call thecontext problem, direct threading's indirect branches are poorly predicted bythe hardware, limiting performance.
Context threading: improves branch prediction and performance by aligning hardware and virtual machine state. Linear virtual instructions are dispatched with native calls and returns, aligning the hardware and virtual PC. Thus, sequential control flow is predicted by the hardware return stack. We convert virtual branching instructions to native branches, mobilizing the hardware's branch prediction resources.
Branch predictor:Two-way branching is usually implemented with a conditional jump instruction. A conditional jump can either be "not taken"and continue execution with the first branch of code which follows immediatelyafter the conditional jump - or it can be "taken" and jump to adifferent place in program memory where the second branch of code is stored.
Labels as Values:
static const int array[] = { &&foo - &&foo,&&bar - &&foo,&&hack - &&foo };
goto *(&&foo + array[ i ]);
This is more friendly to code living in shared libraries, as it reduces the number of dynamic relocations that are needed, and by consequence, allows the data to be read-only.
Link register:is a specialpurpose register whichholds the address to return to whena function call completes.
Thunk:a piece of code toperform a delayed computation, lazy computer orexecute.
Stub:Method stub, in computer programming, a piece of code used to stand in forsome other programming functionality
Trampolines:
Here is another instrunctions for trampoline with gcc compiler:http://gcc.gnu.org/onlinedocs/gccint/Trampolines.html
[1]is a smallpiece of code that is created at run time when the address of a nested functionis taken
Suppose functionA calls function B, which happens to be too far away for the short CALLinstruction to reach. The linker inserts a tiny trampoline function Twhich executes a long branch to function B, and rewrites A's call so that itactually calls T. However, T must be placed close to function A, or theshort CALL in A won't reach it. For this reason, T needs to go in thesame section as A.
[2]
Trampolines are small, hand-written pieces of assembly code used to perform various tasks in the mono runtime. They are generated at runtime using the native code generation macros used by the JIT. They usually have a corresponding C function they can fall back to if they need to perform a more complicated task. They can be viewed as ways to pass control from JITted code back to the runtime.
The common code for all architectures is in mini-trampolines.c, this file contains the trampoline creation functions plus the C functions called by the trampolines. The tramp-<ARCH>.c files contain the arch-dependent code which creates the trampolines themselves.
Most, but not all trampolines consist of two parts:
The generic part saves the machine state to the stack, and calls one of the trampoline functions in mini-trampolines.c with the state, the call site, and the argument passed by the specific trampoline. After the C function returns, it either returns normally, or branches to the address returned by the C function, depending on the trampoline type.
Trampoline types are given by the MonoTrampolineType enumeration in mini.h.
The platform specific code for trampolines is in the file tramp-<ARCH>.c for each architecture, while the cross platform code is in mini-trampolines.c. There are two types of functions in mini-trampolines.c:
Trampoline creation functions have the following signature:
gpointer mono_arch_create_foo_trampoline (<args>, MonoTrampInfo **info, gboolean aot)
The function should return a pointer to the newly created trampoline, allocating memory from either the global code manager, or from a domain's code manager. If INFO is not NULL, it is set to a pointer to a MonoTrampInfo structure, which contains information about the trampoline, like its name, unwind info, etc. This is used for two purposes:
These trampolines are used to JIT compile a method the first time it is called. When the JIT compiles a call instruction, it doesn't compile the called method right away. Instead, it creates a JIT trampoline, and emits a call instruction referencing the trampoline. When the trampoline is called, it calls mono_magic_trampoline () which compiles the target method, and returns the address of the compiled code to the trampoline which branches to it. This process is somewhat slow, so mono_magic_trampoline () tries to patch the calling JITted code so it calls the compiled code instead of the trampoline from now on. This is done by mono_arch_patch_callsite () in tramp-<ARCH>.c.
There is one virtual call trampoline per vtable slot index. The trampoline uses this index plus the 'this' argument which is passed in a fixed register/stack slots by the managed calling convention to obtain the virtual method which needs to be compiled. It then patches the vtable slot with the address of the newly compiled method.
<TODO IMT>
Jump trampolines are very similar to JIT trampolines, they even use the same mono_magic_trampoline () C function. They are used to implement the LDFTN and the JMP IL opcodes.
These trampolines are used to implement the type initialization sematics of the CLI spec. They call the mono_class_init_trampoline () C function which executes the class initializer of the class passed as the trampoline argument, then replaces the code calling the class init trampoline with NOPs so it is not executed anymore.
This is similar to the class init trampolines, but is used for initalizing classes which are only known at run-time, in generic-shared code. It receives the class to be initialized in a register instead of from a specific trampoline. This means there is only one instance of this trampoline.
These are used for fetching values from a runtime generic context, lazily initializing the values if they do not exist yet. There is one instance of this trampoline for each offset value.
These are similar to the JIT trampolines but instead of receiving a MonoMethod to compile, they receive an image+token pair. If the method identified by this pair is also AOT compiled, the address of its compiled code can be obtained without loading the metadata for the method.
These trampolines handle calls made from AOT code though the PLT.
These trampolines are used to handle the first call made to the delegate though its Invoke method. They call mono_delegate_trampoline () which creates a specialized calling sequence optimized to the delegate instance before calling it. Further calls will go through to this optimized code sequence.
These trampolines implement the fastpath of Monitor.Enter/Exit on some platforms.