<span style="font-family: Arial, Helvetica, sans-serif;"></span>Android 4.3 Dalvik Stack layout
<span style="font-family: Arial, Helvetica, sans-serif;">In what follows, the "top" of the stack is at a low position in memory,</span>
and the "bottom" of the stack is in a high position (put more simply, they grow downward). They may be merged with the native stack at a later date. The interpreter assumes that they have a fixed size, determined when the thread is created. Dalvik's registers (of which there can be up to 64K) map to the "ins" (method arguments) and "locals" (local variables). The "outs" (arguments to called methods) are specified by the "invoke" operand. The return value, which is passed through the interpreter rather than on the stack, is retrieved with a "move-result" instruction. Low addresses (0x00000000) +- - - - - - - - -+ - out0 - +-----------------+ <-- stack ptr (top of stack) + VM-specific + + internal goop + +-----------------+ <-- curFrame: FP for cur function + v0 == local0 + +-----------------+ +-----------------+ + out0 + + v1 == in0 + +-----------------+ +-----------------+ + out1 + + v2 == in1 + +-----------------+ +-----------------+ + VM-specific + + internal goop + +-----------------+ <-- frame ptr (FP) for previous function + v0 == local0 + +-----------------+ + v1 == local1 + +-----------------+ + v2 == in0 + +-----------------+ + v3 == in1 + +-----------------+ + v4 == in2 + +-----------------+ - - - - - - +-----------------+ <-- interpStackStart High addresses (0xffffffff) Note the "ins" and "outs" overlap -- values pushed into the "outs" area become the parameters to the called method. The VM guarantees that there will be enough room for all possible "outs" on the stack before calling into a method. All "V registers" are 32 bits, and all stack entries are 32-bit aligned. Registers are accessed as a positive offset from the frame pointer, e.g. register v2 is fp[2]. 64-bit quantities are stored in two adjacent registers, addressed by the lower-numbered register, and are in host order. 64-bit quantities do not need to start in an even-numbered register. We push two stack frames on when calling an interpreted or native method directly from the VM (e.g. invoking <clinit> or via reflection "invoke()"). The first is a "break" frame, which allows us to tell when a call return or exception unroll has reached the VM call site. Without the break frame the stack might look like an uninterrupted series of interpreted method calls. The second frame is for the method itself. The "break" frame is used as an alternative to adding additional fields to the StackSaveArea struct itself. They are recognized by having a NULL method pointer. When calling a native method from interpreted code, the stack setup is essentially identical to calling an interpreted method. Because it's a native method, though, there are never any "locals" or "outs". For native calls into JNI, we want to store a table of local references on the stack. The GC needs to scan them while the native code is running, and we want to trivially discard them when the method returns. See JNI.c for a discussion of how this is managed. In particular note that it is possible to push additional call frames on without calling a method.
以上是Stack.cpp中官方对解释栈的英文介绍。
简单翻译:
解释栈布局
解释栈顶是在内存的低地址,而栈底是在内存的高地址,简而言之,栈是向下生长的。
栈可能在之后的某个时刻被本地栈(与解释栈有区别)合并。
解释器假定栈是固定大小的,是在线程创建时就被决定好的。
dalvik的寄存器(最大64K)映射到栈中的 “ins”(方法参数)“locals”(局部变量)“outs”(被其他方法调用的参数)3个栈区域。
方法的返回值则是通过解释器的一个叫做"move-result"的指令来获取的。与栈本身无关。
SP:stack ptr (top of stack):栈顶指针 用于压栈入栈
curFrame: FP for cur function:当前栈帧指针 当前正在使用的栈帧
FP:frame ptr (FP) for previous function:栈帧指针 指向一个栈帧,栈帧是栈的一个单位,32位(64位是2个栈帧)
(图....)
注意图中 "ins" "outs"重叠部分(共同使用??)变量值被压入了outs区域,被用做其他方法的参数(即本方法的outs部分=别的方法的ins部分)。虚拟机保证为了给其他函数调用而压入的outs栈帧的空间。
所有的V开头的寄存器都是32位的,所有的栈实体都是32位对齐的,即4字节对齐。
寄存器的访问的方式是栈指针(frame ptr)加上正偏移值,例如寄存器V2 = fp[2]。
64位数值被存储在相邻2个寄存器中,通过较小数值的寄存器来寻址(and are in host order??)64位数值不需要从偶数寄存器开始。
(调用函数过程中的栈帧:)
当我们直接从虚拟机调用解释方法或者本地方法时(例如invoking <clinit> or 通过反射 reflection "invoke()"),我们会压入2个栈帧。
第一个栈帧是break frame(中断栈帧) ,通过这个栈帧,我们可以知道一个调用什么时候返回或是中断异常来临?,如果没有break frame(中断栈帧)那么栈就可能看起来像是一个连续的解释调用方法序列。
break frame 是一个固定20B 即一个StackSaveArea结构体的栈帧,且它的method字段为null,
StackSaveArea结构体:
struct StackSaveArea {
#ifdef PAD_SAVE_AREA
u4 pad0, pad1, pad2;
#endif
#ifdef EASY_GDB
/* make it easier to trek through stack frames in GDB */
StackSaveArea* prevSave;
#endif
/* saved frame pointer for previous frame, or NULL if this is at bottom */
u4* prevFrame;//保存指向前一个栈帧,如果是栈底,则空
/* saved program counter (from method in caller's frame) */
const u2* savedPc;//程序计数器
/* pointer to method we're *currently* executing; handy for exceptions */
const Method* method;//当前执行方法的指针
union {
/* for JNI native methods: bottom of local reference segment */
u4 localRefCookie;//本地方法,局部引用段底部
/* for interpreted methods: saved current PC, for exception stack
* traces and debugger traces */
const u2* currentPc;//解释方法,保存当前的PC异常与调试
} xtra;
/* Native return pointer for JIT, or 0 if interpreted */
const u2* returnAddr;//JIT本地返回指针(如果为0是解释方法)
#ifdef PAD_SAVE_AREA
u4 pad3, pad4, pad5;
#endif
};
红色部分不知道是什么意思。。。如果有的话 将会多出8字节O O
第二个栈帧是为了方法自身的栈帧。称之为 regular frame。regular frame先分配4*registers(寄存器个数)的大小的内存空间,然后也是一个20B的StackSaveArea结构体,其中的method指向当前方法。
当要在解释方法中调用本地方法时,本质上栈的设置和调用解释方法是一致的。但是,本地方法从来没有 "locals" or "outs"这2个栈区域(why?)
(每次在dvmCallMethod的时候,在Method执行之前,会调用dvmPushInterpFrame(java→java)或者dvmPushJNIFrame(java→native))
JNI本地调用时,我们想要存储一个局部引用表到栈上。当本地代码运行的时候,GC(垃圾收集器)需要扫描这些表。同时,我们希望能简单的丢弃这些应用,当native方法返回的时候。查看JNI.c文件中,查看如何实现这种管理的讨论。尤其要注意,当没有调用方法时,也有可能压入额外的call frames(调用栈帧?就是上文的2个栈帧?)
Android 5.0 ART ShadowFrame layout
/*
* Return sp-relative offset for a Dalvik virtual register, compiler
* spill or Method* in bytes using Method*.
* Note that (reg >= 0) refers to a Dalvik register, (reg == -1)
* denotes an invalid Dalvik register, (reg == -2) denotes Method*
* and (reg <= -3) denotes a compiler temporary. A compiler temporary
* can be thought of as a virtual register that does not exist in the
* dex but holds intermediate values to help optimizations and code
* generation. A special compiler temporary is one whose location
* in frame is well known while non-special ones do not have a requirement
* on location in frame as long as code generator itself knows how
* to access them.
*
* +---------------------------+
* | IN[ins-1] | {Note: resides in caller's frame}
* | . |
* | IN[0] |
* | caller's ArtMethod | ... StackReference<ArtMethod>
* +===========================+ {Note: start of callee's frame}
* | core callee-save spill | {variable sized}
* +---------------------------+
* | fp callee-save spill |
* +---------------------------+
* | filler word | {For compatibility, if V[locals-1] used as wide
* +---------------------------+
* | V[locals-1] |
* | V[locals-2] |
* | |
* | . | ... (reg == 2)
* | V[1] | ... (reg == 1)
* | V[0] | ... (reg == 0) <---- "locals_start"
* +---------------------------+
* | Compiler temp region | ... (reg <= -3)
* | |
* | |
* +---------------------------+
* | stack alignment padding | {0 to (kStackAlignWords-1) of padding}
* +---------------------------+
* | OUT[outs-1] |
* | OUT[outs-2] |
* | . |
* | OUT[0] |
* | StackReference<ArtMethod> | ... (reg == -2) <<== sp, 16-byte aligned
* +===========================+
static int GetVRegOffset(const DexFile::CodeItem* code_item, uint32_t core_spills, uint32_t fp_spills, size_t frame_size, int reg, InstructionSet isa) { DCHECK_EQ(frame_size & (kStackAlignment - 1), 0U); DCHECK_NE(reg, static_cast<int>(kVRegInvalid)); int spill_size = POPCOUNT(core_spills) * GetBytesPerGprSpillLocation(isa) + POPCOUNT(fp_spills) * GetBytesPerFprSpillLocation(isa) + sizeof(uint32_t); // Filler. int num_ins = code_item->ins_size_; int num_regs = code_item->registers_size_ - num_ins; int locals_start = frame_size - spill_size - num_regs * sizeof(uint32_t); if (reg == static_cast<int>(kVRegMethodPtrBaseReg)) { // The current method pointer corresponds to special location on stack. return 0; } else if (reg <= static_cast<int>(kVRegNonSpecialTempBaseReg)) { /* * Special temporaries may have custom locations and the logic above deals with that. * However, non-special temporaries are placed relative to the locals. Since the * virtual register numbers for temporaries "grow" in negative direction, reg number * will always be <= to the temp base reg. Thus, the logic ensures that the first * temp is at offset -4 bytes from locals, the second is at -8 bytes from locals, * and so on. */ int relative_offset = (reg + std::abs(static_cast<int>(kVRegNonSpecialTempBaseReg)) - 1) * sizeof(uint32_t); return locals_start + relative_offset; } else if (reg < num_regs) { return locals_start + (reg * sizeof(uint32_t)); } else { // Handle ins. return frame_size + ((reg - num_regs) * sizeof(uint32_t)) + sizeof(StackReference<mirror::ArtMethod>); } }
已知寄存器号reg 通过GetVRegOffset方法就可以在内存中找到个reg所对应的frame(帧地址)相对于sp的偏移量(byte为单位)。
int spill_size由上图可知是一个被调用方法最上面的frames的大小,包括core_spill fp_spill 和一个 filler(4字节固定,当最后一个局部变量为64位wide型时,可以储存信息)
int num_ins 为dex文件中得到的该被调用方法的参数个数
int num_regs 为局部变量的个数,由dex文件中总寄存器数 - 参数寄存器个数
int locals_start 是通过 栈的总大小frame_size-spill_size-num_regs 获得(frame_size由Ralloc_util.cc中的ComputerFrameSize方法获得)。 代表第一个局部变量所在的偏移(即SP指针与第一个局部变量之间相差的栈帧数目)。
最后根据reg的类型 通过locals_start再加上一定的偏移即获得我们需要的结果与sp的偏移量。