出处:http://cm.bell-labs.com/cm/cs/who/dmr/clcs.html
刘建文略译(http://www.semi-translate.com/blog | http://blog.csdn.net/keminlau )
A calling sequence is the conventional sequence of instructions that call and return from a procedure. Because programs tend to make many procedure calls, a compiler should ensure that this sequence is as economical as possible.
一个(过程或函数)调用顺序就是调用一个过程并从该过程返回的指令的约定顺序。由于一个程序是很多的函数的调用来实现的,所以编译器应该使这个调用的顺序尽可能的经济划算。
A calling sequence is influenced by the semantics of the language and by the register layout, addressing modes, instruction set. and existing conventions of the target machine. It is closely connected with allocation of storage for variables local to procedures.
函数调用顺序受语言语义、寄存器的规划、寻址模式、指令集和目标机器的固有约定等多方面的影响。函数调用顺序还与局部变量的内存分配方式有着紧密的联系。
This document sets forth the issues involved in designing a calling sequence for the C language, and discusses experience with various environments. The system implementer and hardware designer might find hints about the efficient support of C; the ordinary user might gain some sense of how hard it can be to support even a simple language on today's hardware. The first several sections discuss the requirements of a procedure call and return forced by the C language and conventions, including the support of recursion and variable-sized argument lists, and methods of returning values of various types (which might be ignored by the calling procedure). Then some sample designs are presented. The last sections discuss various topics, such as growing the stack, supporting structure valued procedures, and the library routines setjmp and longjmp An appendix presents the Interdata 8/32 calling sequence in detail.
这个文档设定了四个在为C语言设计函数调用顺序时需要考虑的问题,并且讨论在不同的环境下的一些经验。系统实现者和硬件设计者会从关于C的有效支持中得到一些提示。在最初的用户看来,用今天的硬件尽管实现一种简单的语言对他们来说也是相当难以想像的。本文档的头几部分我们讨论C语言的函数调用和返回的强制约定,包括递归的支持、可变大小参数列表和(可能被函数调用方忽略的)不同类型返回值的返回方法。我们会举一些设计的例子。最后我们讨论另外相关不同的内容,比如堆栈的增长、结构值的过程支持和库函数setjmp and longjmp 。
What features must be provided by the call of a C procedure? Procedures may call themselves recursively without explicit declaration. Because procedures can pass the addresses of local variables and arguments to subprocedures, keeping local variables in static cells is unacceptably complex. Thus, C run-time environments are built around a stack that contains procedure arguments and local variables. The maximum amount of stack space needed by a given program is data dependent and difficult to estimate in advance. Thus, it is convenient to permit the stack to start small and grow as needed. When this cannot be done automatically, stack overflow should at least be clearly signaled and stack size easily increased.
调用一个C函数能提供什么样的功能呢?函数可能不用显式的声明就可递归地调用自己。由于函数可以把本地局部变量或参数的地址传给子函数,因此为局部变量分配固定内存几乎不可能。所以,C运行时环境创建堆栈来存放函数的参数和局部变量。由于程序的动态数据是按需分配的,所以程序的堆栈最大所需空间是很难事先估计的。因此使用堆栈潜规则就是小用,按需增长,并且在堆栈溢出时有相应的处理机制。
The C language definition does not contemplate the question of procedures whose formal and actual argument lists disagree in number or type. Nevertheless, as a practical matter, the issue must be dealt with, especially for the printf function, which is routinely called with a variable number of arguments of varying types.
C语言的定义并没有对函数的具体实现有过多的规定,比如参数的个数或类型。但是作为一个具体的现实,这些问题是要被处理的。
In general, only the calling procedure knows how many arguments are passed, while the called procedure does not. On the other hand, only the called procedure knows how many automatic variables and temporary locations it will need, while the calling procedure does not. This issue of handling variable-length argument lists can dominate the design of the calling sequence on some machines; often the hardware support for procedure calls assumes that the size of the argument list is known to the called procedure at compile time.
通常只有函数调用方才知道被传递的参数的个数,而被调用方不知道(kemin:是这样的吗?不是两都知道的吗?);另一方面,只有被调用方知道函数会使用多少个局部变量,而调用方不知道。这种参数个数可变的函数调用方式在一些机器的函数顺序设计占主导地位。通常支持过程调用的硬件都假定参数的个数是调用方和被调用方都知道的。
This assumption is correct for many languages (e.g., Pascal), but in practice not for C. (Interestingly, the assumption is also correct in theory for Fortran, but again not always in practice. Many Fortran programs incorrectly call one entry point in a subroutine to initialize a large argument list, then another that assumes that the old arguments are still valid. Were it not for the ubiquitous printf, implementors of C would be as justified as many implementors of Fortran in adhering to published standard rather than custom.)
The following things happen during a C call and return:
Some machine architectures will make certain of these jobs trivial, and make others difficult. These tradeoffs will be the subject of the next several sections.
- 调用前参数被评估并放到一个默认的地方;
- 返回地址压进堆栈;
- 控制权转给被调用函数;
- (被调用函数执行前先保存调用者的上下文信息)什么信息保存到哪里,此处不详;
- 被调用函数获得保存局部变量的临时堆栈空间
- 被调用函数的(上下文信息)簿记寄存器被初始化。现在,被调用函数必须能够找到参数的值;
- 被调用函数主体执行;
- 返回值(如果有)将在堆栈被释放后放到一个安全的地方,函数调用者的(上下文信息)簿记寄存器和其它寄存器将被恢复,先前的返回地址被弹出堆栈,并返回原调用处。
一些机器的体系对函数的调用返回过程的处理的工作不甚重视,而看重其它方面的。重视与否将是以下内容的主题。
Parts of the job of calling and returning can be done either by the calling or the called procedure. This section discusses some of the tradeoffs.
As mentioned above, some of the status of the calling procedure must be saved at the call and restored on return. This status includes the return address, information that allows arguments and local variables to be addressed (e.g. the stack pointer), and the values of any register variables. This information can be saved either by the calling procedure or the called procedure. If the calling procedure saves the information, it knows which registers contain information it will need again, and need not save the other registers. On the other hand, the called procedure need not save the registers that it will not change.
函数的调用和返回的额外工作(也就是Prologs and Epilogs)应该由函数调用方还是被调用方来处理呢?
由上面可知,调用方的一些状态信息必须调用时保存起来,在返回时恢复。这些状态信息包括调用的返回地址、参数的地址信息(如堆栈指针)和一些寄存器的值(kemin:还是没有说是什么)。这些信息既可由调用方也可由被调用方来保存。如果是前者,那么它知道哪些寄存器的值是需要在返回时恢复的;从而,被调用者也就不必保存那些没用到的寄存器的值。
Program space considerations play a role in the decision. If the status is saved by the called procedure, the save/restore code is generated once per procedure; if the saving is done by the calling procedure, the code is generated once per call of the procedure. Because nearly all procedures are called at least once, and most more than once, considerable code space can be gained by saving the status in the called procedure. If status saving takes several instructions, code space might be further reduced (at the expense of time) by sharing the status saving code sequence in a specially-called subroutine (this is done on the PDP-11). Similarly, branching to a common return sequence can further shrink code space. If the save or return code is indeed common, the calling procedure's status information must be in a known place in the called procedure's environment. Of course, the space gained by sharing save and return sequences must be balanced against the time required to branch to and from the common code.
程序空间对工作分配起着重要的作用。如果状态信息由被调用者保存,那保存和恢复代码每函数一套;如果由调用者来保存,保存和恢复代码每调用一套。而函数至少会被调用一次,所以代码每函数一套将节省程序空间。如果保存和恢复代码量大,那么用一个特殊的调用子程序来专门实现将更加节省代码空间。同样的,抽离回返过程的通用代码也将缩小代码空间。如果保存和返回代码是真的通用代码,那么调用者的状态信息必须放在一个被调用者知道的地方。当然这种代码共享与抽离也有一定的开销,是折中不是极端。
The calling procedure must assume responsibility for passing the arguments, since it alone knows how many there are. However, these arguments must eventually be made part of the called procedure's environment.
Arguments can be passed in several ways. The calling procedure might establish a piece of memory (in effect, a structure) containing the arguments, and pass its address. This requires an additional bookkeeping register, and so increases the amount of status information to be saved, and slows down the call.
调用方负责参数传递,这些参数的所在位置也最终成为被调用方的环境。
参数可以以几种方式传递。其中之一是调用方创建一块包括参数的内存区,并将该内存的地址传给被调用方。这种方式需要用到额外的寄存器,因此增加了状态信息的量,降低了调用的效率。
Most often. the caller knows the location of the called procedure's stack frame. Then the calling procedure can store the arguments adjacent to (or part of) the new stack frame, where the called procedure can find them. In practice, this requires that the stack be allocated from contiguous area of memory.
调用方一般知道被调用方的堆栈帧的位置。调用方可以把参数存到这个堆栈帧的边上,这样被调用方可以很容易的找到它,而不像上面那样需要额外的开销。这种方式需要堆栈必须分配在连续的内存区域。
Because most calls pass only a few arguments, it is tempting吸引人的 to pass the first arguments in registers and the rest in some other way. Although this approach might be effective, it requires an exceptionally regular register architecture, and has not yet proved practical.*
NOTE: * Note added 2003: it did prove practical shortly thereafter.
由于大多数的函数一般传少量的几个的参数,所以一个很吸引人的效率提升手段就是将这些参数放在寄存器。这种机制需要特殊的寄存器体系(?)。目标已经证实此方法可行。