Chez Scheme 程序设计语言(第4版本)
- 简介(Introduction)
Scheme is a general-purpose computer programming language. It is a high-level language, supporting operations on structured data such as strings, lists, and vectors, as well as operations on more traditional data such as numbers and characters. While Scheme is often identified with symbolic applications, its rich set of data types and flexible control structures make it a truly versatile language. Scheme has been employed to write text editors, optimizing compilers, operating systems, graphics packages, expert systems, numerical applications, financial analysis packages, virtual reality systems, and practically every other type of application imaginable. Scheme is a fairly simple language to learn, since it is based on a handful of syntactic forms and semantic concepts and since the interactive nature of most implementations encourages experimentation. Scheme is a challenging language to understand fully, however; developing the ability to use its full potential requires careful study and practice.
Scheme 是一种通用计算机程序语言。Scheme是一种高级语言,支持结构化数据如字符串、列表和向量以及更多传统数据类型如数和字符等之上的操作。Scheme通常被认为是符号式的,它拥有丰富的数据类型和灵活的控制结构使得它是一个真正的万能型语言。Scheme已经被用于编写文本编辑器、优化编译器、操作系统、图形软件包、专家系统、数值应用程序、金融分析软件包、虚拟现实系统,以及你所能想象的所有其他类型应用。然而,Scheme语言也确实是一种简单易学的编程语言,这是由于Scheme是基于许多语法形式(syntacitc forms)和语义概念,此外也由于大多数Scheme实现的交互本质能促进程序员进行程序设计实验。尽管如此,完全理解Scheme语言是极具挑战的;要想完全发挥Scheme语言的所有潜力要求程序员进行仔细的研究和练习。
Scheme programs are highly portable across versions of the same Scheme implementation on different machines, because machine dependencies are almost completely hidden from the programmer. They are also portable across different implementations because of the efforts of a group of Scheme language designers who have published a series of reports, the "Revised Reports" on Scheme. The most recent, the "Revised6 Report" [24], emphasizes portability through a set of standard libraries and a standard mechanism for defining new portable libraries and top-level programs.
Scheme程序具有高度的可移植性(在不同机器上相同的Scheme实现),这是由于对于程序员,其机器依赖性几乎完全被隐藏了起来。在不同的Scheme实现上,Scheme程序也几乎拥有可一致性,这是因为一群Scheme语言设计者发布了一系列的报告,即Revised Reports on Scheme(RnRS)。其中最近的,是第6版(R6RS)[24],其中强调了通过一组标准库、定义新的可移植性库的机制以及顶层程序的方式来确保可移植性。
Although some early Scheme systems were inefficient and slow, many newer compiler-based implementations are fast, with programs running on par with equivalent programs written in lower-level languages. The relative inefficiency that sometimes remains results from run-time checks that support generic arithmetic and help programmers detect and correct various common programming errors. These checks may be disabled in many implementations.
尽管某些早期的Scheme系统确实既低效又缓慢,但是许多新的基于编译器的实现则是快速的,可以与那些使用较低级别的语言编写的程序相当。(Scheme的实现)运行期检测(run-time checks)可以支持通用算法和帮助程序员检查以及修正多种常见程序错误,而这种运行期检测有时会导致相对的低效。这些检测在许多Scheme实现种可以被关闭。
Scheme supports many types of data values, or objects, including characters, strings, symbols, lists or vectors of objects, and a full set of numeric data types, including complex, real, and arbitrary-precision rational numbers.
The storage required to hold the contents of an object is dynamically allocated as necessary and retained until no longer needed, then automatically deallocated, typically by a garbage collector that periodically recovers the storage used by inaccessible objects. Simple atomic values, such as small integers, characters, booleans, and the empty list, are typically represented as immediate values and thus incur no allocation or deallocation overhead.
在Scheme中,对象所需的内存是按需动态分配的,并且会在不再需要时被自动释放,该特性主要通过一个垃圾收集器(garbage collector)周期性的恢复那些不可用对象的存储空间来实现。简单原子值,类似小整数、字符、布尔值以及空列表则以立即值的方式表示,不需要额外的分配和释放开销。
Regardless of representation, all objects are first-class data values; because they are retained indefinitely, they may be passed freely as arguments to procedures, returned as values from procedures, and combined to form new objects. This is in contrast with many other languages where composite data values such as arrays are either statically allocated and never deallocated, allocated on entry to a block of code and unconditionally deallocated on exit from the block, or explicitly allocated and deallocated by the programmer.
Scheme is a call-by-value language, but for at least mutable objects (objects that can be modified), the values are pointers to the actual storage. These pointers remain behind the scenes, however, and programmers need not be conscious of them except to understand that the storage for an object is not copied when an object is passed to or returned from a procedure.
Scheme是一种按值调用的语言,但是对于可变对象(mutable objects - 可以被修改的对象),它们的值就是实际存储的指针。这些指针处于场景之后,尽管如此,程序员不需要意识到它们,除非有需要理解这个对象的存储在传参和返回值的时候并没有被拷贝。
At the heart of the Scheme language is a small core of syntactic forms from which all other forms are built. These core forms, a set of extended syntactic forms derived from them, and a set of primitive procedures make up the full Scheme language. An interpreter or compiler for Scheme can be quite small and potentially fast and highly reliable. The extended syntactic forms and many primitive procedures can be defined in Scheme itself, simplifying the implementation and increasing reliability.
Scheme语言的心脏是一个关于语法形式(syntactic forms)的小核心(core),通过这个核心可以构建其他所有的语法形式。这些核心形式,以及一组从它推导出的语法形式的集合和一组基本过程构成了完整的Scheme语言。一个Scheme的解释器或者编译器可以非常小,同时可能很快以及高度可靠。这些扩展形式和很多基本过程可以用Scheme自身定义出来,从而简化了整个Scheme的实现和其可靠性。
Scheme programs share a common printed representation with Scheme data structures. As a result, any Scheme program has a natural and obvious internal representation as a Scheme object. For example, variables and syntactic keywords correspond to symbols, while structured syntactic forms correspond to lists. This representation is the basis for the syntactic extension facilities provided by Scheme for the definition of new syntactic forms in terms of existing syntactic forms and procedures. It also facilitates the implementation of interpreters, compilers, and other program transformation tools for Scheme directly in Scheme, as well as program transformation tools for other languages in Scheme.
Scheme variables and keywords are lexically scoped, and Scheme programs are block-structured. Identifiers may be imported into a program or library or bound locally within a given block of code such as a library, program, or procedure body. A local binding is visible only lexically, i.e., within the program text that makes up the particular block of code. An occurrence of an identifier of the same name outside this block refers to a different binding; if no binding for the identifier exists outside the block, then the reference is invalid. Blocks may be nested, and a binding in one block may shadow a binding for an identifier of the same name in a surrounding block. The scope of a binding is the block in which the bound identifier is visible minus any portions of the block in which the identifier is shadowed. Block structure and lexical scoping help create programs that are modular, easy to read, easy to maintain, and reliable. Efficient code for lexical scoping is possible because a compiler can determine before program evaluation the scope of all bindings and the binding to which each identifier reference resolves. This does not mean, of course, that a compiler can determine the values of all variables, since the actual values are not computed in most cases until the program executes.
In most languages, a procedure definition is simply the association of a name with a block of code. Certain variables local to the block are the parameters of the procedure. In some languages, a procedure definition may appear within another block or procedure so long as the procedure is invoked only during execution of the enclosing block. In others, procedures can be defined only at top level. In Scheme, a procedure definition may appear within another block or procedure, and the procedure may be invoked at any time thereafter, even if the enclosing block has completed its execution. To support lexical scoping, a procedure carries the lexical context (environment) along with its code.
Furthermore, Scheme procedures are not always named. Instead, procedures are first-class data objects like strings or numbers, and variables are bound to procedures in the same way they are bound to other objects.
As with procedures in most other languages, Scheme procedures may be recursive. That is, any procedure may invoke itself directly or indirectly. Many algorithms are most elegantly or efficiently specified recursively. A special case of recursion, called tail recursion, is used to express iteration, or looping. A tail call occurs when one procedure directly returns the result of invoking another procedure; tail recursion occurs when a procedure recursively tail-calls itself, directly or indirectly. Scheme implementations are required to implement tail calls as jumps (gotos), so the storage overhead normally associated with recursion is avoided. As a result, Scheme programmers need master only simple procedure calls and recursion and need not be
burdened with the usual assortment of looping constructs.
和多数支持过程的语言类似,Scheme的过程可以被递归调用。这就是说,任何过程可以直接或简接的调用自己。许多算法可以用递归的形式高雅或高效地定义出来。一种递归特殊情形,被称为尾递归,被用来表达迭代或循环。一个尾调用总是发生在一个过程在尾部,直接或简接地递归地调用自身。Scheme的实现被要求尾调用要以jumps(goto) 的方式实现,这样通常可避免与递归有关的存储开销。作为一个结果,Scheme程序员仅需掌握简单过程调用和递归,而无需通常负担常见的多种循环结构。
Scheme supports the definition of arbitrary control structures with continuations. A continuation is a procedure that embodies the remainder of a program at a given point in the program. A continuation may be obtained at any time during the execution of a program. As with other procedures, a continuation is a first-class object and may be invoked at any time after its creation. Whenever it is invoked, the program immediately continues from the point where the continuation was obtained. Continuations allow the implementation of complex control mechanisms including explicit backtracking, multithreading, and coroutines.
Scheme also allows programmers to define new syntactic forms, or syntactic extensions, by writing transformation procedures that determine how each new syntactic form maps to existing syntactic forms. These transformation procedures are themselves expressed in Scheme with the help of a convenient high-level pattern language that automates syntax checking, input deconstruction, and output reconstruction. By default, lexical scoping is maintained through the transformation process, but the programmer can exercise control over the scope of all identifiers appearing in the output of a transformer. Syntactic extensions are useful for defining new language constructs, for emulating language constructs found in other languages, for achieving the effects of in-line code expansion, and even for emulating entire languages in Scheme. Most large Scheme programs are built from a mix of syntactic extensions and procedure definitions.
Scheme evolved from the Lisp language and is considered to be a dialect of Lisp. Scheme inherited from Lisp the treatment of values as first-class objects, several important data types, including symbols and lists, and the representation of programs as objects, among other things. Lexical scoping and block structure are features taken from Algol 60 [21]. Scheme was the first Lisp dialect to adopt lexical scoping and block structure, first-class procedures, the treatment of tail calls as jumps, continuations, and lexically scoped syntactic extensions.
Scheme从LISP语言演化而来,也被认为是LISP的一种方言。Scheme语言继承了LISP处理第一类对象数值的方式,一些重要的数据类型,包括符号和列表,以及程序作为对象的表示,还有其他一些。词法作用域和块结构是从Algo 60[21]语言中借鉴过来的。Scheme是第一种采用了词法作用域和块结构、第一类过程对象以及将尾调用处理成jumps的LISP方言。
Common Lisp [27] and Scheme are both contemporary Lisp languages, and the development of each has been influenced by the other. Like Scheme but unlike earlier Lisp languages, Common Lisp adopted lexical scoping and first-class procedures, although Common Lisp's syntactic extension facility does not respect lexical scoping. Common Lisp's evaluation rules for procedures are different from the evaluation rules for other objects, however, and it maintains a separate namespace for procedure variables, thereby inhibiting the use of procedures as first-class objects. Also, Common Lisp does not support continuations or require proper treatment of tail calls, but it does support several less general control structures not found in Scheme. While the two languages are similar, Common Lisp includes more specialized constructs, while Scheme includes more general-purpose building blocks out of which such constructs (and others) may be built.
Common Lisp [27] 和Scheme都是现代的LISP语言,两者互相影响中发展。像Scheme,但又不像早期Lisp语言,Common Lisp采用了词法作用域和第一类过程,不过Common Lisp的语法扩展工具却不支持词法作用域。Common Lisp的过程求值规则和其他对象的求值规则不太一样,尽管如此,它仍然保持了针对过程变量的分离命名空间(namespace),所以这不便于将过程当作第一类对象使用。Common Lisp也不支持continuation或者要求对尾调用的处理,但是它支持少数Scheme中没有的通用控制结构。由于两种语言非常相似,Common Lisp包括了一些特殊结构,而对应的Scheme则提供了一些更一般的通用构建块,可以构造出前面Common Lisp提供的那些结构。
译者说明:Common Lisp定义过程(DEFUN)和定义普通变量(DEFVAR)的时候需要用不同的过程,对过程变量赋值自然和对过程变量赋值就存在区别了。
The remainder of this chapter describes Scheme's syntax and naming conventions and the typographical conventions used throughout this book.
1.1 Scheme 语法(Scheme Syntax)
Scheme programs are made up of keywords, variables, structured forms, constant data (numbers, characters, strings, quoted vectors, quoted lists, quoted symbols, etc.), whitespace, and comments.
Keywords, variables, and symbols are collectively called identifiers. Identifiers may be formed from letters, digits, and certain special characters, including
There is no inherent limit on the length of a Scheme identifier; programmers may use as many characters as necessary. Long identifiers are no substitute for comments, however, and frequent use of long identifiers can make a program difficult to format and consequently difficult to read. A good rule is to use short identifiers when the scope of the identifier is small and longer identifiers when the scope is larger.
Identifiers may be written in any mix of upper- and lower-case letters, and case is significant, i.e., two identifiers are different even if they differ only in case. For example,
all refer to different identifiers. This is a change from previous versions of the Revised Report.
是不同的标识符。这和之前版本的Revised Report是不同的。
Structured forms and list constants are enclosed within parentheses, e.g.,
Strings are enclosed in double quotation marks, e.g.,
Details of the syntax for each type of constant data are given in the individual sections of Chapter 6.
Scheme expressions may span several lines, and no explicit terminator is required. Since the number of whitespace characters (spaces and newlines) between expressions is not significant, Scheme programs should be indented to show the structure of the code in a way that makes the code as readable as possible. Comments may appear on any line of a Scheme program, between a semicolon
and the end of the line. Comments explaining a particular Scheme expression are normally placed at the same indentation level as the expression, on the line before the expression. Comments explaining a procedure or group of procedures are normally placed before the procedures, without indentation. Multiple comment characters are often used to set off the latter kind of comment, e.g.,;;; The following procedures ....
Scheme 表达式可以展开成多行,且不需要显式的终结符。因为空白字符的数量没有要求,Scheme程序应当通过恰当的缩进来体现代码的结构,这样才能使代码更加具有可读性。注释可以出现在Scheme程序的任何一行,使用分号;
直到改行结束。解释一段表达式的注释通常置于该表达式同等缩进级别的前一行。用来注释一个过程、或者一组过程的内容通常放在这个过程的前面,且不带缩进。多个缩进字符通常用于后面这种注释,如:;;; The following procedures ....
Two other forms of comments are supported: block comments and datum comments. Block comments are delimited by
pairs, and may be nested. A datum comment consists of a#;
Some Scheme values, such as procedures and ports, do not have standard printed representations and can thus never appear as a constant in the printed syntax of a program. This book uses the notation
1.2 Scheme 命名规范(Scheme Naming Convention)
Scheme's naming conventions are designed to provide a high degree of regularity. The following is a list of these naming conventions:
Programmers should employ these same conventions in their own code whenever possible.
1.3 印刷和记号的约定(Typographical and Notational Conventions)
A standard procedure or syntactic form whose sole purpose is to perform some side effect is said to return unspecified. This means that an implementation is free to return any number of values, each of which can be any Scheme object, as the value of the procedure or syntactic form. Do not count on these values being the same across implementations, the same across versions of the same implementation, or even the same across two uses of the procedure or syntactic form. Some Scheme systems routinely use a special object to represent unspecified values. Printing of this object is often suppressed by interactive Scheme systems, so that the values of expressions returning unspecified values are not printed.
While most standard procedures return a single value, the language supports procedures that return zero, one, more than one, or even a variable number of values via the mechanisms described in Section 5.8. Some standard expressions can evaluate to multiple values if one of their subexpressions evaluates to multiple values, e.g., by calling a procedure that returns multiple values. When this situation can occur, an expression is said to return "the values" rather than simply "the value" of its subexpression. Similarly, a standard procedure that returns the values resulting from a call to a procedure argument is said to return the values returned by the procedure argument.
This book uses the words "must" and "should" to describe program requirements, such as the requirement to provide an index that is less than the length of the vector in a call to
. If the word "must" is used, it means that the requirement is enforced by the implementation, i.e., an exception is raised, usually with condition type &assertion. If the word "should" is used, an exception may or may not be raised, and if not, the behavior of the program is undefined.
The phrase "syntax violation" is used to describe a situation in which a program is malformed. Syntax violations are detected prior to program execution. When a syntax violation is detected, an exception of type &syntax is raised and the program is not executed.
The typographical conventions used in this book are straightforward. All Scheme objects are printed in a typewriter typeface, just as they are to be typed at the keyboard. This includes syntactic keywords, variables, constant objects, Scheme expressions, and example programs. An italic typeface is used to set off syntax variables in the descriptions of syntactic forms and arguments in the descriptions of procedures. Italics are also used to set off technical terms the first time they appear. In general, names of syntactic forms and procedures are never capitalized, even at the beginning of a sentence. The same is true for syntax variables written in italics.
In the description of a syntactic form or procedure, one or more prototype patterns show the syntactic form or forms or the correct number or numbers of arguments for an application of the procedure. The keyword or procedure name is given in typewriter font, as are parentheses. The remaining pieces of the syntax or arguments are shown in italics, using a name that implies the type of expression or argument expected by the syntactic form or procedure. Ellipses are used to specify zero or more occurrences of a subexpression or argument. For example,
In most cases, the type of argument required is obvious, as with vector, obj, or binary-input-port. In others, primarily within the descriptions of numeric routines, abbreviations are used, such as int for integer, exint for exact integer, and fx for fixnum. These abbreviations are explained at the start of the sections containing the affected entries.