面向对象软件构造(第2版)-第6章 Abstract data types抽象数据类型 (上)

This opened my mind, I started to grasp what it means to use the tool known as algebra. I’ll be damned if anyone had ever told me before: over and again Mr. Dupuy [the mathematics teacher] was making pompous sentences on the subject, but not once would he say this simple word: it is a division of labor, which like any division of labor produces miracles, and allows the mind to concentrate all of its forces on just one side of objects, on just one of their qualities.

这启了我的思维,我开始领会使用代数工具的意图。在此之前,可没有任何人告诉过我: Dupuy先生[数学老师]在主题上不断地夸夸其谈,但是他一次也没说这个简单的词组:这是劳动力的分工(division of labor),就象创造奇迹的任何劳动分工一样,允许把所有的思维集中在对象的某个方面上,在其中的一个品质上。

 

What a difference it would have made for us if Mr. Dupuy had told us: This cheese is soft or it is hard; it is white, it is blue; it is old, it is young; it is yours, it is mine, it is light or it is heavy. Of so many qualities let us consider only the weight. Whatever that weight may be, let us call it A. Now, without thinking of the weight any more, let us apply to A everything that we know of quantities.

假如Dupuy先生告诉我们其中的不同,这就会对我们有所帮助了: 这奶酪很软或很硬;它是白色的,它是蓝色的;它是熟的,它是生的;它是您的, 它是我的,它是轻的或是重的。在这么多的品质中让我们只考虑重量。无论其重量是多少,让我们称之为A。现在,不再考虑其重量,让我们把所有我们知其量的每件事物都应用到A上。

 

Such a simple thing; yet no one was saying it to us in that faraway province¼

一件如此简单的事物;在那个偏僻的省份中,一直无人能告诉我们

 

Stendhal, The Life of Henry Brulard, 1836.

 

For abstraction consists only in separating the perceptible qualities of bodies, either from other qualities, or from the bodies to which they apply. Errors arise when this separation is poorly done or wrongly applied: poorly done in philosophical questions, and wrongly applied in physical and mathematical questions. An almost sure way to err in philosophy is to fail to simplify enough the objects under study; and an infallible way to obtain defective results in physics and mathematics is to view the objects as less composite than they are.

因为抽象只存在于分离的可感知的主体本质中,即从其它的本质中来,也从它们应用的主体中来。错误源自于当分离无法完成或应用失误的时候: 无法完成发生在哲学的问题中,应用失误出现在物理和数学的问题上。一个几乎肯定会在哲学中引发错误的方法是没能把研究的对象足够简单化;在物理和数学上,一个确定能获得有缺陷的结果的途径是不能把对象分解到底。

 

Denis Diderot, A Letter on the Blind for the Benefit of Those Who Can See, 1749.


Letting objects play the lead role in our software architectures requires that we describe them adequately. This chapter shows how.

要让对象在我们的软件架构中起着领导作用,这需要我们充分地描述对象。在本章中将展现如何描述它们。

 

You are perhaps impatient to dive into the depths of object technology and explore the details of multiple inheritance, dynamic binding and other joys; then you may at first look at this chapter as an undue delay since it is mostly devoted to the study of some mathematical concepts (although all the mathematics involved is elementary).

您也许迫不及待地想钻研对象技术并想研究多重继承,动态绑定和其它感兴趣技术的细节;由于本章大部分专注于一些数学概念上的研究(虽然这里所用的数学概念都是最基本的),于是,刚开始您可能会认为本章没甚么必要。

 

But in the same way that even the most gifted musician will benefit from learning a little music theory, knowing about abstract data types will help you understand and enjoy the practice of object-oriented analysis, design and programming, however attractive the concepts might already appear without the help of the theory. Since abstract data types establish the theoretical basis for the entire method, the consequences of the ideas introduced in this chapter will be felt throughout the rest of this book.

但是同样地,即使是最富天才的音乐家也会从学习一段简短的音乐理论中受益, 了解有关抽象数据类型将会帮助您理解和喜爱面向对象的分析,设计和编程的实践,然而在没有理论的帮助下,引人入胜的概念就已经出现了。由于抽象数据类型对于整个方法建立了理论基础,所以本章所介绍的概念结论将会贯穿于本书。

 

There is more. As we will see at chapter end, these consequences actually extend beyond the study of software proper, yielding a few principles of intellectual investigation which one may perhaps apply to other disciplines.

不仅仅是这些。在本章结束时我们就能看到,这些结论实际上被完全地扩充了,超越了严格意义上的软件研究,并产生了一些可以适用于其它学科的理性研究的原则。

 

6.1 CRITERIA

6.1 标准

 

To obtain proper descriptions of objects, we need a method satisfying three conditions:

要获得对象的正确描述,我们需要一个方法能满足下列三个条件:

 

• The descriptions should be precise and unambiguous.

·描述应该是精确和无歧义的。

 

• They should be complete — or at least as complete as we want them in each case (we may decide to leave some details out).

·描述应该是完整的-或至少在每种情况里面和我们所希望的一样完整(我们可以选择忽略一些细节)。

 

• They should not be overspecifying.

·描述不应该冗余(overspecifying)

 

The last point is what makes the answer non-trivial. It is after all easy to be precise, unambiguous and complete if we “spill the beans” by giving out all the details of the objects’ representation. But this is usually too much information for the authors of software elements that need to access the objects.

最后一条是说让答案切中要害。如果我们“无意中”发布了对象表示法的所有细节,那么最终是能轻易地得到了精确的,不含糊的和完整的描述。但是这通常对于那些需要使用对象的软件元素的作者来说信息过多了。

 

This observation is close to the comments that led to the notion of information hiding. The concern there was that by providing a module’s source code (or, more generally, implementation-related elements) as the primary source of information for the authors of software elements that rely on that module, we may drown them in a flood of details, prevent them from concentrating on their own job, and hamper prospects of smooth evolution. Here the danger is the same if we let modules use a certain data structure on the basis of information that pertains to the structure’s representation rather than to its essential properties.

这个观点与信息隐藏观念的结论比较接近。对于依赖那个模块的软件元素的作者来说,考虑到如果通过提供给他们这个模块的源代码(或者,更通常的是提供相关实现元素的源码)作为信息的主要来源,我们可能会把他们淹没在细枝末节的汪洋大海中,妨碍他们集中于自己的工作,而且阻碍了平滑演化的可能性。在这里,如果我们让模块使用一个特定的数据结构,而这个数据结构是以符合结构表示法的信息为基础,而不是它的基本属性的话,那么其危险性是同样的。

 

6.2 IMPLEMENTATION VARIATIONS

6.2 实现变体

 

To understand better why the need for abstract data descriptions is so crucial, let us explore further the potential consequences of using physical representation as the basis for describing objects.

要更好地了解为什么对抽象数据描述的需求是如此决定性的, 我们就要进一步研究使用物理示法作为描述对象基础的可能结果。

 

A well-known and convenient example is the description of stack objects. A stack object serves to pile up and retrieve other objects in a last-in, first-out (“LIFO”) manner, the latest inserted element being the first one to be retrieved. The stack is a ubiquitous structure in computing science and in many software systems; the typical compiler or interpreter, for example, is peppered with stacks of many kinds.

一个众所周知的,适宜的例子是栈对象的描述。在后进先出的方式中("LIFO"),一个栈对象用于存放和取回其它的对象,最后插进的元素被最先取回。在计算科学方面和在众多软件的系统中,栈是一个无处不在的结构;例如,典型的编译器或解释器大量使用多种类型的栈。

 

Stacks, it must be said, are also ubiquitous in didactic presentations of abstract data types, so much so that Edsger Dijkstra is said to have once quipped that “abstract data types are a remarkable theory, whose purpose is to describe stacks”. Fair enough. But the notion of abstract data type applies to so many more advanced cases in the rest of this book that I do not feel ashamed of starting with this staple example. It is the simplest I know which includes about every important idea about abstract data types.

可以这么说,栈也到处存在于抽象数据类型的教学活动中,如此之多以至于据说Edsger Dijkstra曾经嘲讽“抽象数据类型是一个卓越的理论,其目的就是描述栈”。此言不假。但是抽象数据类型的概念应用到本书如此众多的高级范例中,以至于我并不认为以这个常用的例子开始有什么不好意思。这是我所知的最简单的例子,它包含了抽象数据类型的每一个重要的思想。

 

Stack representations

表示法

 

Several possible physical representations exist for stacks:

存在几种可能的物理表示法:

 

 


The figure illustrates three of the most common representations. Each has been given a name for ease of reference:

图例描绘了三种最常用的表示法。每一种都有一个便于引用的名字:

 

ARRAY_UP: represent a stack through an array representation and an integer count whose value ranges from 0 (for an empty stack) to capacity, the size of the array representation; stack elements are stored in the array at indices 1 up to count.

·ARRAY_UP:由一个数组representation和一个整数count来描述的一种整数值的范围从0(一个空栈)到capacitycapacity是数组representation的大小;栈元素被存储在索引为1count数组中。

 

ARRAY_DOWN: like ARRAY_UP, but with elements stored from the end of the array rather than from the beginning. Here the integer is called free (it is the index of the highest free array position, or 0 if all positions are occupied) and ranges from capacity for an empty stack down to 0. The stack elements are stored in the array at indices capacity down to free + 1.

·ARRAY_DOWN:如同ARRAY_UP,但其元素从数组的末端开始存储,而不是从起始位置。这里的整数被称之为free(其值是最大的空数组位置的索引,如果所有位置都满了则为0),范围从空栈位置capacity减至0栈元素被存储在索引为capacity下至free + 1数组中。

 

LINKED: a linked representation which stores each stack element in a cell with two fields: item representing the element, and previous containing a pointer to the cell containing the previously pushed element. The representation also needs last, a pointer to the cell representing the top.

·LINKED:一种链接表示法,在其单元内用两个字段存储每一个元素:item表示元素,previous包含一个指针指向前一个元素所在的单元。这种表示法也需要last,一个指向首端的指针。

 

Next to each representation, the figure shows a program extract (in Pascal-like notation) giving the corresponding implementation for a basic stack operation: pushing an element x onto the top.

在每种表示法的旁边,图例都显示了一个程序片段(类Pascal语言符号),为一个基本的栈操作提供了对应的实现: 把元素x压入栈顶。

 

For the array representations, ARRAY_UP and ARRAY_DOWN, the instructions increase or decrease the top indicator (count or free) and assign x to the corresponding array element. Since these representations support stacks of at most capacity elements, robust implementations should include guards of the respective forms

if count < capacity then ¼

if free > 0 then ¼

which the figure omits for simplicity.

对于数组表示法ARRAY_UPARRAY_DOWN指令增加或减少栈顶指示符(countfree)x赋值到对应的数组元素中。由于这些表示法支持最多capacity个元素的栈因此健壮的实现应该包括各自的保护措施

if count < capacity then ¼

if free > 0 then ¼

为了简单,图例中省略了这些条件。

 

For LINKED, the linked representation, pushing an element requires four operations: create a new cell n (done here with Pascal’s new procedure, which allocates space for a new object); assign x to the new cell’s item field; chain the new cell to the earlier stack top by assigning to its previous field the current value of last; and update last so that it will now be attached to the newly created cell.

对于LINKED链接表示法,压入一个元素进栈需要四步操作:创建一个新的存储单元n(这里是由Pascalnew过程完成的,它给一个新的对象分派空间);把x赋值到新存储单元nitem字段中;通过把last指向的当前单元赋值到nprevious字段,链接新的单元到栈的顶端;最后更新last以便它现在指向新创建的存储单元n

 

Although these are the most frequently used stack representations, many others exist. For example if you need two stacks of elements of the same type, and have only limited space available, you may rely on a single array with two integer top markers, count as in ARRAY_UP and free as in ARRAY_DOWN; one of the stacks will grow up and the other will grow down. The representation is full if and only if count = free.

这些都是最常使用的栈表示法,还有许多其它的方法。举例来说,如果您需要二个具有相同类型的元素的栈,并且只有有限的存储空间,那么您可以使用一个单一数组,它带有两个整数类型的栈顶标记,ARRAY_UP中的countARRAY_DOWN中的free;其中一个向上增长,另一个向下递减。当count = free时表示栈满了。

 


The advantage, of course, is to lessen the risk of running out of space: with two arrays of capacity n representing stacks under ARRAY_UP or ARRAY_DOWN, you exhaust the available space whenever either stack reaches n elements; with a single array of size 2n holding two head-to-head stacks, you run out when the combined size reaches 2n, a less likely occurrence if the two stacks grow independently. (For any variable values p and q, max (p + q) £ max (p) + max (q).)

当然,这有减少耗尽空间危险的好处:如果ARRAY_UPARRAY_DOWN各用容量为n的数组来表示栈,只要其中一个达到了n元素您就用尽了栈的有效空间;如果由一个大小为2n的单一数组容纳二个相对而行的栈,当合起来的大小到达2n时,您才用完空间,二个栈独立地增长的情况倒也不常发生。(对于任意的有效变量pqmax (p + q) £ max (p) + max (q))

 

Each of these and other possible representations is useful in some cases. Choosing one of them as “the” definition of stacks would be a typical case of overspecification. Why should we consider ARRAY_UP, for example, more representative than LINKED? The most visible properties of ARRAY_UP — the array, the integer count, the upper bound — are irrelevant to an understanding of the underlying structure.

上述的每一种和其它可能的表示法都适用于一定的情况。选择其中的一个作为栈的定义可能会是一个规格冗余的典型情况。举例来说,我们为什么应该考虑更具代表性的ARRAY_UP,而不是LINKED? ARRAY_UP中最主要的属性-数组,整数count,上界-并不牵涉到对于底层结构的理解。

 

The danger of overspecification

规格冗余的危险

 

Why is it so bad to use a particular representation as specification?

为什么采用一个特别的表示法作为规格是相当糟糕的?

 

The results of the Lientz and Swanson maintenance study, which you may recall, give a hint. More than 17% of software costs was found to come from the need to take into account changes of data formats. As was noted in the discussion, too many programs are closely tied to the physical structure of the data they manipulate. A method relying on the physical representation of data structures to guide analysis and design would not be likely to yield flexible software.

让我们回想一下,LientzSwanson有关维护的研究结果给出了一个提示。研究发现超过17%的软件费用用于考虑数据格式变化的要求。讨论中已经提到了,太多的程序被绑定在它们所操纵的物理数据结构上。一个依赖数据结构的物理表示法来引导分析和设计的方法,将不可能产生具有灵活性的软件。

 

So if we are to use objects or object types as the basis of our system architectures, we should find a better description criterion than the physical representation.

所以,如果我们要用对象或对象类型作为我们系统架构的基础,我们应该找出一个比物理表示法更好的描述标准。

 

How long is a middle initial?

一个中间名(middle initial)多长?

 

Lest stacks make us forget that, beyond the examples favored by computer scientists, data structures are ultimately connected with real-life objects, here is an amusing example, taken from a posting on the Risks forum (comp.risks Usenet newsgroup) of the dangers of a view of data that is too closely dependent on concrete properties:

要避免栈会使我们忘记数据结构才是最终连接到现实的对象上,除了计算机科学家们所偏爱的例子外,这里还有一个有趣的例子,摘自于一个风险论坛上(comp.risks Usenet新闻组)的帖子,此帖的一个数据危害性的观点是太过于依赖实际的属性:

 

My dear mother blessed (or perhaps cursed) all of her children with two middle initials, in my case “D” and “E”. This has caused me a good deal of trouble.

我亲爱的母亲用两个中间名祝福(也许是诅咒)她所有的孩子,我的情况是DE。这给我造成了很多的麻烦。

 

It seems that TRW sells certain parts of your credit information, such as your name and a demographic profile. I recently got a new credit card from Gottchalks and found to my chagrin that my name had been truncated to “Darrell D. Long”. I went to the credit manager and was assured that things would be fixed. Well, two things happened: I got a new credit card, this time as “Darrell E. Long”, and TRW now has an annotation in my file to the effect “File variation: middle initial is E”. Soon after this I start getting mail for “Darrell E. Long” (along with the usual “Darrell Long” and “Darrell D. Long” and the occasional “Darrell D. E. Long”).

似乎TRW泄露了您的信用信息的某些部份,像是您的名字和个人的概况。我最近从Gottchalks得到一张新的信用卡,而使我懊恼的是我发现我的名字已经被切断成" Darrell D. Long"。我去找信用卡部的经理并确保将会被修改好。哦,第二件事发生了: 我得到了一张新的信用卡,这次是"Darrell E. Long",并且现在TRW在我的档案中的注解是"文档改变: 中间名是E"。不久,我开始收到发给"Darrell E. Long"的邮件(随着平时所用的"Darrell Long""Darrell D. Long",偶尔还有"Darrell D. E. Long")

 

I called up the credit bureau and it seems that the programmer who coded up the TRW database decided that all good Americans are entitled to only one middle initial. As the woman on the phone patiently told me “They only allocated enough megabytes (sic) in the system for one middle initial, and it would probably be awfully hard to change”.

我打电话给信用部门,似乎在TRW数据库上面编码的程序员决定了所有良好的美国市民只能有一个中间名。正如在电话的另一端上的妇女耐心地告诉我"在系统中只为一个中间名分配了足够的兆字节(MB)(原文如此),这恐怕会很难更改"

 

Aside from the typical example of technobabble justification (“megabytes”), the lesson here is the need to avoid tying software to the exact physical properties of data. TRW’s system seems similar to those programs, mentioned in an earlier discussion, which “knew” that postal codes consist of exactly five digits.

除了这个可笑的技术理由(兆字节)的典型例子之外,这里的教训是需要避免使软件依赖于数据的精确物理属性。TRW的系统看上去似乎很像在之前讨论中提到的那些程序,"已知"邮政编码包含了正好五位数字。

 

The author of the message reproduced above was mainly concerned about junk mail, an unpleasant but not life-threatening event; the archives of the Risks forum are full of computer-originated name confusions with more serious consequences. The “millenium problem”, mentioned in the discussion of software maintenance, is another example of the dangers of accessing data based on physical representation, this one with hundreds of millions of dollars’ worth of consequences.

上述编写邮件程序的作者主要考虑的是垃圾邮件,一个讨厌但不是威胁生命的事情;风险论坛的文档中充满了有着更严重后果的计算机造成的命名混乱。在软件维护的讨论中提到的"千年虫问题",是另一个在物理表示法上存取数据的危险例子,其后果的价值数以百万计。

 

6.3 TOWARDS AN ABSTRACT VIEW OF OBJECTS

6.3 有关对象的抽象观点

 

How do we retain completeness, precision and non-ambiguity without paying the price of overspecification?

在不付出规格冗余代价的前提下我们如何保持完整性(completeness)精确性(precision)和单值性(non-ambiguity)

 

Using the operations

使用运算

 

In the stack example, what unites the various representations in spite of all their differences is that they describe a “container” structure (a structure used to contain other objects), where certain operations are applicable and enjoy certain properties. By focusing not on a particular choice of representation but on these operations and properties, we may be able to obtain an abstract yet useful characterization of the notion of stack.

在栈的例子中,忽略其中差异而结合各种不同的表示法,就是它们所描述的“容器”结构(一个结构用于包含其它的对象),在其中可运用特定的运算并使用特定的属性。通过集中在这些运算和属性上而不是在表示法的特别选择上,我们能够获得一个有关栈概念的抽象而有效的特征描述。

 

The operations typically available on a stack are the following:

典型的栈运算如下:

 

• A command to push an element on top of a stack. Let us call that operation put.

·把一个元素压入栈顶的命令。我们称之为put运算。

 

• A command to remove the stack’s top element, if the stack is not empty. Let us call it remove.

·在非空的情况下,一个移去栈顶元素的命令。我们称之为remove

 

• A query to find out what the top element is, if the stack is not empty. Let us call it item.

·在非空的情况下,一个找出栈顶元素的查询。我们称之为item

 

• A query to determine whether the stack is empty. (This will enable clients to determine beforehand if they can use remove and item.)

·一个检查栈是否为空的查询。(这使客户预先决定是否能够用removeitem)

 

In addition we may need a creator operation giving us a stack, initially empty. Let us call it make.

另外,我们需要一个创建符运算,使我们得到一个初始为空的栈。我们称之为make

 

Two points may have caught your attention and will deserve more explanation later in this chapter. First, the operation names may seem surprising; for the moment, just think of put as meaning push, remove as meaning pop, and item as meaning top. Details shortly (on the facing page, actually). Second, the operations have been divided into three categories: creators, which yield objects; queries, which return information about objects; and commands, which can modify objects. This classification will also require some more comments.

有两点要值得注意,本章稍后会进一步阐述。第一,运算命名也许有些出乎意料;目前,仅仅把put当成是pushremove当成是popitem当成是top。很快会公布细节(其实就在下一页)。第二,运算会被分成三类:创建符,其创建对象;查询,其返回对象信息;命令,其维护对象。这种分类也将要更进一步地阐述。

 

In a traditional view of data structures, we would consider that the notion of stack is given by some data declaration corresponding to one of the above representations, for example (representation ARRAY_UP, Pascal-like syntax):

count: INTEGER

representation: array [1 l l capacity] of STACK_ELEMENT_TYPE

where capacity, a constant integer, is the maximum number of elements on the stack. Then put, remove, item, empty and make would be routines (subprograms) that work on the object structures defined by these declarations.

在传统的数据结构的观点中,我们认为,符合上述某个表示法的一些数据声明给定了栈的概念,例如(ARRAY_UP表示法,类Pascal符号)

count: INTEGER

representation: array [1 l l capacity] of STACK_ELEMENT_TYPE

其中,整数常量capacity表示栈中元素的最大个数。put, remove, item, emptymake是例程(子程序),它们在被这些声明所定义的对象结构中运行。

 

The key step towards data abstraction is to reverse the viewpoint: forget for the moment about the representation; take the operations themselves as defining the data structure. In other words, a stack is any structure to which clients may apply the operations listed above.

理解数据抽象的主要关键是要颠覆上述之观点: 暂时忘记表示法;把这些运算本身当作一个定义中的数据结构。换句话说,一个栈任意的结构,客户端可以在其中应用上述所列出的运算。

 

A laissez-faire policy for the society of modules

对于模块组织的经济自由政策

 

The method just outlined for describing data structures shows a rather selfish approach to the world of data structures: like an economist of the most passionate supply-side, invisible-hand, let-the-free-market-decide school, we are interested in individual agents not so much for what they are internally as for what they have to offer to each other. The world of objects (and hence of software architecture) will be a world of interacting agents, communicating on the basis of precisely defined protocols.

对于了解数据结构领域来说,一个仅仅只是简略描述数据结构轮廓的方法显示了一种相当狭隘的方式:就像一位对于供应经济学,“看不见的手”的规律,自由市场决定之类的教育政策极端狂热的经济学家, 对其个人动机,我们感兴趣的是它们所能彼此之间提供的联系,而不是它们的内在作用。对象的领域(和此后的软件架构)将成为一个交互媒介的领域,根据精确地详细定义的协议进行沟通。

 

The economic analogy will indeed accompany us throughout this presentation; the agents — the software modules — are called suppliers and clients; the protocols will be called contracts, and much of object-oriented design is indeed Design by Contract, the title of a later chapter.

经济上的类比将会真正地伴随着我们贯穿于整个陈述媒介软件模块被称为供应者(suppliers)客户端(clients)协议称之为契约(contracts)而且大部分面向对象设计是真正的契约式设计(Design by Contract),这是后续章节的主题。

 

As always with analogies, we should not get too carried away: this work is not a textbook on economics, and contains no hint of its author’s views in that field. It will suffice for the moment to note the remarkable analogies of the abstract data type approach to some theories of how human agents should work together. Later in this chapter we will again explore what abstract data types can tell us beyond their original area of application.

由于总是使用类比,我们不应该过于陷入其中: 本书不是经济学上的一本教科书,也并没有包含作者对此领域所持的观点。目前,这些类比足够表明抽象数据类型方法与人类媒体协同工作的方法之间显著的类似之处。在本章的后面,我们会再次研究抽象数据类型在其最初的应用领域之外所能告诉我们的东西。

 

Name consistency

命名一致性

 

For the moment, let us get back to more immediate concerns, and make sure you are comfortable with the above example specification in all its details. If you have encountered stacks before, the operation names chosen for the discussion of stacks may have surprised or even shocked you. Any self-respecting computer scientist will know stack operations under other names:

暂时,让我们回到更迫切需要关心的问题上来,同时,确信您对上述例子的规格中所有的细节都没有疑问。如果您以前遇到过栈,那么对所讨论中的栈的运算命名之选择可能已经使您吃惊甚至感到震惊。任何自信的计算机科学家都会了解下列栈运算的其它命名:


 

Why use anything else than the traditional terminology? The reason is a desire to take a high-level view of data structures — especially “containers”, those data structures used to keep objects.

为什么要使用有别于传统的术语? 原因是一个更高层次上的研究数据结构的要求-特别是"容器",即那些用来保存对象的数据结构。

 

Stacks are just one brand of container; more precisely, they belong to a category of containers which we may call dispensers. A dispenser provides its clients with a mechanism for storing (put), retrieving (item) and removing (remove) objects, but without giving them any control over the choice of object to be stored, retrieved or removed. For example, the LIFO policy of stacks implies that you may only retrieve or remove the element that was stored last. Another brand of dispenser is the queue, which has a first-in, first-out (FIFO) policy: you store at one end, retrieve and remove at the other; the element that you retrieve or remove is the oldest one stored but not yet removed. An example of a container which is not a dispenser is an array, where you choose, through integer indices, the positions where you store and retrieve objects.

栈只是容器的类型之一更精确地说是属于一种我们称之为分配器(dispensers)的容器。一个分配器提供了一个机制给它的客户端来存储(put), 获取(item)和删除(remove)对象,但是对于被存储,获取或删除的对象之选择,并没有给其客户端任何地控制。举例来说,栈的LIFO政策意味着您只可能获取或删除在最后一个被储存的元素。另一种分配器是队列,它采取的是先进先出(FIFO)策略:您在一端存储,在另一端获取和删除;您获取或是删除的元素是没被删除中的最早存储的一个。一个不是分配器的容器例子是数组,在其中通过整数索引选择您储存和获取对象的位置。

 

Because the similarities between various kinds of container (dispensers, arrays and others) are more important than the differences between their individual storage, retrieval and removal properties, this book constantly adheres to a standardized terminology which downplays the differences between data structure variants and instead emphasizes the commonality. So the basic operation to retrieve an element will always be called item, the basic operation to remove an element will always be called remove and so on.

因为,在各种不同类型的容器(分配器,数组等等)之间的相似性比在它们单独地存储,获取和删除属性之间的差异更为重要,所以本书一直坚持一个标准化的术语,来忽略数据结构变体之间的差异,取而代之的是强调共通性。因此获取一个元素的基本操作将总是被称为item,删除一个元素的基本操作总是被称之为remove等等。

 

These naming issues may appear superficial at first — “cosmetic”, as programmers sometimes say. But do not forget that one of our eventual aims is to provide the basis for powerful, professional libraries of reusable software components. Such libraries will contain tens of thousands of available operations. Without a systematic and clear nomenclature, both the developers and the users of these libraries would quickly be swamped in a flood of specific and incompatible names, providing a strong (and unjustifiable) obstacle to large-scale reuse.

正如程序员有时说的那样,这些命名的议题起先可能显得有些无关紧要-"修饰物"。但是不要忘记我们的最终目标之一是对强大的,专业的可复用软件组件库提供基础。这样的库将会包含成千上万的有效运算。没有一个系统的和清晰的命名法则,这些库的开发者和用户很快地会陷入在这些特殊和矛盾的命名之中,这对大范围的复用提供了巨大(和不合理的)的障碍。

 

Naming, then, is not cosmetic. Good reusable software is software that provides the right functionality and provides it under the right names.

因而,命名并是修饰物。良好的可复用软件是能提供准确的功能性和正确的命名的软件。

 

The names used here for stack operations are part of a systematic set of naming conventions used throughout this book. A later chapter will introduce them in more detail.

这里所使用的栈运算的命名,是贯穿于本书的系统性命名约定的一部份。更详细地介绍请见后。

 

How not to handle abstractions

如何忽略抽象性

 

In software engineering as in other scientific and technical disciplines, a seminal idea may seem obvious once you have been exposed to it, even though it may have taken a long time to emerge. The bad ideas and the complicated ones (they are often the same) often appear first; it takes time for the simple and the elegant to take over.

就象在其它的科学技术学科中一样,在软件工程中一旦您已经受到一个创造性概念的影响,就可能认为这个概念是显而易见的,即使它可能曾经花了很长时间才脱颖而出。差劲和复杂难懂的概念(它们常常是难兄难弟)通常倒是一开始就冒出头来;而简单和优雅的理论去要花上一段时间才能被接受。

 

This observation is true of abstract data types. Although good software developers have always (as a result of education or mere instinct) made good use of abstraction, many of the systems in existence today were designed without much consideration of this goal.

这个结论对抽象数据类型也适用。虽然优秀的软件开发者总是(作为教育的结果或者仅仅是出于本能)很好地利用抽象, 但是现存的许多系统没有过多考虑这一个目标就被设计出来了。

 

I once did a little involuntary experiment which provided a good illustration of this state of affairs. While setting up the project part of a course which I was teaching, I decided to provide students with a sort of anonymous marketplace, where they could place mock “for sale” announcements of software modules, without saying who was the source of the advertisement. (The idea, which may or may not have been a good one, was to favor a selection process based only on a precise specification of the modules’ advertized facilities.) The mail facility of a famous operating system commonly favored by universities seemed to provide the right base mechanism (why write a new mail system just for a course project?); but naturally that mail facility shows the sender’s name when it delivers a message to its recipients. I had access to the source of the corresponding code — a huge C program — and decided, perhaps foolishly, to take that code, remove all references to the sender’s name in delivered messages, and recompile.

我曾经做过一个偶然的小实验,对这种事态提供了一个很好的佐证。作为我正在讲授的课程中的一部分,我设立了一个项目,决定提供给学生一种(虚拟)匿名的市场,在那里他们可以放置一些模拟的“待售”软件模块的公告,并不说明广告的作者来源。(不管好不好,这个主意只是在一个模块的广告工具的精确规格上做一次选择过程罢了。)一个在大学普遍使用的著名的操作系统中的邮件工具似乎能够提供合意的基本机制(为什么仅仅针对一个课程项目去写一个新的邮件系统?);但是当它发送一个消息给收件人的时候,邮件工具很自然地显示了寄件人的名字。我查看了对应的源代码―一个庞大的C语言程序-并且决定,也许是愚蠢的决定,从发送消息的代码中删除有关寄件人名字的所有引用, 然后重新编译。

 

Aided by a teaching assistant, I thus embarked on a task which seemed obvious enough although not commonly taught in software engineering courses: systematic program deconstruction. Sure enough, we quickly found the first place where the program accessed the sender’s name, and we removed the corresponding code. This, we naively thought, would have done the job, so we recompiled and sent a test mail message; but the sender’s name was still there! Thus began a long and surreal process: time and again, believing we had finally found the last reference to the sender’s name, we would remove it, recompile, and mail a test message, only to find the name duly recorded once again in its habitual field. Like the Hydra in its famous fight, the mailer kept growing a new head every time we thought we had cut the last neck.

就此,在一个助教的帮助下我开始着手于这个工作,这个工作虽然没有普遍的在软件工程课程-系统程序的解析 (deconstruction)-中讲授过,但是却显而易见。我们迅速地发现了程序存取寄件人名字的第一个位置,并删除了相关的代码,这当然足够了。我们很自然地认为,这就完成了工作,因此,我们重新编译并发送了一个测试邮件;但是寄件人的名字仍然出现了!如此开始了一个漫长的和梦幻般的过程: 反复之后, 我们确信最终找到了寄件人名字的最后一个引用,删除了它,再重新编译,并发出了测试邮件,不料竟会再一次发现名字仍旧出现在它应有的位置上。就像在那著名战斗中的九头怪蛇,每当我们认为已经砍下了最后的一个头的时候,邮寄者又长出了一个新的。

 

Finally, repeating for the modern era the earlier feat of Hercules, we slew the beast for good; by then we had removed more than twenty code extracts which all accessed, in some way or other, information about the message sender.

最后,就像大力神的壮举在现代社会中的重演,我们永久地杀死了怪兽;到那时候我们已经删除了超过二十处的代码,这些代码用了某种方法来存取了有关寄件人的信息。

[希腊神话] 许德拉,希腊神话中的九头怪,斩去一头会生出二头,后被大力神赫克勒斯所杀。

 

Although the previous sections have only got us barely started on our road to abstract data types, it should be clear by now that any program written in accordance with even the most elementary concepts of data abstraction would treat MAIL_MESSAGE as a carefully defined abstract notion, supporting a query operation, perhaps called sender, which returns information about the message sender. Any portion of the mail program that needs this information would obtain it solely through the sender query. Had the mail program been designed according to this seemingly obvious principle, it would have been sufficient, for the purpose of my little exercise, to modify the code of the sender query. Most likely, the software would also then have provided an associated command operation set_sender to update sender information, making the job even easier.

在通向抽象数据类型的道路上,刚才的经过只是一个小插曲依照数据抽象最基本概念而写出的任何程序,都会把MAIL_MESSAGE 视为一个切实定义的抽象概念,它支持一个查询操作,也许称之为sender,其返回关于寄件人的信息。这在现在看来是再清楚不过了的。邮件程序中任何需要此信息的部分都将会通过sender的查询独立地获得。要是邮件程序依照这项显而易见的原则来设计的话,对付我这个小小的练习,则修改sender查询的代码就足够了。如有可能,软件也会提供一个相关的运算命令set_sender,来更新寄件人的信息,让工作更容易。

 

What is the real moral of that little story (besides lowering the reader’s guard in preparation for the surprise mathematical offensive of the next section)? After all, the mail program in question is successful, at least judging by its widespread use. But it typifies the current quality standard in the industry. Until we move significantly beyond that standard, the phrase “software engineering” will remain a case of wishful thinking.

这个小故事的真正寓意是什么呢(除了降低一下读者的戒心,准备下一段的令人昏昏欲睡的数学攻势)?毕竟,至少由它广泛的使用性来判断的话,讨论中的邮件程序是成功的。但是它代表着当前产业中的品质标准。直到我们有效地超越那个标准之前,短语“软件工程”将保持着一种可望而不可及的状况。

 

Oh yes, one more note. Some time after my brief encounter with the mail program, I read that certain network hackers had intruded into the computer systems of highly guarded government laboratories, using a security hole of that very mail program — a hole which was familiar, so the press reported, to all those in the know. I was not in the know; but, when I learned the news, I was not surprised.

哦是的,还有一个要注意的。在我短暂地接触了邮件程序的一段时间之后,我得知某个网络黑客侵入了高度保护着的政府实验室内的计算机系统,使用的正是那个邮件程序的安全漏洞-正如新闻媒体所报导的,这是一个内情人所熟知的漏洞。我并不知道内情,但是,当我看到新闻的时候,我并不吃惊。

你可能感兴趣的:(面向对象软件构造译文)