D2 的 range设计

betty_betty2008  2009-04-08

转自圈子:http://dlang.group.iteye.com/group/topic/10615

Hello,

Walter, Bartosz and myself have been hard at work trying to find the
right abstraction for iteration. That abstraction would replace the
infamous opApply and would allow for external iteration, thus paving the
way to implementing real generic algorithms.
Walter, Bartosz和我一直致力于试图寻找合适的迭代器抽象模型。这种抽象模型能够取代名声狼籍的opApply并且允许从外部迭代,从而为实现真正的泛型算法铺路。

We considered an STL-style container/iterator design. Containers would
use the newfangled value semantics to enforce ownership of their
contents. Iterators would span containers in various ways.
我们曾考虑过STL类似的容器/迭代器设计。容器可以用最新式的语法对其包含的内容行使所有权。迭代器可以多种方式遍历容器。

The main problem with that approach was integrating built-in arrays into
the design. STL's iterators are generalized pointers; D's built-in
arrays are, however, not pointers, they are "pairs of pointers" that
cover contiguous ranges in memory. Most people who've used D gained the
intuition that slices are superior to pointers in many ways, such as
easier checking for validity, higher-level compact primitives,
streamlined and safe interface. However, if STL iterators are
generalized pointers, what is the corresponding generalization of D's
slices? Intuitively that generalization should also be superior to
iterators.
这个初步的想法最大的问题就是如何将D语言内置的数组整合到该设计之中。STL的迭代器是泛型指针;D的内置数组则不是指针,而是拥有一段连续内存的“指针对”。大多数用过D的同学都觉得数组切片好过指针体现在很多地方,比如更简单的有效性检测、高级精简原语、更加自然和安全的接口。既然STL的迭代器是泛型指针,那么在D中什么才是相对应的泛型切片呢?直觉上这个泛型的东东也应该优于迭代器才对。

In a related development, the Boost C++ library has defined ranges as
pairs of two iterators and implemented a series of wrappers that accept
ranges and forward their iterators to STL functions. The main outcome of
Boost ranges been to decrease the verboseness and perils of naked
iterator manipulation (see
http://www.boost.org/doc/libs/1_36_0/libs/range/doc/intro.html). So a
C++ application using Boost could avail itself of containers, ranges,
and iterators. The Boost notion of range is very close to a
generalization of D's slice.
相似的开发是,C++Boost 库定义的“区间”(range)为一对迭代器,并对其实现了一系列的包装使之能接受区间并且能够传递给STL的方法。Boost “区间”最大收获就是简化了显式使用迭代器编写的冗长的代码且降低了危险性(参见http://www.boost.org/doc/libs /1_36_0/libs/range/doc/intro.html)。所以应用了Boost的C++程序能够从容器、区间、迭代器当中获益。 Boost 的区间(range)概念非常接近D的泛型切片。

We have considered that design too, but that raised a nagging question.
In most slice-based D programming, using bare pointers is not necessary.
Could then there be a way to use _only_ ranges and eliminate iterators
altogether? A container/range design would be much simpler than one also
exposing iterators.
我们也曾考虑过这个设计。但是同样存在一个比较挑剔的问题:在大多数基于切片的D程序中没有必要显式使用指针。有没有办法实现一个不用迭代器的单纯的区间技术呢?一个容器/区间的设计比那个同时暴露迭代器在外的设计用起来要简单得多。

All these questions aside, there are several other imperfections in the
STL, many caused by the underlying language. For example STL is
incapable of distinguishing between input/output iterators and forward
iterators. This is because C++ cannot reasonably implement a type with
destructive copy semantics, which is what would be needed to make said
distinction. We wanted the Phobos design to provide appropriate answers
to such questions, too. This would be useful particularly because it
would allow implementation of true and efficient I/O integrated with
iteration. STL has made an attempt at that, but istream_iterator and
ostream_iterator are, with all due respect, a joke that builds on
another joke, the iostreams.
除了这些问题,STL还存在着其它的一些缺陷,当中很多都是由底层的语言造成的。比如,STL不能够区分输入/输出迭代器与前向迭代器。这是由于 C++不能合乎情理地实现析构复制这样一个语义,而这正是区分前述问题所必需的[译注:限于俺对Boost及C++的菜鸟级认知,此处翻译得很烂,请大牛指正]。我们希望Phobos能够很好地解决这个问题。这在实现整合了迭代器的,真正、高效的I/O非常有用。STL在这方面做了尝试,但是,恕我冒
昧,istream_iterator 和ostream_iterator是一个建立在iostream这个笑话之上的另一个笑话而已。

After much thought and discussions among Walter, Bartosz and myself, I
defined a range design and reimplemented all of std.algorithm and much
of std.stdio in terms of ranges alone. This is quite a thorough test
because the algorithms are diverse and stress-test the expressiveness
and efficiency of the range design. Along the way I made the interesting
realization that certain union/difference operations are needed as
primitives for ranges. There are also a few bugs in the compiler and
some needed language enhancements (e.g. returning a reference from a
function); Walter is committed to implement them.
诸多考量,加之与Walter和Bartosz商讨,我设计了区间(range)并重新实现了std.algorithm和相当一部分与 range相关的std.stdio的实现。 这是一次彻底的尝试,原因是算法完全迥异,尝试更着重于range的表达能力和性能。在整个期间有意思的是我意识到这种合并/差异等操作的特性应该是区间最原生的需求。目前编译器尚有些许bugs,语言本身在某些方面也有待增强(比如返回一个方法的引用);Walter已答应实现它们。

I put together a short document for the range design. I definitely
missed about a million things and have been imprecise about another
million, so feedback would be highly appreciated. See:

http://ssli.ee.washington.edu/~aalexand/d/tmp/std_range.html
放在一起的还有区间设计的一个简短文档。毫无疑问,我肯定漏掉了许许多多本该有的;而那些已有的里面也肯定有许许多多不够准确的地方,所以非常感激您能够给予反馈。参见:
http://ssli.ee.washington.edu/~aalexand/d/tmp/std_range.html

Andrei
----------------------全文完-----------------------
来源:
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.announce&article_id=12922

补充材料:从Andrei的另外一篇文章中找一二例供大概了解range的用法及它的能力:

import std.range;  // 包含chain的模块,但该模块俺在现在的D2中尚未发现
int[] a,b,c;
…
foreach(e;chain(a,b,c))
{
…use e   //用e 来做一些事情
}

上例只用了三个range就可以一次遍历三个容器,如果用迭代器写一个函数的话,则需要六个迭代器(三对)。另请留意上述代码并没有将三个容器a ,b 和 c 连接在一起。chain所做的仅仅是一个一个地遍历。再请看下面对三个容器排序的代码:

sort(chain(a,b,c));


std.range尚有更多的用法。

你可能感兴趣的:(C++,c,算法,C#,笑话)