[翻译]Looking Back at Postgres,Postgres 过往

2019独角兽企业重金招聘Python工程师标准>>> hot3.png

ABSTRACT Looking Back at Postgres,Postgres 过往 Joseph M. Hellerstein,(次郎翻译)

[email protected]

还覆盖了比他建立的任何其他单一系统更多的技术基础。尽

This is a recollection of the UC Berkeley Postgres project, which was led by Mike Stonebraker from the mid-1980’s to the mid-1990’s. The article was solicited for Stonebraker’s Turing Award book [Bro19], as one of many personal/historical recollections. As a result it fo- cuses on Stonebraker’s design ideas and leadership. But Stonebraker was never a coder, and he stayed out of the way of his development team. The Postgres codebase was the work of a team of brilliant stu- dents and the occasional university “staff programmers” who had little more experience (and only slightly more compensation) than the students. I was lucky to join that team as a student during the latter years of the project. I got helpful input on this writeup from some of the more senior students on the project, but any errors or omissions are mine. If you spot any such, please contact me and I will try to fix them.

这是对加州大学伯克利分校Postgres 项目的回忆,该项目山 Mike Stonebraker 从 1980 年代中期到 1990 年代中期领导。这篇文掌是为Stonebraker 的图灵奖书[Bro19] 征集的,作为众多个人/历史回忆中的一个。因此,它专注于 Stonebraker 的设计理念和领导力。但是 Stonebraker 从来就不是编码员,而且他不在他的开发团队的路上。Postgres 代码库是一群聪明的学生和偶尔的大学“工作人员”的工作,他们比学生们有更多的经验(而且只有更多的补偿)。在项目的后期,我很幸运能够以学生的身份加入该团队。我从该项目的一些较高年级学生那里得到了有用的意见,但任何错误或遗涓都是我的。如果您发现任何此类情况,请与我联系,我会尽力解决。

OPENING 开篇 Postgres was Michael Stonebraker’s most ambitious project—his grand effort to build a one-size-fits-all database system. A decade long, it generated more papers, Ph.D.s, professors, and companies than anything else he did. It also covered more technical ground than any other single system he built. Despite the risk inherent in taking on that scope, Postgres also became the most successful soft- ware artifact to come out of Stonebraker’s research groups, and his main contribution to open source. It is an example of a “second sys- tem” [Bro75] that succeeded. As of the time of writing–over thirty years since the project started—the open-source PostgreSQL sys- tem is the most popular independent open-source database system in the world, and the fourth most popular database system in the world. Meanwhile, companies built from a Postgres base have gen- erated a total of over $2.6 billion in acquisitions. By any measure, Stonebraker’s Postgres vision resulted in enormous and ongoing impact.

Postgres 是 Michael Stonebraker 最雄心勃勃的项目 - 他致力于建立一个通用的数据库系统。十年之久,山产生了很多的论文,博士学位,教授和公司,而不是他所做的任何事情。它

This paper is published under the Creative Commons Attribution 4.0 International (CC-BY 4.0) license. Authors reserve their rights to disseminate the work on their personal and corporate Web sites with the appropriate attribution.

管承担了该范围所固有的风险,Postgres 也成为了Stonebraker研究小组中最成功的软件工具,以及他对开源的主要贡献。这是一个成功的“第二个系统” [Bro75] 的例子。截至撰写本文时

  • 项目开始已有三十多年 - 开源 PostgreSQL 系统是世界上最受欢迎的独立开源数据库系统,也是世界上第四大最受欢迎的数据库系统。与此同时,从Postgres 基地建立的公司已经产生了超过 26 亿美元的收购。无论如何,Stonebraker 的 Postgres 愿景产生了巨大而持续的影响。

Context Stonebraker had enormous success in his early career with the In- gres research project at Berkeley [SHWK76], and the subsequent start-up he founded with Larry Rowe and Eugene Wong: Relational Technology, Inc. (RTI).

As RTI was developing in the early 1980s, Stonebraker began working on database support for data types beyond the traditional rows and columns of Codd’s original relational model. A motivat- ing example current at the time was to provide database support for Computer-Aided Design (CAD) tools for the microelectronics industry. In a paper in 1983, Stonebraker and students Brad Ruben- stein and Antonin Guttman explained how that industry needed support for ”new data types such as polygons, rectangles, text strings, etc.” ”efficient spatial searching” ”complex integrity constraints” and ”design hierarchies and multiple representations” of the same physical constructions [SRG83]. Based on motivations such as these, the group started work on indexing (including Guttman’s influen- tial R-trees for spatial indexing [Gut84], and on adding Abstract Data Types (ADTs) to a relational database system. ADTs were

a popular new programming language construct at the time, pi- oneered by subsequent Turing Award winner Barbara Liskov and explored in database application programming by Stonebraker’s new collaborator, Larry Rowe. In a paper in SIGMOD Record in 1983 [OFS83], Stonebraker and students James Ong and Dennis Fogg describe an exploration of this idea as an extension to In- gres called ADT-Ingres, which included many of the representa- tional ideas that were explored more deeply—and with more sys- tem support—in Postgres.

Stonebraker 在伯克利的 Ingres 研究项目早期的职业生涯中取得了巨大的成功,他随后与 Larry Rowe 和 Eugene Wong 合作创立了初创公司: Relational Technology,Inc。(RTI)。

随着 RTI 在 20 世纪 80 年代初期的发展,Stonebraker 开始致力于数据类型的支持,超越了 Codd 原始关系模型的传统行和列。当前一个激励性的例子是为微电子工业的计算机辅助设计(CAD)工具提供数据库支持。在 1983 年的一篇论文中,

Stonebraker 和学生 Brad Rubenstein 和 Antonin Guttman 解释了该行业如何霄要支持“多边形,矩形,文本字符串等新数据类型”。“有效空间搜索”“复杂完整性约束”和“设计层次结构和多个表示”的相同物理结构 [SRG83]。基于这些动机,该小组开始研究索引(包括 Guttman 有影响力的 R-trees 用于空间索引 [Gut84],以及将抽象数据类型(ADT)添加到关系数

在数据库中支持抽象数据类型 (ADTs) Complex Objects (i.e., nested or non-first-normal form data)* 复杂对象 User-Defined Abstract Data Types and Functions* 用户定义的抽象类型和函数 Extensible Access Methods for New Data Types* 拓展的新数据类型访问方法 Optimizer Handling of Queries with Expensive UDFs 优化的查询接口, UDFs Active Databases and Rules Systems (Triggers, Alerts)* 激活数据库和规则系统 Rules implemented as query rewrites† Rules(规则) 通过查询 rewrites 实现 Rules implemented as record-level triggers† Rules(规则) 通过行级触发器实现 triggers Log-centric Storage and Recovery 基于日志 (Log-centric) 的存储和恢复 Reduced-complexity recovery code by treating the log as data,* using non-volatile memory for commit status† No-overwrite storage and time travel queries† Support for querying data on new deep storage technologies, no- tably optical disks* 支持在新的存储技术上存储数据, 光盘? Support for multiprocessors or custom processors* 支持多处理器或用户定义处理器 Support for a variety of language models 支持多语言模型 Minimal changes to the relational model and support for declarative queries* 最新变更关系模型支持衍生查询 Exposure of ”fast path” access to internal APIs, bypassing the query language† Multi-lingual support† 多语言支持 在数据库中支持抽象数据类型 (ADTs) Complex Objects (i.e., nested or non-first-normal form data)* 复杂对象 User-Defined Abstract Data Types and Functions* 用户定义的抽象类型和函数 Extensible Access Methods for New Data Types* 拓展的新数据类型访问方法 Optimizer Handling of Queries with Expensive UDFs 优化的查询接口, UDFs Active Databases and Rules Systems (Triggers, Alerts)* 激活数据库和规则系统 Rules implemented as query rewrites† Rules(规则) 通过查询 rewrites 实现 Rules implemented as record-level triggers† Rules(规则) 通过行级触发器实现 triggers Log-centric Storage and Recovery 基于日志 (Log-centric) 的存储和恢复 Reduced-complexity recovery code by treating the log as data,* using non-volatile memory for commit status† No-overwrite storage and time travel queries† Support for querying data on new deep storage technologies, no- tably optical disks* 支持在新的存储技术上存储数据, 光盘? Support for multiprocessors or custom processors* 支持多处理器或用户定义处理器 Support for a variety of language models 支持多语言模型 Minimal changes to the relational model and support for declarative queries* 最新变更关系模型支持衍生查询 Exposure of ”fast path” access to internal APIs, bypassing the query language† Multi-lingual support† 多语言支持 据库系统 ADT 是当时流行的新编程语言构造,山随后的图灵奖获得者 Barbara Liskov 开创,并山 Stonebraker 的新合作者 Larry Rowe 在数据库应用程序编程中进行了探索。在 1983 年的 SIGMOD 记录中的文掌 [OFS83],Stonebraker 和学生 James

Ong 和 Dennis Fogg 描述了对这个想法的探索,作为 Ingres 的扩展名为ADT-Ingres,其中包括许多在 Postgres 中更深入探索的代表性思想 - 以及更多的系统支持。

POSTGRES: AN OVERVIEW As indicated by the name, Postgres was ”Post-Ingres”: a system designed to take what Ingres could do, and go beyond. The signa- ture theme of Postgres was the introduction of what he eventu- ally called Object-Relational database features: support for object- oriented programming ideas within the data model and declara- tive query language of a database system. But Stonebraker also de- cided to pursue a number of other technical challenges in Postgres that were independent of object-oriented support, including active database rules, versioned data, tertiary storage, and parallelism.

Two papers were written on the design of Postgres: an early de- sign in SIGMOD 1986 [SR86] and a ”mid-flight” design description in CACM 1991 [SK91]. The Postgres research project ramped down in 1992 with the founding of Stonebraker’s Illustra startup, which involved Stonebraker, key Ph.D. student Wei Hong, and then-chief- programmer Jeff Meredith. In Figure 1, the features mentioned in the 1986 paper are marked with an asterisk*; those from the 1991 paper that were not in the 1986 paper are marked with a dagger†. Other goals listed below were tackled in the system and the re- search literature, but not in either design paper. Many of these topics were addressed in Postgres well before they were studied or reinvented by others; in many cases Postgres was too far ahead of its time and the ideas caught fire later, with a contemporary twist.

正如名称所示,Postgres 是“后 Ingres”:一个旨在采取 Ingres可以做的事情的系统,并且超越了。Postgres 的标志性主题是引言他最终称之为 Object-Relational 数据库特性: 支持数据模型中的面向对象编程思想和声明性查询语言数据库系统的。但是 Stonebraker 还决定在 Postgres 中追求一些独立于面向对象支持的其他技术挑战,包括活动数据库规则,版本化数据,三级存储和并行性。

关于 Postgres 设计的两篇论文: 1986 年 SIGMOD 的早期设

Figure 1: Postgres features first mentioned in the 1986 pa- per* and the 1991 paper†.

Supporting ADTs in a Database System, 数据库系统中的抽象数据类型 The signature goal of Postgres was to support new Object-Relational features: the extension of database technology to support a combi-

计[SR86] 以及CACM 1991 中的“中途飞行”设计描述[?]Stonebraker:1n99a1ti:oPnNGof:1t2h5e22b3e.n12e5fi2ts62o。f relational query processing and object-

Postgres 研究项目于 1992 年随着Stonebraker 的Illustra 创业公司的成立而崛起,其中涉及Stonebrak 呃,关键博士学生魏宏,然后是首席程序员杰夫梅雷迪思。在图 ⁇中,1986 年论文中提到的功能标有星号 *; 那些来自 1991 年论文的那篇文掌在 1986年的论文中用 were not 标有 dagger†。下面列出的其他目标在系统中得到了解决研究文献,但在任何一篇设计论文中都没有。

Postgres 中很多这些主题在被其他人研究或重新发明之前就得到了解决; 在许多情况下,Postgres 远远超出了它的时间和想法以后会发生大火,带有现代感。

We briefly discuss each of these Postgres contributions, and con- nections to subsequent work in computing.

我们简要讨论了这些Postgres 的每一个贡献,以及与后续计算工作的关系.

oriented programming. Over time the Object-Relational ideas pio- neered in Postgres have become standard features in most modern database systems.

Postgres 的签名目标是支持新的对象关系功能: 数据库技术的扩展以支持组合关系查询处理和面向对象编程的好处。随着时间的推移,对象关系的想法开创了 Postgres 已经成为大多数现代数据库系统的标准功能。

Complex objects. It is quite common for data to be repre- sented in the form of nested bundles or “objects.” A classic example is a purchase order, which has a nested set of products, quantities, and prices in the order. Relational modeling religion dictated that such data should be restructured and stored in an unnested format, using multiple flat entity tables (orders, products) with flat relation- ship tables (product_in_order) connecting them. The classic reason for this flattening is that it reduces duplication of data (a product being described redundantly in many purchase orders), which in turn avoids complexity or errors in updating all redundant copies. But in some cases you want to store the nested representation, be- cause it is natural for the application (say, a circuit layout engine in a CAD tool), and updates are rare. This data modeling debate is at least as old as the relational model.

数据以嵌套捆绑或“对象”的形式表示是很常见的。一个典型的例子是采购订单, 它在订单中有一组嵌套的产品,数量和价格。关系建模理论决定应该使用具有连接它们的平面关系表(product_in_order) 的多个平面实体表(订单,产品)来重构和存储这种数据。这种扁平化的经典原因是它减少了数据重复(在许多采购订单中冗余描述的产品),这反过来避免了更新所有冗余副本的复杂性或错误。但在某些情况下你想要存储嵌套表示,因为它对应用程序很自然(例如,CAD 工具中的电路布局引擎),并且更新很少。这种数据建模辩论至少与关系模型一样古老。

A key aspect of Postgres was to ”have your cake and eat it too” from a data modeling perspective: Postgres retained tables as its ”outermost” data type, but allowed columns to have ”complex” types including nested tuples or tables. One of its more esoteric implementations, first explored in the ADT-Ingres prototype, was to allow a table-typed column to be specified declaratively as a query definition: ”Quel as a data type” [SAHR84].

Postgres 的一个关键方面是从数据建模角度“拥有你的蛋糕并吃掉它”: Postgres 将表保留为“最外层”数据类型,但允许列 s 具有“复杂”类型,包括嵌套元组或表。其中一个更为深奥的实现,首先在 ADT-Ingres 原型中进行了探索,是为了允许表格式列以声明方式指定为查询定义:“Quel as a data

type”[?]。

The “post-relational” theme of supporting both declarative queries and nested data has recurred over the years—often as an outcome of arguments about which is better. At the time of Postgres in the 1980s and 1990s, some of the object-oriented database groups picked up the idea and pursued it to a standard language called OQL, which has since fallen from use.

支持声明性查询和嵌套数据的“后关系”主题多年来一直在复发 - 通常是关于哪个更好的争论的结果。在 20 世纪 80 年代和 90 年代的 Postgres 时代,一些面向对象的数据库小组接受了这个想法并将其推向了一种名为 OQL 的标准语言,该语言已经从使用中消失了。

Around the turn of the millennium, declarative queries over nested objects became a research obsession for a segment of the database community in the guise of XML databases; the resulting XQuery language (headed by Don Chamberlin of SQL fame) owes a debt to the complex object support in Postgres’ Postquel language. XQuery had broad adoption and implementation in industry, but never caught on with users. The ideas are being revisited yet again today in query language designs for the JSON data model popu- lar in browser-based applications. Like OQL, these languages are in many cases an afterthought in groups that originally rejected declarative queries in favor of developer-centric programming (the ”NoSQL” movement), only to want to add queries back to the sys- tems post-hoc. In the meantime, as Postgres has grown over the years (and shifted syntax from Postquel to versions of SQL that re- flect many of these goals), it has incorporated support for nested

data like XML and JSON into a general-purpose DBMS without re- quiring any significant rearchitecting. The battle swings back and forth, but the Postgres approach of extending the relational frame- work with extensions for nested data has shown time and again to be a natural end-state for all parties after the arguments subside.

在千禧年之际,对嵌套对象的声明性查询成为了以 XML数据库为幌子的数据库社区的一个研究重点; 山此产生的 XQuery 语言(山SQL 名望的Don Chamberlin 领导)欠 Postgres的 Postquel 语言中复杂对象支持的债务。XQuery 在业界得到了广泛的采用和实施,但从未与用户相提并论。今天,在基于浏览器的应用程序中流行的 JSON 数据模型的查询语言设计中,这些想法再次被重新审视。与 OQL 一样,这些语言在许多情况下都是在最初拒绝声明性查询以支持以开发人员为中心的编程(“NoSQL”运动)的组中的事后想法,只是想要在事后将查询添加回系统。与此同时,随着Postgres 多年来的发展(并将语法从 Postquel 转移到反映其中许多目标的 SQL 版本),它已将对 XML 和 JSON 等嵌套数据的支持整合到通用

DBMS 中,而无霄任何霄求。重大的重新解构。战斗来回摆动,但 Postgres 扩展关系框架与嵌套数据扩展的方法在争论消退后,一次又一次地成为所有各方的自然终结状态。

User-defined abstract data types and functions. 用户定义的抽象数据类型和函数 In addition to offering nested types, Postgres pioneered the idea of having opaque, extensible Abstract Data Types (ADTs), which are stored in the database but not interpreted by the core data- base system. In principle this was always part of Codd’s relational model: integers and strings were traditional, but really any atomic data types with predicates can be captured in the relational model. The challenge was to provide that mathematical flexibility in soft- ware. To enable queries that interpret and manipulate these ob- jects, an application programmer needs to be able to register User- Defined Functions (UDFs) for these types with the system, and be able to invoke those UDFs in queries. User-Defined Aggregate (UDA) functions are also desirable to summarize collections of these objects in queries. Postgres was the pioneering database system supporting these features in a comprehensive way.

除了提供嵌套类型之外,Postgres 还提出了使用不透明,可扩展的抽象数据类型(ADTs),存储在数据库中但不山核心数据库系统解释。原则上始终是 Codd 关系模型的一部分:整数和字符串是传统的,但实际上是任何原子数据可以在关系模型中捕获具有谓词的 ta 类型。挑战在于提供这种数学方法软件的灵活性。为了启用解释和操作这些对象的查询,应用程序编程语法霄要能够为系统注册这些类型的用户定义函数(UDF),并且能够在查询中调用这些UDF。用户定义的聚合(UDA)功能也可用于总结 colle 查询中这些对象的提示。Postgres 是支持这些功能的开创性数据库系统综合方式。

Why put this functionality into the DBMS, rather than the appli- cations above? The classic answer was the significant performance benefit of “pushing code to data,” rather than “pulling data to code.” Postgres showed that this is quite natural within a relational frame- work: it involved modest changes to a relational metadata catalog, and mechanisms to invoke foreign code, but the query syntax, se- mantics, and system architecture all worked out simply and ele- gantly.

Postgres was a bit ahead of its time in exploring this feature. In particular, the security implications of uploading unsafe code

to a server were not an active concern in the database research community at the time. This became problematic when the technol- ogy started to get noticed in industry. Stonebraker commercialized Postgres in his Illustra start-up, which was acquired by Informix in large part for its ability to support extensible “DataBlades” (exten- sion packages) including UDFs. Informix’s Postgres-based technol- ogy, combined with their strong parallel database offering, made Informix a significant threat to Oracle. Oracle invested heavily in negative marketing about the risks of Informix’s ability to run “unprotected” user-defined C code. Some trace the demise of In- formix to this campaign, although Informix’s financial shenani- gans (and subsequent federal indictment of its then-CEO) were cer- tainly more problematic. Now, decades later, all the major database vendors support the execution of user-defined functions in one or more languages, using newer technologies to protect against server crashes or data corruption.

Postgres 在探索此功能方面略胜一筹。特别是,将不安全代码上传到服务器的安全隐患并不是主动关注的问题当时的数据库研究社区。当技术开始在工业中引起注意时,这就成了问题。Stonebraker 在他的Il 中将Postgres 商业化lustra 初创公司,它在很大程度上被 Informix 收购,因为它支持包括 UDF 在内的可扩展“DataBlades”(扩展包)。Informix 的 Postgres 基于其强大的并行数据库产品,Informix 成为 Oracle 的重大威胁。甲骨文在负面营销方面的投入很大 Informix 能够运行“未受保护的”用户定义的 C 代码的风险。有些人追踪 Informix 在此次活动中的消亡,尽管是 Informix 的财务恶作剧 (以及随后对当时的首席执行官的联邦起诉)当然更成问题。现在,几十年后,所有主要数据库供应商都支持用户定义的执行一种或多种语言的函数,使用较新的技术来防止服务器崩溃或数据损坏。

Meanwhile, the Big Data stacks of the 2000s—including the MapRe- duce phenomenon that gave Stonebraker and DeWitt such heart- burn [DS08]—are a re-realization of the Postgres idea of user-defined code hosted in a query framework. MapReduce looks very much like a combination of software engineering ideas from Postgres combined with parallelism ideas from systems like Gamma and Teradata, with some minor innovation around mid-query restart for extreme-scalability workloads. Postgres-based start-ups Green- plum and Aster showed around 2007 that parallelizing Postgres could result in something much higher-function and practical than MapReduce for most customers, but the market still wasn’t ready for any of this technology in 2008. By now, in 2018, nearly every Big Data stack primarily serves a workload of parallel SQL with UDFs—very much like the design Stonebraker and team pioneered

in Postgres.

与此同时,2000 年代的大数据堆栈 - 包括给 Stonebraker 和

DeWitt 带来伤害的MapReduce 现象[DS08] — 重新实现了Post-

gres 在查询中托管的用户定义代码的想法框架。MapReduce 看起来非常类似于 Postgres 的软件工程思想与 Gamma 和 Tera-

data 等系统的并行思想的结合,以及针对极端可扩展性工作负载的中间查询重启的一些小创新。基于 Postgres 的初创公司

Greenplum 和Aster 在 2007 年左右展示了Postgres 的并行化可能会为大多数客户带来比 MapReduce 更高功能和实用性的功能,但是市场仍然没有为 2008 年的任何技术做好准备。到目前为止,到 2018 年,几乎每个大数据堆栈主要用于与 UDF 并行 SQL 的工作量 - 非常类似于设计 Stonebraker 和 Postgres 开创的团队。

Extensible access methods for new data types. Relational databases evolved around the same time as B-trees in the early 1970s, and B- trees helped fuel Codd’s dream of ”physical data independence”: B-tree indexes provide a level of indirection that adaptively reor- ganizes physical storage without requiring applications to change. The main limitation of B-trees and related structures was that they only support equality lookups and 1-dimensional range queries. What if you have 2-dimensional range queries of the kind typical in mapping and CAD applications? This problem was au courant at the time of Postgres, and the R-tree [Gut84] developed by An- tonin Guttman in Stonebraker’s group was one of the most success- ful new indexes developed to solve this problem in practice. Still, the invention of an index structure does not solve the end-to-end systems problem of DBMS support for multi-dimensional range queries. Many questions arise. Can you add an access method like R-trees to your DBMS easily? Can you teach your optimizer that said access method will be useful for certain queries? Can you get concurrency and recovery correct?

关系数据库在 20 世纪 70 年代早期与B 树同时发展,B 树帮助推动了Codd“物理数据独立性”的梦想:B 树索引提供了一个间接层,可以自适应地重新组织物理存储,而无霄更改应用程序。B 树和相关结构的主要限制是他们只支持相等查找和 1维范围查询。如果您具有地图和 CAD 应用程序中典型的二维范围查询,该怎么办?这个问题在 Postgres 时是 au courant,山

Stoneinker 的小组中的 Antonin Guttman 开发的 R-tree [Gut84]

为实现这一问题而开发的最成功的新指标之一。仍然,索引结构的发明不能解决端到端系统问题的 DBMS 支持多维范围查询。出现了许多问题。您可以轻松地向 DBMS 添加 R 树等访问方法吗?你能教你的优化器吗?在所述访问方法将对某些查询有用?你能获得并发和恢复正确吗?

This was a very ambitious aspect of the Postgres agenda: a soft- ware architecture problem affecting most of a database engine, from the optimizer to the storage layer and the logging and recov- ery system. R-trees became a powerful driver and the main exam- ple of the elegant extensibility of Postgres’ access method layer and its integration into the query optimizer. Postgres demonstrated— in an opaque ADT style—how to register an abstractly described access method (the R-tree, in this case), and how a query optimizer could recognize an abstract selection predicate (a range selection in this case) and match it to that abstractly described access method. Questions of concurrency control were less of a focus in the orig- inal effort: the lack of a unidimensional ordering on keys made B-tree-style locking inapplicable.1

这是 Postgres 议程中一个非常雄心勃勃的方面: 影响大多数数据库引擎的软件架构问题,从优化器到存储层以及日志记录和恢复系统。R-trees 成为了一个强大的驱动程序,也是

Postgres 访问方法层优雅可扩展性及其与查询优化器集成的主要示例。Postgres 演示—以不透明的 ADT 样式—如何注册抽象描述的访问方法(在本例中为 R 树),以及查询优化器如何识别抽象选择谓词(在这种情况下的范围选择)并将其与抽象描

1 The Postgres challenge of extensible access methods inspired one of my first research projects at the end of graduate school: the Generalized Search Trees (GiST) [HNP95] and subsequent notion of Indexability theory [HKM+02]. I implemented GiST in Postgres during a postdoc semester, which made it even easier to add new indexing logic in Postgres. Marcel Kornacker’s thesis at Berkeley solved the difficult concur- rency and recovery problems raised by extensible indexing in GiST in a templated way [KMH97].

述的访问方法相匹配。并发控制的问题在原始工作中不太重要: 缺少对键的单维排序使得 B 树样式的锁定不适用。2

PostgreSQL today leverages both the original software architec- ture of extensible access methods (it has B-tree, GiST, SP-GiST, and Gin indexes) and the extensibility and high concurrency of the Generalized Search Tree (GiST) interface as well. GiST indexes power the popular PostgreSQL-based PostGIS geographic informa- tion system; Gin indexes power PostgreSQL’s internal text index- ing support.

PostgreSQL 今天利用了可扩展访问方法的原始软件架构

(它具有 B-tree,GiST,SP-GiST 和 Gin 索引) 以及广义搜索树 (GiST)接口的可扩展性和高并发性。GiST 索引为流行的基于

PostgreSQL 的 PostGIS 地理信息系统提供支持; Gin 索引支持

PostgreSQL 的内部文本索引.

Optimizer handling of queries with expensive UDFs. 优化查询使用昂贵的 UDFs In traditional query optimization, the challenge was generally to minimize the amount of tuple-flow (and hence I/O) you gen- erate in processing a query. This meant that operators that filter tuples (selections) are good to do early in the query plan, while operators that can generate new tuples ( join) should be done later. As a result, query optimizers would ”push” selections below joins and order them arbitrarily, focusing instead on cleverly optimizing joins and disk accesses. UDFs changed this: if you have expensive UDFs in your selections, the order of executing UDFs can be crit- ical to optimizing performance. Moreover, if a UDF in a selection is really time-consuming, it’s possible that it should happen after joins (i.e., selection ”pullup”). Doing this optimally complicated the optimizer space.

在传统的查询优化中,挑战通常是最小化在处理查询时生成的元组流(以及 I/O)的数量。这意味着过滤元组(选择)的运算符在查询计划的早期阶段是很好的,而可以生成新元组

(连接)的运算符应该稍后完成。因此,查询优化器会“推送”连接下方的选择并对其进行任意排序,而是专注于巧妙地优化连接和磁盘访问。UDF 改变了这一点:如果您的选择中有昂贵的 UDF,则执行 UDF 的顺序对于优化性能至关重要。此外,

如果选择中的UDF 非常耗时,则可能会发生 after join(即,选择“pullup”)。这样做最佳地使优化器空间变得复杂。

I took on this problem as my first challenge in graduate school and it ended up being the subject of both my M.S. with Stonebraker at Berkeley and my Ph.D. at Wisconsin under Jeff Naughton, with ongoing input from Stonebraker. Postgres was the first DBMS to capture the costs and selectivities of UDFs in the database catalog. We approached the optimization problem by coming up with an optimal ordering of selections, and then an optimal interleaving of the selections along the branches of each join tree considered during plan search. This allowed for an optimizer that maintained the textbook dynamic programming architecture of System R, with

2 Postgres 对可扩展访问方法的挑战启发了我的第一个研究项目之一研究生院结束: 广义搜索树(GiST)[?] 以及随后的可索引性理论概念 [?]。我在博士后期间在Postgres 中实现了GiST,这使得在Postgres 中添加新索引逻辑变得更加容易。 Marcel Kornacker 在 Berkeley 的论文解决了 GiST 中可扩展索引引发的困难的并发和恢复问题,这是一种模仿的方式 [?]。

a small additional sorting cost to get the expensive selections or- dered properly.3

我把这个问题作为我在研究生院的第一次挑战,最终成为

了我的 M .S(硕士) 的主题。与伯克利的 Stonebraker 和我的博士。在威斯康星州的 Jeff Naughton 领导下,Stonebraker 不断提供意见。Postgres 是第一个在数据库目录中捕获UDF 成本和选择性的DBMS。我们通过提出选择的最佳排序来接近优化问题,然后在计划搜索期间沿着每个连接树的分支进行选择的最佳交织。这允许优化器维护系统 R 的教科书动态编程体系结构,只霄少量额外的分类成本就可以正确地选择昂贵的选项。 4

The expensive function optimization feature was disabled in the PostgreSQL source trees early on, in large part because there weren’t compelling use cases at that time for expensive user-defined func- tions.5 The examples we used revolved around image processing, and are finally becoming mainstream data processing tasks in 2018. Of course, today in the era of Big Data and machine learning work- loads, expensive functions have become quite common, and I ex- pect this problem to return to the fore. Once again, Postgres was well ahead of its time.

早期在 PostgreSQL 源代码树中禁用了昂贵的函数优化功能,这在很大程度上是因为当时没有令人信服的用例用于昂贵的用户定义函数。6 一个名叫 Neil Conway 的年轻开源黑客的

PostgreSQL 源代码树,几年后开始攻读博士学位。和我在加州大学伯克利分校学习,现在是 Stonebraker 的博士之一。我们使用的示例围绕图像处理,最终成为 2018 年的主流数据处理任务。当然,今天在大数据和机器学习工作负载的时代,昂贵的功能已经变得非常普遍,我期待这个问题重返前列。Postgres再一次领先于时代。

Active Databases and Rule Systems The Postgres project began at the tail end of the AI community’s interest in rule-based programming as a way to represent knowl- edge in “expert systems.” That line of thinking was not successful; many say it led to the much discussed ”AI winter” that persisted through the 1990s.

Postgres 项目始于 AI 社区对基于规则的编程的兴趣,作为在“专家系统”中表示知识的一种方式。这种思路并不成功;在 90 年代末期, 可以说这导致了很多人讨论“人工智能冬天”。

However, rule programming persisted in the database commu- nity in two forms. The first was theoretical work around declara- tive logic programming using Datalog. This was a bugbear of Stone- braker’s; he really seemed to hate the topic and famously criticized it in multiple ”community” reports over the years.7 The second database rules agenda was pragmatic work on what was eventually

3 When I started grad school, this was one of three topics that Stonebraker wrote on the board in his office as options for me to think about for a Ph.D. topic. I think the second was function indexing, but I cannot remember the third.‌

4 当我开始研究生学习时,这是 Stonebraker 写的三个课题之一在他的办公室作为我考虑博士学位的选择。课题。我认为第二个是函数索引,我不记得第三个了。

5 Ironically, my code from grad school was fully deleted from the PostgreSQL source tree by a young open-source hacker named Neil Conway, who some years later started a Ph.D. with me at UC Berkeley and is now one of Stonebraker’s Ph.D. grandchildren.‌

6 具有讽刺意味的是,我的毕业学校代码已完全删除

7 Datalog survived as a mathematical foundation for declarative languages, and has found application over time in multiple areas of computing including software- defined networks and compilers. Datalog is declarative querying “on steroids” as a

dubbed Active Databases and Database Triggers, which evolved to be a standard feature of relational databases. Stonebraker char- acteristically voted with his feet to work on the more pragmatic variant.

但是,规则编程在数据库社区中以两种形式存在。第一种是使用 Datalog 进行声明性逻辑编程的理论工作。这是 Stone-

braker 的一个 bug; 多年来,他似乎真的讨厌这个话题并在多

个“社区”报告中对其进行了着名的批评。8 第二个数据库规则议程是关于什么最终被称为活动数据库和数据库触发器的实用工作,成为关系数据库的标准功能。Stonebraker 以他的脚为特征投票决定更实用的变体。

Postgres 项目始于 AI 社区对基于规则的编程的兴趣,作为在“专家系统”中表示知识的一种方式。这种思路并不成功;许多人说这导致了大量讨论的“人工智能冬季”,这种情况在

20 世纪 90 年代持续存在。

但是,规则编程在数据库社区中以两种形式存在。第一个是使用 Datalog 进行声明性逻辑编程的理论工作。这是 Stone-

braker 的一个 bug; 多年来,他似乎真的讨厌这个话题并在多个“社区”报告中对其进行了着名的批评。 footnote Datalog作为声明性语言的数学基础而幸存下来,并且随着时间的推移在包括软件定义网络在内的多个计算领域中得到应用和编译器。Datalog 是一个声明性的查询“on on steroids”,作为一个完全表达的编程模型。我最终被吸引到它作为一种自然的设计选择,并在传统数据库系统之外的各种应用环境中进行了追求。第二个数据库规则议程是关于什么最终被称为活动数据库和数据库触发器的实用工作,成为关系数据库的标准功能。

Stonebraker 以他的脚为特征投票决定更实用的变体。 Stonebraker’s work on database rules began with Eric Hanson’s

Ph.D., which initially targeted Ingres but quickly transitioned to the new Postgres project. It expanded to the Ph.D. work of Spyros Potamianos on PRS2: Postgres Rules System 2. A theme in both implementations was the potential to implement rules in two dif- ferent ways. One option was to treat rules as query rewrites, remi- niscent of the work on rewriting views that Stonebraker pioneered in Ingres. In this scenario, a rule logic of ”on condition then action” is recast as ”on query then rewrite to a modified query and execute it instead.” For example, a query like ”append a new row to Mike’s list of awards” might be rewritten as ”raise Mike’s salary by 10%.” The other option was to implement a more physical ”on condition then action,” checking conditions at a row level by using locks in- side the database. When such locks were encountered, the result was not to wait (as in traditional concurrency control), but to exe-

cute the associated action.9

Stonebraker 在数据库规则方面的工作始于Eric Hanson 的博士学位,该博士最初针对的是 Ingres,但很快就转向了新的

Postgres 项目。它扩大到博士学位。Spyros Potamianos 在PRS2上的工作:Postgres 规则系统 2. 两种实现中的主题都是以两种不同的方式实现规则的潜力。一种选择是将规则视为查询重

fully expressive programming model. I was eventually drawn into it as a natural de- sign choice, and have pursued it in a variety of applied settings outside of traditional database systems.

8 Datalog 作为声明性语言的数学基础而幸存下来,并且随着时间的推移在包括软件定义网络在内的多个计算领域中得到应用和编译器。Datalog 是一个声明性

写,让人联想到 Stonebraker 在 Ingres 中开创的重写视图的工作。在这种情况下,“on condition then action”的规则逻辑将重新命名为“on query,然后重写为已修改的查询并执行它”。例如,像“向 Mike 的奖励列表添加新行”这样的查询可能会被重写为“将迈克的薪水提高 10%”。另一个选择是实现更实际的“on on condition then action”,通过使用锁来检查行级别的条件在数据库内。遇到这样的锁时,结果不是等待(如传统的并发控制),而是执行相关的操作。10

In the end, neither the query rewriting scheme nor the row-level locking scheme was declared a ”winner” for implementing rules in Postgres—both were kept in the released system. Eventually all of the rules code was scrapped and rewritten in PostgreSQL, but the current source still retains both the notions of per-statement and per-row triggers.

The Postgres rules systems were very influential in their day, and went ”head-to-head” with research from IBM’s Starburst project and MCC’s HiPac project. Today, ”triggers” are part of the SQL standard and implemented in many of the major database engines. They are used somewhat sparingly, however. One problem is that this body of work never overcame the issues that led to AI winter: the interactions within a pile of rules can become untenably con- fusing as the rule set grows even modestly. In addition, triggers still tend to be relatively time-consuming in practice, so database installations that have to run fast tend to avoid the use of triggers. But there has been a cottage industry in related areas like mate- rialized view maintenance, Complex Event Processing and stream queries, all of which are in some way extensions of ideas explored in the Postgres rules systems.

最后,查询重写方案和行级锁定方案都没有被声明为在

Postgres 中实现规则的“赢家”- 两者都保存在已发布的系统中。最终所有的规则代码都被废弃并在 PostgreSQL 中重写,但是当前的源代码仍然保留了 per 语句和每行触发器的概念。

Postgres 规则系统在当时非常有影响力,并与 IBM 的 Star-

burst 项目和MCC 的HiPac 项目进行了“头对头”的研究。今天,“触发器”是 SQL 标准的一部分,并在许多主要数据库引擎中实现。然而,它们有点谨慎使用。一个问题是,这一系列的工作从未克服导致 AI 冬季的问题:随着规则集的适度增长,一堆规则内的交互可能会变得难以置信。此外,触发器在实践中仍然相对耗时,因此必须快速运行的数据库安装往往会避免使用触发器。但是在物化视图维护,复杂事件处理和流查询等

comment—probably from Spyros Potamianos—in Postgres version 3.1, circa 1991:

DESCRIPTION: Take a deeeeeeep breath & read. If you can avoid hacking the code below ( i .e. if you have not been ”volunteered” by the boss to do this dirty job) avoid it at all costs. Try to do something less dangerous for your (mental) health. Go home and watch horror movies on TV. Read some Lovecraft. Join the Army. Go and spend a few nights in people's park. Commit suicide ... Hm, you keep reading, eh? Oh, well, then you deserve what you get. Welcome to the gloomy labyrinth of the tuple level rule system, my poor hacker... 10 PRS2 中行级规则的代码非常棘手。在Berkeley Postgres 档案中进行了一些搜索,发现了以下源代码评论—可能来自 Spyros Potamianos —在 Postgres 3.1 版,大约

DESCRIPTION: Take a deeeeeeep breath & read. If you can avoid hacking the code below ( i .e. if you have not been ”volunteered” by the boss to do this dirty job) avoid it at all costs. Try to do something less dangerous 的查询“on on steroids”,作为一个完全表达的编程模型。我最终被吸引到它作为一种自然的设计选择,并在传统数据库系统之外的各种应用环境中进行了追求。

9 The code for row-level rules in PRS2 was notoriously tricky. A bit of search- ing in the Berkeley Postgres archives unearthed the following source code

1991

for your (mental) health. Go home and watch horror movies on TV. 年 :* Read some Lovecraft. Join the Army. Go and spend a few nights in

people's park. Commit suicide ... Hm, you keep reading, eh? Oh, well, then you deserve what you get. Welcome to the gloomy labyrinth of the tuple level rule system, my poor hacker... 相关领域中有一个家庭手工业,所有这些都在某些方面扩展了

Postgres 规则系统中探索的想法.

Log-centric Storage and Recovery 菲于日志的存储和恢复 Stonebraker described his design for the Postgres storage system this way:

When considering the POSTGRES storage system, we were guided by a missionary zeal to do something dif- ferent. All current commercial systems use a storage manager with a write-ahead log (WAL), and we felt that this technology was well understood. Moreover, the original Ingres prototype from the 1970s used a similar storage manager, and we had no desire to do another implementation. [SK91]

While this is cast as pure intellectual restlessness, there were technical motivations for the work as well. Over the years, Stone- braker repeatedly expressed distaste for the complex write-ahead logging schemes pioneered at IBM and Tandem for database re- covery. One of his core objections was based on a software en- gineering intuition that nobody should rely upon something that complicated—especially for functionality that would only be exer- cised in rare, critical scenarios after a crash.

虽然这被视为纯粹的知识分子不安,但也有工作的技术动机。多年来,Stonebraker 一再表示厌恶IBM 和Tandem 开创的复杂的预写日志记录方案,用于数据库恢复。他的一个核心反对意见是基于软件工程直觉,没有身体应该依赖于那些复杂的东西- 特别是对于那些只会在崩溃后的罕见,危急情况下运用的功能。

The Postgres storage engine unified the notion of primary stor- age and historical logging into a single, simple disk-based represen- tation. At base, the idea was to keep each record in the database in a linked list of versions stamped with transaction IDs—in some sense, this is “the log as data” or “the data as a log,” depending on your point of view. The only additional metadata required is a list of committed transaction IDs and wall-clock times. This ap- proach simplifies recovery enormously, since there’s no “translat-

暴的一个创造性会议中所写的那样- 非常简单地考虑了这个基本方案的一些效率问题和优化,以及一些湿手指分析性能如何发挥 [?]。在 Postgres 中实现的实现稍微简单一些。

Stonebraker’s idea of “radical simplicity” for transactional stor- age was deeply counter-cultural at the time, when the database vendors were differentiating themselves by investing heavily in the machinery of high-performance transaction processing. Bench- mark winners at the time achieved high performance and recover- ability via highly optimized, complex write-ahead logging systems. Once they had write-ahead logs working well, the vendors also began to innovate on follow-on ideas such as transactional replica- tion based on log shipping, which would be difficult in the Postgres scheme. In the end, the Postgres storage system never excelled on performance; versioning and time-travel were removed from Post-

greSQL over time and replaced by write-ahead logging.11但是时间旅行功能很有趣并且保持独特。此外,Stonebraker 关于简单的恢复软件工程的理念今天在 NoSQL 系统(选择复制而不是预先写入日志)和主存储器数据库(通常使用多版本控制和压缩提交日志)的环境中回响。版本化的关系数据库和时间旅行查询的想法今天仍然被降级为esoterica,偶尔出现在研究原型和小型开源项目中。在我们廉价存储和连续流数据的时代,这个想法已经成熟。

Queries over New Deep Storage Technologies In the middle of the Postgres project, Stonebraker signed on as a co- PI on a large grant for digital earth science called Project Sequoia. Part of the grant proposal was to handle unprecedented volumes of digital satellite imagery requiring up to 100 terabytes of storage, far more data than could be reasonably stored on magnetic disks at the time. The center of the proposed solution was to explore the idea of a DBMS (namely Postgres) facilitating access to near-line “tertiary” storage provided by robotic “jukeboxes” for managing libraries of optical disks or tapes.

在Postgres 项目的中期,Stonebraker 签署了一项名为Project

Sequoia 的大型数字地球科学基金的联合 PI。部分赠款建议是处理前所未有的大量数字卫星图像,霄要高达 100TB 的存储空间,远远超过当时合理存储在磁盘上的数据。拟议解决方案

ing” from a log representation back to a primary representation.

It also enables “time-travel” queries: you can run queries “as of” some wall-clock time, and access the versions of the data that were committed at that time. The original design of the Postgres storage system—which reads very much as if Stonebraker wrote it in one creative session of brainstorming—contemplated a number of ef-

11 Unfortunately, PostgreSQL still isn’t particularly fast for transaction processing: its embrace of write-ahead logging is somewhat half-hearted. Oddly, the PostgreSQL team kept much of the storage overhead of Postgres tuples to provide multiversion concurrency control, something that was never a goal of the Berkeley Postgres project. The result is a storage system that can emulate Oracle’s snapshot isolation with a fair bit of extra I/O overhead, but one that does not support Stonebraker’s original idea of

ficiency problems and optimizations to this basic scheme, along

time travel or simple recovery.

当时数据库供应商通过大量投资于高性能事务处理机制来区分自己,

Stone-

with some wet-finger analyses of how performance might play out [Sto87]. The resulting implementation in Postgres was some- what simpler.

Postgres 存储引擎将主存储和历史记录的概念统一到一个简单的基于磁盘的表示中。在基础上,我们的想法是将数据库中的每条记录保存在标有事务 ID 的版本的链接列表中—在某种意义上,这是“作为数据的日志”或“作为日志的数据”。’取决于你的观点。唯一霄要的额外元数据是已提交的事务 ID和挂钟时间列表。这种方法极大地简化了恢复,因为没有从日志表示“转换”回主表示。它还支持“时间旅行”查询:您可以在某些挂钟时间运行“查询”,并访问当时提交的数据版本。Postgres 存储系统的原始设计- 就像Stonebraker 在头脑风

braker 对事务存储的“彻底简单化”的想法在当时是非常反文化的。当时的基准赢家通过高度优化获得了高性能和可恢复性复杂的,复杂的预写日志系统。一旦他们的预写日志工作良好,供应商也开始创新后续的想法,如基于日志传送的事务复制,这在Postgres 方案中很难。最终,Postgres 存储系统在性能上从未表现出色; 版本化和时间旅行随着时间的推移从PostgreSQL 中删除,取而代之的

是预写日志记录。12 But the time-travel functionality was interesting and remained unique. Moreover, Stonebraker’s ethos regarding simple software engineering for re- covery has echoes today both in the context of NoSQL systems (which choose repli- cation rather than write-ahead logging) and main-memory databases (which often use multi-versioning and compressed commit logs). The idea of versioned relational databases and time-travel queries are still relegated to esoterica today, popping up in occasional research prototypes and minor open-source projects. It is an idea that is ripe for a comeback in our era of cheap storage and continuously streaming data.

尽管 PostgreSQL 存储引擎很慢,但这并不是系统所固有的。PostgreSQL 的

Greenplum 分支集成了一个有趣的替代高性能压缩的存储引擎。它山 Matt Mc-

Cline 设计,他是 Jim Gray 在 Tandem 的团队的老将。它也不支持时间旅行。

的核心是探索t 他想要一个DBMS(即Postgres),便于访问山机器人“自动点唱机”提供的近线“三级”存储,用于管理光盘或磁带库。

A couple different research efforts came out of this. One was the Inversion file system: an effort to provide a UNIX filesystem abstraction above an RDBMS. In an overview paper for Sequoia, Stonebraker described this in his usual cavalier style as “a straight- forward exercise” [Sto95]. In practice, this kept Stonebraker stu- dent (and subsequent Cloudera founder) Mike Olson busy for a cou- ple years, and the final result was not exactly straightforward [Ols93] nor did it survive in practice.13

» 山此产生了几项不同的研究成果。一个是 Inversion 文件系统: 努力在 DBMS 中提供一个 UNIX 文件系统抽象上面。在一篇关于 Sequoia(红衫) 的概述文掌中,斯通布拉克尔以他惯常的骑士风格描述了这种“直截了当的练习”[Sto95]。在实践中,这让 Stonebraker 的学生(以及随后的 Cloudera 创始人) Mike Olson 忙碌了几年,最后的结果并不是那么简单 [Ols93]也没有在实践中存活下来。14

The other main research thrust on this front was the incorpo- ration of tertiary storage into a more typical relational database stack, which was the subject of Sunita Sarawagi’s Ph.D. thesis. The main theme was to change the scale at which you think about man- aging space (i.e., data in storage and the memory hierarchy) and time (coordinating query and cache scheduling to minimize unde- sirable I/Os). One of the key issues in that work was to store and retrieve large multidimensional arrays in tertiary storage—echoing work in multidimensional indexing, the basic ideas included break- ing up the array into chunks, and storing chunks together that are fetched together—including replicating chunks to enable multiple physical “neighbors” for a given chunk of data. A second issue was to think about how disk becomes a cache for tertiary storage. Fi- nally, query optimization and scheduling had to take into account the long load times of tertiary storage and the importance of “hits” in the disk cache—this affects both the plan chosen by a query op- timizer, and the time at which that plan is scheduled for execution. Tape and optical disk robots are not widely used at present.

But the issues of tertiary storage are very prevalent in the cloud, which has deep storage hierarchies in 2018: from attached solid- state disks to reliable disk-like storage services (e.g., AWS EBS) to archival storage (e.g., AWS S3) to deep storage (e.g., AWS Glacier). It is still the case today that these storage tiers are relatively de- tached, and there is little database support for reasoning about storage across these tiers. I would not be surprised if the issues explored on this front in Postgres are revisited in the near term.

Support for Multiprocessors: XPRS Stonebraker never architected a large parallel database system, but he led many of the motivating discussions in the field. His “Case for Shared Nothing” paper [Sto86] documented the coarse-grained architectural choices in the area; it popularized the terminology

13 Some years after Inversion, Bill Gates tilted against this same windmill with WinFS, an effort to rebuild the most widely-used filesystem in the world over a relational database backend. WinFS was delivered in developer releases of Windows but never made it to market. Gates later called this his greatest disappointment at Microsoft.‌

14 后几年,比尔盖茨跟风实现了 WinFS,通过关系数据库后端重建世界上使用最广泛的文件系统(猜测译: 有些文件系统有事务的概念)。WinFS 是在Windows的开发人员版本中提供的,但从未进入市场。盖茨后来称这是他对微软最大的不满之处。

used by the industry, and threw support behind shared-nothing architectures like those of Gamma and Teradata, which were re- discovered by the Big Data crowd in the 2000s.

Stonebraker 从未领导过大型并行数据库系统构建,但他的工作引导了该领域的许多启发。他的“无共享案例”论文 [Sto86]记录了该地区粗糙的架构选择; 它推广了行业使用的术语,并支持像 Gamma 和 Teradata 这样的无共享架构,这些架构在

2000 年代被搞大数据的人重新发现。

Ironically, Stonebraker’s most substantive contribution to the area of parallel databases was a “shared-memory” architecture called XPRS, which stood for eXtended Postgres on RAID and Sprite. XPRS was the “Justice League” of Berkeley systems in the early 1990s: a brief combination of Stonebraker’s Postgres system, John Ousterhout’s Sprite distributed OS, and Dave Patterson’s and Randy Katz’s RAID storage architectures. Like many multi-faculty efforts, the execution of XPRS was actually determined by the grad stu- dents who worked on it. The primary contributor ended up being Wei Hong, who wrote his Ph.D. thesis on parallel query optimiza- tion in XPRS. Hence the main contribution of XPRS to the liter- ature and industry was parallel query optimization, with no real consideration of issues involving RAID or Sprite.15

» 具有讽刺意味的是,Stonebraker 对并行数据库领域最实质性的贡献是一个名为 XPRS 的“共享内存”架构,它代表了 eXtended Postgres On 在RAID 和Sprite 上。20 世纪 90 年代初, XPRS 是伯克利系统的“正义联盟”:Stonebraker 的Postgres 系统,John Ousterhout 的Sprite 分布式操作系统,Dave Patterson和Randy Katz 的RAID 存储架构的简单组合。像许多多学院的努力一样,XPRS 的执行实际上是山研究它的研究生决定的。主要撰稿人最终成为了撰写博士学位的魏宏。XPRS 中并行查询优化的论文。因此,XPRS 对文献和行业的主要贡献是并行查询优化,没有真正考虑涉及 RAID 或 Sprite 的问题。16

In principle, parallelism “blows up” the plan space for a query optimizer by making it multiply the traditional choices made dur- ing query optimization (data access, join algorithms, join orders) against all possible ways of parallelizing each choice. The basic idea of what Stonebraker called “The Wei Hong Optimizer” was to cut the problem in two: run a traditional single-node query opti- mizer in the style of System R, and then “parallelize” the resulting single-node query plan by scheduling the degree of parallelism and placement of each operator based on data layouts and system con- figuration. This approach is heuristic, but it makes parallelism an additive cost to traditional query optimization, rather than a mul- tiplicative cost.

15 Of the three projects, Postgres and RAID both had enormous impact. Sprite is best re- membered for Mendel Rosenblum’s Ph.D. thesis on Log Structured File Systems (LFS), which had nothing of note to do with distributed operating systems. All three projects involved new ideas for disk storage beyond mutating single copies in place. LFS and the Postgres storage manager are rather similar, both rethinking logs as primary stor- age, and requiring expensive background reorganization. I once gently probed Stone- braker about rivalries or academic scoops between LFS and Postgres, but I never got any good stories out of him. Maybe it was something in the water in Berkeley at the time.‌

16 在这三个项目中, postgres 和raid 都产生了巨大的影响。sprite 是最值得铭记的是孟德尔 Rosenblum 关于日志结构文件系统 (lfs) 的博士论文, 该论文与分布式操作系统无关。这三个项目都涉及到磁盘存储的新想法, 而不仅仅是在适当的地方对单个副本进行突变。lfs 和postgres 存储管理器非常相似, 它们都将日志重新思考为主存储, 并且霄要昂贵的后台重组。我曾经温和地探讨过 stonebraker 关于 lfs 和 postgres 之间的竞争或学术勺子的问题, 但我从来没有从他那里得到任何好的故事。也许是当时伯克利水很深。

原则上, 并行性” 放大了” 查询优化器的计划空间, 使其与所有可能的并行化方法对每个选择进行并行化, 使其在查询优化过程中所做的传统选择(数据访问、联接算法、联接顺序) 成倍增加。stonebraker 所说的”Wei Hong 优化器” 的基本思想是将问题一分为二: 以系统 r 的风格运行传统的单节点查询优化器,然后通过调度的程度来” 并行化” 生成的单节点查询计划。根据数据布局和系统配置, 实现每个算子的并行性和位置。

Although “The Wei Hong Optimizer” was designed in the con- text of Postgres, it became the standard approach for many of the parallel query optimizers in industry.

虽然 ‘’Wei Hong Optimizer” 是在 Postgres 环境中设计的, 但它成为了行业内许多并行查询优化器的标准方法。

Support for a Variety of Language Models One of Stonebraker’s recurring interests since the days of Ingres was the programmer API to a database system. In his Readings in Database Systems series, he frequently included work like Carlo Zaniolo’s GEM language as important topics for database system aficionados to understand. This interest in language undoubtedly led him to partner up with Larry Rowe on Postgres, which in turn deeply influenced the design of the Postgres data model and its Object-Relational approach. Their work focused largely on data- centric applications they saw in the commercial realm, including both business processing and emerging applications like CAD/- CAM and GIS. 自 Ingres 时代以来,Stonebraker 的一个重复利益是数据库系统的程序员 API。在他的数据库系统读物系列中,他经常将 Carlo Zaniolo 的 GEM 语言作为数据库系统爱好者理解的重要主题。这种对语言的兴趣无疑使他与 Postgres 的 Larry Rowe合作,这反过来又深深地影响了 Postgres 数据模型及其对象关系方法的设计。他们的工作主要集中在他们在商业领域中看到的以数据为中心的应用程序,包括业务处理和 CAD / CAM和 GIS 等新兴应用程序。

One issue that was forced upon Stonebraker at the time was the idea of “hiding” the boundary between programming language constructs and database storage. Various competing research projects and companies exploring Object-Oriented Databases (OODBs) were targeting the so-called “impedance mismatch” between imperative object-oriented programming languages like Smalltalk, C++, and Java, and the declarative relational model. The OODB idea was to make programming language objects be optionally marked “persis- tent,” and handled automatically by an embedded DBMS. Postgres supported storing nested objects and ADTs, but its relational-style declarative query interface meant that each roundtrip to the data- base was unnatural for the programmer (requiring a shift to declar- ative queries) and expensive to execute (requiring query parsing and optimization). To compete with the OODB vendors, Postgres exposed a so-called “Fast Path” interface: basically a C/C++ API to the storage internals of the database. This enabled Postgres to be moderately performant in academic OODB benchmarks, but never really addressed the challenge of allowing programmers in multi- ple languages to avoid the impedance mismatch problem. Instead, Stonebraker branded the Postgres model as “Object-Relational,” and simply sidestepped the OODB workloads as a “zero-billion dol- lar” market. Today, essentially all commercial relational database systems are “Object-Relational” database systems.

» 当时强加给 Stonebraker 的一个问题是“隐藏”编程语言结构和数据库存储之间的界限。各种竞争性研究项目和探索面向对象数据库(OODB)的公司都在针对命令式面向对象编程语言(如 Smalltalk,C ++ 和Java)与声明性关系模型之间的所谓“阻抗不匹配”。OODB 的想法是使编程语言对象可选地标记为“持久”,并山嵌入式 DBMS 自动处理。Postgres 支持存储嵌套对象和 ADT,但其关系式声明性查询接口意味着程序员每次往返数据库都是不自然的(霄要转换为声明性查询)并且执行起来很昂贵(霄要查询解析和优化)。为了与 OODB 供应商竞争,Postgres 公开了一个所谓的“快速路径”接口:基本上是数据库存储内部的 C / C ++ API。这使得 Postgres 在学术 OODB 基准测试中具有适度的性能,但从未真正解决允许多种语言的程序员避免阻抗不匹配问题的挑战。相反,Stonebraker将 Postgres 模型称为“对象关系”,并简单地将 OODB 工作量作为一个“数十亿美元”的市场。今天,基本上所有商业关系数据库系统都是“对象关系”数据库系统。

This proved to be a sensible decision. Today, none of the OODB products exist in their envisioned form, and the idea of “persistent objects” in programming languages has largely been discarded. By contrast, there is widespread usage of object-relational mapping layers (fueled by early efforts like Java Hibernate and Ruby on Rails) that allow declarative databases to be tucked under nearly any imperative object-oriented programming language as a library, in a relatively seamless way. This application-level approach is dif- ferent than both OODBs and Stonebraker’s definition of Object- Relational DBs. In addition, lightweight persistent key-value stores have succeeded as well, in both non-transactional and transactional forms. These were pioneered by Stonebraker’s Ph.D. student Margo Seltzer, who wrote BerkeleyDB as part of her Ph.D. thesis at the same time as the Postgres group, which presaged the rise of dis- tributed “NoSQL” key-value stores like Dynamo, MongoDB, and Cassandra.

事实证明这是一个明智的决定。今天,没有任何 OODB 产品以其设想的形式存在,并且编程语言中的“持久对象”的概念已基本被抛弃。相比之下,对象关系映射层的广泛使用(山 Java Hibernate 和Ruby on Rails 等早期工作推动)允许声明性数据库隐藏在几乎任何命令式面向对象编程语言作为库中,相对无缝办法。这种应用程序级方法不同于 OODB 和 Stonebraker对对象关系数据库的定义。此外,轻量级持久键值存储也以非事务和事务形式成功。这些是山Stonebraker 的博士开创的。 Margo Seltzer,伯克利分校的博士生, 写出了 BerkoresDB 作为她的博士论文与Postgres 集团同时发表论文,预示着 Dynamo,

MongoDB 和 Cassandra 等分布式“NoSQL”键值存储的兴起。

SOFTWARE IMPACT 软件的影响 Open Source 开源 Postgres was always an open source project with steady releases, but in its first many years it was targeted at usage in research, not in production.

Postgres 始终是一个稳定发布的开源项目,但在其开始的几年里,它的目标是用于研究,而不是用于生产。

As the Postgres research project was winding down, two stu- dents in Stonebraker’s group—Andrew Yu and Jolly Chen—modified the system’s parser to accept an extensible variant of SQL rather than the original Postquel language. The first Postgres release sup- porting SQL was Postgres95; the next was dubbed PostgreSQL.

Postgres 始终是一个稳定发布的开源项目,但在其第一个多年的时间里,它的目标是用于研究,而不是用于生产。

随着 Postgres 研究项目的结束,Stonebraker 门下的两名学生— Andrew Yu 和 Jolly Chen —修改了系统的解析器,以接受

SQL 的可扩展变体而不是原始的 Postquel 语言。支持 SQL 的第一个 Postgres 版本是Postgres95; 接下来被称为PostgreSQL。 A set of open-source developers became interested in PostgreSQL and “adopted” it even as the rest of the Berkeley team was mov- ing on to other interests. Over time the core developers for Post-

greSQL have remained fairly stable, and the open-source project has matured enormously. Early efforts focused on code stability and user-facing features, but over time the open source commu- nity made significant modifications and improvements to the core of the system as well, from the optimizer to the access methods and the core transaction and storage system. Since the mid-1990s, very few of the PostgreSQL internals came out of the academic group at Berkeley—the last contribution may have been my GiST implementation in the latter half of the 1990s—but even that was rewritten and cleaned up substantially by open-source volunteers (from Russia, in that case). The open source community around PostgreSQL deserves enormous credit for running a disciplined process that has soldiered on over decades to produce a remark- ably high-impact and long-running project.

一组开源开发人员开始对 PostgreSQL 感兴趣并且“采用”它,即使其他伯克利团队正在转向其他兴趣。随着时间的推移,PostgreSQL 的核心开发人员保持相当稳定,开源项目已经成熟。早期的工作主要集中在代码稳定性和用户功能上,但随着时间的推移,开源社区也对系统的核心进行了重大修改和改进,包括, 优化器, 访问方式 (access) 核心事务和存储系统。自

20 世纪 90 年代中期以来,很少有PostgreSQL 内部成员来自伯克利的学术团体 - 最后的贡献可能是我在 20 世纪 90 年代后半期的 GiST 实现 - 但即便如此也被重写和清理主要山开源志愿者(在这种情况下来自俄罗斯)。围绕 PostgreSQL 的开源社区应该获得巨大的荣誉,因为他们运行了一个纪律严明的流程高影响力和长期运行的项目。

While many things have changed in 25 years, the basic archi- tecture of PostgreSQL remains quite similar to the university re- leases of Postgres in the early 1990s, and developers familiar with the current PostgreSQL source code would have little trouble wan- dering through the Postgres3.1 source code (c. 1991). Everything from source code directory structures to process structures to data structures remain remarkably similar. The code from the Berkeley Postgres team had excellent bones.

许多事情在 25 年后发生了变化,但PostgreSQL 的基本架构仍然与 20 世纪 90 年代早期的Postgres 大学版本非常相似,熟悉当前 PostgreSQL 源代码的开发人员在使用 Postgres3.1 源代码时几乎没有什么问题。(c.1991)。从源代码目录结构到流程结构再到数据结构,一切都非常相似。来在伯克利Postgres 团队的代码拥有很好的健壮性。

PostgreSQL today is without question the most high-function

database in the world17 and its impact continues to grow: in both 2017 and 2018 it was the fastest-growing database system in the world in popularity [DE19c] PostgreSQL is used across a wide vari- ety of industries and applications, which is perhaps not surprising given its ambition of broad functionality.

今天的 PostgreSQL 毫无疑问是最多功能的开源 DBMS,支持商业中常见的功能产品。它也是(根据一个有影响力的排名网站)中最受欢迎的广泛使用的独立开源数据库 world 18及其影响继续增长:2017 年和 2018 年它是世界上发展最快的数据库系统受欢迎 [DE19c] PostgreSQL 用于各种行业和应用程序,山于其广泛的功能,它可能并不奇怪。

Heroku is a cloud SaaS provider that is now part of Salesforce. Postgres was adopted by Heroku in 2010 as the default database for its platform. Heroku chose Postgres because of its operational relia- bility. With Heroku’s support, more major application frameworks such as Ruby on Rails and Python for Django began to recommend Postgres as their default database.

Heroku 是一个云 SaaS 提供商,现在是 Salesforce 的一部分。

Postgres 于 2010 年被 Heroku 采用作为其平台的默认数据库。 Heroku 选择了Postgres 因为使用其运行可靠性。在Heroku 的支持下,更多主要的应用程序框架(如 Ruby on Rails 和Python for Django)开始推荐Postgres 作为其默认数据库。

PostgreSQL today supports an extension framework that makes it easy to add additional functionality to the system via UDFs and related modifications. There is now an ecosystem of PostgreSQL extensions—akin to the Illustra vision of DataBlades, but in open source. Some of the more interesting extensions include the Apache MADlib library for machine learning in SQL, and the Citus library for parallel query execution.

PostgreSQL 今天支持扩展框架,可以通过 UDF 和相关修改轻松地向系统添加其他功能。现在有一个 PostgreSQL 扩展的生态系统 - 类似于 DataBlades 的 Illustra 愿景,但是在开源中。一些更有趣的扩展包括用于 SQL 机器学习的 Apache MADlib库,以及用于并行查询执行的 Citus 库。

One of the most interesting open-source applications built over Postgres is the PostGIS Geographic Information System, which takes advantage of many of the features in Postgres that originally inspired Stonebraker to start the project.

Postgres 构建的最有趣的开源应用程序之一是 PostGIS 地理信息系统,它利用Postgres 中的许多功能,这些功能最初启发了 Stonebraker 启动项目。

Commercial Adaptations 商业发展 PostgreSQL has long been an attractive starting point for building commercial database systems, given its permissive open source li- cense, its robust codebase, its flexibility, and breadth of functional- ity. Summing the acquisition prices listed below, Postgres has led open-source DBMS, supporting features that are often missing from

commercial products. It is also (according to one influential rank- ings site) the most popular widely used independent open source

17 According to DB Engines, PostgreSQL today is the fourth most popular DBMS in the world, after Oracle, MySQL and MS SQL Server, all of which are corporate offer- ings (MySQL was acquired by Oracle many years ago) [DE19a]. See the DB-Engines ranking methodology for a discussion of the rules for this ranking [DE19b].‌

18 根据 DB Engines,PostgreSQL 今天是世界上第四大最受欢迎的 DBMS,仅次于Oracle,MySQL 和MS SQL Server,所有这些都是企业产品(MySQL 多年前被 Oracle 收购)[DE19a]。查看DB-Engines 排名讨论这一排名规则的民族学[DE19b]。

to over $2.6 billion in acquisitions.19 Many of the commercial ef- forts that built on PostgreSQL have addressed what is probably its key limitation: the ability to scale out to a parallel, shared-nothing architecture.20

PostgreSQL 长期以来一直是构建商业数据库系统的有吸引力的起点,因为它具有宽松的开源许可证,强大的代码库,灵活性和广泛的功能。总结下面列出的收购价格,Postgres 导致超过 26 亿美元的不合格。footnote 注意,这是实际交易美元的一个衡量指标,并且比高科技经常抛出的价值要大得多。数十亿中的数字通常用于描述股票持有量的估计价值,但通常会膨胀 10 倍或更多而不是现代价值,以期望未来价值。收购的交易金额衡量公司在收购时的实际市场价值。可以说,Postgres 已经产生了超过 26 亿美元的实际商业价值。许多基于 PostgreSQL的商业努力已经解决了可能是它的关键限制:能够扩展到并行,无共享架构。 footnote 并行化 PostgreSQL 霄要相当多的工作,但是一个经验丰富的小团队非常有用。今天,业界管理的 PostgreSQL 开源分支如 Greenplum 和 CitusDB 提供了这一功能。令人遗憾的是,PostgreSQL 并没有更早地以真正的开源方式进行并行化。如果在 21 世纪初期 PostgreSQL 在开源中扩展了无共享功能,那么开源大数据运动很可能会以不同的方式和更有效的方式发展。

Illustra was Stonebraker’s second major start-up company, founded in 1992, seeking to commercialize Postgres as RTI had com- mercialized Ingres.21 The founding team included some of the core Postgres team including recent Ph.D. alumnus Wei Hong and then-chief programmer Jeff Meredith, along with Ingres alumni Paula Hawthorn and Michael Ubell. Postgres M.S. stu- dent Mike Olson joined shortly after the founding, and I worked on the Illustra handling of optimizing expensive functions as part of my Ph.D. work. There were three main efforts in Illus- tra: to extend SQL92 to support user-defined types and func- tions as in Postquel, to make the Postgres code base robust enough for commercial use, and to foster the market for exten- sible database servers via examples of “DataBlades,” domain- specific plug-in components of data types and functions. Illus-

tra was acquired by Informix in 1997 for an estimated $400M [Mon96] and its DataBlade architecture was integrated into a more ma-

ture Informix query processing codebase as Informix Universal Server.

Illustra 是Stonebraker 的第二家大型初创公司,成立于 1992年,当 RTI 将商业化的Ingres 商业化时,试图将 Postgres 商业化。 footnote Illustra 实际上是该公司提出的第三个名称。

19 Note that this is a measure in real transaction dollars, and is much more substantial than the values often thrown around in high tech. Numbers in the billions are often used to describe estimated value of stock holdings, but are often inflated by 10x or more against contemporary value in hopes of future value. The transaction dollars of an acquisition measure the actual market value of the company at the time of ac- quisition. It is fair to say that Postgres has generated more than $2.6 billion of real commercial value.‌

20 Parallelizing PostgreSQL requires a fair bit of work, but is eminently doable by a

small, experienced team. Today, industry-managed open-source forks of PostgreSQL such as Greenplum and CitusDB offer this functionality. It is a shame that PostgreSQL wasn’t parallelized in a true open source way much earlier. If PostgreSQL had been extended with shared-nothing features in open source in the early 2000s, it is quite possible that the open source Big Data movement would have evolved quite differently and more effectively.

21 Illustra was actually the third name proposed for the company. Following the

painterly theme established by Ingres, Illustra was originally called Miró. For trade- mark reasons the name was changed to Montage, but that also ran into trademark problems.

按照Ingres 建立的绘画主题,Illustra 最初被称为Mir ó。出于商标原因,该名称改为蒙太奇,但也遇到了商标问题。创始团队包括一些核心的 Postgres 团队,包括最近的博士学位。校友 Wei Hong 和当时的首席程序员 Jeff Meredith,以及Ingres 的校友Paula Hawthorn 和Michael Ubell。Postgres

M.S. 学生 Mike Olson 在成立后不久就加入了,我在 Illustra处理优化昂贵的功能,作为我博士的一部分。工作。Illustra有三个主要工作:扩展 SQL92 以支持 Postquel 中的用户定义类型和功能,使 Postgres 代码库足够强大以供商业用途,并通过“DataBlades”的示例促进可扩展数据库服务器的市场,” 特定于域的数据类型和函数的插件组件。Illustra于 1997 年被 Informix 收购,价值约为 400 万美元,而其

DataBlade 架构被整合到一个更成熟的Informix 查询处理代码库中作为Informix Universal Server。

Netezza was a startup founded in 1999, which forked the Post- greSQL codebase to build a high-performance parallel query processing engine on custom FPGA-based hardware. Netezza was quite successful as an independent company, and had its IPO in 2007. It was eventually acquired by IBM, with a value of $1.7B [IBM10].

Greenplum was the first effort to offer a shared-nothing par- allel, scale-out version of PostgreSQL. Founded in 2003, Green- plum forked from the public PostgreSQL distribution, but main- tained the APIs of PostgreSQL to a large degree, including the APIs for user-defined functions. In addition to parallelization, Greenplum extended PostgreSQL with an alternative high-performance compressed columnar storage engine, and a parallelized rule-

driven query optimizer called Orca. Greenplum was acquired by EMC in 2010 for an estimated $300M [Mal10]; in 2012, EMC consolidated Greenplum into its subsidiary, Pivotal. In 2015, Pivotal chose to release Greenplum and Orca back into open source. One of the efforts at Greenplum that leveraged its Post-

gres API was the MADlib library for machine learning in SQL [HRS+12].

MADlib lives on today as an Apache project. Another interest- ing open-source project based on Greenplum is Apache HAWQ, a Pivotal design that runs the “top half” of Greenplum (i.e., the parallelized PostgreSQL query processor and extensibility APIs) in a decoupled fashion over Big Data stores such as the Hadoop File System.

Greenplum 是第一个提供 PostgreSQL 无共享并行扩展版本的努力。Greenplum 成立于 2003 年,从公共 PostgreSQL 发行版中分离出来,但在很大程度上维护了 PostgreSQL 的 API,包括用户定义函数的API。除了并行化之外,Greenplum 还使用另一种高性能压缩列式存储引擎和一个名为Orca 的并行规则驱动查询优化器扩展了PostgreSQL。Greenplum 于 2010年被EMC 收购,估计耗资 300 亿美元[Mal10];2012 年,EMC将Greenplum 合并到其子公司Pivotal。2015 年,Pivotal 选择将 Greenplum 和 Orca 重新发布为开源软件。Greenplum 利用其 Postgres API 的努力之一是用于 SQL [HRS+12] 的机器学习的MADlib 库。MADlib 今天仍然是Apache 项目。基于 Greenplum 的另一个有趣的开源项目是 Apache HAWQ,这是一种Pivotal 设计,它以大数据存储(如数据库)的分离方式运行 Greenplum 的“上半部分”(即并行化的 PostgreSQL查询处理器和可扩展性API)。Hadoop 文件系统。

EnterpriseDB was founded in 2004 as an open-source-based business, selling PostgreSQL in both a vanilla and enhanced edition with related services for enterprise customers. A key feature of the enhanced EnterpriseDB Advanced Server is a set of database compatibility features with Oracle, to allow appli- cation migration off of Oracle.

EnterpriseDB 成立于 2004 年,是一家开源的企业,以香草和增强版销售PostgreSQL,为企业客户提供相关服务。增强型 EnterpriseDB Advanced Server 的一个关键功能是与 Oracle数据库兼容的功能,以允许从Oracle 迁移应用程序。 Aster Data was founded in 2005 by two Stanford students, to build a parallel engine for analytics. Its core single-node engine was based on PostgreSQL. Aster focused on queries for graphs and on analytics packages based on UDFs that could be pro- grammed with either SQL or MapReduce interfaces. Aster Data was acquired by Teradata in 2011 for $263M [Sho11]. While Teradata never integrated Aster into its core parallel database engine, it still maintains Aster as a standalone product for use cases outside the core of Teradata’s warehousing market. Aster Data 山斯坦福大学的两名学生于 2005 年创立,旨在构建一个用于分析的并行引擎。它的核心单节点引擎基于 PostgreSQL。Aster 专注查询图表和基于UDF 的分析,可以使用 SQL 或 MapReduce 接口进行编程。Aster Data 在 2011年被Teradata 收购被$263M [Sho11]。虽然Teradata 从未将 Aster 集成到其核心并行数据库引擎中,但它仍然将 Aster

作为用例的独立产品这是 Teradata 仓储市场的核心。

ParAccel was founded in 2006, selling a shared-nothing parallel version of PostgreSQL with column-oriented, shared-nothing storage. ParAccel enhanced the Postgres optimizer with new heuristics for queries with many joins. In 2011, Amazon in- vested in ParAccel, and in 2012 announced AWS Redshift, a hosted data warehouse as a service in the public cloud based on ParAccel technology. In 2013, ParAccel was acquired by Actian (who also had acquired Ingres) for an undisclosed amount— meaning it was not a material expense for Actian. Meanwhile, AWS Redshift has been an enormous success for Amazon—for many years it was the fastest-growing service on AWS, and many believe it is poised to put long-time data warehousing products like Teradata and Oracle Exadata out of business. In this sense, Postgres may achieve its ultimate dominance in the cloud. ParAccel 成立于 2006 年,销售 PostgreSQL 的无共享并行版本,具有面向列,无共享存储。对于具有多个连接的查询,ParAccel 使用新的启发式方法增强了 Postgres 优化器。

2011 年,亚马逊投资ParAccel,并于 2012 年宣布推出 AWS

Redshift,这是一种托管数据仓库,作为基于 ParAccel 技术的公共云服务。2013 年,ParAccel 被 Actian(也曾收购

Ingres)收购,收购金额未公开- 这意味着它不是Actian 的重大费用。与此同时,AWS Redshift 在亚马逊上取得了巨大的成功 - 多年来它一直是 AWS 上发展最快的服务,许多人认为它有望将 Teradata 和 Oracle Exadata 等长期数据仓库产品停业。从这个意义上讲,Postgres 可能会在云端实现其最终的统治地位。

CitusDB was founded in 2010 to offer a shared-nothing paral- lel implementation of PostgreSQL. While it started as a fork of PostgreSQL, as of 2016 CitusDB is implemented via public Post- greSQL extension APIs and can be installed into a vanilla Post- CitusDB 成立于 2010 年,旨在提供 PostgreSQL 的无共享并行实现。虽然它最初是 PostgreSQL 的一个分支,但截至

2016 年,CitusDB 是通过公共 PostgreSQL 扩展 API 实现的,可以安装到一个 vanilla PostgreSQL 安装中。同样截至 2016年,CitusDB 扩展可在开源中使用。

LESSONS You can draw a host of lessons from the success of Postgres, a num- ber of them defiant of conventional wisdom. 你可以从Postgres 的成功中汲取一些教训,其中一些教训蔑视传统智慧。

The highest-order lesson I draw comes from the fact that that Postgres defied Fred Brooks’ “Second System Effect” [Bro75]. Brooks argued that designers often follow up on a successful first system with a second system that fails due to being overburdened with features and ideas. Postgres was Stonebraker’s second system, and it was certainly chock full of features and ideas. Yet the system succeeded in prototyping many of the ideas, while delivering a software infrastructure that carried a number of the ideas to a suc- cessful conclusion. This was not an accident—at base, Postgres was designed for extensibility, and that design was sound. With exten- sibility as an architectural core, it is possible to be creative and stop worrying so much about discipline: you can try many exten- sions and let the strong succeed. Done well, the “second system” is not doomed; it benefits from the confidence, pet projects, and ambitions developed during the first system. This is an early archi- tectural lesson from the more “server-oriented” database school of software engineering, which defies conventional wisdom from the “component-oriented” operating systems school of software engi- neering.

我绘制的最高级课程来自于 Postgres 违背弗雷德布鲁克斯的“第二系统效应” [Bro75] 这一事实。布鲁克斯认为,设计师经常跟进一个成功的第一个系统,第二个系统山于功能和想法负担过重而失败。Postgres 是Stonebraker 的第二个系统,它肯定充满了各种功能和想法。然而,该系统成功地对许多想法进行了原型设计,同时提供了一个软件基础设施,其中包含了许多想法,并取得了圆满成功。这不是偶然的 - 在基地,Postgres是为可扩展性而设计,而且这种设计很合理。随着可扩展性作为架构核心,可以创造性并且不再担心纪律:你可以尝试许多扩展并让强者成功。做得好,“第二个系统”并没有注定; 它受益于第一个系统中的信心,宠物项目和抱负。这是来自更“面向服务器”的软件工程数据库学校的早期架构,它蔑视来自 “面向组件”的操作系统软件工程学院的传统智慧。

Another lesson is that a broad focus—“one size fits many”—can be a winning approach for both research and practice. To coin some names, “MIT Stonebraker” made a lot of noise in the database world in the early 2000s that “one size doesn’t fit all.” Under this banner he launched a flotilla of influential projects and startups, but none took on the scope of Postgres. It seems that “Berkeley Stonebraker” defies the later wisdom of “MIT Stonebraker,” and I have no issue with that.22 Of course there’s wisdom in the “one size doesn’t fit all” motto (it’s always possible to find modest markets for custom designs!), but the success of “Berkeley Stonebraker’s” signature system—well beyond its original intents—demonstrates that a broad majority of database problems can be solved well with a good general-purpose architecture. Moreover, the design of that

greSQL installation. Also as of 2016, the CitusDB extensions are

available in open source. 22 As Emerson said, “a foolish consistency is the hobgoblin of little minds”.

architecture is a technical challenge and accomplishment in its own right. In the end—as in most science and engineering debates— there isn’t only one good way to do things. Both Stonebrakers have lessons to teach us. But at base, I’m still a fan of the broader agenda that “Berkeley Stonebraker” embraced.

另一个教训是,广泛的关注 - “一个适合许多人” - 可以成为研究和实践的成功方法。为了打造一些名字,“麻省理工学院的 Stonebraker”在 21 世纪初的数据库世界中制造了很多噪音,“一种尺寸并不适合所有人。”在这个旗帜下,他推出了一个有影响力的项目和创业公司,但没有人接受 Postgres 的范围。似乎“伯克利 Stonebraker”蔑视“MIT Stonebraker”, 后来智慧,我对此毫无疑问。23。当然,“一种尺寸并不适合所有人”的座右铭是有智慧的(总是有可能为定制设计找到适度的市场!),但“伯克利 Stonebraker”的签名系统的成功- 远远超出它的原始意图—表明,通过良好的通用架构,可以很好地解决大多数数据库问题。此外,该架构的设计本身就是一项技术挑战和成就。最终 - 就像大多数科学和工程辩论一样 - 不仅有一种好的方法可以做。两位Stonebrakers 都有教训我们。但在基础上,我仍然是“Berkeley Stonebraker”所接受的更广泛议

的事情并让它自山。”在我看来(使用 Stonebrakerism),你不能跳过这一课的任何一部分。

ACKNOWLEDGMENTS 尾记 I’m indebted to my old Postgres buddies Wei Hong, Jeff Meredith, and Mike Olson for their remembrances and input, and to Craig Kerstiens for his input on modern-day PostgreSQL.

我很感激我的Postgres 老友,Wei Hong,Jeff Meredith 和Mike

Olson 的记忆和投入,感谢 Craig Kerstiens 对现代 PostgreSQL

的投入。

REFERENCES [Bro75] Frederick P Brooks. The mythical man-month, 1975.

[Bro19] Michael L. Brodie, editor. Making Databases Work. Morgan & Claypool, 2019.

[DE19a] DB-Engines. DB-Engines ranking, 2019. https://db-engines.com/en/ ranking. (Last accessed January 4, 2019).

[DE19b] DB-Engines. Method of calculating the scores of the DB-Engines rank- ing, 2019. https://db-engines.com/en/ranking_definition (Last accessed

程的粉丝。

[DE19c]

January 4, 2019).

eSQL is the DBMS of the year 2018, January 2019.

A final lesson I take from Postgres is the unpredictable poten- tial that can come from open-sourcing your research. In his Tur-

DB-Engines. Postgr

https://db-engines.com/en/blog_post/79 (Last accessed January 4, 2019). [DS08] David DeWitt and Michael Stonebraker. Mapreduce: A major step back-

ing talk, Stonebraker speaks about the “serendipity” of PostgreSQL

[Gut84]

wards. The Database Column, 1:23, 2008.

x structure for spatial search-

succeeding in open source, largely via people outside Stonebraker’s own sphere. It’s a wonderfully modest quote:

我从 Postgres 获得的最后一课是开源研究可能带来的不可预测的潜力。在他的图灵 talk 中,Stonebraker 谈到了 Post-

Antonin Guttman. R-trees: A dynamic inde

ing. In Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data, SIGMOD ’84, pages 47–57, New York, NY, USA, 1984. ACM.

[HKM+02] Joseph M. Hellerstein, Elias Koutsoupias, Daniel P. Miranker, Christos H. Papadimitriou, and Vasilis Samoladas. On a model of indexability and its

greSQL 在开源方面取得成功的“偶然性”,主要是通过 Stone-

[HNP95]

bounds for range queries. J. ACM, 49(1):35–55, January 2002.

Joseph M. Hellerstein, Jeffrey F. Naughton, and Avi Pfeffer. Generalized

braker 自己领域以外的人。这是一个非常温和的报价:

pick-up team of volunteers, none of whom have anything to do with me or Berkeley, have been shep- herding that open source system ever since 1995. The system that you get off the web for Postgres comes from this pick-up team. It is open source at its best search trees for database systems. In Proceedings of the 21th International Conference on Very Large Data Bases, VLDB ’95, pages 562–573, San Fran- cisco, CA, USA, 1995. Morgan Kaufmann Publishers Inc.

[HRS+12] Joseph M Hellerstein, Christoper Ré, Florian Schoppmann, Daisy Zhe Wang, Eugene Fratkin, Aleksander Gorajek, Kee Siong Ng, Caleb Welton, Xixuan Feng, Kun Li, et al. The MADlib analytics library: or MAD skills, the SQL. Proceedings of the VLDB Endowment, 5(12):1700–1711, 2012.

[IBM10] IBM to acquire Netezza, September 2010. http://www-03.ibm.com/press/

and I want to just mention that I have nothing to do with that and that collection of folks we all owe a huge debt of gratitude to [Sto14].

[KMH97]

us/en/pressrelease/32514.wss#release (Last accessed January 22, 2018).

Marcel Kornacker, C. Mohan, and Joseph M. Hellerstein. Concurrency and recovery in generalized search trees. In Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data, SIGMOD ’97, pages 62–72, New York, NY, USA, 1997. ACM.

自从 1995 年以来,[A] 志愿者接送团队一直没有

与我或伯克利有任何关系,他们一直在引导开源系统。你从网上获取 Postgres 的系统来自这个接送团队。它是最好的开源,我只想提到我与那个人没有任何关系,那些人们都非常感谢 [Sto14]。

I’m sure all of us who have written open source would love

[Mal10] Om Malik. Big Data = Big Money: EMC Buys Greenplum. In GigaOm, July 2010. https://gigaom.com/2010/07/06/emc-buys-greenplum/.

[Mon96] John Monroe. Informix acquires illustra for complex data management.

In Federal Computer Week, January 1996.

[OFS83] James Ong, Dennis Fogg, and Michael Stonebraker. Implementation of data abstraction in the relational database system ingres. ACM Sigmod Record, 14(1):1–14, 1983.

[Ols93] Michael A. Olson. The design and implementation of the inversion file

for that kind of “serendipity” to come our way. But it’s not all

[SAHR84]

system. 1993.

, Erika Anderson, Eric Hanson, and Brad Ruben-

serendipity—the roots of that good luck were undoubtedly in the ambition, breadth and vision that Stonebraker had for the project,

Michael Stonebraker

stein. Quel as a data type. In Proceedings of the 1984 ACM SIGMOD Inter- national Conference on Management of Data, SIGMOD ’84, pages 208–214,

and the team he mentored to build the Postgres prototype. If there’s

[Sho11]

New York, NY, USA, 1984. ACM.

Erick Shonfeld. Big pay day for big

data. teradata buys aster data for $263

a lesson there, it might be to “do something important and set it free.” It seems to me (to use a Stonebrakerism) that you can’t skip either part of that lesson.

我敢肯定,我们所有写过开源的人都会喜欢这种“偶然性”。但并非所有的意外发生 - 这种好运的根源无疑是 Stonebraker对该项目的雄心,广度和远见,以及他为建立Postgres 原型而指导的团队。如果那里有一个教训,那就可能是“做一些重要

23 正如爱默生所说,“愚蠢的一致性是小脑袋的大人物”

million. In TechCrunch, May 2011. https://techcrunch.com/2011/03/03/ teradata-buys-aster-data-263-million/ (Last accessed January 22, 2018).

[SHWK76] Michael Stonebraker, Gerald Held, Eugene Wong, and Peter Kreps. The design and implementation of ingres. ACM Transactions on Database Sys- tems (TODS), 1(3):189–222, 1976.

[SK91] Michael Stonebraker and Greg Kemnitz. The postgres next generation database management system. Commun. ACM, 34(10):78–92, October 1991.

[SR86] Michael Stonebraker and Lawrence A. Rowe. The design of postgres. In Proceedings of the 1986 ACM SIGMOD International Conference on Man- agement of Data, SIGMOD ’86, pages 340–355, New York, NY, USA, 1986. ACM.

[SRG83] M Stonebraker, B Rubenstein, and A Guttman. Application of abstract data types and abstract indices to cad bases. IEEE Trans, on Software En- gineering, 1983.‌

[Sto86] Michael Stonebraker. The case for shared nothing. IEEE Database Eng.

Bull., 9(1):4–9, 1986.

[Sto87] Michael Stonebraker. The design of the postgres storage system. In Pro- ceedings of the 13th International Conference on Very Large Data Bases, VLDB ’87, pages 289–300, San Francisco, CA, USA, 1987. Morgan Kauf- mann Publishers Inc.

[Sto95] Michael Stonebraker. An overview of the sequoia 2000 project. Digital Technical Journal, 7(3):39–49, 1995.

[Sto14] Michael Stonebraker. The land sharks are on the squawk box, 2014. https://www.acm.org/turing-lecture-stonebraker (Last accessed January 4, 2019).

转载于:https://my.oschina.net/innovation/blog/3017918

你可能感兴趣的:([翻译]Looking Back at Postgres,Postgres 过往)