第12章开始这段总结了整本书的脉络,以及我们面对不同问题的时候应该知道有不同的选择,
而当我们面对一个复杂系统的时候,通常要整合很多不同DB 来用于不同的需求
cache 用来优化 read heavy, column index storage 用来用于 search feature等等
Combining specialized tools by deriving data
For example, it is common to need to integrate an OLTP database with a full-text search index in order to handle queries for arbitrary keywords.
Besides the database and the search index, perhaps you need to keep copies of the data in analytics systems (data warehouses, or batch and stream processing systems);
看来之前对stream processing system 有误解,我以为是实时的那种系统,看来stream processing system 也主要用于用户分析?
这段话也很深刻…… 有时候需要站在整个 organization 层面去思考问题, 因为数据在不同人那里的需求是完全不一样的。
所以你在设计一个application的时候, 一定要考虑全面,比如这个数据还可以用在什么地方,然后通过不同的工具来给公司其他部门提供便利性
Surprisingly often I see software engineers make statements like, “In my experience, 99% of people only need X” or “...don’t need X” (for various values of X). I think that such statements say more about the experience of the speaker than about the actual usefulness of a technology. The range of different things you might want to do with data is dizzyingly wide. What one person considers to be an obscure and pointless feature may well be a central requirement for someone else. The need for data inte‐ gration often only becomes apparent if you zoom out and consider the dataflows across an entire organization.
所以 total order broadcast 等于 consensus
In formal terms, deciding on a total order of events is known as total order broadcast, which is equivalent to consensus (see “Consensus algorithms and total order broad‐ cast” on page 366). Most consensus algorithms are designed for situations in which the throughput of a single node is sufficient to process the entire stream of events, and these algorithms do not provide a mechanism for multiple nodes to share the work of ordering the events. It is still an open research problem to design consensus algorithms that can scale beyond the throughput of a single node and that work well in a geographically distributed setting.
目前还是一个没有被解决的问题,consensus algorithm 目前来讲都还是基于 single node 的
如果必须avoid single node bottleneck, conflict resolution 可能要在application 层面发生
Batch and Stream Processing
batch and stream processor 主要就是让 right data ends up in the right place
他们有很多共同点, 最大不同就是 stream processor input is unbounded
而它们的实现方式现在越来越趋同了,
Spark performs stream processing on top of a batch processing engine by breaking the stream into microbatches, whereas Apache Flink performs batch processing on top of a stream processing engine [5]. In principle, one type of processing can be emulated on top of the other, although the performance characteristics vary: for example, microbatching may perform poorly on hopping or sliding windows [6].
http://data-artisans.com/batch-is-a-special-case-of-streaming/
[5] Kostas Tzoumas: “Batch Is a Special Case of Streaming,” data-artisans.com, Sep‐ tember 15, 2015.
对于functional programming不了解的我也是比较有帮助的,看来functional programming 主要是deterministic 并且不会影响input?
Batch processing has a quite strong functional flavor (even if the code is not written in a functional programming language): it encourages deterministic, pure functions whose output depends only on the input and which have no side effects other than the explicit outputs, treating inputs as immutable and outputs as append-only. Stream processing is similar, but it extends operators to allow managed, fault-tolerant state (see “Rebuilding state after a failure” on page 478).
derived data 能让我们进行 gradual migration
Derived views allow gradual evolution. If you want to restructure a dataset, you do not need to perform the migration as a sudden switch. Instead, you can maintain the old schema and the new schema side by side as two independently derived views onto the same underlying data. You can then start shifting a small number of users to the new view in order to test its performance and find any bugs, while most users con‐ tinue to be routed to the old view. Gradually, you can increase the proportion of users accessing the new view, and eventually you can drop the old view [10].
The beauty of such a gradual migration is that every stage of the process is easily reversible if something goes wrong: you always have a working system to go back to. By reducing the risk of irreversible damage, you can be more confident about going ahead, and thus move faster to improve your system [11].
https://stripe.com/blog/online-migrations
[10] Jacqueline Xu: “Online Migrations at Scale,” stripe.com, February 2, 2017.
http://conferences.oreilly.com/software-architecture/sa2015/public/schedule/detail/40388
[11] Molly Bartlett Dishman and Martin Fowler: “Agile Architecture,” at O’Reilly
Software Architecture Conference, March 2015.
Lambda architecture
Lambda architecture 就是让两个系统并行,one is batch, one is stream
两个系统都从always growing dataset 里面读
If batch processing is used to reprocess historical data, and stream processing is used to process recent updates, then how do you combine the two? The lambda architec‐ ture [12] is a proposal in this area that has gained a lot of attention.
The core idea of the lambda architecture is that incoming data should be recorded by appending immutable events to an always-growing dataset, similarly to event sourc‐ ing (see “Event Sourcing” on page 457). From these events, read-optimized views are derived. The lambda architecture proposes running two different systems in parallel: a batch processing system such as Hadoop MapReduce, and a separate stream- processing system such as Storm.
In the lambda approach, the stream processor consumes the events and quickly pro‐ duces an approximate update to the view; the batch processor later consumes the same set of events and produces a corrected version of the derived view.
所以stream 给一个 approximate view, 过段时间 batch 给精确的 view
竟然已经有整合的方案了??
More recent work has enabled the benefits of the lambda architecture to be enjoyed without its downsides, by allowing both batch computations (reprocessing historical data) and stream computations (processing events as they arrive) to be implemented in the same system [15].
[15] Raul Castro Fernandez, Peter Pietzuch, Jay Kreps, et al.: “Liquid: Unifying Near‐ line and Offline Big Data Integration,” at 7th Biennial Conference on Innovative Data Systems Research (CIDR), January 2015.
- Tools for windowing by event time, not by processing time, since processing time is meaningless when reprocessing historical events (see “Reasoning About Time” on page 468). For example, Apache Beam provides an API for expressing such computations, which can then be run using Apache Flink or Google Cloud Dataflow.
感觉Flink 很强啊……
这个观点太有意思了!
At a most abstract level, databases, Hadoop, and operating systems all perform the same functions: they store some data, and they allow you to process and query that data [16]. A database stores data in records of some data model (rows in tables, docu‐ ments, vertices in a graph, etc.) while an operating system’s filesystem stores data in files—but at their core, both are “information management” systems [17].
http://www.cs.virginia.edu/~zaher/classes/CS656/p365-ritchie.pdf
[16] Dennis M. Ritchie and Ken Thompson: “The UNIX Time-Sharing System,” Communications of the ACM, volume 17, number 7, pages 365–375, July 1974. doi: 10.1145/361011.361061
http://people.eecs.berkeley.edu/~brewer/cs262/systemr.html
[17] Eric A. Brewer and Joseph M. Hellerstein: “CS262a: Advanced Topics in Com‐ puter Systems,” lecture notes, University of California, Berkeley, cs.berkeley.edu, August 2011.
这一段话把书中很多概念都清晰的定义出来了,我觉得你可以把这段话放在一个地方供你搜索
DDIA 概念定义
Composing Data Storage Technologies
Over the course of this book we have discussed various features provided by data‐ bases and how they work, including:
• Secondary indexes, which allow you to efficiently search for records based on the value of a field (see “Other Indexing Structures” on page 85)
- Materialized views, which are a kind of precomputed cache of query results (see “Aggregation: Data Cubes and Materialized Views” on page 101)
- Replication logs, which keep copies of the data on other nodes up to date (see “Implementation of Replication Logs” on page 158)
- Full-text search indexes, which allow keyword search in text (see “Full-text search and fuzzy indexes” on page 88) and which are built into some relational databases [1]
我觉得这段话实际上很好的概括了分布式系统的philosophy, 就是整个系统是一个由不同部件组成的, 每个部件是一个独立的个体, 而系统把不同部件整合到一起
跟 unix 的 philosophy 是一致的,one component must do one thing, and does it well
Viewed like this, batch and stream processors are like elaborate implementations of triggers, stored procedures, and materialized view maintenance routines. The derived data systems they maintain are like different index types. For example, a relational database may support B-tree indexes, hash indexes, spatial indexes (see “Multi- column indexes” on page 87), and other types of indexes. In the emerging architec‐ ture of derived data systems, instead of implementing those facilities as features of a single integrated database product, they are provided by various different pieces of software, running on different machines, administered by different teams.
尤其是这句话
In the emerging architec‐ ture of derived data systems, instead of implementing those facilities as features of a single integrated database product, they are provided by various different pieces of software, running on different machines, administered by different teams.
DB本身想要提供所有功能就有点不现实
ha! 我的想法跟Martin 又重合了 , 证明我还是理解了他的意思嘛!
我感觉下面这段话 说明unifying writes 更加合适, 或者说现在主流都是在向 unifying writes 靠拢
Where will these developments take us in the future? If we start from the premise that there is no single data model or storage format that is suitable for all access pat‐ terns, I speculate that there are two avenues by which different storage and process‐ ing tools can nevertheless be composed into a cohesive system:
Federated databases: unifying reads
It is possible to provide a unified query interface to a wide variety of underlying storage engines and processing methods—an approach known as a federated database or polystore [18, 19]. For example, PostgreSQL’s foreign data wrapper feature fits this pattern [20]. Applications that need a specialized data model or query interface can still access the underlying storage engines directly, while users who want to combine data from disparate places can do so easily through the federated interface.
A federated query interface follows the relational tradition of a single integrated system with a high-level query language and elegant semantics, but a compli‐ cated implementation.
Unbundled databases: unifying writes
While federation addresses read-only querying across several different systems, it does not have a good answer to synchronizing writes across those systems. We said that within a single database, creating a consistent index is a built-in feature. When we compose several storage systems, we similarly need to ensure that all data changes end up in all the right places, even in the face of faults. Making it easier to reliably plug together storage systems (e.g., through change data capture and event logs) is like unbundling a database’s index-maintenance features in a way that can synchronize writes across disparate technologies [7, 21].
The unbundled approach follows the Unix tradition of small tools that do one thing well [22], that communicate through a uniform low-level API (pipes), and that can be composed using a higher-level language (the shell) [16].
不对,Martin focus 在 unifying writes 主要是因为他的 engineering 更复杂。
Making unbundling work
Federation and unbundling are two sides of the same coin: composing a reliable, scalable, and maintainable system out of diverse components. Federated read-only querying requires mapping one data model into another, which takes some thought but is ultimately quite a manageable problem. I think that keeping the writes to several storage systems in sync is the harder engineering problem, and so I will focus on it.
看来你需要了解一下 unifying read 是如何实现的, 因为他们不需要 sync on write?
要读这个
http://wp.sigmod.org/?p=1629
[18] Michael Stonebraker: “The Case for Polystores,” wp.sigmod.org, July 13, 2015.
松耦合(loose coupling) 其实就是让 individual component do one thing well, 而且它只需要保证接口不变就可以了, 内部如何实现/优化,别人都不用管
The big advantage of log-based integration is loose coupling between the various com‐ ponents, which manifests itself in two ways:
At a system level, asynchronous event streams make the system as a whole more robust to outages or performance degradation of individual components. If a consumer runs slow or fails, the event log can buffer messages (see “Disk space usage” on page 450), allowing the producer and any other consumers to continue running unaffected. The faulty consumer can catch up when it is fixed, so it doesn’t miss any data, and the fault is contained. By contrast, the synchronous interaction of distributed transactions tends to escalate local faults into large- scale failures (see “Limitations of distributed transactions” on page 363).
At a human level, unbundling data systems allows different software components and services to be developed, improved, and maintained independently from each other by different teams. Specialization allows each team to focus on doing one thing well, with well-defined interfaces to other teams’ systems. Event logs provide an interface that is powerful enough to capture fairly strong consistency properties (due to durability and ordering of events), but also general enough to be applicable to almost any kind of data.
这句话很关键,building for scale you don't needed is wasted effort
吴军老师在他的专栏也讲过, Google 都是先 Build 10倍于现在的 workload, 然后再升级
As I said in the Preface, building for scale that you don’t need is wasted effort and may lock you into an inflexible design. In effect, it is a form of premature optimization.
有空可以看一下现在的 research 方向
Similarly, it would be great to be able to precompute and update caches more easily. Recall that a materialized view is essentially a precomputed cache, so you could imag‐ ine creating a cache by declaratively specifying materialized views for complex quer‐ ies, including recursive queries on graphs (see “Graph-Like Data Models” on page 49) and application logic. There is interesting early-stage research in this area, such as differential dataflow [24, 25], and I hope that these ideas will find their way into pro‐ duction systems.
The approach of unbundling databases by composing specialized storage and pro‐ cessing systems with application code is also becoming known as the “database inside-out” approach [26], after the title of a conference talk I gave in 2014 [27]. However, calling it a “new architecture” is too grandiose. I see it more as a design pattern, a starting point for discussion, and we give it a name simply so that we can better talk about it.
Martin 太谦虚了
These ideas are not mine; they are simply an amalgamation of other people’s ideas from which I think we should learn. In particular, there is a lot of overlap with data‐ flow languages such as Oz [28] and Juttle [29], functional reactive programming (FRP) languages such as Elm [30, 31], and logic programming languages such as Bloom [32]. The term unbundling in this context was proposed by Jay Kreps [7].
真的不是一个level的……
In the microservices approach, the code that processes the purchase would prob‐ ably query an exchange-rate service or database in order to obtain the current rate for a particular currency.
In the dataflow approach, the code that processes purchases would subscribe to a stream of exchange rate updates ahead of time, and record the current rate in a local database whenever it changes. When it comes to processing the purchase, it only needs to query the local database.
我觉得micro services 更流行的本质原因可能是他是lazy approach, 就跟操作系统里面的思想一样, 如果你不需要, 不用eagerly update
咦? 我想错了,network round trip 比我们想象的时间要长 所以 eagerly update 速度反而更快?
The second approach has replaced a synchronous network request to another service with a query to a local database (which may be on the same machine, even in the same process).ii Not only is the dataflow approach faster, but it is also more robust to the failure of another service. The fastest and most reliable network request is no net‐ work request at all! Instead of RPC, we now have a stream join between purchase events and exchange rate update events (see “Stream-table join (stream enrichment)” on page 473).
Martin 这里跟我想的不太一样,可能是我还不够了解dataflow的思想
Subscribing to a stream of changes, rather than querying the current state when needed, brings us closer to a spreadsheet-like model of computation: when some piece of data changes, any derived data that depends on it can swiftly be updated. There are still many open questions, for example around issues like time-dependent joins, but I believe that building applications around dataflow ideas is a very promis‐ ing direction to go in.
所以Martin 倾向于derived data eagerly done ,也对,因为derived data 早晚都要process, 什么时候process 看你app 具体需求了
Taken together, the write path and the read path encompass the whole journey of the data, from the point where it is collected to the point where it is consumed (probably by another human). The write path is the portion of the journey that is precomputed —i.e., that is done eagerly as soon as the data comes in, regardless of whether anyone has asked to see it. The read path is the portion of the journey that only happens when someone asks for it. If you are familiar with functional programming lan‐ guages, you might notice that the write path is similar to eager evaluation, and the read path is similar to lazy evaluation.
看来也要了解functional programming……
这点很重要,以后application 很有可能就是stateful的,所以整个架构都有可能变化
The huge popularity of web applications in the last two decades has led us to certain assumptions about application development that are easy to take for granted. In par‐ ticular, the client/server model—in which clients are largely stateless and servers have the authority over data—is so common that we almost forget that anything else exists. However, technology keeps moving on, and I think it is important to question the status quo from time to time.
Storm’s distributed RPC feature supports this usage pattern (see “Message passing and RPC” on page 468). For example, it has been used to compute the number of people who have seen a URL on Twitter—i.e., the union of the follower sets of every‐ one who has tweeted that URL [48].
[48] Nathan Marz: “Trident: A High-Level Abstraction for Realtime Computation,” blog.twitter.com, August 2, 2012.