网上有很多关于CAP理论的介绍, 感觉很多文章没有说清楚, 本想写一篇文章来介绍一下, 但我发现wiki上的介绍更好, 所以直接引用过来(红色的中文是我的注解):
In theoretical computer science, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees:[1][2][3] (不能三者都满足)
Consistency | Availability | Partition tolerance |
---|---|---|
Every read receives the most recent write or an error (读取到最新数据) | Every request receives a (non-error) response – without guarantee that it contains the most recent write(可以收到响应) | The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes(分布式) |
In other words, the CAP theorem states that in the presence of a network partition(确保P), one has to choose between consistency and availability(权衡C和A). Note that consistency as defined in the CAP theorem is quite different from the consistency guaranteed in ACID database transactions.
No distributed system is safe from network failures, thus network partitioning generally has to be tolerated(既然是分布式,那就要选择P). In the presence of a partition, one is then left with two options: consistency or availability. When choosing consistency over availability, the system will return an error or a time-out if particular information cannot be guaranteed to be up to date due to network partitioning(由于是分布式,且要求数是最新的,那么必然会存在不可用的时候). When choosing availability over consistency, the system will always process the query and try to return the most recent available version of the information, even if it cannot guarantee it is up to date due to network partitioning(由于是分布式,且要可用,那就没法保证数据是最新的).
In the absence of network failure – that is, when the distributed system is running normally – both availability and consistency can be satisfied.(这句话很关键,CAP可以同时满足,但前提条件是没有网络故障)
CAP is frequently misunderstood as if one has to choose to abandon one of the three guarantees at all times. In fact, the choice is really between consistency and availability only when a network partition or failure happens; at all other times, no trade-off has to be made.[4][5](这句话也很关键, 进一步指出CAP在网络良好的情况下,是可以同时满足的。但在实际中,会有网络出现故障的时候, 此时,CAP才无法同时满足)
Database systems designed with traditional ACID guarantees in mind such as RDBMS choose consistency over availability, whereas systems designed around the BASE philosophy, common in the NoSQL movement for example, choose availability over consistency.[6](不同数据库有不同的选择哲学)
The PACELC theorem builds on CAP by stating that even in the absence of partitioning, another trade-off between latency and consistency occurs.
According to University of California, Berkeley computer scientist Eric Brewer, the theorem first appeared in autumn 1998.[6] It was published as the CAP principle in 1999[7] and presented as a conjecture by Brewer at the 2000 Symposium on Principles of Distributed Computing (PODC).[8] In 2002, Seth Gilbert and Nancy Lynch of MIT published a formal proof of Brewer's conjecture, rendering it a theorem.[1](先提出,后证明)
In 2012, Brewer clarified some of his positions, including why the often-used "two out of three" concept can be misleading or misapplied, and the different definition of consistency used in CAP relative to the one used in ACID.[6](别误解)
A similar theorem stating the trade-off between consistency and availability in distributed systems was published by Birman and Friedman in 1996.[9] The result of Birman and Friedman restricted this lower bound to non-commuting operations.
最后,我们来看看CAP原则下的常见数据库: