Hyperledger Fabric v1.4(LTS) 系列(3.10):关键概念-The Ordering Service
Audience: Architects, ordering service admins, channel creators
This topic serves as a conceptual introduction to the concept of ordering, how orderers interact with peers, the role they play in a transaction flow, and an overview of the currently available implementations of the ordering service, with a particular focus on the Raft ordering service implementation.
受众: 架构师、排序服务管理员、通道创建者
本主题是对排序概念的概念性介绍,排序节点如何与节点交互,它们在事务流中扮演的角色,以及排序服务当前可用实现的概述,特别关注排序服务的实现 Raft 。
Many distributed blockchains, such as Ethereum and Bitcoin, are not permissioned, which means that any node can participate in the consensus process, wherein transactions are ordered and bundled into blocks. Because of this fact, these systems rely on probabilistic consensus algorithms which eventually guarantee ledger consistency to a high degree of probability, but which are still vulnerable to divergent ledgers (also known as a ledger “fork”), where different participants in the network have a different view of the accepted order of transactions.
许多分布式区块链(如以太坊和比特币)都是非许可制的,这意味着任何节点都可以参与共识过程,在协商一致的过程中,交易被排序并打包为区块。由于这一事实,这些系统依赖于概率 共识算法,最终保证分类账高概率地达成一致性,但仍然容易受到不同分类账(也称为分类账“分叉”)的影响,因为网络中的不同参与者对接受的交易顺序有不同的视角。
Hyperledger Fabric works differently. It features a kind of a node called an orderer (it’s also known as an “ordering node”) that does this transaction ordering, which along with other nodes forms an ordering service. Because Fabric’s design relies on deterministic consensus algorithms, any block a peer validates as generated by the ordering service is guaranteed to be final and correct. Ledgers cannot fork the way they do in many other distributed blockchains.
Hyperledger Fabric的工作方式不同。它具有一种称为 排序者 (也称为“排序节点”),用于执行此事务排序,与其他节点一起构成 排序服务。由于Fabric的设计依赖于确定性 一致性算法,任何由节点验证过的由排序服务生成的区块都保证是最终状态且正确的。分类帐不能像在许多其他分布式区块链中那样分叉。
译注:
这里 deterministic 确定性共识算法是相对于比特币的probabilistic 概率性共识算法。
In addition to promoting finality, separating the endorsement of chaincode execution (which happens at the peers) from ordering gives Fabric advantages in performance and scalability, eliminating bottlenecks which can occur when execution and ordering are performed by the same nodes.
除了促进不可变性,把对链码执行的共识(发生在节点上)与排序分离,给Fabric带来了性能和可伸缩性方面优势,消除了在相同节点执行和排序时可能出现的瓶颈。
In addition to their ordering role, orderers also maintain the list of organizations that are allowed to create channels. This list of organizations is known as the “consortium”, and the list itself is kept in the configuration of the “orderer system channel” (also known as the “ordering system channel”). By default, this list, and the channel it lives on, can only be edited by the orderer admin. Note that it is possible for an ordering service to hold several of these lists, which makes the consortium a vehicle for Fabric multi-tenancy.
除了他们的排序角色之外,排序节点还维护可创建通道的组织列表。此组织列表称为“联合体”,列表本身保存在“排序节点系统通道”(也称为“排序系统通道”)的配置中。默认情况下,此列表及其所在的通道只能由排序节点管理员编辑。请注意,排序服务可以保存其中几个列表,这使得联合体成为Fabric多租户的载体。
Orderers also enforce basic access control for channels, restricting who can read and write data to them, and who can configure them. Remember that who is authorized to modify a configuration element in a channel is subject to the policies that the relevant administrators set when they created the consortium or the channel. Configuration transactions are processed by the orderer, as it needs to know the current set of policies to execute its basic form of access control. In this case, the orderer processes the configuration update to make sure that the requestor has the proper administrative rights. If so, the orderer validates the update request against the existing configuration, generates a new configuration transaction, and packages it into a block that is relayed to all peers on the channel. The peers then processs the configuration transactions in order to verify that the modifications approved by the orderer do indeed satisfy the policies defined in the channel.
排序节点还对通道实施基本访问控制,限制谁可以向通道读写数据,以及谁可以配置通道。请记住,授权谁修改通道中的配置元素取决于相关管理员在创建联合体或通道时设置的策略。配置事务由排序节点处理,因为它需要知道当前的策略集才能执行其基本形式的访问控制。在这种情况下,排序节点处理配置更新,以确保请求方具有适当的管理权限。如果是这样,排序节点根据现有配置验证更新请求,生成一个新的配置事务,并将其打包到一个块中,该块被中继到通道上的所有对等节点。然后,节点处理配置事务,以验证排序节点批准的修改确实满足通道中定义的策略。
Everything that interacts with a blockchain network, including peers, applications, admins, and orderers, acquires their organizational identity from their digital certificate and their Membership Service Provider (MSP) definition.
所有与区块链网络交互的东西,包括对等节点、应用程序、管理员和排序节点,都从他们的数字证书和会员服务提供商(MSP)定义中获取他们的组织标识。
For more information about identities and MSPs, check out our documentation on Identity and Membership.
有关身份标识和MSP的更多信息,请参阅我们的文档身份标识和成员身份。
Just like peers, ordering nodes belong to an organization. And similar to peers, a separate Certificate Authority (CA) should be used for each organization. Whether this CA will function as the root CA, or whether you choose to deploy a root CA and then intermediate CAs associated with that root CA, is up to you.
与对等节点一样,排序节点属于一个组织。与对等节点类似,每个组织都应该使用单独的证书颁发机构(CA)。这个CA是否可以作为根CA工作,或者您是否选择部署根CA,然后部署与根CA关联的中间CA,这取决于您。
We’ve seen from our topic on Peers that they form the basis for a blockchain network, hosting ledgers, which can be queried and updated by applications through smart contracts.
从我们在对等节点主题可以看出,它们构成了区块链网络的基础,托管分类账,应用程序可以通过智能合约查询和更新分类账。
Specifically, applications that want to update the ledger are involved in a process with three phases that ensures all of the peers in a blockchain network keep their ledgers consistent with each other.
具体来说,要更新分类账的应用涉及三个阶段,以确保区块链网络中的所有对等节点保持其分类账一致。
In the first phase, a client application sends a transaction proposal to a subset of peers that will invoke a smart contract to produce a proposed ledger update and then endorse the results. The endorsing peers do not apply the proposed update to their copy of the ledger at this time. Instead, the endorsing peers return a proposal response to the client application. The endorsed transaction proposals will ultimately be ordered into blocks in phase two, and then distributed to all peers for final validation and commit in phase three.
在第一个阶段,客户端应用向对等节点的子集发送一个事务提案,然后调用智能合约来生成一个的分类帐更新提案,然后对结果进行背书。此时,背书节点还未将提案的更新应用于其分类账副本上。相反,背书节点返回给客户端应用提案的响应。批准的交易提案最终将在第二阶段被分为若干块,然后分发给所有节点,以便在第三阶段进行最终验证和提交。
For an in-depth look at the first phase, refer back to the Peers topic.
要深入了解第一阶段,请参阅[对等节点](https://hyperledger fabric.readthedocs.io/en/release-1.4/peers/peers.html phase-1-proposal)主题。
After the completion of the first phase of a transaction, a client application has received an endorsed transaction proposal response from a set of peers. It’s now time for the second phase of a transaction.
在完成交易的第一个阶段后,客户端应用收到了一组经对等节点背书的事务提案响应。现在是进行第二阶段交易的时候了。
In this phase, application clients submit transactions containing endorsed transaction proposal responses to an ordering service node. The ordering service creates blocks of transactions which will ultimately be distributed to all peers on the channel for final validation and commit in phase three.
在此阶段中,应用客户端将包含已背书的事务提案响应的事务提交到排序服务节点。排序服务创建事务块,这些事务块最终将分发给通道上的所有对等节点,以便在第三阶段进行最终验证和提交。
Ordering service nodes receive transactions from many different application clients concurrently. These ordering service nodes work together to collectively form the ordering service. Its job is to arrange batches of submitted transactions into a well-defined sequence and package them into blocks. These blocks will become the blocks of the blockchain!
排序服务节点同时接收来自许多不同应用客户端的事务。这些排序服务节点共同组成排序服务。它的工作是将提交的事务批次排列成一个定义良好的序列,并将它们打包成块。这些块将成为区块链的区块!
The number of transactions in a block depends on channel configuration parameters related to the desired size and maximum elapsed duration for a block (BatchSize
and BatchTimeout
parameters, to be exact). The blocks are then saved to the orderer’s ledger and distributed to all peers that have joined the channel. If a peer happens to be down at this time, or joins the channel later, it will receive the blocks after reconnecting to an ordering service node, or by gossiping with another peer. We’ll see how this block is processed by peers in the third phase.
块中的事务数取决于与块的所需大小和最大运行时间相关的通道配置参数(确切地说是BatchSize
和 BatchTimeout
参数)。然后将这些块保存到排序节点的分类帐账本中,并分发给加入通道的所有对等节点。如果此时某个对等节点出现故障,或稍后加入通道,则在重新连接到某个排序服务节点或与另一个对等节点通讯之后,它将接收这些块。我们将在第三阶段看到对等节点如何处理这个块。
The first role of an ordering node is to package proposed ledger updates. In this example, application A1 sends a transaction T1 endorsed by E1 and E2 to the orderer O1. In parallel, Application A2 sends transaction T2 endorsed by E1 to the orderer O1. O1 packages transaction T1 from application A1 and transaction T2 from application A2 together with other transactions from other applications in the network into block B2. We can see that in B2, the transaction order is T1,T2,T3,T4,T6,T5 – which may not be the order in which these transactions arrived at the orderer! (This example shows a very simplified ordering service configuration with only one ordering node.)
排序节点的第一个角色是打包提案的分类帐更新。在本例中,应用程序A1将由E1和E2背书的事务T1发送给排序节点O1。并行的,应用程序A2将由E1签署的事务T2发送给排序节点O1。O1将来自应用A1的事务T1和来自应用A2的事务T2以及来自网络中其他应用程序的其他事务打包到块B2中。我们可以看到,在B2中,事务顺序是T1,T2,T3,T4,T6,T5——这可能不是这些事务到达排序节点的顺序!(此示例显示了一个非常简单的排序服务配置,只有一个排序节点。)
It’s worth noting that the sequencing of transactions in a block is not necessarily the same as the order received by the ordering service, since there can be multiple ordering service nodes that receive transactions at approximately the same time. What’s important is that the ordering service puts the transactions into a strict order, and peers will use this order when validating and committing transactions.
值得注意的是,块中事务的顺序不一定与排序服务接收事务的顺序相同,因为可以有多个排序服务节点同时接收事务。重要的是,排序服务将事务放入一个严格的顺序,对等节点在验证和提交事务时将使用这个顺序。
This strict ordering of transactions within blocks makes Hyperledger Fabric a little different from other blockchains where the same transaction can be packaged into multiple different blocks that compete to form a chain. In Hyperledger Fabric, the blocks generated by the ordering service are final. Once a transaction has been written to a block, its position in the ledger is immutably assured. As we said earlier, Hyperledger Fabric’s finality means that there are no ledger forks — validated transactions will never be reverted or dropped.
这种对块内事务的严格排序使得Hyperledger Fabric与其他区块链稍有不同,在这些区块链中,同一事务可以打包成多个不同的块,通过竞争形成一个链。在Hyperledger Fabric中,排序服务生成的块是final。一旦一个交易被写入一个数据块,它在分类帐中的位置就不会改变。正如我们前面所说,超级账本结构的最终性意味着没有账本分叉—经过验证的交易将永远不会被反转或删除。
We can also see that, whereas peers execute smart contracts and process transactions, orderers most definitely do not. Every authorized transaction that arrives at an orderer is mechanically packaged in a block — the orderer makes no judgement as to the content of a transaction (except for channel configuration transactions, as mentioned earlier).
我们还可以看到,虽然对等节点执行智能合约并处理事务,但排序方绝对不会。到达排序方的每一个授权事务都被机械地打包在一个块中—排序方对事务的内容不做任何判断(前面提到的通道配置事务除外)。
At the end of phase two, we see that orderers have been responsible for the simple but vital processes of collecting proposed transaction updates, ordering them, and packaging them into blocks, ready for distribution.
在第二阶段结束时,我们看到排序节点负责收集提案的事务更新、排序它们并将它们打包成块,以备分发,这是一个简单但至关重要的过程。
The third phase of the transaction workflow involves the distribution and subsequent validation of blocks from the orderer to the peers, where they can be applied to the ledger.
事务处理工作流的第三个阶段涉及对从排序节点到对等节点的块的分发和后续验证,这些区块会应用到分类帐本。
Phase 3 begins with the orderer distributing blocks to all peers connected to it. It’s also worth noting that not every peer needs to be connected to an orderer — peers can cascade blocks to other peers using the gossip protocol.
阶段3从排序节点将区块分配给与其连接的所有对等节点开始。同样值得注意的是,并非每个对等节点都需要连接到排序节点—对等节点可以使用[gossip](https://hyperledger fabric.readthedocs.io/en/release-1.4/peers/peers.html#phase-1-proposal)协议区块块级联到其他对等节点。
Each peer will validate distributed blocks independently, but in a deterministic fashion, ensuring that ledgers remain consistent. Specifically, each peer in the channel will validate each transaction in the block to ensure it has been endorsed by the required organization’s peers, that its endorsements match, and that it hasn’t become invalidated by other recently committed transactions which may have been in-flight when the transaction was originally endorsed. Invalidated transactions are still retained in the immutable block created by the orderer, but they are marked as invalid by the peer and do not update the ledger’s state.
每个对等节点将独立地验证分发的区块,但以确定的方式,确保分类账保持一致。具体来说,通道中的每个对等节点都将验证区块中的每个事务,以确保它已得到所需的组织内对等节点的背书,其背书是匹配的,且没有因最初背书该事务时可能正在运行的其他最近提交的事务而失效。无效的事务仍保留在排序节点创建的不可变区块中,但对等节点将它们标记为无效,并且不更新分类帐的状态。
The second role of an ordering node is to distribute blocks to peers. In this example, orderer O1 distributes block B2 to peer P1 and peer P2. Peer P1 processes block B2, resulting in a new block being added to ledger L1 on P1. In parallel, peer P2 processes block B2, resulting in a new block being added to ledger L1 on P2. Once this process is complete, the ledger L1 has been consistently updated on peers P1 and P2, and each may inform connected applications that the transaction has been processed.
排序节点的第二个角色是将区块分发给对等节点。在本例中,排序节点O1将块B2分配给对等节点P1和对等节点P2。对等节点P1处理块B2,从而将新块添加到P1上的分类帐L1。并行地,对等节点P2处理块B2,导致一个新块被添加到P2上的分类帐L1。完成此过程后,已在对等节点P1和P2上一致更新分类帐L1,并且每个分类帐都可以通知关联应用已处理该事务。
In summary, phase three sees the blocks generated by the ordering service applied consistently to the ledger. The strict ordering of transactions into blocks allows each peer to validate that transaction updates are consistently applied across the blockchain network.
总的来说,第三阶段看到的是由一致应用于分类账的排序服务生成的块。将交易严格排序成块允许每个对等节点验证应用事务在区块链网络中被一致更新。
For a deeper look at phase 3, refer back to the Peers topic.
要更深入地了解第3阶段,请参阅[对等节点](https://hyperledger fabric.readthedocs.io/en/release-1.4/peers/peers.html phase-3-validation-and-commit)主题。
While every ordering service currently available handles transactions and configuration updates the same way, there are nevertheless several different implementations for achieving consensus on the strict ordering of transactions between ordering service nodes.
虽然当前可用的每个排序服务都以相同的方式处理事务和配置更新,但仍有几种不同的实现,以在排序服务节点之间对事务的严格排序达成共识。
For information about how to stand up an ordering node (regardless of the implementation the node will be used in), check out our documentation on standing up an ordering node.
有关如何建立排序节点的信息(无论该节点将在何处使用),请查看[关于建立排序节点的文档](https://hyperledger fabric.readthedocs.io/en/release-1.4/order_deploy.html)。
Solo
The Solo implementation of the ordering service is aptly named: it features only a single ordering node. As a result, it is not, and never will be, fault tolerant. For that reason, Solo implementations cannot be considered for production, but they are a good choice for testing applications and smart contracts, or for creating proofs of concept. However, if you ever want to extend this PoC network into production, you might want to start with a single node Raft cluster, as it may be reconfigured to add additional nodes.
订购服务的单点实现命名很恰当:它只具有一个排序节点。因此,它不是,也永远不会是容错的。出于这个原因,单点实现不能考虑用于生产,但对于测试应用程序和智能合约,或者创建POC来说,它们是一个很好的选择。但是,如果您曾经希望将这个POC网络扩展到生产环境中,那么您可能从单个节点Raft集群开始,因为它可能被重新配置以添加其他节点。
Raft
New as of v1.4.1, Raft is a crash fault tolerant (CFT) ordering service based on an implementation of Raft protocol in etcd
. Raft follows a “leader and follower” model, where a leader node is elected (per channel) and its decisions are replicated by the followers. Raft ordering services should be easier to set up and manage than Kafka-based ordering services, and their design allows different organizations to contribute nodes to a distributed ordering service.
从v1.4.1开始,Raft是一种基于在etcd
中实现Raft协议的容错(CFG)排序服务。Raft遵循“领导者和追随者”模型,在这个模型中,领导者节点在每个通道被选中,其决定被追随者复制。Raft排序服务应该比基于Kafka的排序服务更容易设置和管理,它们的设计允许不同的组织为分布式排序服务贡献节点。
Kafka
Similar to Raft-based ordering, Apache Kafka is a CFT implementation that uses a “leader and follower” node configuration. Kafka utilizes a ZooKeeper ensemble for management purposes. The Kafka based ordering service has been available since Fabric v1.0, but many users may find the additional administrative overhead of managing a Kafka cluster intimidating or undesirable.
与基于Raft的排序类似,Apache Kafka是一个使用一个“leader和follower”节点配置的CFT实现。Kafka利用Zookeeper协调实现管理目的。基于Kafka的排序服务从Fabric v1.0开始就可以使用,但许多用户可能会发现管理Kafka集群的额外管理开销令人生畏或不受欢迎。
As stated above, a Solo ordering service is a good choice when developing test, development, or proofs-of-concept networks. For that reason, it is the default ordering service deployed in our Build your first network tutorial, as, from the perspective of other network components, a Solo ordering service processes transactions identically to the more elaborate Kafka and Raft implementations while saving on the administrative overhead of maintaining and upgrading multiple nodes and clusters. Because a Solo ordering service is not crash-fault tolerant, it should never be considered a viable alternative for a production blockchain network. For networks which wish to start with only a single ordering node but might wish to grow in the future, a single node Raft cluster is a better option.
如上所述,测试、开发或POC阶段,单点排序服务是一个很好的选择。因此,它是我们在建立你的第一个网络教程中部署的默认排序服务,因为从其他网络组件的角度来看,单独排序服务处理事务与更复杂的Kafka和Raft实现完全相同,但节省了维护和升级多个节点和集群的管理开销。由于单点排序服务不具有故障容错性,因此不应将其视为生产区块链网络的可行替代方案。对于只希望从单个排序节点开始,但将来可能增长的网络,单节点Raft集群是更好的选择。
For information on how to configure a Raft ordering service, check out our documentation on configuring a Raft ordering service.
有关如何配置Raft排序服务的信息,请参阅我们的配置Raft排序服务的文档。
The go-to ordering service choice for production networks, the Fabric implementation of the established Raft protocol uses a “leader and follower” model, in which a leader is dynamically elected among the ordering nodes in a channel (this collection of nodes is known as the “consenter set”), and that leader replicates messages to the follower nodes. Because the system can sustain the loss of nodes, including leader nodes, as long as there is a majority of ordering nodes (what’s known as a “quorum”) remaining, Raft is said to be “crash fault tolerant” (CFT). In other words, if there are three nodes in a channel, it can withstand the loss of one node (leaving two remaining). If you have five nodes in a channel, you can lose two nodes (leaving three remaining nodes).
对于生产网络的go-to排序服务选择,Fabric对建立的Raft协议的实现使用一个“leader and follower”模型,其中leader是在一个通道中的排序节点中动态选择的(这个节点集合称为“参与者集”),并将消息复制到follower节点。由于系统能够承受节点的损失,包括领导者节点,只要有大部分排序节点(即所谓的“仲裁”)剩余,Raft就被称为“故障容错”(CFT)。换句话说,如果一个通道中有三个节点,它可以承受一个节点的损失(剩下两个)。如果一个通道中有五个节点,则可以丢失两个节点(剩下三个节点)。
From the perspective of the service they provide to a network or a channel, Raft and the existing Kafka-based ordering service (which we’ll talk about later) are similar. They’re both CFT ordering services using the leader and follower design. If you are an application developer, smart contract developer, or peer administrator, you will not notice a functional difference between an ordering service based on Raft versus Kafka. However, there are a few major differences worth considering, especially if you intend to manage an ordering service:
从提供给网络或通道的服务来看,Raft和现有的基于Kafka的排序服务(稍后讨论)是相似的。他们都是使用领导者和追随者设计的CFT排序服务。如果您是应用开发人员、智能合约开发人员或节点管理员,您不会注意到基于Raft的排序服务与Kafka之间的功能差异。但是,有一些值得考虑的主要差异,特别是如果您打算管理排序服务:
Raft is easier to set up. Although Kafka has scores of admirers, even those admirers will (usually) admit that deploying a Kafka cluster and its ZooKeeper ensemble can be tricky, requiring a high level of expertise in Kafka infrastructure and settings. Additionally, there are many more components to manage with Kafka than with Raft, which means that there are more places where things can go wrong. And Kafka has its own versions, which must be coordinated with your orderers. With Raft, everything is embedded into your ordering node.
Raft更容易搭建。Kafka有很多崇拜者,但即使是那些崇拜者也通常会承认部署Kafka集群及其Zookeeper组件很棘手,需要在Kafka基础设施和设置方面拥有高水平的专业知识。此外,使用Kafka需要管理的组件比使用Raft管理的组件多,这意味着有更多的地方会出现问题。Kafka有自己的版本,必须与排序方协调。使用Raft,所有内容都嵌入到您的排序节点。
Kafka and Zookeeper are not designed to be run across large networks. They are designed to be CFT but should be run in a tight group of hosts. This means that practically speaking you need to have one organization run the Kafka cluster. Given that, having ordering nodes run by different organizations when using Kafka (which Fabric supports) doesn’t give you much in terms of decentralization because the nodes will all go to the same Kafka cluster which is under the control of a single organization. With Raft, each organization can have its own ordering nodes, participating in the ordering service, which leads to a more decentralized system.
Kafka和Zookeeper并不是为了在大型网络上运行。它们被设计为CFT,但应该在一组关系紧密的主机中运行。这意味着实际上,您需要有一个组织运行Kafka集群。考虑到这一点,在使用Kafka(被Fabric支持)时,让不同组织运行排序节点不会给您带来太多的去中心化,因为这些节点都将进入同一个由单个组织控制的Kafka集群。使用Raft,每个组织都可以有自己的排序节点,参与排序服务,从而形成一个更加分散的系统。
Raft is supported natively. While Kafka-based ordering services are currently compatible with Fabric, users are required to get the requisite images and learn how to use Kafka and ZooKeeper on their own. Likewise, support for Kafka-related issues is handled through Apache, the open-source developer of Kafka, not Hyperledger Fabric. The Fabric Raft implementation, on the other hand, has been developed and will be supported within the Fabric developer community and its support apparatus.
Raft是内在支持的。虽然基于Kafka的排序服务目前与Fabric兼容,但用户需要获得必要的想象,并独自学习使用Kafka和ZooKeeper。同样,对Kafka相关问题的支持也通过Apache来处理,其是Kafka而不是超级账本Fabric的开源开发人员。另一方面,Fabric Raft 已经开发出来,并将在Fabric开发人员社区及其支持设备中得到支持。
Where Kafka uses a pool of servers (called “Kafka brokers”) and the admin of the orderer organization specifies how many nodes they want to use on a particular channel, Raft allows the users to specify which ordering nodes will be deployed to which channel. In this way, peer organizations can make sure that, if they also own an orderer, this node will be made a part of a ordering service of that channel, rather than trusting and depending on a central admin to manage the Kafka nodes.
Kafka使用一个服务器池(称为“Kafka代理”),并且排序节点组织的管理员指定要在特定通道上使用多少个节点,Raft允许用户指定某个排序节点部署到哪个通道。通过这种方式,节点组织可以确保,如果他们也拥有一个排序节点,那么这个节点将成为该通道的排序服务的一部分,而不是信任并依赖中央管理者来管理Kafka节点。
Raft is the first step toward Fabric’s development of a byzantine fault tolerant (BFT) ordering service. As we’ll see, some decisions in the development of Raft were driven by this. If you are interested in BFT, learning how to use Raft should ease the transition.
Raft是向开发拜占庭容错(BFT)排序服务迈出的第一步。正如我们将看到的,Raft开发中的一些决策是由这个驱动的。如果你对BFT感兴趣,学习如何使用Raft应该可以降低转换难度。
Note: Similar to Solo and Kafka, a Raft ordering service can lose transactions after acknowledgement of receipt has been sent to a client. For example, if the leader crashes at approximately the same time as a follower provides acknowledgement of receipt. Therefore, application clients should listen on peers for transaction commit events regardless (to check for transaction validity), but extra care should be taken to ensure that the client also gracefully tolerates a timeout in which the transaction does not get committed in a configured timeframe. Depending on the application, it may be desirable to resubmit the transaction or collect a new set of endorsements upon such a timeout.
注:与Solo和Kafka类似,在向客户发送回执后,Raft排序服务可能会丢失交易。例如,如果领导者在追随者发送回执通知的同时崩溃。因此,应用客户端应该无区别的监听对等节点上的事务提交事件来检查事务的有效性,但是应该小心以确保客户端也能优雅地容忍事务在配置的时间窗口内未能提交的超时。根据应用的不同,可能需要在此类超时时重新提交事务或收集一组新的背书。
While Raft offers many of the same features as Kafka — albeit in a simpler and easier-to-use package — it functions substantially different under the covers from Kafka and introduces a number of new concepts, or twists on existing concepts, to Fabric.
Raft提供了许多与Kafka相同的功能—尽管它是一个简单易用的软件包—但它在表面下与Kafka功能却大不相同,它向Fabric引入了许多新的概念,或改变了现有的概念。
Log entry. The primary unit of work in a Raft ordering service is a “log entry”, with the full sequence of such entries known as the “log”. We consider the log consistent if a majority (a quorum, in other words) of members agree on the entries and their order, making the logs on the various orderers replicated.
日志条目。Raft排序服务中的主要工作单元是“日志条目”,该条目的完整序列即为“日志”。如果大多数成员(换句话说,法定人数)同意条目及其顺序,则我们认为日志是一致的,从而使不同排序节点上的日志得到复制。
Consenter set. The ordering nodes actively participating in the consensus mechanism for a given channel and receiving replicated logs for the channel. This can be all of the nodes available (either in a single cluster or in multiple clusters contributing to the system channel), or a subset of those nodes.
参与者组。主动参与给定通道的协商机制并接收该通道的复制日志的排序节点。这可以是所有可用的节点(在单个集群中或在多个集群中为系统通道提供服务),也可以是这些节点的一个子集。
Finite-State Machine (FSM). Every ordering node in Raft has an FSM and collectively they’re used to ensure that the sequence of logs in the various ordering nodes is deterministic (written in the same sequence).
有限状态机(FSM). Raft中的每个排序节点都有一个FSM,它们共同用于确保不同排序节点中的日志序列是确定的(以相同的顺序写入)。
Quorum. Describes the minimum number of consenters that need to affirm a proposal so that transactions can be ordered. For every consenter set, this is a majority of nodes. In a cluster with five nodes, three must be available for there to be a quorum. If a quorum of nodes is unavailable for any reason, the ordering service cluster becomes unavailable for both read and write operations on the channel, and no new logs can be committed.
法定人数。描述需要确认提案以便使事务进入排序的最少参与人数量。对于每个参与者集,这是大多数节点。在具有五个节点的集群中,必须有三个节点可用于仲裁。如果由于任何原因某仲裁节点不可用,那么排序服务集群将不可用于通道上的读写操作,并且不能提交新的日志。
Leader. This is not a new concept — Kafka also uses leaders, as we’ve said — but it’s critical to understand that at any given time, a channel’s consenter set elects a single node to be the leader (we’ll describe how this happens in Raft later). The leader is responsible for ingesting new log entries, replicating them to follower ordering nodes, and managing when an entry is considered committed. This is not a special type of orderer. It is only a role that an orderer may have at certain times, and then not others, as circumstances determine.
领导者。这不是一个新的概念—前文提到Kafka也使用了领导者—但重要的是要理解,在任何给定的时间,一个通道的参与者集选择一个单一节点作为领导者(稍后我们将在Raft中描述这是如何发生的)。领导者负责接收新的日志条目,将它们复制到跟随排序节点,以及管理何时将条目视为已提交。这不是特殊的类型的排序节点。它只是排序节点在特定时间可能拥有的角色,而不是其他角色,具体情况决定了这一点。
Follower. Again, not a new concept, but what’s critical to understand about followers is that the followers receive the logs from the leader and replicate them deterministically, ensuring that logs remain consistent. As we’ll see in our section on leader election, the followers also receive “heartbeat” messages from the leader. In the event that the leader stops sending those message for a configurable amount of time, the followers will initiate a leader election and one of them will be elected the new leader.
追随者。这同样不是一个新的概念,但了解追随者的关键是,追随者从领导者那里接收日志并确定地复制它们,确保日志保持一致。正如我们在关于领导者选举的章节中所看到的,追随者也会收到领导者发出的“心跳”信息。如果领导者在可配置的一段时间内停止发送这些消息,那么追随者将发起领导者选举,其中一个将被选为新的领导者。
Every channel runs on a separate instance of the Raft protocol, which allows each instance to elect a different leader. This configuration also allows further decentralization of the service in use cases where clusters are made up of ordering nodes controlled by different organizations. While all Raft nodes must be part of the system channel, they do not necessarily have to be part of all application channels. Channel creators (and channel admins) have the ability to pick a subset of the available orderers and to add or remove ordering nodes as needed (as long as only a single node is added or removed at a time).
每个通道都运行在Raft协议的独立实例上,这允许每个实例选择不同的领导者。这种配置还允许在集群由来自不同组织的排序节点组成的情况下进一步使服务去中心化。虽然所有Raft节点必须是系统通道的一部分,但它们不一定是所有应用通道的一部分。通道创建者(和通道管理员)能够选择可用的排序节点的子集,并根据需要添加或删除排序节点(一次只允许添加或删除一个节点)。
While this configuration creates more overhead in the form of redundant heartbeat messages and goroutines, it lays necessary groundwork for BFT.
虽然这种配置带来冗余心跳消息和goroutine这些额外开销,但它为BFT奠定了必要的基础。
In Raft, transactions (in the form of proposals or configuration updates) are automatically routed by the ordering node that receives the transaction to the current leader of that channel. This means that peers and applications do not need to know who the leader node is at any particular time. Only the ordering nodes need to know.
在Raft中,事务(以提案或配置更新的形式)由接收事务的排序节点自动路由到该通道的当前领导者。这意味着对等节点和应用在任何特定时间都不需要知道哪个领导节点。只有排序节点需要知道。
When the orderer validation checks have been completed, the transactions are ordered, packaged into blocks, consented on, and distributed, as described in phase two of our transaction flow.
当排序节点验证检查完成后,事务将被排序、打包成块、同意并分发,如我们交易流程的第二阶段所述。
Although the process of electing a leader happens within the orderer’s internal processes, it’s worth noting how the process works.
虽然挑选领导者的过程发生在排序节点内部,但是应弄清楚这个过程是如何工作的。
Raft nodes are always in one of three states: follower, candidate, or leader. All nodes initially start out as a follower. In this state, they can accept log entries from a leader (if one has been elected), or cast votes for leader. If no log entries or heartbeats are received for a set amount of time (for example, five seconds), nodes self-promote to the candidate state. In the candidate state, nodes request votes from other nodes. If a candidate receives a quorum of votes, then it is promoted to a leader. The leader must accept new log entries and replicate them to the followers.
Raft节点总是处于三种状态之一:跟随、候选或领导。所有节点最初都以跟随者开始。在这种状态下,他们可以接受领导节点的日志条目(如果已经选出一个领导节点),或者为领导节点投票。如果在设定的时间内(例如,5秒)没有收到日志条目或心跳,则节点会自我提升到候选状态。在候选状态下,节点请求来自其他节点的投票。如果候选人获得法定人数的选票,则被提升为领导者。领导者必须接受新的日志条目并将其复制到跟随节点。
For a visual representation of how the leader election process works, check out The Secret Lives of Data.
有关领导选举过程的可视化展示,请查看数据的秘密生活。
If an ordering node goes down, how does it get the logs it missed when it is restarted?
如果一个排序节点出现故障,它如何在重新启动时获取丢失的日志?
While it’s possible to keep all logs indefinitely, in order to save disk space, Raft uses a process called “snapshotting”, in which users can define how many bytes of data will be kept in the log. This amount of data will conform to a certain number of blocks (which depends on the amount of data in the blocks. Note that only full blocks are stored in a snapshot).
虽然可以无限期地保留所有日志,但为了节省磁盘空间,Raft使用了一个称为“快照”的过程,在该过程中,用户可以定义日志中保留的数据字节数。数据量将对应一定数量的块(这取决于块中的数据量。请注意,快照中只存储完整的块)。
For example, let’s say lagging replica R1
was just reconnected to the network. Its latest block is 100
. Leader L
is at block 196
, and is configured to snapshot at amount of data that in this case represents 20 blocks. R1
would therefore receive block 180
from L
and then make a Deliver
request for blocks 101
to 180
. Blocks 180
to 196
would then be replicated to R1
through the normal Raft protocol.
例如,假设滞后副本“R1”刚刚重新连接到网络,其最新的块是“100”。领导者节点“L”位于块“196”,配置为以20个块的数据量进行快照。因此,‘R1’将从’L’接收至块’180’,然后对’101’到’180’块发出’Deliver’请求。块“180”到“196”将通过普通Raft协议复制到“R1”。
The other crash fault tolerant ordering service supported by Fabric is an adaptation of a Kafka distributed streaming platform for use as a cluster of ordering nodes. You can read more about Kafka at the Apache Kafka Web site, but at a high level, Kafka uses the same conceptual “leader and follower” configuration used by Raft, in which transactions (which Kafka calls “messages”) are replicated from the leader node to the follower nodes. In the event the leader node goes down, one of the followers becomes the leader and ordering can continue, ensuring fault tolerance, just as with Raft.
Fabric支持的另一个崩溃容错排序服务是改写的Kafka分布式流式平台,用作排序节点的集群。您可以在Apache Kafka网站上阅读更多关于Kafka的信息,但在上层,Kafka使用了与Raft使用的“leader and follower”配置相同的概念,其中事务(Kafka称之为“messages”)从leader节点复制到跟随者节点。如果领导者节点出现故障,跟随者的一个将成为领导者,排序可以继续,确保容错,和Raft一样。
The management of the Kafka cluster, including the coordination of tasks, cluster membership, access control, and controller election, among others, is handled by a ZooKeeper ensemble and its related APIs.
Kafka集群的管理,包括任务协调、集群成员、访问控制和控制器选择等,由ZooKeeper集群及其相关API处理。
Kafka clusters and ZooKeeper ensembles are notoriously tricky to set up, so our documentation assumes a working knowledge of Kafka and ZooKeeper. If you decide to use Kafka without having this expertise, you should complete, at a minimum, the first six steps of the Kafka Quickstart guide before experimenting with the Kafka-based ordering service. You can also consult this sample configuration file for a brief explanation of the sensible defaults for Kafka and ZooKeeper.
Kafka集群和ZooKeeper组合是众所周知的难以搭建,所以我们的文档假定具有Kafka和ZooKeeper的工作知识。如果您决定在没有此专业知识的情况下使用Kafka,则在尝试基于Kafka的排序服务之前,应至少完成Kafka快速入门指南的前六个步骤。您还可以参考示例配置文件以获得对Kafka和ZooKeeper的合理默认值的简要说明。
To learn how to bring up a a Kafka-based ordering service, check out our documentation on Kafka.
要了解如何创建基于Kafka的排序服务,请查看[我们关于Kafka的文档](https://hyperledger fabric.readthedocs.io/en/release-1.4/kafka.html)。