Btcd区块链的构建(一)

我们在介绍Btcd协议消息时提到,协议设计的目标就是同步transaction或者block,最终在各节点上形成一致的区块链。本文开始,我们将逐步分析节点在收到transaction或者block后如何处理它们,如如何将transaction打包成区块并挖矿、如何将block添加到区块链中等问题。同步transaction后,“矿工”可以将交易打包进新的区块,并向网络传播新区块。所以,对于独立“挖矿”的节点或者通过“矿池”与“矿工”连接的“全节点”而言,它们有两方面的区块来源,即“挖矿”产生的新区块和收到的其它节点转发的区块,如下图所示。

Btcd区块链的构建(一)_第1张图片

同步其它节点转发的区块,就是通过协议消息inv-getdata-block实现的,我们在《Btcd区块链协议消息解析》中介绍过这一过程;“挖矿”的过程我们将在后文中详细介绍。无论是“挖”出新区块还是收到其他节点转发的区块,都需要经过一系列验证后,再加入到区块链上;同时,当区块链的共识机制需要作微调,即发生“软分叉”时,节点之间也需要一个共识机制来支持新的区块或者兼容老的区块,这些过程或者机制的实现位于btcd/blockchain包中。我们将重点介绍btcd/blockchain包,其中的文件有:

  • fullblocktests文件夹: 包含自动生成测试用例的实现类,用于测试区块的共识参数;
  • indexers文件夹: 实现了transaction到block之间和transaction到关联的bitcoin地址之间的索引关系;
  • testdata文件夹: 包含用于测试的block数据;
  • process.go: ProcessBlock()方法所在的文件,它是btcd/blockchain对区块进行处理的入口方法,也是我们重点分析对象;
  • accept.go: maybeAcceptBlock()方法所在的文件,它是区块被链接到区块链上的入口方法;
  • validate.go: 实现了验证transaction和block的各种方法,如CheckBlockSanity()、CheckProofOfWork、CheckTransactionSanity等等;
  • chain.go: 定义了BlockChain类型,包括了区块加入区块链,或者区块链构建的主要实现过程;
  • blockindex.go: 实现了blockIndex,用于维护blockchain上block与其hash之间的映射关系和父block与子block(s)的关系;
  • chainio.go: 提供了向db中读写区块或者相关元数据(如ThresholdCaches、BlockIndex等等)的方法;
  • checkpoints.go: 实现了BlockChain中与Checkpoint相关的方法,如findPreviousCheckpoint()等等,BlockChain中的checkpoints是一系列指定的区块,内置在bitcoin节点中,用于验证区块工作量;
  • compress.go: 实现了一种改进的VLQ(Variable Length Quantity)正整数编码方式,同时实现了对交易输出额进行压缩编码和输出脚本(锁定脚本)进行压缩编码的算法,进而实现了对交易输出TxOut的压缩;
  • difficulty.go: 提供了与区块难度值相关的方法,如区块头里的难度值与整数值之间的转换、通过难度值计算工作量和难度值调整算法等等;
  • mediatime.go: 定义了MedianTimeSource接口和其实现mediaTime类型,通过计算各时间源的中位值来较正当前时间;
  • merkle.go: 实现了构建Merkle树的算法;
  • notifications.go: 定义了NotificationCallback和Notification,用于blockchain向其消费者回调NTBlockAccepted、NTBlockConnected及NTBlockDisconnected事件;
  • scriptval.go: 实现了验证交易中的锁定脚本与解锁脚本的机制,支持并发地验证多个脚本对;
  • thresholdstate.go: 定义了新的共识规则在向区块链网络部署时的状态,按部署或者节点接受的进度依次为Defined、Started、LockedIn、Active或者Failed等状态,更重要地,实现了状态转换的算法和计算区块上某个共识规则的部署状态的方法,我们将在后文中详细介绍;
  • timersort.go: 实现了支持升序排序的时间戳集合类型timeSorter,timeSorter实际上是[]int64类型,这是Go语言特性的一个体现,即通过类型定义可以扩展任意类型,包括基础类型的方法集;
  • versionbits.go: 实现了BIP9中描述的关于区块版本号的定义,区块版本号是一个32位且按小端模式存储的整型值,以001开头,其后的每一位代表一个BIP部署;同时也实现了根据部署状态确定下一个区块的目标版本号的方法,用于设定新“挖”出区块的版本号,我们将在后文中详细分析;
  • utxoviewpoint.go: 维护了UTXO的集合,包括有新的transaction时向集合中添加新的UTXO或者移除已经花费的UTXO;
  • doc.go: 包blockchain的doc说明;
  • error.go: 定义了与构建区块链相关的error类型及对应的字符串;
  • log.go: 提供了logger相关的方法;
  • xxx_test.go: 各文件对应的测试文件;

ProcessBlock()是btcd/blockchain中开始对block进行各种验证并接入区块链的核心方法,从上图中也可以看到,收到其它节点同步过来的区块或者矿工“挖”出块后均交由ProcessBlock处理。接下来,我们将从ProcessBlock()入手逐步分析区块处理的各个步骤,为了便于后续分析,我们看介绍BlockChain的定义:

//btcd/blockchain/chain.go

// BlockChain provides functions for working with the bitcoin block chain.
// It includes functionality such as rejecting duplicate blocks, ensuring blocks
// follow all rules, orphan handling, checkpoint handling, and best chain
// selection with reorganization.
type BlockChain struct {
    // The following fields are set when the instance is created and can't
    // be changed afterwards, so there is no need to protect them with a
    // separate mutex.
    checkpoints         []chaincfg.Checkpoint
    checkpointsByHeight map[int32]*chaincfg.Checkpoint
    db                  database.DB
    chainParams         *chaincfg.Params
    timeSource          MedianTimeSource
    notifications       NotificationCallback
    sigCache            *txscript.SigCache
    indexManager        IndexManager

    // The following fields are calculated based upon the provided chain
    // parameters.  They are also set when the instance is created and
    // can't be changed afterwards, so there is no need to protect them with
    // a separate mutex.
    //
    // minMemoryNodes is the minimum number of consecutive nodes needed
    // in memory in order to perform all necessary validation.  It is used
    // to determine when it's safe to prune nodes from memory without
    // causing constant dynamic reloading.  This is typically the same value
    // as blocksPerRetarget, but it is separated here for tweakability and
    // testability.
    minRetargetTimespan int64 // target timespan / adjustment factor
    maxRetargetTimespan int64 // target timespan * adjustment factor
    blocksPerRetarget   int32 // target timespan / target time per block
    minMemoryNodes      int32

    // chainLock protects concurrent access to the vast majority of the
    // fields in this struct below this point.
    chainLock sync.RWMutex

    // These fields are configuration parameters that can be toggled at
    // runtime.  They are protected by the chain lock.
    noVerify bool

    // These fields are related to the memory block index.  The best node
    // is protected by the chain lock and the index has its own locks.
    bestNode *blockNode
    index    *blockIndex

    // These fields are related to handling of orphan blocks.  They are
    // protected by a combination of the chain lock and the orphan lock.
    orphanLock   sync.RWMutex
    orphans      map[chainhash.Hash]*orphanBlock
    prevOrphans  map[chainhash.Hash][]*orphanBlock
    oldestOrphan *orphanBlock

    // These fields are related to checkpoint handling.  They are protected
    // by the chain lock.
    nextCheckpoint  *chaincfg.Checkpoint
    checkpointBlock *btcutil.Block

    // The state is used as a fairly efficient way to cache information
    // about the current best chain state that is returned to callers when
    // requested.  It operates on the principle of MVCC such that any time a
    // new block becomes the best block, the state pointer is replaced with
    // a new struct and the old state is left untouched.  In this way,
    // multiple callers can be pointing to different best chain states.
    // This is acceptable for most callers because the state is only being
    // queried at a specific point in time.
    //
    // In addition, some of the fields are stored in the database so the
    // chain state can be quickly reconstructed on load.
    stateLock     sync.RWMutex
    stateSnapshot *BestState

    // The following caches are used to efficiently keep track of the
    // current deployment threshold state of each rule change deployment.
    //
    // This information is stored in the database so it can be quickly
    // reconstructed on load.
    //
    // warningCaches caches the current deployment threshold state for blocks
    // in each of the **possible** deployments.  This is used in order to
    // detect when new unrecognized rule changes are being voted on and/or
    // have been activated such as will be the case when older versions of
    // the software are being used
    //
    // deploymentCaches caches the current deployment threshold state for
    // blocks in each of the actively defined deployments.
    warningCaches    []thresholdStateCache
    deploymentCaches []thresholdStateCache

    // The following fields are used to determine if certain warnings have
    // already been shown.
    //
    // unknownRulesWarned refers to warnings due to unknown rules being
    // activated.
    //
    // unknownVersionsWarned refers to warnings due to unknown versions
    // being mined.
    unknownRulesWarned    bool
    unknownVersionsWarned bool
}

其中各个字段的意义如下:

  • checkpoints: 链上预设的一些checkpoints,节点启动时指定,实际由链上的某些高度上的区块充当;
  • checkpointsByHeight: 记录了checkpoints和对应的区块的高度之间的映射关系,初始化时由checkpoints解析而来;
  • db: 存储区块的database对象;
  • chainParams: 与区块链相关的参数配置,如网络号、创世区块(Genesis Block)、难度调整周期、DNSSeeds等等;对这些参数进行定制即可以生成一个与Bitcoin类似的新的"代币";
  • timeSource: 用于较正节点时钟的“时钟源”,节点可以在与Peer进行version消息交换的时候把Peer添加到自己的“时钟源”中,这样节点通过计算与不同Peer节点的时钟差的中位值,来估计其与网络同步时间的差,从而在利用timestamp进行区块较验之前先较正自己的时钟,以防节点时钟未同步造成较验失败;
  • notifications: 向bockchain注册的回调事件接收者;
  • sigCache: 用于缓存公钥与签名验证的结果,对于已经通过解锁脚本和锁定脚本验证的交易,对应的公钥和签名会被缓存,后续相同的交易验证时可以直接从缓存中读到验证结果;
  • indexManager: 用于索引链上transaction或者block的indexer;
  • minRetargetTimespan: 难度调整的最小周期,公链上其值为3.5天;
  • maxRetargetTimespan: 难度调整的最大周期,公链上其值为56天;
  • blocksPerRetarget: 难度调整周期内的区块数,公链上难度调整周期是14天,对应的区块数是2016个;
  • minMemoryNodes: 该值暂未用到;
  • chainLock: 保护访问区块链的读写锁;
  • noVerify: 是否关闭脚本验证的开关;
  • bestNode: 指向主链上的“尾”节点,即高度最高的区块;
  • index: 指向一个区块索引器,用于索引实例化后内存中的各区块;
  • orphanLock: 保护“孤儿区块”相关对象的读写锁;
  • orphans: 记录“孤儿区块”与其Hash之间的映射;
  • prevOrphans: 记录“孤儿区块”与其父区块Hash之间的映射,当父区块写入区块链后,要检索prevOrphans将对应的区块从“孤儿区块池”从移除并写入区块链;
  • oldestOrphan: 处于“孤儿”状态时间最长的区块,“孤儿区块池”最多维护100个“孤儿区块”,当超过限制时,开始移除“最老”的“孤儿区块”;
  • stateLock: 保护stateSnapshot的读写锁;
  • stateSnapshot: 主链相关信息的快照;
  • warningCaches: 缓存区块对于所有可能的部署的thresholdState,用于当节点收到大量新的版本的区块,且对应的共识规则在新的区块里已经部署或者即将部署时,发出警告提示,为了兼容新版本的区块,可能需要升级btcd版本;
  • deploymentCaches: 缓存区块对于已知的部署提案的thresholdState;
  • unknownRulesWarned: 标识是否已经警告过未知共识规则已经部署或者将被部署;
  • unknownVersionsWarned: 标识是否已经警告过收到过多未知新版本的区块;当有新版本的区块时,往往有新的共识规则正在部署,所以警告未知新版本与未知共识规则部署是相关的;

在了解了BlockChain的定义后,我们开始从ProcessBlock()分析处理区块的各个环节:

//btcd/blockchain/process.go

// ProcessBlock is the main workhorse for handling insertion of new blocks into
// the block chain.  It includes functionality such as rejecting duplicate
// blocks, ensuring blocks follow all rules, orphan handling, and insertion into
// the block chain along with best chain selection and reorganization.
//
// When no errors occurred during processing, the first return value indicates
// whether or not the block is on the main chain and the second indicates
// whether or not the block is an orphan.
//
// This function is safe for concurrent access.
func (b *BlockChain) ProcessBlock(block *btcutil.Block, flags BehaviorFlags) (bool, bool, error) {
    b.chainLock.Lock()
    defer b.chainLock.Unlock()

    fastAdd := flags&BFFastAdd == BFFastAdd
    dryRun := flags&BFDryRun == BFDryRun

    blockHash := block.Hash()
    log.Tracef("Processing block %v", blockHash)

    // The block must not already exist in the main chain or side chains.
    exists, err := b.blockExists(blockHash)                                              (1)
    if err != nil {
        return false, false, err
    }
    if exists {
        str := fmt.Sprintf("already have block %v", blockHash)
        return false, false, ruleError(ErrDuplicateBlock, str)
    }

    // The block must not already exist as an orphan.
    if _, exists := b.orphans[*blockHash]; exists {                                      (2)
        str := fmt.Sprintf("already have block (orphan) %v", blockHash)
        return false, false, ruleError(ErrDuplicateBlock, str)
    }

    // Perform preliminary sanity checks on the block and its transactions.
    err = checkBlockSanity(block, b.chainParams.PowLimit, b.timeSource, flags)           (3)
    if err != nil {
        return false, false, err
    }

    // Find the previous checkpoint and perform some additional checks based
    // on the checkpoint.  This provides a few nice properties such as
    // preventing old side chain blocks before the last checkpoint,
    // rejecting easy to mine, but otherwise bogus, blocks that could be
    // used to eat memory, and ensuring expected (versus claimed) proof of
    // work requirements since the previous checkpoint are met.
    blockHeader := &block.MsgBlock().Header
    checkpointBlock, err := b.findPreviousCheckpoint()
    if err != nil {
        return false, false, err
    }
    if checkpointBlock != nil {
        // Ensure the block timestamp is after the checkpoint timestamp.
        checkpointHeader := &checkpointBlock.MsgBlock().Header
        checkpointTime := checkpointHeader.Timestamp
        if blockHeader.Timestamp.Before(checkpointTime) {                                (4)
            str := fmt.Sprintf("block %v has timestamp %v before "+
                "last checkpoint timestamp %v", blockHash,
                blockHeader.Timestamp, checkpointTime)
            return false, false, ruleError(ErrCheckpointTimeTooOld, str)
        }
        if !fastAdd {
            // Even though the checks prior to now have already ensured the
            // proof of work exceeds the claimed amount, the claimed amount
            // is a field in the block header which could be forged.  This
            // check ensures the proof of work is at least the minimum
            // expected based on elapsed time since the last checkpoint and
            // maximum adjustment allowed by the retarget rules.
            duration := blockHeader.Timestamp.Sub(checkpointTime)
            requiredTarget := CompactToBig(b.calcEasiestDifficulty(
                checkpointHeader.Bits, duration))
            currentTarget := CompactToBig(blockHeader.Bits)
            if currentTarget.Cmp(requiredTarget) > 0 {                                   (5)
                str := fmt.Sprintf("block target difficulty of %064x "+
                    "is too low when compared to the previous "+
                    "checkpoint", currentTarget)
                return false, false, ruleError(ErrDifficultyTooLow, str)
            }
        }
    }

    // Handle orphan blocks.
    prevHash := &blockHeader.PrevBlock
    prevHashExists, err := b.blockExists(prevHash)
    if err != nil {
        return false, false, err
    }
    if !prevHashExists {
        if !dryRun {
            log.Infof("Adding orphan block %v with parent %v",
                blockHash, prevHash)
            b.addOrphanBlock(block)                                                      (6)
        }

        return false, true, nil
    }

    // The block has passed all context independent checks and appears sane
    // enough to potentially accept it into the block chain.
    isMainChain, err := b.maybeAcceptBlock(block, flags)                                 (7)
    if err != nil {
        return false, false, err
    }

    // Don't process any orphans or log when the dry run flag is set.
    if !dryRun {
        // Accept any orphan blocks that depend on this block (they are
        // no longer orphans) and repeat for those accepted blocks until
        // there are no more.
        err := b.processOrphans(blockHash, flags)                                        (8)
        if err != nil {
            return false, false, err
        }

        log.Debugf("Accepted block %v", blockHash)
    }

    return isMainChain, false, nil
}

ProcessBlock()输入的是指向btcutil.Block类型的block,它对wire.MsgBlock进行了封装,可以看作是访问wire.MsgBlock的辅助类型,在btcd/blockchain中看到的block类型均是btcutil.Block类型,所以在解析代码之前,我们先看一下它的定义:

//btcd/vendor/github.com/btcsuite/btcutil/block.go

// Block defines a bitcoin block that provides easier and more efficient
// manipulation of raw blocks.  It also memoizes hashes for the block and its
// transactions on their first access so subsequent accesses don't have to
// repeat the relatively expensive hashing operations.
type Block struct {
    msgBlock        *wire.MsgBlock  // Underlying MsgBlock
    serializedBlock []byte          // Serialized bytes for the block
    blockHash       *chainhash.Hash // Cached block hash
    blockHeight     int32           // Height in the main block chain
    transactions    []*Tx           // Transactions
    txnsGenerated   bool            // ALL wrapped transactions generated
}

ProcessBlock()输出的第一个值表示区块是否加入了主链,第二值表示区块是否是“孤儿区块”。其主要执行步骤是:

  1. 首先检查区块是否已经在链上,如代码(1)处所示;
  2. 然后检查区块是否在“孤儿区块池”中,如代码(2)处所示;
  3. 代码(3)处调用checkBlockSanity()对区块结构进行检查,包括对区块头,如工作量和Merkle树根等等,和交易集合的检查;
  4. 通过区块结构检查后,根据最近的Checkpoint与区块之间的时间差,计算预期的最小工作量,如果区块的工作量低于预期的最小工作量则被拒绝,如代码(5)所示; 这是通过Checkpoint机制防止伪造工作量证明的过程,需要注意的是,区块头中表示目标难度的值越大,则表示工作量越小,反之,其值越小,则表示工作量越大;
  5. 通过Checkpoint检查后,接着检测区块的父区块是否在链上,如果不在链上,则将区块加入“孤儿区块池”,如代码(6)处所示;
  6. 如果父区块在链上,代码(7)处调用maybeAcceptBlock()对区块先进行上下文检查,如根据父区块计算期望工作量、期望的timestamp范围、区块高度是否正确等等,然后根据父区块是否在主链上决定是扩展主链还是侧链,并进一步对区块中的交易进行验证;交易验证通过后,最终将区块接入区块链,如果扩展的是侧链,还要比较侧链扩展后的工作量是否超过主链,如果超过,则将侧链变为主链,其中详细的过程将在后文中介绍;
  7. 区块加入区块链后,最后调用processOrphans()检测“孤儿区块池”中是否有“孤儿区块”的父区块是刚刚入链的区块,如果有,则将“孤儿区块”也加入区块链,并重复这一检查;

通过ProcessBlock()我们可以看到区块处理的几个阶段:

Btcd区块链的构建(一)_第2张图片

其中的checkBlockSanity、maybeAcceptBlock及processOrphans等过程中又有很复杂的步骤,它们到底作了哪些检查,如何保证区块链的一致性,我们将在下一篇文章《Btcd区块链的构建(二)》中展开介绍。

==大家可以关注我的微信公众号,后续文章将在公众号中同步更新:==
Btcd区块链的构建(一)_第3张图片

你可能感兴趣的:(Btcd区块链的构建(一))