分布式系统 Lab3B KVRaft snapshot

发送InstallSnapshot RPC的时机?

  • extened Raft paper第12页左下角:when the leader has already discarded the next log that it needs to send to a follower. 所以是在本来要发送AppendEntries RPC时,却发现nextIndex[server]已经进snapshot了。此时不发送AppendEntries,改成发送InstallSnapshot
  • Snapshot对Follower的状态机起到“快进”的作用。直接跳过中间步骤,一步到位与Leader一致。

Follower收到AppendEntries RPC检查匹配却发现已经discard指定的log项

  • 发现lastIncludedIndex已经不在现存的log中,已经进了follower的snapshot中,如何处理?——算是匹配!
  • 因为凡是进了snapshot的,必然是已经被commit且apply。已经commit的东西,必然也存在于leader的log中。

如何实现InstallSnapshot RPC handler

Figure13写得太不明确了,真正实现还是看老师的课程讲义:
  • ignore if term is old (not the current leader)
  • ignore if follower already has last included index/term
    • it’s an old/delayed RPC
  • if not ignored:
    • empty the log, replace with fake “prev” entry
    • set lastApplied to lastIncludedIndex
    • replace service state (e.g. k/v table) with snapshot contents

崩溃和重启

解释为什么snapshot要包含duplicate table
  • service reads snapshot from disk
  • Raft reads persisted log from disk
  • Raft log may start before the snapshot (but definitely not after)
    • Raft will (re-)send committed entries on the applyCh
      • since applyIndex starts at zero after reboot
    • service will see repeats, must detect repeated index, ignore

注意

  • Raft重启以后,lastApplied应该设置为snapshot的lastIncludedIndex!!

参见

https://pdos.csail.mit.edu/6.824/notes/l-spinnaker.txt

你可能感兴趣的:(分布式系统 Lab3B KVRaft snapshot)