Wedge_Ss

MIT 6.824分布式 LAB2D：Raft

Lab 2D是lab2的最后阶段了，这一阶段就是加了一个快照机制，但是这种实验中这个快照是何时以及如何进行的，建议大家事前先去看一看，不然就会碰到各种问题。例如，我在测试过程中莫名其妙发现leader死锁了，以及测试显示lastapplied的index值和commandIndex值不匹配等问题。

这个实验的代码修改范围挺大，因为涉及到了rf.log的索引值的修改，以及log replication的运行逻辑调整。写lab2D的过程，感觉就是以往的实验差不多，跑测试，然后找bug，然后打补丁。

此外本实验室需要通过LAB 2A+2B+2C+2D的所有测试，在其中好像又发现了点过去写的bug，这次也调整过来，最终也全部测试通过。

介绍

Lab 2D就是让我们实现snapshot，让节点中的执行后的状态能够保存下来，并将那些快照之前的log值都删掉，可以避免每个节点中log太多占据了太多内存，以及避免节点重启后需要从头开始执行大量的log指令，节点仅需读取快照，并重新执行快照后的log指令即可。

go test -run 2
# 这个就是用来进行Lab2的所有测试的指令

注意事项（实验教程中Hint）

1、实验教程中建议我们先加入lastIncludeIndex这个变量，在该变量的存在的前提下修改代码，并重新跑Lab2B和2C。

2、在实验中，由于有部分Log在进行snapshot后会被丢弃，因此需要做好记录已经进入快照的log的index值即lastIncludedIndex，我们需要借助这个变量来访问指定index的log在rf.log数组所在的位置。

3、当follower的Log过于落后，和leader相比，部分缺少的log在leader那儿都已经进快照了，那么leader就需要给follower调用InstallSnapshot RPC。

4、论文中的快照发送方式是将快照进行切片，分片进行传输，但是本实验中，我们无需考虑这些东西，因此offset机制就不应该在本实验中设计。

5、每个节点的log在进行snapshot后必须丢弃掉快照前的Log，确保那些log的内存能够被释放，那些log在丢弃掉后，不应该还能访问到。

6、在快照后，对rf.log进行修剪后，leader在调用AppendEntries RPC时需要发送prevLogIndex和prevLogTerm仍然都需要发送的，那么就需要注意，因为可能经过快照后，rf.log中的所有记录直接都被清空了，那么就需要查看lastIncludedIndex和lastIncludedTerm。因此lastIncluedeIndex和lastIncludedTerm也需要进行持久化。

7、本实验完成后，不仅要通过Lab2D的测试，应当要能够一次性通过Lab2的所有测试。

8、实验中并不建议我们实现CondInstallSnapshot函数，让我们实现InstallSnapshot这个RPC函数即可。

以上是实验教程中的提示，下面在各个代码实现处，还要列出我当时碰到的各种情况，需要注意的地方。

本次实验内容

调整Raft结构体
调整persist函数
调整readPersist函数
实现Snapshot函数
实现InstallSnapshot的RPC
调整AppendEntries函数
调整Start函数
调整ticker函数

代码阶段

注意：基本上代码通过rf.log数组来访问log时，都需要用到rf.lastIncludedIndex。因此，需要在前面阶段代码的基础上，在rf.log[j-rf.lastIncludedIndex]，rf.log[rf.lastApplied-rf.lastIncludedIndex]等这些地方加上"-rf.lastIncludedIndex"

调整Raft结构体

由于加入了快照机制，那么就需要考虑rf.log被修剪的情况，需要记录lastIncludedIndex和lastIncludedTerm这两个参数，这两个变量指的是被经过快照后，被修剪的最后一个log的index和term值。在rf.log数组中访问指定index的log时均需要使用该变量，例如：访问index=n的log，通过rf.log数组来访问的话，就是rf.log[n-rf.lastIncludedIndex]即为index=n的log。因此直接把这两个变量加到raft结构体中，利用rf.mu这个互斥锁来方便各个协程进行互斥使用。

代码如下：

type Raft struct {
	mu        sync.Mutex          // Lock to protect shared access to this peer's state
	peers     []*labrpc.ClientEnd // RPC end points of all peers
	persister *Persister          // Object to hold this peer's persisted state
	me        int                 // this peer's index into peers[]
	dead      int32               // set by Kill()

	// Your data here (2A, 2B, 2C).
	// Look at the paper's Figure 2 for a description of what
	// state a Raft server must maintain.
	peerNum int

	// persistent state
	currentTerm       int
	voteFor           int
	log               []LogEntry
	lastIncludedIndex int
	lastIncludedTerm  int

	// volatile state
	commitIndex int
	lastApplied int
	// lastHeardTime time.Time
	state        string
	lastLogIndex int
	lastLogTerm  int

	// send each commited command to applyCh
	applyCh chan ApplyMsg

	// Candidate synchronize election with condition variable
	mesMutex    sync.Mutex // used to lock variable opSelect
	messageCond *sync.Cond

	// opSelect == 1 -> start election, opSelect == -1 -> stay still, opSelect == 2 -> be a leader, opSelect == 3 -> election timeout
	opSelect int

	// special state for leader
	nextIndex  []int
	matchIndex []int
}

调整persist函数

lastIncludedIndex和lastIncludedTerm这两个变量应当持久化，同时节点的log记录也不再是完整的了，而是经过快照机制修剪过的了。

代码如下:

func (rf *Raft) persist() {
	// Your code here (2C).
	// Example:
	// w := new(bytes.Buffer)
	// e := labgob.NewEncoder(w)
	// e.Encode(rf.xxx)
	// e.Encode(rf.yyy)
	// data := w.Bytes()
	// rf.persister.SaveRaftState(data)
	w := new(bytes.Buffer)
	e := labgob.NewEncoder(w)
	e.Encode(rf.currentTerm)
	e.Encode(rf.voteFor)
	e.Encode(rf.log)
	e.Encode(rf.lastIncludedIndex)
	e.Encode(rf.lastIncludedTerm)
	data := w.Bytes()
	rf.persister.SaveRaftState(data)
	// fmt.Printf("%v raft%d persists its state, term:%d, voteFor:%d, logLength:%d\n", time.Now(), rf.me, rf.currentTerm, rf.voteFor, len(rf.log))
}

调整readPersist函数

注意：

在没有快照机制之前，节点崩溃重启是将rf.commitIndex设置为0，rf.lastApplied设置为0，由于节点的log记录并没有修剪，是完整的，通过leader更新commitIndex来重新执行log中的command来恢复状态。

在有快照机制后，节点在崩溃后重启的时候，由于log是经过修剪的，节点开始就可以读取快照来获取状态，并读取持久化的信息得知lastIncludedIndex和lastIncludedTerm从而得知哪些log已经被修剪了，得知快照后的状态对应的commitIndex和lastApplied，因此如果有快照的话，那么节点重启后的commitIndex和lastApplied不应该为0。

此外，可能节点的log经过快照被修剪光了，无法通过rf.log来获取lastLogIndex和lastLogTerm。那么此时的lastLogIndex和lastLogTerm就应该等同于lastIncludedIndex和lastIncludedTerm。

以上的注意点即为该函数需要调整的点。

代码如下：

func (rf *Raft) readPersist(data []byte) {
	if data == nil || len(data) < 1 { // bootstrap without any state?
		return
	}
	r := bytes.NewBuffer(data)
	d := labgob.NewDecoder(r)
	var term int
	var voteFor int
	var log []LogEntry
	var lastIncludedIndex int
	var lastIncludedTerm int
	if d.Decode(&term) != nil || d.Decode(&voteFor) != nil || d.Decode(&log) != nil || d.Decode(&lastIncludedIndex) != nil || d.Decode(&lastIncludedTerm) != nil {
		// fmt.Printf("Error: raft%d readPersist.", rf.me)
	} else {
		rf.mu.Lock()
		rf.currentTerm = term
		rf.voteFor = voteFor
		rf.log = log
		rf.lastIncludedIndex = lastIncludedIndex
		rf.lastIncludedTerm = lastIncludedTerm
		rf.commitIndex = lastIncludedIndex
		rf.lastApplied = lastIncludedIndex
		var logLength = len(rf.log)
		// fmt.Printf("%v raft%d readPersist, term:%d, voteFor:%d, logLength:%d, lastIncludedIndex:%d, lastIncludedTerm:%d\n", time.Now(), rf.me, rf.currentTerm, rf.voteFor, logLength, lastIncludedIndex, lastIncludedTerm)
		if logLength == 1 {
			rf.lastLogIndex = rf.lastIncludedIndex
			rf.lastLogTerm = rf.lastIncludedTerm
		} else {
			rf.lastLogTerm = rf.log[logLength-1].Term
			rf.lastLogIndex = rf.log[logLength-1].Index
		}
		rf.mu.Unlock()
	}
}

实现Snapshot函数

该函数主要由节点自己调用，来创建一个新的快照。

主要流程就是；检查该创建的快照是否已经过时，若过时则直接return。根据快照范围内log的index，将节点自身的log数组进行修剪。创建好快照后，节点那些需要持久化的信息也会更新，像rf.log、rf.lastIncludedIndex、rf.lastIncludedTerm。那么就还需要将这些更新后的信息进行持久化保存，此时需要持久化保存快照和节点的信息，实验教程是推荐使用persister.SaveStateAndSnapshot函数。

注意：6.824中快照的间隔是每10条command进行一次快照，因此节点在进行将已经提交了的指令发送到applyCh进行执行的时候不能获取有rf.mu这个互斥锁，因为在你提交指令并将该指令发送到applyCh执行的同时，测试脚本会调用Snapshot函数进行快照，但是我设计的这个函数也需要获取rf.mu互斥锁，那么这个节点就会进入死锁状态：无法获取rf.mu互斥锁进行快照，另一边是需要等快照结束才能继续提交指令并执行，以及后续动作。

func (rf *Raft) Snapshot(index int, snapshot []byte) {
	// Your code here (2D).
	rf.mu.Lock()
	if index <= rf.lastIncludedIndex {
		rf.mu.Unlock()
		return
	}
	// fmt.Printf("%v raft%d persists and creates a snapshot from %d to %d\n", time.Now(), rf.me, rf.lastIncludedIndex, index)
	for cutIndex, val := range rf.log {
		if val.Index == index {
			rf.lastIncludedIndex = index
			rf.lastIncludedTerm = val.Term
			rf.log = rf.log[cutIndex+1:]
			var tempLogArray []LogEntry = make([]LogEntry, 1)
			// make sure the log array is valid starting with index=1
			rf.log = append(tempLogArray, rf.log...)
		}
	}
	w := new(bytes.Buffer)
	e := labgob.NewEncoder(w)
	e.Encode(rf.currentTerm)
	e.Encode(rf.voteFor)
	e.Encode(rf.log)
	e.Encode(rf.lastIncludedIndex)
	e.Encode(rf.lastIncludedTerm)
	data := w.Bytes()
	rf.persister.SaveStateAndSnapshot(data, snapshot)

	rf.mu.Unlock()
}

实现InstallSnapshot的RPC

有快照机制后，当leader修剪log后，在进行log replication的时候，部分follower缺少已经被leader快照修剪没了的log，那么leader就需要调用该RPC来将自身的快照发送给该follower，来解决这个问题。

设计InstallSnapshot RPC还需要设计RPC调用过程中的结构体设计。

InstallSnapshot RPC的结构体设计

根据paper可以直接简单的得到以下两个结构体

type InstallSnapshotArgs struct {
    Term              int
    LeaderId          int
    LastIncludedIndex int
    LastIncludedTerm  int
    Data              []byte
}

type InstallSnapshotReply struct {
    Term int
}

InstallSnapshot函数设计

注意：需要检查发送来的快照是否是过时的，避免旧的快照把本地新的快照给取代了，发生数据回滚。

同时还需要注意，节点收到快照有两种可能：1、发送来的args.LastIncludedIndex比本节点的lastLogIndex都要大，那么节点仅需将本地log全部删除，然后通过args的数据来更新本地信息，同时，节点的lastLogIndex也应当变更为args.LastIncludedIndex，lastLogTerm也应当变更为args.LastIncludedTerm；2、发送来的args.LastIncludedIndex比本节点的lastLogIndex要小，那么节点仅需将包括LastIncludeIndex和在此之前的全部Log修剪掉即可，无需改动lastLogIndex。

实验教程中说明，当一个节点收到InstallSnapshot时，需要该快照信息放到applyMsg中，然后放到applyCh中才可。如果只是将接受到的信息更新了节点的本地信息，不把快照的信息生成一个applyMsg并插入到applyCh中，测试将出错。

那么此处将生成applyMsg插入到applyCh中时又需要注意上面说的，插入msg到管道applyCh中时，不能拥有rf.mu互斥锁，避免和测试脚本调用snapshot函数时发生死锁。

需要注意的点都如上所示，下面就是具体代码实现：

func (rf *Raft) InstallSnapshot(args *InstallSnapshotArgs, reply *InstallSnapshotReply) {
	rf.mu.Lock()
	reply.Term = rf.currentTerm
	if reply.Term > args.Term {
		rf.mu.Unlock()
		return
	}
	if args.LastIncludedIndex > rf.lastIncludedIndex {
		// fmt.Printf("%v raft%v install snapshot:%d to %d from leader%d \n", time.Now(), rf.me, rf.lastIncludedIndex, args.LastIncludedIndex, args.LeaderId)
		rf.lastIncludedIndex = args.LastIncludedIndex
		rf.lastIncludedTerm = args.LastIncludedTerm
		if rf.lastLogIndex < args.LastIncludedIndex {
			rf.lastLogIndex = args.LastIncludedIndex
			rf.lastLogTerm = args.LastIncludedTerm
			rf.log = rf.log[0:1]
		} else {
			for cutIndex, val := range rf.log {
				if val.Index == args.LastIncludedIndex {
					rf.log = rf.log[cutIndex+1:]
					var tempLogArray []LogEntry = make([]LogEntry, 1)
					// make sure the log array is valid starting with index=1
					rf.log = append(tempLogArray, rf.log...)
				}
			}
		}
		if rf.lastApplied < rf.lastIncludedIndex {
			rf.lastApplied = rf.lastIncludedIndex
		}
		if rf.commitIndex < rf.lastIncludedIndex {
			rf.commitIndex = rf.lastIncludedIndex
		}
		w := new(bytes.Buffer)
		e := labgob.NewEncoder(w)
		e.Encode(rf.currentTerm)
		e.Encode(rf.voteFor)
		e.Encode(rf.log)
		e.Encode(rf.lastIncludedIndex)
		e.Encode(rf.lastIncludedTerm)
		data := w.Bytes()
		rf.persister.SaveStateAndSnapshot(data, args.Data)
		var snapApplyMsg ApplyMsg
		snapApplyMsg.SnapshotValid = true
		snapApplyMsg.SnapshotIndex = args.LastIncludedIndex
		snapApplyMsg.SnapshotTerm = args.LastIncludedTerm
		snapApplyMsg.Snapshot = args.Data
		// fmt.Printf("%v raft%d LII:%d, LIT:%d, LLI:%d, LLT:%d, LA:%d, CI:%d, len:%d\n", time.Now(), rf.me, rf.lastIncludedIndex, rf.lastIncludedTerm, rf.lastLogIndex, rf.lastLogTerm, rf.lastApplied, rf.commitIndex, len(rf.log))
		rf.mu.Unlock()
		rf.applyCh <- snapApplyMsg
	} else {
		// fmt.Printf("%v raft%v recv an out of date snapshot:%d to %d from leader%d \n", time.Now(), rf.me, rf.lastIncludedIndex, args.LastIncludedIndex, args.LeaderId)
		rf.mu.Unlock()
	}
}

sendInstallSnapshot函数设计

这个函数就是给leader调用installSnapshot用的一个接口，仿照前面的那些即可

func (rf *Raft) sendInstallSnapshot(server int, args *InstallSnapshotArgs, reply *InstallSnapshotReply) bool {
	ok := rf.peers[server].Call("Raft.InstallSnapshot", args, reply)
	return ok
}

调整AppendEntries函数

当节点在和leader发送来的args中寻找log replication的匹配点logIndex的时候，logIndex开始会设置为lastLogIndex，然后进行比较是否为匹配点，若不是则logIndex--，这样进行遍历，若直到logIndex = 0的时候即该节点本地的内存中的第一个log也没有匹配上，那么这里就有两种情况：1、节点没有进行过快照，表明该节点的log没有经过修剪，那么该节点没有一个log是有效的，需要从头开始给该节点进行log replication，follower将index=0,Term=0这个匹配点发送给leader，让其从头开始给follower发送log；2、节点进行过快照，那么还需比较进行lastIncludedIndex和lastIncludedTerm，查看lastIncludedIndex、lastIncludedTerm是否能够和args.PrevLogIndex、args.PrevLogTerm匹配的上，若能够匹配的上则直接把匹配点后续的log直接复制到节点的rf.log即可；若不能够匹配的上的话，则follower直接将lastIncludedTerm和lastincludedIndex发送给leader，由于快照中的Log都是已经提交过的，必然leader也有，这个精准的匹配点将发送给leader。

func (rf *Raft) AppendEntries(args *AppendEntriesArgs, reply *AppendEntriesReply) {
	rf.mu.Lock()
	defer rf.mu.Unlock()
	reply.Term = rf.currentTerm
	if args.Term < rf.currentTerm {
		reply.Success = false
	} else {
		// fmt.Printf("raft%d receive ae from leader%d\n", rf.me, args.LeaderId)
		if args.Entries == nil {
			// if the args.Entries is empty, it means that the ae message is a heartbeat message.
			if args.LeaderCommit > rf.commitIndex {
				// fmt.Printf("%v				raft%d update commitIndex from %d to %d\n", time.Now(), rf.me, rf.commitIndex, args.LeaderCommit)
				rf.commitIndex = args.LeaderCommit
				for rf.lastApplied < rf.commitIndex {
					rf.lastApplied++
					var applyMsg = ApplyMsg{}
					applyMsg.Command = rf.log[rf.lastApplied-rf.lastIncludedIndex].Command
					applyMsg.CommandIndex = rf.log[rf.lastApplied-rf.lastIncludedIndex].Index
					applyMsg.CommandValid = true
					// fmt.Printf("%v				raft%d insert the msg%d into applyCh\n", time.Now(), rf.me, rf.lastApplied)
					rf.mu.Unlock()
					rf.applyCh <- applyMsg
					// fmt.Printf("%v				raft%d insert the msg%d into applyCh\n", time.Now(), rf.me, rf.lastApplied)
					rf.mu.Lock()
				}
			}
			reply.Success = true
		} else {
			// if the args.Entries is not empty, it means that we should update our entries to be aligned with leader's.
			var match bool = false
			if args.PrevLogTerm > rf.lastLogTerm {
				reply.Term = rf.lastLogTerm
				// fmt.Printf("%v 1 raft%d prevIndex:%d lastIndex:%d\n", time.Now(), rf.me, args.PrevLogIndex, rf.lastLogIndex)
				reply.Success = false
			} else if args.PrevLogTerm == rf.lastLogTerm {
				if args.PrevLogIndex <= rf.lastLogIndex {
					match = true
				} else {
					reply.Term = rf.lastLogTerm
					reply.ConflictIndex = rf.lastLogIndex
					// fmt.Printf("%v 2 raft%d prevIndex:%d lastIndex:%d\n", time.Now(), rf.me, args.PrevLogIndex, rf.lastLogIndex)
					reply.Success = false
				}
			} else if args.PrevLogTerm < rf.lastLogTerm {
				// ---------------key region--------------
				var logIndex = len(rf.log) - 1
				for logIndex >= 0 {
					if rf.log[logIndex].Term > args.PrevLogTerm {
						logIndex--
						continue
					}
					if rf.log[logIndex].Term == args.PrevLogTerm {
						reply.Term = args.PrevLogTerm
						if rf.log[logIndex].Index >= args.PrevLogIndex {
							match = true
						} else {
							// fmt.Printf("%v 3 raft%d prevIndex:%d lastIndex:%d\n", time.Now(), rf.me, args.PrevLogIndex, rf.lastLogIndex)
							reply.ConflictIndex = rf.log[logIndex].Index
							reply.Success = false
						}
						break
					}
					if logIndex == 0 && rf.lastIncludedIndex != 0 {
						if rf.lastIncludedTerm == args.PrevLogTerm && rf.lastIncludedIndex == args.PrevLogIndex {
							match = true
						} else {
							reply.Success = false
							// fmt.Printf("%v 4 raft%d prevIndex:%d lastIndex:%d\n", time.Now(), rf.me, args.PrevLogIndex, rf.lastLogIndex)
							reply.Term = rf.lastIncludedTerm
							reply.ConflictIndex = rf.lastIncludedIndex
						}
					}
					if rf.log[logIndex].Term < args.PrevLogTerm {
						reply.Term = rf.log[logIndex].Term
						// fmt.Printf("%v 5 raft%d prevIndex:%d lastIndex:%d\n", time.Now(), rf.me, args.PrevLogIndex, rf.lastLogIndex)
						reply.Success = false
						break
					}
				}
			}
			if match {
				// Notice!!
				// we need to consider a special situation: followers may receive an older log replication request, and followers should do nothing at that time
				// so followers should ignore those out-of-date log replication requests or followers will choose to synchronized and delete lastest logs
				var length = len(args.Entries)
				var index = args.PrevLogIndex + length
				reply.Success = true
				if index < rf.lastLogIndex {
					// check if the ae is out-of-date
					if index <= rf.lastIncludedIndex || args.Entries[length-1].Term == rf.log[index-rf.lastIncludedIndex].Term {
						// fmt.Printf("%v				raft%d receive a out-of-date ae and do nothing. prevLogIndex:%d, length:%d from leader%d\n", time.Now(), rf.me, args.PrevLogIndex, length, args.LeaderId)
						return
					}
				}
				// fmt.Printf("%v				raft%d recv preIndex:%d,len:%d,leader:%d\n", time.Now(), rf.me, args.PrevLogIndex, length, args.LeaderId)
				if args.PrevLogIndex+1 < rf.lastIncludedIndex {
					for cutIndex, val := range args.Entries {
						if val.Index == rf.lastIncludedIndex {
							rf.log = make([]LogEntry, 1)
							rf.log = append(rf.log, args.Entries[cutIndex+1:]...)
						}
					}
				} else {
					rf.log = rf.log[:args.PrevLogIndex+1-rf.lastIncludedIndex]
					rf.log = append(rf.log, args.Entries...)
				}
				// fmt.Printf("%v				raft%d log:%v\n", time.Now(), rf.me, rf.log)
				var logLength = len(rf.log)
				rf.lastLogIndex = rf.log[logLength-1].Index
				rf.lastLogTerm = rf.log[logLength-1].Term
				rf.persist()
			}
		}

		if rf.currentTerm < args.Term {
			// fmt.Printf("%v raft%d update term from %d to %d\n", time.Now(), rf.me, rf.currentTerm, args.Term)
		}
		rf.currentTerm = args.Term
		rf.state = "follower"
		rf.changeOpSelect(-1)
		rf.messageCond.Broadcast()
	}
}

调整Start函数

这里调整的主要是针对那些log过于落后的follower需要installSnapshot而调整的。

注意：start函数中，存在循环尝试给follower发送sendAppendEntries来进行log replication，每次循环中都需要释放和获取锁，再次获取锁后，节点本身的状态可能发生了更新，在2D实验中，我们需要额外考虑的一个状态更新就是快照导致rf.log进行了修剪。同时在发送完sendAppendEntreis后，又要重新获取锁那么此时对rf.log进行任何操作前也都要先检查rf.lastIncludedIndex是否发生了改变，否则再对rf.log进行访问的时候就会发生错误。

因此存在一种情况，leader尝试给follower进行log replication寻找匹配点的时候，进入下一次循环的时候发现leader创建了快照，rf.log经过了修剪，nextIndex已经小于rf.lastIncludedIndex即匹配点必然处于快照中，那么此时就直接发送快照。

此外，在Leader给follower发送快照后，更新了follower的状态后，leader本地对该follower的状态记录也需要更新，rf.matchIndex和rf.nextIndex都需要更新。

在正常情况中，当leader发送给follower，然后进行寻找log replicaition的匹配点的时候，同样需要对nextIndex进行递减寻找，如果发现nextIndex-1-rf.lastIncludedIndex==0 并且 rf.lastIncludedInex!=0，表明Leader本地的第一个log也不是匹配点，leader创造过快照对rf.log进行过修剪，那么还需要查看快照中的最后一个log是否匹配，若不匹配则需要先发送快照来解决那些过于落后的log，后续再发送本地有的log给follower。

在Start函数中，我们碰到nextIndex-1-rf.lastIncludedIndex==0的情况，这种边缘情况，那么我们就需要考虑到快照对rf.log进行修剪的情况。

同时，在Start函数中本地对那些已经提交了的msg进行执行，将msg插入到applyCh管道中时，务必不能拥有rf.mu这个互斥锁，不然后果上文也已经讲过了。

代码实现如下：

func (rf *Raft) Start(command interface{}) (int, int, bool) {
	index := -1
	term := -1
	isLeader := true
	// Your code here (2B).
	_, isLeader = rf.GetState()
	if !isLeader {
		return index, term, isLeader
	}
	rf.mu.Lock()
	// var length = len(rf.log)
	// index = rf.log[length-1].Index + 1
	rf.lastLogTerm = rf.currentTerm
	rf.lastLogIndex = rf.nextIndex[rf.me]
	index = rf.nextIndex[rf.me]
	term = rf.lastLogTerm
	var peerNum = rf.peerNum
	var entry = LogEntry{Index: index, Term: term, Command: command}
	rf.log = append(rf.log, entry)
	// if command == 0 {
	// 	fmt.Printf("%v				leader%d send a command:%v to update followers' log, index:%d term:%d\n", time.Now(), rf.me, command, index, term)
	// } else {
	// 	fmt.Printf("%v				leader%d receive a command:%v, index:%d term:%d\n", time.Now(), rf.me, command, index, term)
	// }
	// fmt.Printf("%v				leader%d receive a command:%v, index:%d term:%d\n", time.Now(), rf.me, command, index, term)
	rf.matchIndex[rf.me] = index
	rf.nextIndex[rf.me] = index + 1
	rf.persist()
	// rf.mu.Unlock()
	for i := 0; i < peerNum; i++ {
		if i == rf.me {
			continue
		}
		// rf.mu.Lock()
		go func(id int, nextIndex int) {
			var args = &AppendEntriesArgs{}
			rf.mu.Lock()
			if rf.currentTerm > term {
				rf.mu.Unlock()
				return
			}
			if rf.nextIndex[id] > nextIndex+1 {
				// out of date gorouine should not send RPC to save network bandwidth
				rf.mu.Unlock()
				return
			}
			args.Entries = make([]LogEntry, 0)
			// if rf.nextIndex[id] < index {
			// 	for j := rf.nextIndex[id] + 1; j <= index; j++ {
			// 		args.Entries = append(args.Entries, rf.log[j])
			// 	}
			// }
			if nextIndex < index {
				for j := nextIndex + 1; j <= index; j++ {
					args.Entries = append(args.Entries, rf.log[j-rf.lastIncludedIndex])
				}
			}
			args.Term = term
			args.LeaderId = rf.me

			rf.mu.Unlock()

			for {
				var reply = &AppendEntriesReply{}
				rf.mu.Lock()
				if rf.currentTerm > term {
					// fmt.Printf("%v raft%d is no longer leader and stop sending log to raft%d\n", time.Now(), rf.me, id)
					rf.mu.Unlock()
					return
				}
				if nextIndex <= rf.lastIncludedIndex {
					if nextIndex != rf.lastIncludedIndex {
						rf.mu.Unlock()
						return
					}
					// fmt.Printf("%v leader%d send installsnapshot to raft%d 674\n", time.Now(), rf.me, id)
					var snapArgs InstallSnapshotArgs
					var snapReply InstallSnapshotReply
					snapArgs.Term = rf.currentTerm
					snapArgs.LastIncludedIndex = rf.lastIncludedIndex
					snapArgs.LastIncludedTerm = rf.lastIncludedTerm
					snapArgs.LeaderId = rf.me
					snapArgs.Data = rf.persister.ReadSnapshot()
					rf.mu.Unlock()
					var count = 0
					for {
						if count == 3 {
							return
						}
						if rf.sendInstallSnapshot(id, &snapArgs, &snapReply) {
							break
						}
						count++
					}
					rf.mu.Lock()
					if rf.currentTerm < snapReply.Term {
						rf.currentTerm = snapReply.Term
						rf.state = "follower"
						rf.voteFor = -1
						// fmt.Printf("%v raft%d sendInstallSnapshot finds a higher term, updates its term to %d\n", time.Now(), rf.me, snapReply.Term)
					} else {
						if rf.matchIndex[id] < snapArgs.LastIncludedIndex {
							rf.matchIndex[id] = snapArgs.LastIncludedIndex
						}
						if rf.nextIndex[id] <= snapArgs.LastIncludedIndex {
							rf.nextIndex[id] = snapArgs.LastIncludedIndex + 1
						}
					}
					rf.mu.Unlock()
					return
				}
				if nextIndex-1-rf.lastIncludedIndex == 0 && rf.lastIncludedIndex != 0 {
					args.PrevLogIndex = rf.lastIncludedIndex
					args.PrevLogTerm = rf.lastIncludedTerm
				} else {
					args.PrevLogIndex = rf.log[nextIndex-1-rf.lastIncludedIndex].Index
					args.PrevLogTerm = rf.log[nextIndex-1-rf.lastIncludedIndex].Term
				}
				// fmt.Printf("				679---nextIndex=%d, rf.lastIncludedIndex=%d\n", nextIndex, rf.lastIncludedIndex)
				// args.PrevLogIndex = rf.log[nextIndex-1-rf.lastIncludedIndex].Index
				// args.PrevLogTerm = rf.log[nextIndex-1-rf.lastIncludedIndex].Term
				args.Entries = rf.log[nextIndex-rf.lastIncludedIndex : index+1-rf.lastIncludedIndex]
				// args.Entries = append([]LogEntry{rf.log[nextIndex]}, args.Entries...)
				// fmt.Printf("%v leader%d send log:%d-%d to raft%d\n", time.Now(), rf.me, nextIndex, index, id)
				rf.mu.Unlock()
				var count = 0
				for {
					if count == 3 {
						return
					}
					// if sendAE failed, retry util success
					if rf.sendAppendEntries(id, args, reply) {
						break
					}
					count++
				}
				rf.mu.Lock()
				if reply.Term > args.Term {
					// fmt.Printf("%v when sending log leader%d find a higher term, term:%d\n", time.Now(), rf.me, args.Term)
					if reply.Term > rf.currentTerm {
						rf.currentTerm = reply.Term
						rf.state = "follower"
						rf.voteFor = -1
						// fmt.Printf("%v raft%d sendAppendEntreis finds a higher term, updates its term to %d\n", time.Now(), rf.me, reply.Term)
						rf.mu.Unlock()
						break
					}
					// fmt.Printf("%v goroutine (term:%d, raft%d send log to raft%d) is out of date. Stop the goroutine.\n", time.Now(), args.Term, rf.me, id)
					rf.mu.Unlock()
					break
				}
				var ifSendInstallSnapshot bool
				if !reply.Success {
					// fmt.Printf("%v fail nextIndex:%d prevIndex:%d prevTerm:%d reply.Term:%d\n", time.Now(), nextIndex, args.PrevLogIndex, args.PrevLogTerm, reply.Term)
					if nextIndex <= rf.lastIncludedIndex {
						ifSendInstallSnapshot = true
					} else if rf.log[nextIndex-1-rf.lastIncludedIndex].Term > reply.Term {
						for rf.log[nextIndex-1-rf.lastIncludedIndex].Term > reply.Term {
							nextIndex--
							if nextIndex-1-rf.lastIncludedIndex == 0 && rf.lastIncludedIndex != 0 {
								if rf.lastIncludedTerm != reply.Term {
									// fmt.Printf("%v leader%d send installsnapshot to raft%d 750\n", time.Now(), rf.me, id)
									ifSendInstallSnapshot = true
								}
							}
						}
						if reply.ConflictIndex != 0 {
							nextIndex = reply.ConflictIndex + 1
							if nextIndex <= rf.lastIncludedIndex {
								// fmt.Printf("%v leader%d send installsnapshot to raft%d 758\n", time.Now(), rf.me, id)
								ifSendInstallSnapshot = true
							}
						}
					} else {
						if reply.ConflictIndex != 0 {
							nextIndex = reply.ConflictIndex + 1
							if nextIndex <= rf.lastIncludedIndex {
								// fmt.Printf("%v leader%d send installsnapshot to raft%d 766\n", time.Now(), rf.me, id)
								ifSendInstallSnapshot = true
							}
						} else {
							nextIndex--
						}
					}
					if ifSendInstallSnapshot {
						var snapArgs InstallSnapshotArgs
						var snapReply InstallSnapshotReply
						snapArgs.Term = rf.currentTerm
						snapArgs.LastIncludedIndex = rf.lastIncludedIndex
						snapArgs.LastIncludedTerm = rf.lastIncludedTerm
						snapArgs.LeaderId = rf.me
						snapArgs.Data = rf.persister.ReadSnapshot()
						rf.mu.Unlock()
						var count = 0
						for {
							if count == 3 {
								return
							}
							if rf.sendInstallSnapshot(id, &snapArgs, &snapReply) {
								break
							}
							count++
						}
						rf.mu.Lock()
						if rf.currentTerm < snapReply.Term {
							rf.currentTerm = snapReply.Term
							rf.state = "follower"
							rf.voteFor = -1
							// fmt.Printf("%v raft%d sendInstallSnapshot finds a higher term, updates its term to %d\n", time.Now(), rf.me, snapReply.Term)
						} else {
							if rf.matchIndex[id] < snapArgs.LastIncludedIndex {
								rf.matchIndex[id] = snapArgs.LastIncludedIndex
							}
							if rf.nextIndex[id] <= snapArgs.LastIncludedIndex {
								rf.nextIndex[id] = snapArgs.LastIncludedIndex + 1
							}
							if index > snapArgs.LastIncludedIndex+1 {
								nextIndex = snapArgs.LastIncludedIndex + 1
							} else {
								rf.mu.Unlock()
								return
							}
						}
					}
					// fmt.Printf("%v leader%d try sending nextIndex:%d log to follower%d\n", time.Now(), rf.me, nextIndex, id)
					// nextIndex--
					if nextIndex == 0 {
						// fmt.Printf("Error:leader%d send log to raft%d, length:%d \n", rf.me, id, len(args.Entries))
						rf.mu.Unlock()
						break
					}
					rf.mu.Unlock()
				} else {
					if rf.matchIndex[id] < index {
						// fmt.Printf("%v				leader%d send log from %d to %d to raft%d\n", time.Now(), rf.me, nextIndex, index, id)
						rf.matchIndex[id] = index
					} else {
						// fmt.Printf("%v				leader%d send out of date log from %d to %d to raft%d\n", time.Now(), rf.me, nextIndex, index, id)
						rf.mu.Unlock()
						return
					}
					// we need to check if most of the raft nodes have reach a agreement.
					var mp = make(map[int]int)
					for _, val := range rf.matchIndex {
						mp[val]++
					}
					var tempArray = make([]num2num, 0)
					for k, v := range mp {
						tempArray = append(tempArray, num2num{key: k, val: v})
					}
					// sort.Slice(tempArray, func(i, j int) bool {
					// 	return tempArray[i].val > tempArray[j].val
					// })
					sort.Slice(tempArray, func(i, j int) bool {
						return tempArray[i].key > tempArray[j].key
					})
					var voteAddNum = 0
					for j := 0; j < len(tempArray); j++ {
						if tempArray[j].val+voteAddNum >= (rf.peerNum/2)+1 {
							if rf.commitIndex < tempArray[j].key {
								// fmt.Printf("%v 				%d nodes have received msg%d, leader%d update commitIndex from %d to %d\n", time.Now(), tempArray[j].val+voteAddNum, tempArray[j].key, rf.me, rf.commitIndex, tempArray[j].key)
								rf.commitIndex = tempArray[j].key
								for rf.lastApplied < rf.commitIndex {
									rf.lastApplied++
									var applyMsg = ApplyMsg{}
									applyMsg.Command = rf.log[rf.lastApplied-rf.lastIncludedIndex].Command
									applyMsg.CommandIndex = rf.log[rf.lastApplied-rf.lastIncludedIndex].Index
									applyMsg.CommandValid = true
									// fmt.Printf("%v				leader%d insert the msg%d into applyCh\n", time.Now(), rf.me, rf.lastApplied)
									rf.mu.Unlock()
									rf.applyCh <- applyMsg
									rf.mu.Lock()
								}
								break
							}
						}
						voteAddNum += tempArray[j].val
					}

					rf.mu.Unlock()
					break
				}

				time.Sleep(10 * time.Millisecond)
			}
		}(i, rf.nextIndex[i])
		// we update the nextIndex array at first time, even if the follower hasn't received the msg.
		if index+1 > rf.nextIndex[i] {
			rf.nextIndex[i] = index + 1
		}
		// rf.mu.Unlock()
	}
	rf.mu.Unlock()
	return index, term, isLeader
}

调整ticker函数

这里需要调整是因为我之前写的代码不兼容有快照的情况，我这里写的比较特殊。

情况如下：

rf.nextIndex[i] = rf.log[length-1].Index + 1

if length == 1 && rf.lastIncludedIndex != 0 {
  rf.nextIndex[i] = rf.lastIncludedIndex + 1
} else {
  rf.nextIndex[i] = rf.log[length-1].Index + 1
}

我此处给nextIndex赋值是通过去rf.log中寻找，但是可能快照导致rf.log中没有记录了，需要考虑这种情况。其余没有变动。

运行结果图

你可能感兴趣的:(分布式,rpc,golang)

分布式学习笔记_04_复制模型 NzuCRAS 分布式学习笔记架构后端
常见复制模型使用复制的目的在分布式系统中，数据通常需要被分布在多台机器上，主要为了达到：拓展性：数据量因读写负载巨大，一台机器无法承载，数据分散在多台机器上仍然可以有效地进行负载均衡，达到灵活的横向拓展高容错&高可用：在分布式系统中单机故障是常态，在单机故障的情况下希望整体系统仍然能够正常工作，这时候就需要数据在多台机器上做冗余，在遇到单机故障时能够让其他机器接管统一的用户体验：如果系统客户端分布
php 高并发下日志量巨大，如何高效采集、存储、分析贵哥的编程之路(热爱分享为后来者) PHP语言经典程序100题 php 开发语言
1.问题背景高并发系统每秒产生大量日志（如访问日志、错误日志、业务日志等）。单机写入、存储、分析能力有限，容易成为瓶颈。需要支持实时采集、分布式存储、快速检索与分析。2.主流架构方案一、分布式日志采集架构[应用服务器(PHP等)]|v[日志采集Agent（如Filebeat、Fluentd、Logstash）]|v[消息队列/缓冲（如Kafka、Redis、RabbitMQ）]|v[日志存储（如E
RocketMQ 之死信队列 firepation RocketMQ rocketmq
在分布式消息系统中，消息的可靠传递和处理至关重要。然而，由于各种原因（如消息处理失败、消费超时等），一些消息可能无法被正常消费。这些无法被消费的消息如果不加以处理，会影响系统的稳定性和数据一致性。为了解决这一问题，RocketMQ提供了死信队列（DeadLetterQueue，DLQ）机制。本文将深入探讨RocketMQ的死信队列，包括其实现原理、应用场景以及使用示例。什么是死信队列？死信队列是一
ZooKeeper架构及应用场景详解走过冬季学习笔记 zookeeper 架构分布式
ZooKeeper是一个开源的分布式协调服务，由Apache软件基金会维护。它旨在为分布式应用提供高性能、高可用、强一致性的基础服务，解决分布式系统中常见的协调难题（如配置管理、命名服务、分布式锁、服务发现、领导者选举等）。核心软件架构ZooKeeper的架构设计围绕其核心目标（协调）而优化，主要包含以下关键组件：集群模式(Ensemble):ZooKeeper通常部署为集群（称为ensemble
zookeeper etcd区别 sun007700 zookeeper etcd 分布式
ZooKeeper与etcd的核心区别体现在设计理念、数据模型、一致性协议及适用场景等方面。‌ZooKeeper基于ZAB协议实现分布式协调，采用树形数据结构和临时节点特性，适合传统分布式系统；而etcd基于Raft协议，以高性能键值对存储为核心，专为云原生场景优化，是Kubernetes等容器编排系统的默认存储组件。‌‌1‌‌2‌架构与设计目标差异‌‌ZooKeeper‌。‌设计定位‌:专注于分
分布式选举算法＜一＞ Bully算法
分布式选举算法详解：Bully算法引言在分布式系统中，节点故障是不可避免的。当主节点（Leader）发生故障时，系统需要快速选举出新的主节点来保证服务的连续性。Bully算法是一种经典的分布式选举算法，以其简单高效的特点被广泛应用于各种分布式系统中。什么是Bully算法？Bully算法是一种基于优先级的分布式选举算法。每个节点都有一个唯一的ID，ID值越大的节点优先级越高。当主节点故障时，优先级最
全面探索Kafka：架构、应用与流处理
Kafka：企业级消息系统与流处理平台的深度解析ApacheKafka作为分布式流处理平台，广泛应用于大数据处理和实时分析领域。本文将基于其官方文档，详细探讨Kafka的核心功能、应用场景以及如何进行有效管理。背景简介Kafka作为高吞吐量的消息系统，支持企业级的发布-订阅模式。它能够处理大量实时数据，并支持高并发读写操作。本文将依据Kafka官方文档的内容，逐层深入，从入门到高级应用，帮助读者全
Elasticsearch搜索引擎存储：从原理到实践的全景解析 Python×CATIA工业智造搜索引擎 elasticsearch 大数据
引言在大数据时代，数据规模呈指数级增长，传统数据库的模糊查询、实时分析能力逐渐成为瓶颈。Elasticsearch（简称ES）凭借其分布式架构、实时搜索和灵活的数据分析能力，成为企业级搜索与存储的核心引擎。截至2025年，ES在全球日志分析、电商搜索、实时监控等场景的市场占有率超过60%。本文将从存储架构、核心技术、应用场景及优化策略四个维度，深入解析Elasticsearch的设计哲学与实践价值
Python爬虫实战：基于最新技术的定时签到系统开发全解析 Python爬虫项目 2025年爬虫实战项目 python 爬虫开发语言人工智能自动化知识图谱
摘要本文详细介绍了如何使用Python开发一个功能完善的定时签到爬虫系统。文章从爬虫基础知识讲起，逐步深入到高级技巧，包括异步请求处理、浏览器自动化、验证码破解、分布式架构等最新技术。我们将通过一个完整的定时签到项目案例，展示如何构建一个稳定、高效且具有良好扩展性的爬虫系统。文中提供了大量可运行的代码示例，涵盖requests、aiohttp、selenium、playwright等多种技术方案，
【Kafka专栏 13】Kafka的消息确认机制：不是所有的“收到”都叫“确认”！
作者名称：夏之以寒作者简介：专注于Java和大数据领域，致力于探索技术的边界，分享前沿的实践和洞见文章专栏：夏之以寒-kafka专栏专栏介绍：本专栏旨在以浅显易懂的方式介绍Kafka的基本概念、核心组件和使用场景，一步步构建起消息队列和流处理的知识体系，无论是对分布式系统感兴趣，还是准备在大数据领域迈出第一步，本专栏都提供所需的一切资源、指导，以及相关面试题，立刻免费订阅，开启Kafka学习之旅！
Golang面试题二（slice,map,chan） os-lee go高级 golang 开发语言后端
目录1.slice的底层实现1.结构体定义2.slice四种初始化方式3.底层函数2.Go语言当中数组和slice的区别是什么？1.长度不同2.函数传参不同3.计算长度方式不同3.slice的扩容机制，有什么注意点扩容机制总结4.扩容前后的Slice是否相同5.深拷贝和浅拷贝浅拷贝（ShallowCopy）深拷贝（DeepCopy）总结6.slice为什么不是线程安全的7.map底层实现8.map
Golang map m0_67393686 java golang java 数据结构后端 apache
前言哈希表是一种巧妙并且实用的数据结构。它是一个无序的key/value对的集合，其中所有的key都是不同的，然后通过给定的key可以在常数时间复杂度内检索、更新或删除对应的value。在Go语言中，一个map就是一个哈希表的引用，map类型可以写为map[K]V，其中K和V分别对应key和value。map中所有的key都有相同的类型，所有的value也有着相同的类型，但是key和value之间
【go基础】4.基本数据结构之map 喝醉的小喵 go语言原理 golang 数据结构哈希算法后端
目录哈希表map-主要思想-特点-哈希函数-数据结构-map初始化-mapvalue为什么不能寻址-map为什么是无序的-map为什么是o(1)的-开发时应注意的哈希表map理解Golang哈希表Map的原理|Go语言设计与实现彻底理解GolangMap-知乎-主要思想1、桶map的底层存储结构式hmap,里面有一个桶数组，所有kv都是存在这些桶里的，每个桶的结构是bmap每个桶中最多可以存8个k
map数据结构在Golang中是无序的，并且键值对的查找效率较高的原因
map，map在Go语言中是无序的，是因为在Go语言中，map基于哈希表实现，它的遍历顺序依赖于哈希表内部存储状态，对并发编程的潜在影响包括可能引发数据一致性问题，也就是并发度写实易导致读到不一样的数据或遍历出错；还会导致结果可重复性的问题，即每次运行程序得到的依赖遍历顺序的计算结果可能不同。map的键值对查找效率高是由于：（1）哈希表的时间复杂度，哈希表的平均复杂度为O（1），最欢情况下为O（n
web3中的ipfs 财神爷首席大弟子 web3 去中心化区块链
什么是web3：是基于区块链技术的分布式网络，主要目标是建立一个去中心化与信任化的互联网去中心化以及是信任化区块链：将所有的交易记录和什么护具存储在分布式网络中，每一个node都有完整的数据副本任何一个node修改都需要得到其他节点的认可，确保数据的真实性和和可信度web3有一些关键技术和标准，例如以太坊，IPFS，ENS，ERC标准等以太坊：以太币是一个开源的有智能合约功能的公共区块链平台，通过
使用ceph-ansible部署分布式存储Ceph-octopus版本降世神童云计算技术专栏分布式 ceph ansible
使用ceph-ansible部署分布式存储Ceph-octopus版本1.Ceph基础概念及部署方式1.1.Ceph基本概念1.2.Ceph部署方式2.系统初始化配置3.Ceph集群部署3.1.Ansible安装与配置3.2.ceph-ansible安装与配置3.2.1.下载ceph-ansible3.2.2.安装ceph-ansible依赖3.2.3.修改ceph配置文件3.3.开始部署ceph
2024年运维最新分布式存储ceph osd 常用操作_ceph查看osd对应硬盘(1)，2024年最新Linux运维编程基础教程 2401_83944328 程序员运维分布式 ceph
最全的Linux教程，Linux从入门到精通======================linux从入门到精通(第2版)Linux系统移植Linux驱动开发入门与实战LINUX系统移植第2版Linux开源网络全栈详解从DPDK到OpenFlow第一份《Linux从入门到精通》466页====================内容简介====本书是获得了很多读者好评的Linux经典畅销书**《Linu
【赵渝强老师】基于PostgreSQL的分布式数据库：Citus
由于PostgreSQL具有强大的功能和良好的可扩展性，因此基于PostgreSQL很容易就可以实现分布式架构。Citus便是具体的一种实现方式。它以扩展的插件形式与PostgreSQL进行集成，且独立于PostgreSQL内核，部署也比较简单。Citus是现在非常流行的基于PostgreSQL的分布式解决方案。一、Citus基础下面是百度百科中对分布式数据库的定义：分布式数据库系统通常使用较小的
使用HarmonyOS 5和CodeGenie辅助工具开发鸿蒙运动健康类应用的项目总结哼唧唧_ CodeGenie 运动健康 Harmony OS5 harmonyos 华为
一、项目背景与目标随着鸿蒙生态在穿戴设备、智能家居领域的快速扩展，我团队基于HarmonyOS5操作系统，开发了一款面向运动健康场景的智能应用——“Harmony健康伴侣”。项目采用华为官方推出的智能编程助手CodeGenie进行辅助开发，旨在验证CodeGenie在提升鸿蒙应用开发效率与质量方面的实际效果。二、核心功能实现该应用深度融合HarmonyOS分布式能力，支持跨设备无缝协同，主要功能包
万物智联时代启航：鸿蒙OS重塑全场景开发新生态黑巧克力可减脂鸿蒙开发鸿蒙系统
目录HarmonyOS简介：分布式操作系统，开启万物智联新时代HarmonyOS发展历程：从破局到引领核心特性：分布式技术三支柱应用场景：全场景覆盖的鸿蒙生态什么选择鸿蒙开发？技术红利与市场蓝海结语：拥抱鸿蒙，赢在万物智联起点HarmonyOS简介：分布式操作系统，开启万物智联新时代什么是鸿蒙？HarmonyOS（鸿蒙操作系统）是华为自主研发的面向全场景的分布式操作系统，其核心使命是打破设备孤岛，
【JAVA】的SPI机制小白杨树树 java microsoft 开发语言
在Java里，SPI（ServiceProviderInterface）是一种关键的服务发现机制。其核心在于，它能让服务提供者在运行时动态地向系统注册自身实现，实现了服务接口与具体实现的解耦。比如，自己开发的RPC框架定义了一个序列化器的接口，但是希望能够提供让用户自己使用实现好的序列化器的功能，就可以使用SPI机制。JAVA内置了这样的SPI功能。核心概念阐释服务接口（ServiceInterf
大数据技术之集群数据迁移
dfs.namenode.rpc-address.nameservice1.namenode30hadoop104:8020dfs.namenode.rpc-address.nameservice1.namenode37hadoop106:8020dfs.namenode.http-address.nameservice1.namenode30hadoop104:9870dfs.namenode.
redis锁java实现 brave_zhao redis java 数据库
以下是几种常见的Redis分布式锁的Java实现方式：1.基于SETNX命令的实现SETNX命令（对应Java中的setIfAbsent方法）是实现Redis分布式锁的基础。以下是实现代码：importredis.clients.jedis.Jedis;publicclassRedisLock{privateJedisjedis;publicRedisLock(Jedisjedis){this.j
服务实现99.99%高可用的核心措施
在分布式系统中，高可用性（HA）是衡量服务可靠性的核心指标。99.99%的可用性意味着系统每年的停机时间不超过约52.6分钟，这对金融交易、电信服务等关键业务至关重要。一、冗余设计与故障转移原理：通过冗余部署消除单点故障，确保部分节点故障时服务仍可用。故障转移机制自动将流量切换至健康节点，缩短服务中断时间。Java服务实现：集群部署：使用SpringCloudAlibaba或Dubbo构建微服务集
分布式事务解决方案总结：本地消息异步确认、可靠消息最终一致性、最大努力通知码到三十五面试攻关分布式 spring cloud spring boot
❃博主首页：「码到三十五」，同名公众号:「码到三十五」☠博主专栏：♝博主的话：搬的每块砖，皆为峰峦之基；公众号搜索「码到三十五」关注这个爱发技术干货的coder，一起筑基分布式系统中事务是一个重要挑战，先从从实现原理、技术细节、适用场景三个维度，对三种主流分布式事务解决方案进行简单总结。一、本地消息异步确认方案实现原理该方案通过「本地事务+消息表」机制实现最终一致性，核心思想是将业务操作与消息发送
SkyWalking实现微服务链路追踪的埋点方案 MenzilBiz 服务器运维微服务 skywalking
SkyWalking实现微服务链路追踪的埋点方案一、SkyWalking简介SkyWalking是一款开源的APM(应用性能监控)系统，特别为微服务、云原生架构和容器化(Docker/Kubernetes)应用而设计。它主要功能包括分布式追踪、服务网格遥测分析、指标聚合和可视化等。SkyWalking支持多种语言（Java、Go、Python等）和协议（HTTP、gRPC等），能够提供端到端的调用
从面试懵逼到通透掌握：分布式锁原理全解（附Redisson与Redlock机制剖析）爱骑行的Coder 数据库 redis java基础面试分布式 java redis 后端
从面试懵逼到通透掌握：分布式锁原理全解（附Redisson与Redlock机制剖)你是不是也有这样的经历？简历上写着“精通Java，精通Redis，熟悉高并发场景”，结果一面下来，分布式锁怎么实现？Redisson是怎么加锁的？看门狗机制了解吗？锁丢失你知道怎么解决吗？全程“啊能能”，频频磕巴。本文不整虚的，带你从0到1，一步步真正搞懂分布式锁的原理与落地实践，面试高频，架构核心，不能不会。一、什
医疗金融预测与语音识别中的模型优化及可解释性技术突破智能计算研究中心其他
内容概要随着人工智能技术的纵深发展，模型优化与可解释性技术正在重塑医疗诊断、金融预测及语音识别领域的应用范式。在医疗领域，基于自适应学习的动态参数调整机制，结合迁移学习的跨场景知识复用，显著提升了疾病筛查模型的泛化能力；而金融预测场景中，联邦学习框架通过分布式数据协作，在保障隐私安全的前提下，实现了风险预测模型的多维度优化。语音识别领域则依托边缘计算架构，将模型压缩技术与实时推理引擎结合，有效解决
Spring Boot 在后端领域的微服务负载均衡实践 AI大模型应用实战 spring boot 微服务负载均衡 ai
SpringBoot在后端领域的微服务负载均衡实践关键词：SpringBoot、微服务、负载均衡、Ribbon、服务发现、高可用、分布式系统摘要：本文深入探讨了SpringBoot在微服务架构中实现负载均衡的实践方法。我们将从基础概念出发，详细分析负载均衡的核心原理，介绍SpringCloud生态中的关键组件（如Ribbon、Eureka等），并通过完整的代码示例展示如何在实际项目中实现高效的负载
Kafka系列之：安装具有安全认证的kafka-2.8.2分布式集群快乐骑行^_^ 大数据 Kafka系列安全认证 kafka-2.8.2 分布式集群
Kafka系列之：安装具有安全认证的kafka-2.8.2分布式集群一、下载Zookeeper3.7.1和Kafka2.8.2二、解压Zookeeper3.7.1和Kafka2.8.2三、安装Zookeeper3.7.1详细步骤1.修改zookeeper配置文件2.创建zookeeper数据目录3.zookeeper创建myid4.设置zookeeper访问kafka认证5.拷贝zookeeper
[黑洞与暗粒子]没有光的世界 comsci
无论是相对论还是其它现代物理学,都显然有个缺陷,那就是必须有光才能够计算但是,我相信,在我们的世界和宇宙平面中,肯定存在没有光的世界.... 那么,在没有光的世界,光子和其它粒子的规律无法被应用和考察,那么以光速为核心的 &nbs
jQuery Lazy Load 图片延迟加载 aijuans jquery
基于 jQuery 的图片延迟加载插件，在用户滚动页面到图片之后才进行加载。对于有较多的图片的网页，使用图片延迟加载，能有效的提高页面加载速度。版本： jQuery v1.4.4+ jQuery Lazy Load v1.7.2 注意事项：需要真正实现图片延迟加载，必须将真实图片地址写在 data-original 属性中。若 src
使用Jodd的优点 Kai_Ge jodd
1. 简化和统一 controller ，抛弃 extends SimpleFormController ，统一使用 implements Controller 的方式。 2. 简化 JSP 页面的 bind, 不需要一个字段一个字段的绑定。 3. 对 bean 没有任何要求，可以使用任意的 bean 做为 formBean。使用方法简介
jpa Query转hibernate Query 120153216 Hibernate
public List<Map> getMapList(String hql, Map map) { org.hibernate.Query jpaQuery = entityManager.createQuery(hql); if (null != map) { for (String parameter : map.keySet()) { jp
Django_Python3添加MySQL/MariaDB支持 2002wmj mariaDB
现状首先，[email protected] 中默认的引擎为 django.db.backends.mysql 。但是在Python3中如果这样写的话，会发现 django.db.backends.mysql 依赖 MySQLdb[5] ，而 MySQLdb 又不兼容 Python3 于是要找一种新的方式来继续使用MySQL。 MySQL官方的方案首先据MySQL文档[3]说，自从MySQL
在SQLSERVER中查找消耗IO最多的SQL 357029540 SQL Server
返回做IO数目最多的50条语句以及它们的执行计划。 select top 50 (total_logical_reads/execution_count) as avg_logical_reads, (total_logical_writes/execution_count) as avg_logical_writes, (tot
spring UnChecked 异常官方定义！ 7454103 spring
如果你接触过spring的事物管理！那么你必须明白 spring的非捕获异常！即 unchecked 异常！因为 spring 默认这类异常事物自动回滚！！ public static boolean isCheckedException(Throwable ex) { return !(ex instanceof RuntimeExcep
mongoDB 入门指南、示例 adminjun java mongodb 操作
一、准备工作 1、下载mongoDB 下载地址：http://www.mongodb.org/downloads 选择合适你的版本相关文档：http://www.mongodb.org/display/DOCS/Tutorial 2、安装mongoDB A、不解压模式：将下载下来的mongoDB-xxx.zip打开，找到bin目录，运行mongod.exe就可以启动服务，默
CUDA 5 Release Candidate Now Available aijuans CUDA
The CUDA 5 Release Candidate is now available at http://developer.nvidia.com/<wbr></wbr>cuda/cuda-pre-production. Now applicable to a broader set of algorithms, CUDA 5 has advanced fe
Essential Studio for WinRT网格控件测评 Axiba JavaScript html5
Essential Studio for WinRT界面控件包含了商业平板应用程序开发中所需的所有控件，如市场上运行速度最快的grid 和chart、地图、RDL报表查看器、丰富的文本查看器及图表等等。同时，该控件还包含了一组独特的库，用于从WinRT应用程序中生成Excel、Word以及PDF格式的文件。此文将对其另外一个强大的控件——网格控件进行专门的测评详述。网格控件功能 1、
java 获取windows系统安装的证书或证书链 bewithme windows
有时需要获取windows系统安装的证书或证书链，比如说你要通过证书来创建java的密钥库。有关证书链的解释可以查看此处。 public static void main(String[] args) { SunMSCAPI providerMSCAPI = new SunMSCAPI(); S
NoSQL数据库之Redis数据库管理(set类型和zset类型) bijian1013 redis 数据库 NoSQL
4.sets类型 Set是集合，它是string类型的无序集合。set是通过hash table实现的，添加、删除和查找的复杂度都是O(1)。对集合我们可以取并集、交集、差集。通过这些操作我们可以实现sns中的好友推荐和blog的tag功能。 sadd：向名称为key的set中添加元
异常捕获何时用Exception，何时用Throwable bingyingao
用Exception的情况 try { //可能发生空指针、数组溢出等异常 } catch (Exception e) {
【Kafka四】Kakfa伪分布式安装 bit1129 kafka
在http://bit1129.iteye.com/blog/2174791一文中，实现了单Kafka服务器的安装，在Kafka中，每个Kafka服务器称为一个broker。本文简单介绍下，在单机环境下Kafka的伪分布式安装和测试验证 1. 安装步骤 Kafka伪分布式安装的思路跟Zookeeper的伪分布式安装思路完全一样，不过比Zookeeper稍微简单些(不
Project Euler bookjovi haskell
Project Euler是个数学问题求解网站，网站设计的很有意思，有很多problem，在未提交正确答案前不能查看problem的overview，也不能查看关于problem的discussion thread，只能看到现在problem已经被多少人解决了，人数越多往往代表问题越容易。看看problem 1吧： Add all the natural num
Java-Collections Framework学习与总结-ArrayDeque BrokenDreams Collections
表、栈和队列是三种基本的数据结构，前面总结的ArrayList和LinkedList可以作为任意一种数据结构来使用，当然由于实现方式的不同，操作的效率也会不同。这篇要看一下java.util.ArrayDeque。从命名上看
读《研磨设计模式》-代码笔记-装饰模式-Decorator bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ import java.io.BufferedOutputStream; import java.io.DataOutputStream; import java.io.FileOutputStream; import java.io.Fi
Maven学习(一) chenyu19891124 Maven私服
学习一门技术和工具总得花费一段时间，5月底6月初自己学习了一些工具，maven+Hudson+nexus的搭建，对于maven以前只是听说，顺便再自己的电脑上搭建了一个maven环境，但是完全不了解maven这一强大的构建工具，还有ant也是一个构建工具，但ant就没有maven那么的简单方便，其实简单点说maven是一个运用命令行就能完成构建，测试，打包，发布一系列功
[原创]JWFD工作流引擎设计----节点匹配搜索算法(用于初步解决条件异步汇聚问题) 补充 comsci 算法工作 PHP 搜索引擎嵌入式
本文主要介绍在JWFD工作流引擎设计中遇到的一个实际问题的解决方案，请参考我的博文"带条件选择的并行汇聚路由问题"中图例A2描述的情况(http://comsci.iteye.com/blog/339756),我现在把我对图例A2的一个解决方案公布出来，请大家多指点节点匹配搜索算法(用于解决标准对称流程图条件汇聚点运行控制参数的算法) 需要解决的问题：已知分支
Linux中用shell获取昨天、明天或多天前的日期 daizj linux shell 上几年昨天获取上几个月
在Linux中可以通过date命令获取昨天、明天、上个月、下个月、上一年和下一年 # 获取昨天 date -d 'yesterday' # 或 date -d 'last day' # 获取明天 date -d 'tomorrow' # 或 date -d 'next day' # 获取上个月 date -d 'last month' #
我所理解的云计算 dongwei_6688 云计算
在刚开始接触到一个概念时，人们往往都会去探寻这个概念的含义，以达到对其有一个感性的认知，在Wikipedia上关于“云计算”是这么定义的，它说： Cloud computing is a phrase used to describe a variety of computing co
YII CMenu配置 dcj3sjt126com yii
Adding id and class names to CMenu We use the id and htmlOptions to accomplish this. Watch. //in your view $this->widget('zii.widgets.CMenu', array( 'id'=>'myMenu', 'items'=>$this-&g
设计模式之静态代理与动态代理 come_for_dream 设计模式
静态代理与动态代理代理模式是java开发中用到的相对比较多的设计模式，其中的思想就是主业务和相关业务分离。所谓的代理设计就是指由一个代理主题来操作真实主题，真实主题执行具体的业务操作，而代理主题负责其他相关业务的处理。比如我们在进行删除操作的时候需要检验一下用户是否登陆，我们可以删除看成主业务，而把检验用户是否登陆看成其相关业务
【转】理解Javascript 系列 gcc2ge JavaScript
理解Javascript_13_执行模型详解摘要: 在《理解Javascript_12_执行模型浅析》一文中,我们初步的了解了执行上下文与作用域的概念，那么这一篇将深入分析执行上下文的构建过程，了解执行上下文、函数对象、作用域三者之间的关系。函数执行环境简单的代码:当调用say方法时，第一步是创建其执行环境，在创建执行环境的过程中，会按照定义的先后顺序完成一系列操作:1.首先会创建一个
Subsets II hcx2013 set
Given a collection of integers that might contain duplicates, nums, return all possible subsets. Note: Elements in a subset must be in non-descending order. The solution set must not conta
Spring4.1新特性——Spring缓存框架增强 jinnianshilongnian spring4
目录 Spring4.1新特性——综述 Spring4.1新特性——Spring核心部分及其他 Spring4.1新特性——Spring缓存框架增强 Spring4.1新特性——异步调用和事件机制的异常处理 Spring4.1新特性——数据库集成测试脚本初始化 Spring4.1新特性——Spring MVC增强 Spring4.1新特性——页面自动化测试框架Spring MVC T
shell嵌套expect执行命令 liyonghui160com
一直都想把expect的操作写到bash脚本里,这样就不用我再写两个脚本来执行了,搞了一下午终于有点小成就,给大家看看吧. 系统:centos 5.x 1.先安装expect yum -y install expect 2.脚本内容: cat auto_svn.sh #!/bin/bash
Linux实用命令整理 pda158 linux
0. 基本命令　　linux 基本命令整理　　1. 压缩解压　　tar -zcvf a.tar.gz a #把a压缩成a.tar.gz 　　tar -zxvf a.tar.gz #把a.tar.gz解压成a 　　2. vim小结　　2.1 vim替换　　:m,ns/word_1/word_2/gc
独立开发人员通向成功的29个小贴士 shoothao 独立开发
概述：本文收集了关于独立开发人员通向成功需要注意的一些东西,对于具体的每个贴士的注解有兴趣的朋友可以查看下面标注的原文地址。明白你从事独立开发的原因和目的。保持坚持制定计划的好习惯。万事开头难，第一份订单是关键。培养多元化业务技能。提供卓越的服务和品质。谨小慎微。营销是必备技能。学会组织，有条理的工作才是最有效率的。 “独立
JAVA中堆栈和内存分配原理 uule java
1、栈、堆 1.寄存器：最快的存储区, 由编译器根据需求进行分配,我们在程序中无法控制.2. 栈：存放基本类型的变量数据和对象的引用，但对象本身不存放在栈中，而是存放在堆（new 出来的对象）或者常量池中（字符串常量对象存放在常量池中。）3. 堆：存放所有new出来的对象。4. 静态域：存放静态成员（static定义的）5. 常量池：存放字符串常量和基本类型常量（public static f