General background:
Type of gossip:
Push gossip: A node that has new information initiates gossip message to some other random node: Message usually contains full state.Propagation fast when less than half of the nodes are infected
Pull gossip:A node that does not have new information initiate gossip and request information from some other random node: Request message contains digest only and some other node respond with requestor’s outdated value. Propagation fast when more than half of the nodes are infected
Push pull gossip: After pull gossip, requestor send back values for which it has later version.
Message size VS precision:
Precise reconciliation: Each data has its own version and part of the data is sent in gossip message. (Random chosen, Data with oldest version, Data with latest version)
ScuttlebuttReconciliation:Maintain a global version on each node and compare global version first. Fall back to precise reconciliation only if global version is different
Akka Gossip data structure: contains a set of members, a vector clock, and a gossip overview
Members:
Each member has the following possible status: Joining, Weaklyup, Up, Leaving, Down, Exiting, Removed and itsunique address, unique address is different from host/port address in that it has a unique uuid, so that a host/port joining the cluster twice can be distinguished
Members are changed when ClusetCoreDaemon received joining or down message or convergence happens
Vector clock:
Keep a treeMap of node to version mapping.
Vector clock comparison: can have 4 results: SAME, BEFORE, AFTER, CONCURRENT. If node is the same, compare version.
Vector clock merge: take the latestversion of each node
Vector clock version increment: whenever current node detects any cluster member change (join, leaving, downing, quarantined, converges,unreachable etc) and gossip is updated (ClusterCoreDaemon. updateLatestGossip),the vector clock will add current node, which makes current node’s vector clock version increase
Gossip overview:
Contains a “seen” set, which contains unique address that have seen the gossip message and replied
Akka ClusterCoreDaemon interacts with Gossip:
Choose gossip target:
Must be in the member list.
Not down or exiting
Reachable
If a node in the above set is not seen by the current node,it takes priority (probability of 0.8). Otherwise, just a random node isselected from the set
Probability of sending message to unseen node reduced graduallywith larger cluster size (>400) to avoid sending too many messages to asingle node
Gossip message is sent to a group of at fixed interval(gossip tick), to speed up convergence, if less than of the members have seenthe gossip message, the interval is reduced
Gossip message contains the from address, to address and aserialized gossip object includes seen set, vector clock etc
Process gossip message:
Received gossip messages are ignored, if
To address does not match current node.
The sender is notreachable from current node.
The sender is not a local member.
Remote gossip member does not contain current node.
Compare local and remote vector clock
If same: merge remote/current seen nodes. send response back if remote node has not seenthis node
If current vector clock is newer: take the current gossip, send response back
If remote vector clock is newer: take the remote gossip, send response back ifremote node has not seen this node
If conflicting: prune conflict of both current/remote gossipand merge them, send response back
Always add current node to the seen set of the latest gossip
Received gossip messages are dropped if they are enqueuedtoo long in mailbox to prevent overwhelming the current node.
Gossip convergence:
Conditions:
No members are unreachable exceptthose with exiting or down status
No members are not in the seen set
After convergence, only gossip status containing the version vector clock is sent. Unless there is a change to the member status (ClusterCoreDaemon.gossipStatusTo ClusterCoreDaemon.receiveGossipStatus),it falls back to normal gossip. (ClusterCoreDaemon.gossipTo ClusterCoreDaemon.receiveGossip)
Also, if a node is in the seen set, it will send gossip status only.
Reference:
http://blog.sina.com.cn/s/blog_912389e50100z0dt.html