General background:

Type of gossip:

  • Push gossip: A node that has new information initiates gossip message to some other random node: Message usually contains full state.Propagation fast when less than half of the nodes are infected

  • Pull gossip:A node that does not have new information initiate gossip and request information from some other random node: Request message contains digest only and some other node respond with requestor’s outdated value. Propagation fast when more than half of the nodes are infected

  • Push pull gossip: After pull gossip, requestor send back values for which it has later version.


Message size VS precision:

  • Precise reconciliation: Each data has its own version and part of the data is sent in gossip message. (Random chosen, Data with oldest version, Data with latest version)

  • ScuttlebuttReconciliation:Maintain a global version on each node and compare global version first. Fall back to precise reconciliation only if global version is different

Akka Gossip data structure: contains a set of members, a vector clock, and a gossip overview


Members:

Each member has the following possible status: Joining, Weaklyup, Up, Leaving, Down, Exiting, Removed and itsunique address, unique address is different from host/port address in that it has a unique uuid, so that a host/port joining the cluster twice can be distinguished

Members are changed when ClusetCoreDaemon received joining or down message or convergence happens


Vector clock:

Keep a treeMap of node to version mapping.

Vector clock comparison: can have 4 results: SAME, BEFORE, AFTER, CONCURRENT. If node is the same, compare version.

Vector clock merge: take the latestversion of each node

Vector clock version increment:  whenever current node detects any cluster member change (join, leaving, downing, quarantined, converges,unreachable etc) and gossip is updated (ClusterCoreDaemon. updateLatestGossip),the vector clock will add current node, which makes current node’s vector clock version increase

Gossip overview:

                Contains a “seen” set, which contains unique address that have seen the gossip message and replied

 

 

Akka ClusterCoreDaemon interacts with Gossip:

Choose gossip target: 

  • Must be in the member list.

  • Not down or exiting

  • Reachable

If a node in the above set is not seen by the current node,it takes priority (probability of 0.8). Otherwise, just a random node isselected from the set

Probability of sending message to unseen node reduced graduallywith larger cluster size (>400) to avoid sending too many messages to asingle node

 

Gossip message is sent to a group of at fixed interval(gossip tick), to speed up convergence, if less than  of the members have seenthe gossip message, the interval is reduced

Gossip message contains the from address, to address and aserialized gossip object includes seen set, vector clock etc

 

Process gossip message:

Received gossip messages are ignored, if

  • To address does not match current node.

  • The sender is notreachable from current node. 

  • The sender is not a local member. 

  • Remote gossip member does not contain current node.


Compare local and remote vector clock

  • If same:  merge remote/current seen nodes.  send response back if remote node has not seenthis node

  • If current vector clock is newer:  take the current gossip, send response back

  • If remote vector clock is newer:  take the remote gossip, send response back ifremote node has not seen this node

  • If conflicting:  prune conflict of both current/remote gossipand merge them, send response back

Always add current node to the seen set of the latest gossip

Received gossip messages are dropped if they are enqueuedtoo long in mailbox to prevent overwhelming the current node.

 

Gossip convergence:

Conditions:

  • No members are unreachable exceptthose with exiting or down status

  • No members are not in the seen set

 

After convergence, only gossip status containing the version vector clock is sent. Unless there is a change to the member status (ClusterCoreDaemon.gossipStatusTo ClusterCoreDaemon.receiveGossipStatus),it falls back to normal gossip. (ClusterCoreDaemon.gossipTo  ClusterCoreDaemon.receiveGossip)

Also, if a node is in the seen set, it will send gossip status only.

 

 

Reference:

http://blog.sina.com.cn/s/blog_912389e50100z0dt.html