Gossip Protocol

Is communication protocol used in a distributed system to distribute the information effectively. The main concept is that each node randomly choose a set of nodes to exchange the information.

Therefore better in synchronisation and communication.

Think of it like co-workers gossip with eachother

Use cases

  • Centralised state management
  • Fault detection

Example

For example in the case where we need to detect which node is down in our system.

The Gossip protocol, each node will:

  • Maintain a list of other nodes which contains their ID and heartbeats.
  • Periodically increment heartbeat counter of each node in its membership
    • Per gossip tick, it only probe 1 node $O(1)$
    • If 1 node heartbeat has not been incremented for a period of time, it's considered as offline.
    • Next, it send the whole membership list to another node

Example:

Pasted image 20230626232812.png

  1. s0 maintains a membership node as shown
  2. s0 probe s2 on its tick and notice that s2 (ID = 2) heartbeat has not incremented for a while
  3. s0 then send this whole membership list other node which include s2 heartbeat counter.
    • It keeps propagating the rumor that S2 is down to other node, until someone update its membership table and send back to s0.

Doing this, we go through $O(N)$ nodes with each node only do $O(1)$ probe.