# 1.5 Classical Distributed Algorithms

## Lesson 1: Snapshots

### What is Global Snapshot?

* In the cloud: Each application or service is running on multiple servers which handle concurrent events and interact with each other. Thus, the ability to obtain a "global photograph" of the system is important
* **Global snapshot = global state = individual state of each process/channel in the distributed system**
* First solution: Synchronize clocks of all processes
  * Ask all processes to record their states at know time t
  * Problems
    * Time synchronization always has error
    * Does not record the state of messages
  * Causality is enough!

### Global Snapshot Algorithm

* System model:
  * N processes in the system
  * Two uni-directional communication channels between each ordered process pair
  * Channels are FIFO
  * No failures
  * All messages arrive intact and are not duplicated
* Requirements
  * Snapshot should not interfere with/block normal application actions
  * Each process is able to record its own state
  * The global state is collected in a distributed manner
  * Any process may initiate the snapshot
* Chandy-Lamport Global Snapshot Algorithm
  * First, Initiator Pi records its own state
    * The initiator process creates special messages called "Marker" messages
    * For all other processes j, Pi sends out a Marker message on outgoing channel Cij (N-1 channels in total)
    * Starts recording the incoming messages on each of the incoming channels at Pi: Cji for j = 1 to N excluding i
  * Whenever a process Pi receives a Marker message on an incoming channel Cki
    * If this is the first Marker Pi is seeing
      * Pi records its own state first
      * Marks the state of channel Cki as "empty"
      * For j = 1 to N except i, Pi sends out a Marker message on outgoing channel Cij
      * Starts recording incoming messages on each of the incoming channels at Pi: Cji for j = 1 to N except i and k
    * Else (if it has already seen a Marker message):
      * Mark the state of channel Cki as all the messages that have arrived on it since recording was turned on for Cki
  * The algorithm terminates when
    * All processes have received a Marker (to record their own state)
    * All processes have received a Marker on all the N-1 incoming channels (to make sure each process has its state recorded)
  * Then, optionally, a central server collects all these partial state pieces to obtain the full global snapshot

### **Consistent Cuts**

* Cut = time frontier at each process and at each channel
* Events at the process/channel that happen before the cut are "in the cut"
* Consistent cut: A cut that obeys causality
  * A cut is consistent iff for each pair of events (e, f) in the system s.t. event e is in the cut C, and if f -> e, then f is also in the cut C

![](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MgUNfJKyHgsAEQ4H3Pg%2F-MgUOidJ7YBpdCYyNHhw%2FScreen%20Shot%202021-08-07%20at%202.57.18%20PM.png?alt=media\&token=fbd0727d-da3d-4345-8a11-7ab9d468a10b)

* Any run of the Chandy-Lamport Global Snapshot algorithm creates a consistent cut
  * Proof by contradiction

### Safety and Liveness

* Liveness: Guarantee that something good will happen eventually
* Safety: Guarantee that something bad will never happen
* Can be difficult to satisfy both in an asynchronous distributed system
  * Failure detector: Completeness/liveness & accuracy/safety cannot both be guaranteed in an asynchronous distributed system
  * Consensus: Decisions/liveness and correct decisions/safety cannot both be guaranteed by any consensus protocol in an asynchronous distributed system
* Liveness w\.r.t. a property Pr in a given state S means
  * S satisfies Pr, or there is some causal path of global states from S to S' where S' satisfies Pr
* Safety w\.r.t. a property Pr in a given state S means
  * S satisfies Pr, and all global states S' reachable from S also satisfy Pr
* Chandy-Lamport algorithm can be used to detect global properties that are stable (once true, stays true forever afterwards)

## Lesson 2: Multicast

### Multicast Ordering

* Different communication forms
  * Multicast: Message sent to a group of processes
  * Broadcast: Message sent to all processes anywhere
  * Unicast: Message sent from one sender process to one receiver process
* FIFO Ordering
  * Multicasts from each sender are received in the order they are sent, at all receivers
  * Doesn't care about multicasts from different senders
* Casual Ordering
  * Multicasts whose send events are causally related must be received in the same causality-obeying order at all receivers
  * Concurrent multicasts are ok to be received in different orders at different receivers
  * Casual Ordering -> FIFO Ordering (the reverse is not true)
* Total Ordering/Atomic Broadcast
  * Ensures all receivers receive all multicasts in the same order
  * Doesn't care about the order of multicast sending
  * May need to delay delivery of some messages at sender

### Implementing Ordering

* FIFO Ordering
  * Each receiver Pi maintains a per-sender sequence number Pi\[1...N], initially all zeros
  * Pi\[j] is the latest sequence number Pi has received from Pj
  * Send multicast from Pj:
    * Pj\[j] += 1
    * Include new Pj\[j] in the multicast message
  * Receive multicast (Pi receives from Pj with sequence number S in the message):
    * If S == Pi\[j] + 1, then
      * Deliver message to application
      * Pi\[j] += 1
    * Else
      * Buffer this multicast until the above condition is true

![FIFO ordering example 1 (from the final exam)](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MgdrEvsSG7lOQda3bSA%2F-Mgdx6U2_RJ3x9E2WHz2%2Ffifo1.jpeg?alt=media\&token=5c728825-609a-410d-a5bc-219c3457a583)

![FIFO ordering example 2 (from the final exam)](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MgdrEvsSG7lOQda3bSA%2F-MgdxL_Daz-505IhuZIN%2Ffifo2.jpeg?alt=media\&token=8f2121ee-477f-4897-aae7-2901b9e6a875)

![FIFO ordering example 3 (from the final exam)](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MgdrEvsSG7lOQda3bSA%2F-MgdxbD_qtxNPFirYJq9%2Ffifo3.jpeg?alt=media\&token=3754a6b3-6dc2-4695-b8d0-ce97c9497779)

* Casual Ordering
  * Each receiver Pi maintains a per-sender sequence number Pi\[1...N], initially all zeros
  * Send multicast from Pj:
    * Pj\[j] += 1
    * Include new **entire vector** Pj\[1...N] in the multicast message
  * Receive multicast (Pi receives from Pj with vector M\[1...N], buffer it until both:)
    * This message is the next one Pi is expecting from Pj, i.e. M\[j] = Pi\[j] + 1
    * All multicasts, anywhere in the group, which happened-before M, have been received at Pi, i.e.
      * For all k != j, M\[k] <= Pi\[k] (Receiver satisfies causality)
    * When the above two conditions are met, deliver M to application and set Pi\[j] = M\[j]

![Casual ordering example 1 (from the final exam)](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MgdrEvsSG7lOQda3bSA%2F-Mgdxsm1w01sA-uxpR2l%2Fcasual1.jpeg?alt=media\&token=f2d5e9b5-2a9a-4893-a681-a857a7362ded)

![Casual ordering example 2 (from the final exam)](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MgdrEvsSG7lOQda3bSA%2F-Mgdy17fKV4YCErzEIkV%2Fcasual2.jpeg?alt=media\&token=7537421f-c202-4472-83f3-65b54aee7c03)

![Casual ordering example 3 (from the final exam)](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MgdrEvsSG7lOQda3bSA%2F-MgdyDuIgVRm9e1dn3I_%2Fcasual3.jpeg?alt=media\&token=15c56a77-5c3e-4335-9492-5c799c38d764)

* Total Ordering: Sequencer-based approach

![](https://1313833672-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MMTslgmrrtRXvxD2lk9%2F-MgUyAEMBI6kPuo94hcT%2F-MgV83P5UsyVHgUkrgx-%2FScreen%20Shot%202021-08-07%20at%206.24.10%20PM.png?alt=media\&token=dad325f2-ed7c-4a03-8003-55ec40f9e909)

### Reliable Multicast

* Reliable multicast loosely says that every process in the group receives all multicasts
  * Reliability is orthogonal to ordering
* When it comes to process failures, the definition becomes vague
* Definition: Need all correct/non-faulty processes to receive the same set of multicasts as all other correct processes

### Virtual Synchrony

* Each process maintains a membership list, called a View
* An update to this membership list is called a View Change
* Virtual synchrony guarantees that all view changes are delivered in the same order at all correct processes
  * A multicast M is said to be "delivered in a view V at process Pi" iff Pi receives view V, and then some time before Pi receives the next view, it delivers multicast M
* Views may be delivered at different physical times at processes, but they are delivered in the same order
* Virtual synchrony ensures that
  * The set of multicasts delivered in a given view is the same set at all correct processes that were in that view
    * **What happens in a view, stays in that view**
  * The sender of the multicast message also belongs to that view
  * If a process Pi does not deliver a multicast M in view V while other processes in the view V delivered M in V, the Pi will be forcibly removed from the next view delivered after V at the other processes
* TODO: Add some examples

## Lesson 3: Paxos

* TODO
