Syllabus   Blank Homework  
Notes   Labs   Scores   Blank

Lecture Notes
Dr. Tong Lai Yu, March 2010
    0. Review and Overview
    1. An Introduction to Distributed Systems
    2. Deadlocks
    3. Distributed Systems Architecture
    4. Processes
    5. Communication
    6. Distributed OS Theories
        7. Distributed Mutual Exclusions
    8. Agreement Protocols
    9. Distributed Scheduling
    10. Distributed Resource Management
    11. Recovery and Fault Tolerance
    12. Security and Protection
    Life well spent is long.
    						Leonardo da Vinci
    Agreement Protocols
    1. Failure Masking by Redundancy

      Triple Modular Redundancy

      Protection against process failures can be achieved by replicating
      processes into groups.

      Design issues

      a) Communication in a flat group. b) Communication in a simple hierarchical group.

    2. The Byzantine Generals Problem
      reliable computer systems must be able to cope with the failure of one or more of its components

      failed component --> conflicting information

      several divisions of an army ( e.g. a coalition of forces ) are camped outside an enemy city

      Generals can communicate with one another by oral messengers only

      decide upon a plan of action
      some generals may be traitors, trying to prevent loyal generals from reaching agreement

      • All loyal generals decide upon the same plan of action
      • A small number of traitors cannot cause the loyal generals to adopt a bad plan
        v(i) = information communicated by General i,    i = 1, 2, ..., n

      1. all loyal Generals agree on same value ( order )
      2. If Commander is loyal, then every loyal General obeys the order he sends

      Impossibility Results
      Consider n = 3, one is a traitor

      For 2 to be satisfied, General 1 must "attack"

      This cannot be distinguished from the previous case. Hence whenever General 1 receives an "attack" order from the commander, he must attack.
      In a similar argument, General 2 must "retreat".
      This violates 1.
      Hence, no solution exists for 3 generals that works in the presence of a single traitor.

      Lamport-Shostak-Pease Algorithm
      Solution exists only if

      # of traitors < 1/3 of total # of generals i.e. 3m + 1 generals can cope with at most m traitors

      Also known as Oral Messenge Algorithm OM(m)    m ≥ 0

      n generals commander sends messages to n - 1 generals

      Solution for n ≥ 3m + 1    m traitors

      majority ( 1, 1, 0 ) = 1
      majority ( v1, ..., vi, .. vn-1 ) = vi    ( majority value )
       = RETREAT    if no such value exists
      e.g. majority ( 1, 2, 3 ) = RETREAT

      Algorithm OM(0)    ( i.e. m = 0 )

      1. The commander sends his value to every general.
      2. Each general uses the value from the commander, or uses the value RETREAT if he receives no value.

      Algorithm OM(m),   , m > 0

      1. The commander sends his value to every general.
      2. For each i, let vi = value General i receives from commander or else RETREAT ( if receives no value ) General i acts as the commander in Algorithm OM(m-1) to send value vi to other n - 2 generals.
      3. For each i, and each j ≠ i, let vj = value General i received from General j in step 2 or else RETREAT
        General i uses value of majority( v1, ..., vn-1 )
      Example: Consier m = 1 and n = 4

      Nonfaulty Commander


      General 1: majority ( v, v, y ) = v

      General 2: majority ( v, v, x ) = v

      Commander, General 1, General 2 all agree on same value

    3. General 1: majority ( x, y, z )
    4. General 2: majority ( y, x, z )
    5. General 3: majority ( z, x, y )
    6. all agree on same value

      Faulty Commander



      faulty source processor

      faulty processor

      ( n - 2 ) rounds,
      message complexity: O( nm )

      Dolev et al.'s algorithm

      message complexity ~ polynomial time
      but requires 2m + 3 rounds

    7. Consensus Problem

        every processor broadcasts its initial value to all other processors

      1. All nonfaulty processors agree on the same single value
      2. If the initial value of every nonfaulty processor is v, then all nonfaulty processors must agree on value v

    8. Interactive Consistency Problem

        every processor broadcasts its initial value to all other processors
      1. All nonfaulty processors agree on the same vector ( v1, ..., vn )
      2. If the i-th processor is nonfaulty and its initial value is vi, then i-th value to be agreed on by all nonfaulty processors must be vi

      All are related

      Byzantine agreement problem is a special case of interactive consistency problem, in which the initial value of only one processor ( source ) is interested
      Conversely, if each of the n processors runs a copy of the Byzantine agreement protocol, the interactive consistency problem is solved.

      Consensus can be solved using consistency.
      They can be derived from Byzantine agreement problem.

    9. Extension to Case n > 3m + 1

      We have considered the case n = 3m + 1 and found the solution.

      When n > 3m + 1, we designate 3m + 1 processors as active processors and the rest passive processors

      The active processors first make an agreement then send results to passive processors

      The passive processors can use majority rule to find the value

    10. Signed Messages


      Generals: unforgeable signed messages

      1. A loyal general's signature cannot be forged, and any alteration of the contents of his signed messages can be detected
      2. Anyone can verify the authenticity of a general's signature
      now a three-general solution does exist

      any m, need m + 2 generals

    11. Applications
    12. fault-tolerant clock synchronization ( use Interactive Consistency algorithm to agree on some value )
    13. atomic commit in DDBS ( to commit or abort ~ "attack" or "retreat" )