Syllabus  Blank  Homework  
Notes  Labs  Scores  Blank 
In a few hundred years, when the history of our time is written from a longterm perspective, it is likely that the most important event those historians will see is not technology, not the Internet, not ecommerce. It is an unprecedented change in human condition. For the first time, they will have to manage themselves. Peter DruckerDistributed OS Theories
Physical Clocks
Problem
Sometimes we simply need the exact time, not just an ordering.
Solution
Universal Coordinated Time (UTC):
Problem
Suppose we have a distributed system with a UTCreceiver
somewhere in it => we still have to distribute its time to each machine.
Basic principle
In practice: 1  r ≤ dC / dt ≤ 1 + r.
Goal
Never let two clocks in any system differ by more than δ time units =>
synchronize at least every δ/(2r) seconds.
Global positioning system
Basic idea
You can get an accurate account of time as a sideeffect of GPS.
Problem
Assuming that the clocks of the satellites are accurate and
synchronized:
Principal operation
Observation
4 satellites => 4 equations in 4 unknowns ( with Δ_{r} as one of them )
Clock Synchronization Principle
Principle I
Every machine asks a time server for the accurate time at least once
every δ/(2r) seconds (Network Time Protocol).
Note
Okay, but you need an accurate measure of round trip delay, including
interrupt handling and processing incoming messages.
Principle II
Let the time server scan all machines periodically, calculate an
average, and inform each machine how it should adjust its time relative
to its present time.
Note
Okay, you'll probably get every machine in sync. You don't even need
to propagate UTC time.
Fundamental
You'll have to take into account that setting the time back is never
allowed => smooth adjustments.
Problem
We first need to introduce a notion of ordering before we can order anything.
The happened before → relation
e11 → e12 , e12 → e22
e21 → e13 , e14  e24 
Realization
To realize the relation → we need a clock C_{i} at each
process P_{i} in the system, and adjust the clock according
to the following rules.
Condition requirements:
Implementation rules:
→ is irreflixive, defines partial order among events
Totally ordering relation ( => ) can be defined by ( on top of the above )
Example: Totally Ordered Multicasting
n = number of processes in a distributed system
Each event in process P_{i} ~ vector clock C_{i} ( integer vector of length n )
C_{i} = 
C_{i}[1]
C_{i}[2] .. C_{i}[n] 
Comparing two vector timestamps of events a and b
Equal  t^{a} = t^{b}  iff  all i,  t^{a}[i] = t^{b}[i]  
Not Equal  t^{a} ≠ t^{b}  iff  some i,  t^{a}[i] ≠ t^{b}[i]  
Less Than or Equal  t^{a} ≤ t^{b}  iff  all i,  t^{a}[i] ≤ t^{b}[i]  
Not Less Than or Equal To  t^{a} t^{b}  iff  some i,  t^{a}[i] > t^{b}[i]  
Less Than  t^{a} < t^{b}  iff  t^{a} ≤ t^{b} and t^{a} ≠ t^{b} )  
Not Less Than  t^{a} t^{b}  iff  !(t^{a} ≤ t^{b} and t^{b} ≠ t^{b} );  
Concurrent  t^{a}  t^{b}  iff  t^{a} t^{b} and t^{b} t^{a} 
Implementation Rules:
Assertion.
At any instant
Now, a → b iff t^{a} < t^{b}
Example: Updating a replicated database and leaving it in an inconsistent state.
if Send( M_{1} ) → Send( M_{2} )
then the receipient should receive M_{1} before
M_{2}
i.e. Send( M_{1} ) → Send( M_{2} ) requires Receive( M_{1} ) → Receive( M_{2} )
BirmanSchiperStephenson Protocol: Enforcing Causal Ordering of Messages
Assumes broadcast communication channels that do not lose or corrupt messages. ( i.e. everyone talks to everyone ). Use vector clocks to "count" number of messages ( i.e. set d = 1 ). n processes.
Vector Time:
The Protocol:
Example
SchiperEggliSandoz were able to solve the problem without broadcasting channels
ChandyLamport Global State Recording Protocol ( Snapshot Algorithm )
The goal of this distributed algorithm is to capture a consistent global state. It assumes all communication channels are FIFO. It uses a distinguished message called a marker to start the algorithm.
Example
In this example, all processes are connected by communications channels C_{ij}. Messages being sent over the channels are represented by arrows between the processes.
Snapshot s_{1}:
Snapshot s2: now a message is in transit on C_{12} and C_{21}.
The recorded process states and channel states must be collected and assembled to form the global state. ( e.g. send G.S. to all processes in finite time )
Termination
each process must ensure that
C = { c_{1}, c_{2}, ... ,c_{n} }
c_{i}  cut event, local state of
site ( or process ) S_{i} at that instant
all S_{i}, all S_{j}, no e_{i}, no e_{j} such that
i.e. every message received before a cut event
was sent before the cut event
at the sender site in the cut.
Time of a cut
C = { c_{1}, c_{2}, ... ,c_{n} }
C_{i}  vector clock of c_{i}
T_{C} = sup ( C_{1}, C_{2}, ... , C_{n} )
T_{C}[k] = max ( C_{1}[k], C_{2}[k], ... , C_{n}[k] )
T_{C} =  C_{1}[1] C_{2}[2] . . C_{n}[n] 
 (1) 
T_{C} = sup ( C_{1}, C_{2}, ... , C_{n} ) =  C_{1}[1] C_{2}[2] . . C_{n}[n] 
On the other hand if (1) is true
we have C_{i}[i] ≥ C_{j}[i] for all i, j.
This implies that the the events c_{i} are concurrent
and the cut is consistent.
System Model
Huang's Termination Detection Protocol:


Example
Total number of messages passed: 8 (one to start each computation, one to return the weight).