- What is a distributed system?
- A collection of independent computers that appears to its users as a single coherent system.
- It consists of multiple computers that do not share memory.
- Each Computer has its own memory and runs its own operating system.
- The computers can communicate with each other through a communication network.
Figure 1. A distributed system organized as middleware.
The middleware layer extends over multiple machines,
and offers each application the same interface.
- Why distributed systems?
Advantages of distributed systems over traditional time-sharing systems
- Much better price/performance ratio
- Resource sharing
- Enhanced performance -- tasks can be executed concurrently; load distribution to reduce response time
- Higher reliability -- data replication
- Easier modular expansion -- hardware and software resources
can be easily added without replacing existing resources
A few issues.
- Transparency in a Distributed System
A distributed system needs to hide the fact that its processes and resources
are physically distribted across multiple computers.
Figure 2. Different forms of transparency in a distributed system (ISO, 1995).
Question: Is complete transparency a good idea?
Degree of Transparency
Aiming at full distribution transparencymay be too much:
- Users may be located in different continents;
distribution is apparent and not something you want to hide
- Completely hiding failures of networks and nodes is
(theoretically and practically) impossible
- You cannot distinguish a slow computer from a failing one
- You can never be sure that a server actually performed an operation before a crash
- Full transparency will cost performance
- DS may consist of devices carried around by people; exposing distribution of the system is more desirable
- Keeping Web caches exactly up-to-date with the master copy
- Immediately ushing write operations to disk for fault tolerance
- Openness of Distributed Systems
Open distributed system:
Be able to interact with services from other open systems, irrespective of the underlying environment:
- Systems should conform to well-defined interfaces
- Systems should support portability of applications
- Systems should easily interoperate
At least make the distributed system independent from heterogeneity of the underlying environment:
Policies vs. Mechanisms
Defining openness: Requires support for different policies specified by applications and users:
- What level of consistency do we require for client cached data?
- Which operations do we allow downloaded code to perform?
- Which QoS requirements do we adjust in the face of varying bandwidth?
- What level of secrecy do we require for communication?
Implementing openness:Ideally, a distributed system provides only mechanisms:
- Allow (dynamic) setting of caching policies, preferably per cachable item
- Support different levels of trust for mobile code
- Provide adjustable QoS parameters per data stream
- Offer different encryption algorithms
- Scale in Distributed Systems
Many developers of modern distributed
system easily use the adjective "scalable" without making
clear how and why their system actually scales.
At least three components:
- Number of users and/or processes (size scalability)
- Maximum distance between nodes (geographical scalability)
- Number of administrative domains (administrative scalability)
Most systems account only, to a certain extent, for
size scalability. The (non)solution: powerful servers
Today, the challenge lies in geographical and administrative
Techniques for Scaling:
- Hide communication latencies: Avoid waiting for responses; do something else:
- Make use of asynchronous communication
- Have separate handler for incoming response
- Do some work at client side (See Figure 4 below)
- Problem: not every application fits this model
- Distribution: Partition data and computations across multiple nodes:
- Move computations to clients (Java applets)
- Decentralized naming services (DNS) (See Figure 5 below)
- Decentralized information systems (WWW)
- Replication/caching: Make copies of data available at different nodes:
- Replicated file servers and databases
- Mirrored Web sites
- Web caches (in browsers and proxies)
- File caching (at server and client)
- Scalability Problems:
Figure 3. Examples of scalability limitations
Characteristics of decentralized algorithms:
- No machine has complete information about the system state.
- Machines make decisions based only on local information.
- Failure of one machine does not ruin the algorithm.
- There is no implicit assumption that a global clock exists.
- Scaling Techniques:
Hiding lactency: To avoid waiting for
responses to remote service requests ( e.g. do other useful work
at the requester's side ).
Figure 4. The difference between letting (a) a server
or (b) a client check forms as they are being filled.
Another important scaling technique is distribution, e.g.
Domain Name System ( DNS ):
The DNS name space is hierarchically organized into a tree of domains, which are divided into nonoverlapping zones.
Figure 5. An example of dividing the DNS name space into zones.
- Pitfalls when Developing Distributed Systems
False assumptions made by first time developer:
- The network is reliable.
- The network is secure.
- The network is homogeneous.
- The topology does not change.
- Latency is zero.
- Bandwidth is infinite.
- Transport cost is zero.
- There is one administrator.
- Distributed Computing Systems
Used for high performance computing
Cluster Computing Systems
mainly for parallel programming
e.g Beowulf -- Linux-based cluster
Grid Computing Systems
- fabric -- provides interfaces to local resources at a specific site
- connectivity -- consists of communication protocols for supporting
grid transactions that may use multiple resources
also authenticate users to use certain resources
- resource -- responsible for managing a single resource; use info
from connectivity layer to call fabric layer
- collective -- handles access to multiple resources, typically consisting
of services for discovering and allocating resources
e.g. SETI (Search for Extraterrestrial Intelligence )@Home project:
thousands of people are sharing the unused processor cycles of their
PCs in the vast search for signs of "rational" signals from outer space.
- Distributed Information Systems
Transaction Processing Systems
Figure 8 Example primitives for transactions.
BEGIN TRANSACTION(server, transaction);
READ(transaction, file-1, data);
WRITE(transaction, file-2, data);
newData := MODIFIED(data);
IF WRONG(newData) THEN
WRITE(transaction, file-2, newData);
Characteristic properties of transactions:
- Atomic: To the outside world, the transaction happens indivisibly.
- Consistent: The transaction does not violate system invariants.
- Isolated: Concurrent transactions do not interfere with each other.
- Durable: Once a transaction commits, the changes are permanent.
A nested transaction can be constructed from a number of subtransactions.
Figure 9. A nested transaction.
'durable' applies to the top level only
Transaction processing ( TP ) monitor allows an application
to access multiple server/database.
Figure 10. The role of a TP monitor in distributed systems.
Enterprise Application Integration
The more applications became decoupled from the databases they were built
upon, the more we needed facilities to integrate applications
independent from their databases; we want application components
to communicate directly with each other.
Figure 11. Middleware as a communication facilitator in enterprise application integration
RPC and RMI ( remote method invocation ) are examples of middleware;
RMI operates on objects.
Both RPC and RMI require caller and callee be up and running
Message-oriented middleware ( MOM ) lets applications
send messages to logical contact points. ( publish/subscribe )
- Distributed Pervasive Systems
Small nodes, mobile, usually embedded, battery-powered
Requirements for pervasive systems
- Embrace contextual changes.
- Encourage ad hoc composition.
- Recognize sharing as the default.
Electronic Health Care Systems
Great example of pervasive system
Questions to be addressed for health care systems:
- Where and how should monitored data be stored?
- How can we prevent loss of crucial data?
- What infrastructure is needed to generate and propagate alerts?
- How can physicians provide online feedback?
- How can extreme robustness of the monitoring system be realized?
- What are the security issues and how can the proper policies be enforced?
Monitoring a person in a pervasive electronic (ECG (Electrocardiogram) sensors)
health care system, using
(a) a local hub or
(b) a continuous wireless connection
Characteristics: (10s to 1000s, simple CPUs, often battery)
Used for processing information; mostly use wireless communicaions
Questions concerning sensor networks:
- How do we (dynamically) set up an efficient tree in a sensor network?
- How does aggregation of results take place? Can it be controlled?
- What happens when network links fail?
Consider sensor networks from a database perspective/API!
Figure 13a. Organizing a sensor network database, while storing
and processing data (a) only at the operator's site or ...
Figure 13b. Organizing a sensor network database, while storing
and processing data ... or (b) only at the sensors.