|To: |Troy Tuckett |
|From: |David Harborth |
|Class: |POS/355 |
|Date: |8/20/12 |
|Re: |Individual Assignment for Week 4 |
Four Types of Failures for a Distributed System
There are two types of systems that people can use when setting up their network. They can either use a distributed system or a centralized system. In this memo I will be point out the failures that can happen in either system and how to isolate and fix two of them.
A distributed system is a collection of dummy computers connected to a network of distributed middleware. This allows the computers to communicate to each other and also share resources. While allowing the end user to use the dummy computer as he or she would use a single integrated computing facility (Emmerich, 1997).
There a few types of failures that can happen with a distributed system, I will list four of them;
1. Halting failures: A component simply stops. There is no way to detect the failure except by timeout: it either stops sending "I'm alive" (heartbeat) messages or fails to respond to requests. Your computer freezing is a halting failure.
2. Fail-stop: A halting failure with some kind of notification to other components. A network file server telling its clients it is about to go down is a fail-stop.
3. Network failures: A network link breaks....