<< Exercise 4 | Index | Exercise 6 >>

To add a question because something is unclear or was not understood, just insert the question and add the prefix %q% for each addition (q like question). This is the "question-style". Like this:

* %q% What kind of problems could have decentralized nature?
  • What kind of problems could have decentralized nature?

If you want to answer a question or add a comment please put a %a% in front. This is thea "answer-style" (a lilke answer). An example:

* %a% This is an addition to something that I consider important.
  • This is an addition to something that I consider important.

For citations or references to the slides of Prof. Suri pleas add the lecture and slide number in braces: (<lecture>.<slide>).

Please make sure that you enter an author name, else your changes will not be saved!

Exercise 5

Distributed Systems

  1. There are many trade-offs to be made when designing a distributed system. Explain the more important trade-offs below and how they relate to each other:
    1. Centralized vs. de-centralized applications
      Actually application control:
      Centralized: easier, simpler, efficient operations; but SPF, slow coordination restoration.
      Distributed: complex, robust, no SPF, more natural for certain problems.
    2. Sequential vs. concurrent distribution
      Highly sequential: more control, less performance. E.g. RPC
      Highly concurrent/parallel: less control, more performance. E.g. DSM, group-oriented systems.
    3. Transparency level
      Invisible: can be used if distribution is not a part of the problem. Makes applications simpler.
      Visible: needed if problem location relates to distribution. Maintenance and location easier.
    4. Function- vs. data-shipping
      Code shipping: for remote execution, using local data (e.g.Applets) - demands less bandwidth, but also less control of results (data is unknown because not shipped).
      Data shipping: demands more bandwidth, but total control over results.
    5. Service vs. server - NOT a TRADE-OFF, but definitions.
      Server: physical resource, can hold several services.
      Service: logical resource provided in a system, hosted in a server.
    6. Scale vs. performance
      Scalability: the larger the system, the harder to keep it sync, so performance (coordination effort), so performance loss, but all benefits of scalability (more reliable, available).
      Performance: smaller systems demand less msg exchange, so more performance.
    7. Openness vs. determinacy
      Openness: the more open a system is (i.e. compatible to 3rd.part systems), the less deterministic is its behaviour, due to high number of unknown events that can happen.
      Determinacy: the less the system, the higher is the determinacy level, because behaviour well known.
    8. Synchrony vs. Asynchrony - I suppose it's meant synchronous vs. asynchronous systems?
      Synchronous systems: hard to obtain (demands real time in every process, so high sync effort), but ensures timeliness properties, so simplifies applications.
      Asynchronous systems: simpler to implement, but apps must cope with timeout issues (if fully async system, FLP impossibility result applies!).
    9. Are there any more trade-off that you can come up with? ?
  2. Explain the importance of a Failure Model. Give examples of a few models and where they might be useful.
    A failure model extends a component's specification to include ways it can legally fail (i.e. without violating its specification). This is important to limit the possible number of failures which must be handled, being those the ones included in the spec. Examples:
  • Omissive failure model: (time domain)
    • Crash or fail-silent: fails "cleanly"
    • Omission: process doesn't respond any more
    • Timing: early or late responses
  • Assertive failure model: (value domain)
    • Sintactic: wrong data structure
    • Semantic: wrong data (e.g. wrong data range)
  • Byzantine failure model: "two-face-behaviour"

    Where might they be useful??

  1. Explain the consequence of the FLP Impossibility result, published by Fischer, Lynch and Patterson in 1985.
    It says that in a fully asynchronous system, if faults occur, it may be impossible to guarantee the deterministic solution of basic problems such as consensus, because msg delivery delays are unbounded or unknown. Consequence is that you must use means to avoid such a situation like delay timeouts.
  2. Three main activities of distributed applications are: a) Coordination, Sharing and Replication. Explain these terms using examples of real-world problems. Also try to find a problem that can/should/must be solved with a combination of them.
  • Coordination: participants cooperate to achieve a common goal.
  • Sharing: participants compete for resources.
  • Replication: performing same sequence of actions or maintaining a set of duplicated data items.
  • Examples: DSM (Distributed Shared Memory) uses sharing, coordination (write/read) and can use replication (for backup). In a dist. database (say, Amazon.com or E-bay) sharing (database access), replication (backup) and coordination (sync among nodes) are used.


  1. What is a Message Bus?
    Is a shared medium over which participants can exchange msgs. It can be only a virtual/logical or even physical.
  2. What is the difference between "push" and "pull" semantics of a pub-sub system?
    In "push" strategy immediately when a msg is published on the bus it's sent to every subscriber, it's a so called active bus.
    In "pull" strategy the subscribers must periodically poll the bus in order to check the availability and get the msgs, this is called passive bus.
  3. What are the benefits of using a pub-sub protocol compared to a multicast one? Which systems cannot use one or the other
    Benefits of Pub-sub protocols are: temporal decoupling - publisher and subscriber do not have any time relationship, are not synchronized = must not coexist temporally - and spatial decoupling - publisher and subscriber must not know each other.
    In a multicast protocol nodes must know each other and are temporal coupled.

    In pratice, pub-sub systems with "push" semantics are implemented as multicast systems.

    Pub-sub systems cannot be used in systems with strong timing constraints, like hard RT systems, because in this case temporal coupling is needed. This I copied from exercise notes, but I'm not really sure it's correct.

Voting Algorithms

  1. One approach to include non-determinism in a replication scheme could be to let each node vote on the result of a computation. How many nodes have to agree on the same result for the replication to work?
    Considering each node has one vote, it's a case of simple majority, so half plus one. Or??
    I guess you thought of weighted voting. But if there is no state-update scheme every node needs to agree on the outcome or otherwise they will be out of sync. --chrschn
  2. A simple voting scheme is to have all nodes to agree on a result before committing it. This is of course not very efficient (why?). Instead weighted voting can be used, explain how this works. Especially explain the requirements 2w > v and w + r > v where v is the votes in the system, r the votes in a read quorum and w the votes in a write quorum.
    Consensus is not very efficient because:
  • you must wait for all msgs to be received in all nodes in order to agree and commit the operation.
  • if any node fails, systems freezes, so every node is a SPF.

    In weighted voting each node is assigned some votes, not necessarily one. Requirements are:
  • 2w > v: guarantees that only one write operation will be granted execution.
  • w + r > v: ensures that either one write or one read operation, but NOT BOTH, will be granted execution. So no writing and reading simultaneously in different partitions [ chrschn]. This prevents WR and RD of the same variable.
  1. A distributed system with five nodes is to implement a weighted voting algorithm. Assign weights to the nodes such that two failing (silent) nodes can be tolerated. One node can be assumed to be fault-free.
    One option is:
  • five nodes: 4 usual nodes with weight ONE, 1 fault-free node with weight THREE.
  • w = 4
  • r = 4
  • Total votes v = 7

    Checking conditions:
    2w > v : 2 x 4 = 8 > 7 : OK
    r + w > v: 4 + 4 = 8 > 7 : OK
    Conclusion: up to 3 failing nodes are tolerated.

Nach oben

Recent Changes

Nach oben

Zuletzt geändert am 16 März 2005 09:56 Uhr von chrschn