<< Exercise 3 | Index | Exercise 5 >>

To add a question because something is unclear or was not understood, just insert the question and add the prefix %q% for each addition (q like question). This is the "question-style". Like this:

* %q% What kind of problems could have decentralized nature?
  • What kind of problems could have decentralized nature?

If you want to answer a question or add a comment please put a %a% in front. This is thea "answer-style" (a lilke answer). An example:

* %a% This is an addition to something that I consider important.
  • This is an addition to something that I consider important.

For citations or references to the slides of Prof. Suri pleas add the lecture and slide number in braces: (<lecture>.<slide>).

Please make sure that you enter an author name, else your changes will not be saved!

Exercise 4

Leader Election and Deadlock

  1. An algorithm that uses one node to coordinate the work of the whole system may be helped by a Leader Election algorithm. Explain how timestamps can be used to implement such an algorithm. Is causal ordering enough for messages in this algorithm, why/why not?
    When a new leader needs to be elected, every client requests to become the leader, similar to distributed request for a LOCK. The clients add a time stamp to their leader-request, and the client with the lowest timestamp (or, if the timestamps are identical, with the lowest order according to the total ordering protocol of the system) is granted to become the leader. This decision is made after the clients have received LEADER request from all other clients.

    Causal ordering is sufficient for this algorithm because the clients wait with their leader decision until they have received the LEADER request from all other nodes, and all messages are timestamped with the clients' local time.
    I'm not really sure about this.
    I think it's enough because you wait for all msgs and check their contents (timestamps), so order doesn't really matters.
  2. Describe the Bully algorithm and the Ring algorithm to implement leader election using examples of at least 6 nodes.
    Assume we have 6 nodes P1 to P6. For the ring algorithm they have their logical order P3, P1, P2, P5, P4, P6, P3. Node P3 initiates the leader election because the previous leader P6 has crashed.
    • Bully algorithm
      1. P3 sends ELECTION to P4, P5, P6
      2. P3 receives OK from P4, P5
      3. P4 sends ELECTION to P5, P6 and
        P5 sends ELECTION to P6
      4. P4 receives OK from P5 and
        P5 receives no response
      5. P5 sends COORDINATOR to P1, ..., P4 to indicate it is the new leader
    • Ring algorithm
      1. P3 sends ELECTION [3] to P1
      2. P1 sends ACK to P3 and
        P1 sends ELECTION [3, 1] to P2
      3. P2 sends ACK to P1 and
        P2 sends ELECTION [3, 1, 2] to P5
      4. P5 sends ACK to P2 and
        P5 sends ELECTION [3, 1, 2, 5] to P4
      5. P4 sends ACK to P5 and
        P4 sends ELECTION [3, 1, 2, 5, 4] to P6
      6. P4 doesn't receive an ACK, so
        P4 sends ELECTION [3, 1, 2, 5, 4] to P3
      7. P3 sends ACK to P4 and recognizes that this was the election it initiated (because it is the first node on the list).
        P3 sends COORDINATOR 5 to P1
      8. P1 sends ACK to P3 and
        P1 sends COORDINATOR 5 to P2
      9. ...
        (message goes round once again).
  3. What is the difference between Deadlock Avoidance techniques and Deadline Detection and Resolution techniques?
    I guess the exercise should read "Deadlock Detection and Resolution"?
    Deadlock avoidance tries to prevent the occurence of deadlocks. This can be done by design through eliminationg one of the four preconditions for deadlocks. Another way is to track down resource requests and detect possible cirular dependencies so that requesting a conflicting resource can be denied until other processes release resources, so that the cirular dependency is broken.

    Deadlock detection comes into action if a deadlock has already occured. It tries to detect processes that don't make progress anymore and kills these processes until normal operation can continue.
  4. There are four conditions for a deadlock situation to appear. To eliminate deadlocks by design one of them needs to be eliminated. Which one will be the hardest to eliminate respectively easiest, and why?
    I think mutual exclusion is the hardest condition because for parallel operations on a shared resource it is required to perform some atomic (mutual excluded) operations on it to keep data consistency. The only work-around would be not to share the resource, which contradicts the idea of a distributed system (or at least some demanded properties of a DS).

    The easiest approch would be to eliminate the hold-and-wait condition. This would require processes to request all resources they need for some action at once. If a process holds resources and needs some more it must release all resources first and then re-request them plus the additional resource. To cope with the "request resources at once" a process can request each needed resource one after the other. If a resource is busy and can't be locked, it must release all previously requested resources and start requesting them again from the beginning.

    Of course this overhead reduces performance on the one hand, but it eliminates deadlocks on the other hand.
  5. The term (non-)preemption is also used in the area of Real-Time systems with regards to the scheduling algorithms. What does it mean in this context? Which "resource" is shared in this context? Can you have deadlocks here too?
    The scheduling strategies can be devided into preemptive, where processes get the CPU for limited time intervals, and non-preemptive, where processes get the CPU until they release it. So the shared resource here is the CPU.

    The problem with non-preemtive schedulers is that the operating system cannot force a process to stop working. Only a system call or an I/O operation interrupts the process. As there is no other resource involved no cirular waiting can occur and thus no deadlocks (regarding the CPU, not the rest of the system).
  6. What is livelock? What would be a livelock situation for the dining philosophers? What is a race condition?
    A livelock is a situation where processes are not waiting passively but still can't make progress. For the dining philosophers this would be a situation where all philophers absolutely synchroneous try to grep the left stick, can't get the right stick, release the left stick, grep the right stick, can't get the left stick, release the right stick, grep the left stick, ...

    I have already described race conditions here.


  1. Explain the concept of consistency in a distributed system. What is an inconsistency?
    The system state has some integrity constrains which must be true at any time. If all constrains are true, the distributed system is consitent, else it is inconsistend.

    If the global state of the distributed system is observed it is important to have a consintent view of the system, e. g. through a (strongly) consitent cut in the timeline diagram.
  2. To evaluate consistency the concept of state cut (or snapshot) is used. Explain the difficulties of taking a cut of a distributed global state.
    When a normal message is sent to all clients requesting to take a snapshot of their current state this could possibly be an inconsistent view of the system because of the network latency (the messages need to be delivered to each client) and the messages in transit while the snapshot of each client is taken.
  3. Describe the Chandy & Lamport snapshot protocol using an example. Why do es the algorithm require directed FIFO channels? Can it not be used in a bus environment?
    The algorithm works as follows: The initiating client sends a special MARKER message to all of its outgoing channels and saves a snapshot of its state. It continues to operate normally until it has received MARKER messages from all of its incoming channels. The other clients receive the MARKER and immediately save their local state. Then they also sends MARKER messages to all outgoing channels and wait until they receive MARKERs from all incoming channels. They add the channel state (all messages received between the first and the last MARKER) to the saved state.

    With this states the global state is captured, all local states and messages in transit are included. This is a consitend cut (not strongly consitend).

    This algorithm needs FIFO channels so that the clients know which messages have been sent before the MARKER, and which afterwards. The directed channels are needed so that the clients can distinguish the channels they have to send MARKERs to and the channels they need to wait for MARKERs.
  4. What is replication? What is object replication?
    Replication is spreading information across (multiple) sites.
  5. Explain how replication can achieve both reliability and availability.
    If information is replicated it is stored multiple times at multiple locations. In the case the information can't be accessed at one location it can be retrieved from another location. Thus the availability and in parallel the reliability increases.
  6. Can replication speed up writes to a distributed storage?
    In general no because on a write the distributed copies need to be updated or invalidated in some way. So that the change needs to propagate through the network which makes it slower compared to a local operation.
  7. Explain the main differences between the following consistency models:
    1. Strict Consistency
      The read value is always the last (in physical time) written one. This is the same for a distributed as for a standalone system.
    2. Sequential Consistency
      All read operations see the write operations in the same order.
    3. Causal Consistency
      In comparison to sequential consitency only causual related write operations must be seen in the same order, e. g. a write and a rewrite of the same variable.
    4. FIFO Consistency
      Only the writes of each process are seen in the same order by all processes. Writes of different processes my be interleaved.
    5. Weak Consistency?
      All writes can be seen in any order? I'm just guessing, I did not find a definition of this consistency.
  8. What does it mean that a set of transactions coming to a database (from multiple processes) is serializable? Why is this important?
    A sequence of transactions must be executed in serial order only if there is dependency between their results. In any other case their executions can be made in parallel, so that the overall process execution time can be reduced. To the user the result looks exactly the same as if the transactions had been executed in a serial order.
    So, serializable operations can be made in parallel but look like executed in series. It's important to enhance system performance.
  9. For databases one often uses atomic transactions. What are the properties of a (correct) atomic transaction?
    - a rollback mechanism is needed to undo changes in case of failure or abort.
    - persistent storage is needed (data doesn't get lost).
    - changes are only visible after transaction is committed.
    - logs of executed operations must be written.


  1. What is a Network Partition? What does it mean that two (or more) partitions diverge?
    If there's a link failure between nodes of a DS, the whole network can be partitioned, where nodes from one part cannot communicate with nodes of the other part. Due to this the global states of the partitions will be different if both keep running. In this case one says the partitions have diverged.
  2. Describe how the concept of Major and Minor partitions can be used to circumvent the divergence problem.
    Using a deterministic algorithm one must ensure that after a partitioning has occured the separated partitions are independently able to find out if they are the minor or the major partition. Then only the major partition keeps executing, while the minor one stops. In this case no divergence occurs.
  3. Why does the algorithm for determining if a partition is the primary partition have to be deterministic?
    Because no extra information can be obtained from the network, since partitions can't communicate. So a partition must be able to determine if it's the primary one only based on the nodes present in its network part and some criteria.

Consensus and Group Membership

  1. Describe the consensus problem.
    Consensus problem occurs in a DS if there is no central scheduler to tell each node what to do next. So independent nodes must reach an agreement on what to execute. A Distributed Consensus algorithm which solves this problem could be e.g. every node sending to every node a proposal of, say, which task to execute next. Then every node evaluates the values based on a criteria and the reaches a decision, which is the same to all nodes.
  2. Formulate Total Ordering as a consensus problem?
    For every new msg received a process proposes a msg to be delivered, and so does every other process. They receive all proposals and based on a criteria elect the msg to be delivered, which is the same to everyone. So total ordering is achieved.
  3. What is a membership service and what does it provide to its users?
    A membership (group) service provides an updated view of all users in the group and their status.
  4. How would you describe group membership as a consensus problem?
    Every time a process realizes that a member no longer is alive it proposes a new group view to every member in the group. Every one does the same and based on a criteria they decide on which is the new group view.
  5. Describe using an example why group views and messages must be coordinated in a group membership protocol.
    Say you have processes p1, p2, p3, p4. Say p4 has sometimes a slower network link and msgs to it may take longer to arrive. After deciding on a group view, p1 sends msgs to p2,p3,p4 distributing new tasks to be executed. p2 and p3 deliver the msg immediately. After that p2 crashes or leaves the group. p1 realizes and proposes a new group view, which is accepted by everyone, including p4. But p4 hasn't yet received the FIRST msg from p1, the one distributing tasks. So, if p4 gets it now, it will execute a task related to other view (context), which is wrong.
    Solution would be delaying the new view until p4 receives the first msg. Or coordinating msgs and views with some other mechanism.
  6. What is Atomic Broadcast? How can it be solved using consensus?
    Atomic Broadcast is the delivery of all msgs to all nodes at the same time.

    Nodes can use consensus to agree on which and if a msg should be delivered, and thus only delivering msgs when all nodes are able to do it at the same time.


  1. Explain the phases of the 2-phase-commit protocol.
    See Exercise 1, Distributed Systems, Question 6.
  2. How would you have to change the protocol to cope with lost messages in the first phase? In the second?
    Basically one must implemented timeouts. If in the 1st.phase msgs from the coordinator get lost, some nodes won't never get the COMMIT-REQ msg, and the coordinator will note it because it won't receive any AGREE or ABORT from the nodes after a while (timeout). Same happens if a node receives COMMIT-REQ msg and answers but answer msg gets lost.

    In 2nd.phase, if a msg from coordinator to any node gets lost, this node won't commit or abort, but the coordinator will realize it by not receiving any ACK, and can then resend the msg.
  3. How would you cope with the coordinator failing after sending the prepare message?
    The nodes should implement timeouts as well. In this case they'd respond with AGREE/ABORT msgs, but would never get the 2nd.phase msgs (COMMIT/ABORT), so if they have timeouts they can realize it and safely abort the operation.

Nach oben

Recent Changes

Nach oben

Zuletzt geändert am 10 März 2005 16:48 Uhr von chrschn