To add a question because something is unclear or was not understood, just insert the question and add the prefix `%q%` for each addition (`q` like question). This is the "question-style". Like this:

 `* %q% What kind of problems could have decentralized nature?` What kind of problems could have decentralized nature?

If you want to answer a question or add a comment please put a `%a%` in front. This is thea "answer-style" (`a` lilke answer). An example:

 `* %a% This is an addition to something that I consider important.` This is an addition to something that I consider important.

For citations or references to the slides of Prof. Suri pleas add the lecture and slide number in braces: `(<lecture>.<slide>)`.

Please make sure that you enter an author name, else your changes will not be saved!

# Exercise 3

## Timing in Distributed Systems

1. What is the difference between logical time and real time? Why is not real time being used for all distributed systems?
In the slides of lecture 3, the terms "logical time" and "logical clock" get mixed with the terms "local time" and "local clock". The local clock is defined as the physical clock of the device (3.6). The "logical time" is first introduced in 3.24, in combination with the "happend-before" relation, but also used in the diagram on slide 3.10, which actually shows local clocks. I'm not sure which definition is acutally meant here, so I use the "logcial time" as introduced on slide 3.24.

The use of logical time is limited to the order in which events take place, and not the duration between sent and receipt of a message, for instance (with respects to the the "happend-before" relation). If one event a precedes a non-concurrent event b, than the logical time of event a is smaller than the time of b: LC(a) < LC(b).

Of course real time could be used for each message, but this requires that all participants have the real time available which is hard to achieve:
1. The local hardware clocks of the devices slowly drift away.
2. Time synchronization algorithms over a network suffer from the network latency.
I think it's meant: logical time = local clocks (time as measured within a machine, computer). So:
1. real time: monotonically increasing CONTINUOUS function
2. logical time (local clock): physical clocks, monotonically increasing DISCRETE function, which has a granularity and drifts.
Why not... is correct answered above.
2. What is a Timestamp?
A time stamp is some time notion (logical or real time) that is attached to an event.
3. What kind of problems are Local Clocks useful for? What is needed in order to use the local clock as a global clock?
The local clock can be used for measuring time intervals where both start and end time origin from the local clock. For example round-trip delays, local timeouts or local timers fulfill that condition.

To use the local clock as global clock the system needs to know the time difference between both clocks, because usually it has no global clock built in. Some time source in the network must exist which provides either the time difference for that client directely (active time server) or the global time so that the client can calculate its difference to the global clock (passive time server).
4. Explain the following properties of a global clock algorithm:
1. Granularity (gV)
The smallest measurable time interval, the atomic period of time
2. Precision (πV)
The maximum absolute time difference between any two clients (the lower-right distance on slide 3.8 which just shows a "v")
3. Accuracy (αV)
The maximum absolute time difference between the time server/global clock and any of the clients (the upper-left distance on slide 3.8 which just shows a "v")
4. Skew (δ)
I'm not sure about the following because it is not clearly defined. It's just one possible interpretation of slide 3.10. I will have to get some more literature.
The skew between to clients A and B is the maximum absolute difference between their clocks. This maximum is reached at the end of the re-sync period, so that for any time t holds |CA(t) - CB(t)| < σ. The skew for two clients depends on their drift rates.
5. Drift rate (ρ)
The time amount per second the local clock runs faster or slower than the real time.
5. What is the difference between convergence and consensus clock synchronization algorithms?
• Convergence clock synchronization: Clocks are approximately in sync, within some range X. The client's clocks converge towards the global time.
• Convergence clock synchronization: The clocks are exactely synchronized.
6. What is UTC?
UTC stands for Universal Time, Coordianted. This is the legal world time with a general time zone assigned. UTC is the successor of the Greenwich Mean/Meridian Time (GMT). The Central European Time CET is UTC + 1 hour (without time savings).
7. Explain how the Cristian Algorithm and the Berkeley Algorithm achieve clock synchronization. What is the main difference between the two algorithms?
• Cristian's Algorithm uses a passive time server that has an exact time source, e. g. WWV. The clients periodically ask the time server for the time and measure the round-trip time between request and receipt of the time. So they can add the half of the round-trip time to the server's time to compensate the network latency.
• The Berkeley Algorithm uses an active time server without any reliable time source. The time server periodically sends its time to all the clients which reply with the difference of their time to the server's time. The server computes the average of all time differences and sends correction values to each client and itself. The resulting time may drift from the UTC.
8. Revisited: What is the difference between an Asynchronous and a Synchronous distributed algorithm?
• Synchronous: An upper limit for any delays in actions exist. This means the time a client needs to wait for some event is bounded.
• Asynchronous: No such upper bounds exist, processing time and network latency may be arbitrary long.
9. Why are not all distributed systems synchronous?
It is very difficult to guaranty an upper limit for all events. This implies that all resources and participants involved are also synchronous and could lead to some other limitations.

But synchrony might not always be needed. The internet, for example, also uses the "best-effort" method for the data transfer, and it works surprisingly well.
Synchrony often conflicts with some desirable features like e.g. resource sharing, openness, interactivity, scalability.

## Ordering

1. How can a posteriori ordering be achieved in a simple way in a distributed system?
Reviewing the log files and try to determine the causality of the events that happened.
Just generate log files using timestamps and actions taken.
2. Explain the difference between these different types of orderings:
1. Physical order
This is the objective "real-world" order in that all events occured.
2. Potential causal order
The potential causal order tracks which events are eventually related, e. g. by timestamping all events. A physical order is a potential causal order.
3. Happened-before order
For two events in a happened-before order one took place before the other. This ordering leaves some events uncomparable, e. g. for concurrent processes that don't communicate with each other.
3. Explain why the happened-before ordering is transitive.
If event A occured before B, but B occured before C, then A also happened before C, of course.
4. Describe the relations between the events in Figure 1 using the one of the following relations: `-->`, `<--`, `-/->`, `||`
1. f and j:     f `-->` j
2. a and m:    a `||` m Is that right? I'm not sure. Yes, it's correct.
3. c and j:     c `-->` j
4. k and e:     k `-->` e
5. j and k:     j `<--` k
• • Figure 1: The events in a distributed computation.
5. Why is happened-before ordering considered to be too pessimistic?
Because a fact that a an event happened-before some other event doesn't necessarily mean they're causally related.
6. Describe how happened-before (Causal Ordering) can be implemented in a message passing system.
I say just by using Lamport clocks. Is it that simple?
7. Explain how Lamport clocks can be used to implement Event Timestamps that are monotonically increasing and with the requirement that all processes agree on the same value.
Each process has a local clock i.e. a counter increased by one at each event (send, process, receive). Local cloks are all initialized to zero. In each msg sent the process piggybacks the actual local clock value. In each msg received the process compares it's local clock value with the received clock value plus one (the receive event itself). If greater, the received value is adopted as local one. After some exchanging of messages all processes have the same clock value. Is that correct?
8. Logical clocks give rise to a partial ordering on events. Describe what a partial ordering is, and show that the following orderings are/are not partial orderings:
1. For the set of exams the assistant is grading; the number of points on the exam.
This is a partial ordering. Let E the set of all exams, and for e in E let P(e) the number of points on the exam. Then there might be some exams e != e' with P(e) = P(e'). So the order is partial.
2. A bridge can hold only one person at the time. For the set of persons passing across the bridge; the time at which they step on to the bridge.
This is a total ordering. Let P the set of people, and for p in P let T(p) the time they step on the bridge. Because the bridge can be passed by only one person at a time T(p) = T(p') requires p = p'. This shows that the order is total.
9. Describe a simple implementation that will enforce FIFO Ordering of messages between two nodes in a distributed system. Why may FIFO not be sufficient when more than two nodes communicate with each-other?
For a FIFO ordering between two nodes each sender can tag the messages it sends with a monotonically increasing sequence number. The receiver can order the received messages before they get delivered.

For more than two nodes the sequence numbers can be used per host-to-host communication channel, but this only helps that the messages "per channel" are delivered in the FIFO order. As soon as A, B and C talk to each other no FIFO ordering of all messages is possible this way, because the sequence numbers of the hosts don't reflect any order between each other. Or: because there's no identification of the sender, just of the msg number. So if a node sees more than one sender (at least 2 other nodes, so 3 nodes at all), it can't tell who sent what, and no FIFO ordering is possible.
10. Describe a protocol that implements Total Ordering in a distributed message passing system.
We use a protocol that gives us a partial ordering, for example the happened-before relation with Lamport clocks. Then we define a total order TO on that partial order PO based on the unique host ID (two messages from the same host will never have the same logical timestamp).

Let ID(m) the unique host ID of the sender of m. For any two messages m != m' we define:
1. PO(m) < PO(m') => TO(m) = PO(m)
2. PO(m) > PO(m') => TO(m) = PO(m)
3. PO(m) = PO(m') => TO(m) = ID(m)
11. Explain what a Covert Channel is and why it can "destroy" a total ordering protocol.
A covert channel is some source of events that is not done by means of the ordering protocol, like a sensor for some physical occurence. This means there exists some causality in the system that can't be tracked down by other participants that follow the protocol and don't know about this external events. The events are not reflected by the protocol order, so they lead to wrong states in the system.
Covert channel = hidden channel = cohort = another comm channel besides the ordering protocol, like e.g. other protocols (RPC/RMI) or the physical environment.

I was trying to build some example, but I can't imagine anything which points out the ordering isse. It always leads to some "lost-message" problem, but not really to an order problem.
Take the examples in (3.42): on the left one, a node A sends a msg to say a central node using the ordering protocol. Then node A sends another msg to say node B using RPC, which then generates a response msg from node B to the central node. Well, if this consequent msg is received/delivered by the central node BEFORE the first msg from node A gets there, it's receiving a response msg of something that theoretically NEVER happened!
Same goes for the right side, but using the physical environment "protocol" instead of a comm. one.

## Recent Changes

Zuletzt geändert am 10 März 2005 11:56 Uhr von chrschn