Dhrubajyoti Ghosh

Snapshots of shared memory

We have an \( n \)-length array \( \texttt{A} \) where each \( \texttt{A[i]} \) is updated by a different writer \( \texttt{W[i]} \). There is also a reader \( \texttt{R} \) who reads \( \texttt{A} \) from left to right and whose task is to return a snapshot reflecting the contents of \( \texttt{A} \) at some time.

The task is not straightforward. Due to the ever-changing nature of \( \texttt{A} \), simply reading it sequentially might return a snapshot that \( \texttt{A} \) was never equal to!

The reader and writers need to work together to avoid this issue. It will be assumed that

the read of a single cell and a write to that cell do not overlap (think of them as atomic operations),
\( \texttt{R} \) should not return an 'old' valid snapshot, i.e. in the order \( \texttt{A=V1} \rightarrow \texttt{A=V2} \rightarrow \texttt{R's read} \) (with none of the events overlapping), \( \texttt{R} \) should not return \( \texttt{V1} \) as its snapshot.

Solution #1

In addition to writing a value to its cell, the writer also includes the timestamp.

The reader keeps reading the array in an indefinite loop (call each iteration a \( \texttt{collect} \) ), until two consecutive \( \texttt{collects} \) return the same snapshot. This snapshot is valid since it must have existed at any point in the time gap between the two \( \texttt{collect} \)s (timestamping the values is crucial in ensuring this).

The issue here is that \( \texttt{R} \) might keep on iterating this loop forever because the \( \texttt{collect} \)s could always keep changing.

Solution #2

In #1, the writers were causing the trouble and did not bother to do anything about it. They will have to 'help' \( \texttt{R} \) by taking a valid snapshot of \( \texttt{A} \) before writing (so a write is now a read followed by the update), and if \( \texttt{R} \) is stuck on the loop for too long, it uses those snapshots. At this point some things are not clear:

how the writers manage to always take a valid snapshot, and
how and when \( \texttt{R} \) chooses among these snapshots so as to not violate B.

The solution is a recursive one. A read \( \texttt{r} \) operates as follows (note that writers and \( \texttt{R} \) both implement reads now):

Keep repeating \( \texttt{collect} \) as before. If two consecutive \( \texttt{collect} \)s are the same, use that \( \texttt{collect} \) as it is a valid snapshot. If this does not happen, eventually there must be some writer \( \texttt{W} \) who has changed(/updated) the array twice (PHP).
This means \( \texttt{W} \) must have executed an entire write (starting after its first update) within the interval in which read \( \texttt{r} \) was executed.
This is the cue for recursion, since \( \texttt{W} \)'s write has a read embedded in it. End the definition of \( \texttt{r} \) by requiring that IF \( \texttt{W} \)'s (embedded) read had returned a snapshot to \( \texttt{W} \), then \( \texttt{r} \) should use that snapshot, otherwise it returns nothing.

In fact, \( \texttt{W} \)'s embedded read SHOULD have returned a snapshot and moreover the snapshot is valid (meaning \( \texttt{r} \) always returns a valid snapshot). To see this, look at \( \texttt{W} \)'s embedded read - either it must have had two consecutive \( \texttt{collect} \)s in which case we do have a valid snapshot, or there must be a writer \( \texttt{W'} \) whose write interval falls within \( \texttt{W} \)'s embedded read interval. Clearly \( \texttt{W'} \neq \texttt{W} \). Apply the same reasoning to \( \texttt{W'} \). How long can this go on recursively? There are at most \( n \) writers. Down the line, we will run out of writers and there will be some super-embedded read which finds two consecutive equal \( \texttt{collect} \)s and thus has a valid snapshot. This snapshot gets carried up the recursion chain and is returned by \( \texttt{r} \).

Illustrations for #2 coming up.

References

Atomic Snapshots of Shared Memory, Yehuda Afek, Hagit Attiya, Danny Dolev, Eli Gafni, Michael Merritt, Nir Shavit
The Art of Multiprocessor Programming, by Maurice Herlihy and Nir Shavit.