patternMinor
Mutual exclusion algorithm using only atomic reads and writes for unknown number of processes and robust to process abortion
Viewed 0 times
exclusionwritesnumberrobustprocessunknownabortionalgorithmreadsprocesses
Problem
I am trying to come up with a mutual exclusion algorithm that is based only on atomic reads and atomic writes of shared memory (i.e. no compare-and-swap or similar).
Apart from mutual exclusion and deadlock freedom, it needs to fulfill the following properties:
I have come up with a way that looks to me like it could work, but I wanted to run it by you guys to see if I made a mistake somewhere or if there is a more elegant solution.
The idea is to combine a modified version of Szymanski's Three-Bit Linear Wait Algorithm (Figure 1 on page 4 of "Mutual Exclusion Revisited") with a process-naming-scheme based loosely on Moir and Anderson's "Wait-Free Algorithms for Fast, Long-Lived Renaming" (M&A).
As a first step, I define an "order" for incoming processes by assigning them an ID like M&A:
M&A use the following mutex primitive (see Figure 2 on page 5):
where out of n processes entering the block, at most 1 will stop there, at most n - 1 will move right, and at most n - 1 will move down.
These primitives can be linked together into a grid, where I decided to number the boxes differently from M&A, to allow me to calculate the box number without knowing the total number of boxes:
The box number can be calculated using
Any new process will initially be assigned a large, globally unique ID (e.g. timestamp + huge random number), which it will use as the value for p in the mutex primitive algorithm a
Apart from mutual exclusion and deadlock freedom, it needs to fulfill the following properties:
- It needs to be robust in the face of processes dying at any point
- It needs to handle an ex-ante unknown number of processes competing for the lock
- Every few seconds I need one of the processes to enter the critical section (I don't care which)
- It should be as memory-efficient as possible
- All my processes share a reasonably synchronized clock-source (sub-second accuracy)
I have come up with a way that looks to me like it could work, but I wanted to run it by you guys to see if I made a mistake somewhere or if there is a more elegant solution.
The idea is to combine a modified version of Szymanski's Three-Bit Linear Wait Algorithm (Figure 1 on page 4 of "Mutual Exclusion Revisited") with a process-naming-scheme based loosely on Moir and Anderson's "Wait-Free Algorithms for Fast, Long-Lived Renaming" (M&A).
As a first step, I define an "order" for incoming processes by assigning them an ID like M&A:
M&A use the following mutex primitive (see Figure 2 on page 5):
where out of n processes entering the block, at most 1 will stop there, at most n - 1 will move right, and at most n - 1 will move down.
These primitives can be linked together into a grid, where I decided to number the boxes differently from M&A, to allow me to calculate the box number without knowing the total number of boxes:
The box number can be calculated using
box = (m*(m-1))/2 + r, where m is the number of total moves made (including the move into the top-left box, i.e. m ≥ 1) and r is the number of moves to the right (r ≥ 0).Any new process will initially be assigned a large, globally unique ID (e.g. timestamp + huge random number), which it will use as the value for p in the mutex primitive algorithm a
Solution
Since I don't really care about fairness or starvation (it doesn't matter which process enters the critical section), I don't really need to use an algorithm as complex as Szymanski's (also pointed out by @WanderingLogic in a comment above).
I found a quite beautiful alternative: An algorithm by Burns and Lynch:
You can find it on page 4 (836) of their paper "Mutual Exclusion Using Indivisible Reads and Writes".
It has a couple of major advantages:
Aside: The idea of this algorithm can be visualized as follows:
I'm a process and I'm standing somewhere in a row with other people (all other processes).
When I want to enter the critical section, I do the following:
I implemented the whole idea from above (including M&A) with this algorithm and am currently testing the heck out of it and so far it seems very stable.
The implementation is very straight forward. There were really only two additional caveats I had to consider (unless, of course, I missed something (pointers welcome)):
Due to the shared data-structures that my code uses, it's currently not really in a state that makes it meaningful to post here, but with a little bit of work, I could put together a self-contained version. If anyone is interested, please leave a comment and I will do so.
I found a quite beautiful alternative: An algorithm by Burns and Lynch:
program Process_i;
type flag = (down, up);
shared var F : array [1..N] of flag;
var j : 1..N;
begin
while true do begin
1: F[i] := down;
2: remainder; (* remainder region *)
3: F[i] := down;
4: for j := 1 to i-1 do
if F[j] = up then goto 3;
5: F[i] := up;
6: for j := 1 to i-1 do
if F[j] = up then goto 3;
7: for j := i+1 to N do
if F[j] = up then goto 7;
8: critical; (* critical region *)
end
end.You can find it on page 4 (836) of their paper "Mutual Exclusion Using Indivisible Reads and Writes".
It has a couple of major advantages:
- It is much simpler
- It uses less memory (In fact, they state that their algorithm is optimal in that respect)
- All shared memory has only a single writer (but multiple readers, of course)
- It is easy to turn the busy-waits into delayed retries (see here)
Aside: The idea of this algorithm can be visualized as follows:
I'm a process and I'm standing somewhere in a row with other people (all other processes).
When I want to enter the critical section, I do the following:
- 3/4: I look to my left and keep my hand down as long as I see someone with their hand up.
- 5: If no-one to my left has their hand up, I put mine up.
- 6: I check again if anyone to my left has meanwhile put their hand up. If so, I put mine back down and start over. Otherwise, I keep my hand up.
- 7: Everyone to my right goes first, so I look to my right and wait until I don't see any hands up.
- 8: Once all hands to my right are down, I can enter the critical section.
- 1: When I'm done, I put my hand back down.
I implemented the whole idea from above (including M&A) with this algorithm and am currently testing the heck out of it and so far it seems very stable.
The implementation is very straight forward. There were really only two additional caveats I had to consider (unless, of course, I missed something (pointers welcome)):
- If a process makes it to line 7 in the algorithm, it might end up hitting the
goto, in which case, in my implementation (see here), the entry to the critical section is denied and the process tries again later. When the retry happens (possibly with a big delay), I need to refresh the process's flag in such a way that it didn't have even the tiniest moment during which the flag was expired, or, if it did, I need to detect that and jump back to line 3 in the algorithm instead of continuing at line 7.
- I added a check whether the critical section needs to be entered at all (i.e. a statement that limits the rate at which the critical section is entered). It is most efficient to perform this check right before a process even tries to enter the critical section. To make 100% sure that no thread can ever exceed the rate limit, though, a second check will need to be performed when a process successfully entered the critical section.
Due to the shared data-structures that my code uses, it's currently not really in a state that makes it meaningful to post here, but with a little bit of work, I could put together a self-contained version. If anyone is interested, please leave a comment and I will do so.
Code Snippets
program Process_i;
type flag = (down, up);
shared var F : array [1..N] of flag;
var j : 1..N;
begin
while true do begin
1: F[i] := down;
2: remainder; (* remainder region *)
3: F[i] := down;
4: for j := 1 to i-1 do
if F[j] = up then goto 3;
5: F[i] := up;
6: for j := 1 to i-1 do
if F[j] = up then goto 3;
7: for j := i+1 to N do
if F[j] = up then goto 7;
8: critical; (* critical region *)
end
end.Context
StackExchange Computer Science Q#48466, answer score: 2
Revisions (0)
No revisions yet.