HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

How does paxos algorithm handle partial failures of accept messages?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
handleacceptfailuresalgorithmpartialmessagesdoeshowpaxos

Problem

I've been reading the basic paxos algorithm from the "Paxos Made Simple" paper. I can't understand how paxos algorithm maintains the safety properties under partial failures of the accept messages, meaning cases where the proposer receives prepare responses from a majority, but the subsequent accept message does not reach all the acceptors. It seems to me this could lead in a violation of the safety properties, where there will be no majority that has agreed on the same value.

I created a contrived example of the use-case I am thinking about. In every round, the proposer sends a prepare message to a majority of nodes, gets a response from all of them, but then the accept message arrives only to one of them (the other ones get lost). More specifically, as shown in the image below:

  • in round 1, the proposer sends a propose (1) message. The acceptors haven't received any prepare message yet, so they all reply with a promise. The proposer gets a majority in phase 1, so sends an accept(1, A) message. This message reaches the first node that accepts it.



  • in round 2, the proposer sends a propose (2) message to a majority, which does not contain the node that has accepted a value. As a result, the proposes gets a majority and sends an accept(2, B) message, which reaches only the second node.



  • the round 3 is similar, where none of the selected acceptors during the first phase have accepted any value yet. As a result, the proposer picks a value and sends an accept(3, C) message, which reaches only the third node.



  • in round 4, the majority has to contain at least one of the acceptors that has accepted a value. Let's say this is the first node that has accepted the value A, which returns the number of the proposal (1) as a reply to the prepare(4) message the proposer sent. Since there's only one accepted value, the proposer uses this one and sends an accept(4, A) message, which reaches the fourth node.



  • in round 5, the proposer sends a propose(5) message to a majorit

Solution

The paper's specification for the phase 2(b) of the algorithm, which is how an acceptor responds to an accept message is the following:


If an acceptor receives an accept request for a proposal numbered n,
it accepts the proposal unless it has already responded to a prepare
request having a number greater than n.

This means that acceptors are allowed to accept multiple proposals (even with different values) in a single instance of Paxos, as long as they have not responded to a prepare request with a number greater than the number of the accept message. As a result, even after the last row of the example, a proposer could bring the system to an agreement, by making some of the nodes accept a different value.

For example, the proposer could send a subsequent prepare(6) message to nodes 1,2,5,6, followed by an accept(6, B) that reaches everyone, thus reaching a majority and concluding the consensus. The important thing to note is that after this point, it's impossible to form a majority with a different value, because any potential majority quorum would contain at least one node returning (6, B) during the prepare phase.

Context

StackExchange Computer Science Q#104890, answer score: 2

Revisions (0)

No revisions yet.