debugMinor

Contamination in Atomic Broadcast in crash stop failure model

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

stopmodelbroadcastcontaminationfailurecrashatomic

Problem

I'm reading several papers about different broadcast algorithms. Most of them are designed for crash stop model but they also consider contamination scenarios.

For example for a Causal Atomic Broadcast when a faulty process
fails to delivers messages in order and then starts broadcasting other
messages based on its inconsistent state which causes inconsistent
state in other correct processes.

The above is just an example why a specific process could have such a weird behavior although our failure model is crash stop. Follwoing paper describes it more.

A Modular Approach to Fault-tolerant Broadcasts and Related Problems.
V. Hadzilacos and S. Toueg, Distributed Systems, 97-145, 1993.

See here section 3.11 (page 22) for more sophisticated example. This is really confusing to me. Why do we don't consider this type of failure as byzantine failures? In crash stop model a single failure should result in a crash. A faulty process can NOT continue to run without crashing. What am I missing?

Solution

I think you are confusing the failure model with the correctness goals of a protocol.

-
A failure model specifies what kinds of things can go wrong (i.e., what kinds of failures the protocol is designed to deal with and promises to be able to handle).

-
A correctness goal describes what properties the protocol promises to provide (i.e., what properties it needs to achieve, to qualify as correct).

We then analyze whether a particular protocol meets a particular correctness goal in a particular failure model. For example, protocol might achieve the correctness goal in one failure model but not in another.

Example of failure models include crash-only; or Byzantine failures; or others. See Sections 2.3 and 2.6 for examples of kinds of failures. A failure model would specify which of those failures are in scope (and thus, implicitly, which are out of scope).

An example of a correctness goal is "Causal Broadcast" or "Atomic Broadcast" or "Causal Atomic Broadcast". For instance, the correctness goal that we call "Causal Broadcast" is that the protocol must ensure Causal Order for all delivered messages (see Section 3.3 of the paper you mention). There might be a protocol that achieves Causal Order in the presence of crashes but not in the presence of Byzantine failures.

With that understanding, Section 3.11 of your paper starts out with an example of a failure that can be problematic. That's a type of failure that isn't just a crash. Yes, it could be considered an instance of a Byzantine failure. However, it is also a more specific type of failure; one can articulate a more specific failure model that is more constrained than arbitrary Byzantine failures, but broader than just crashes, and which includes that failure as the sort of failure that can occur. It may be useful to think about what is achievable in that intermediate failure model. There are more failure models in the world than just "crash stop" and "Byzantine".

Context

StackExchange Computer Science Q#75822, answer score: 2

Revisions (0)

No revisions yet.