HiveBrain v1.2.0
Get Started
← Back to all entries
snippetMajor

How to prove greedy algorithm is correct

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
algorithmprovecorrecthowgreedy

Problem

I have a greedy algorithm that I suspect might be correct, but I'm not sure. How do I check whether it is correct? What are the techniques to use for proving a greedy algorithm correct? Are there common patterns or techniques?

I'm hoping this will become a reference question that can be used to point beginners to; hence its broader-than-usual scope. Please take care to give general, didactically presented answers that are illustrated by at least one example but nonetheless cover many situations. Thanks!

Solution

Ultimately, you'll need a mathematical proof of correctness. I'll get to some proof techniques for that below, but first, before diving into that, let me save you some time: before you look for a proof, try random testing.
Random testing

As a first step, I recommend you use random testing to test your algorithm. It's amazing how effective this is: in my experience, for greedy algorithms, random testing seems to be unreasonably effective. Spend 5 minutes coding up your algorithm, and you might save yourself an hour or two trying to come up with a proof.

The basic idea is simple: implement your algorithm. Also, implement a reference algorithm that you know to be correct (e.g., one that exhaustively tries all possibilities and takes the best). It's fine if your reference algorithm is asymptotically inefficient, as you'll only run this on small problem instances. Then, randomly generate one million small problem instances, run both algorithms on each, and check whether your candidate algorithm gives the correct answer in every case.

Empirically, if your candidate greedy algorithm is incorrect, typically you'll often discover this during random testing. If it seems to be correct on all test cases, then you should move on to the next step: coming up with a mathematical proof of correctness.
Mathematical proofs of correctness

OK, so we need to prove our greedy algorithm is correct: that it outputs the optimal solution (or, if there are multiple optimal solutions that are equally good, that it outputs one of them).

The basic principle is an intuitive one:

Principle: If you never make a bad choice, you'll do OK.

Greedy algorithms usually involve a sequence of choices. The basic proof strategy is that we're going to try to prove that the algorithm never makes a bad choice. Greedy algorithms can't backtrack -- once they make a choice, they're committed and will never undo that choice -- so it's critical that they never make a bad choice.

What would count as a good choice? If there's a single optimal solution, it's easy to see what is a good choice: any choice that's identical to the one made by the optimal solution. In other words, we'll try to prove that, at any stage in the execution of the greedy algorithms, the sequence of choices made by the algorithm so far exactly matches some prefix of the optimal solution. If there are multiple equally-good optimal solutions, a good choice is one that is consistent with at least one of the optima. In other words, if the algorithm's sequence of choices so far matches a prefix of one of the optimal solutions, everything's fine so far (nothing has gone wrong yet).

To simplify life and eliminate distractions, let's focus on the case where there are no ties: there's a single, unique optimal solution. All the machinery will carry over to the case where there can be multiple equally-good optima without any fundamental changes, but you have to be a bit more careful about the technical details. Start by ignoring those details and focusing on the case where the optimal solution is unique; that'll help you focus on what is essential.

There's a very common proof pattern that we use. We'll work hard to prove the following property of the algorithm:

Claim: Let $S$ be the solution output by the algorithm and $O$ be the optimum solution. If $S$ is different from $O$, then we can tweak $O$ to get another solution $O^*$ that is different from $O$ and strictly better than $O$.

Notice why this is useful. If the claim is true, it follows that the algorithm is correct. This is basically a proof by contradiction. Either $S$ is the same as $O$ or it is different. If it is different, then we can find another solution $O^*$ that's strictly better than $O$ -- but that's a contradiction, as we defined $O$ to be the optimal solution and there can't be any solution that's better than that. So we're forced to conclude that $S$ can't be different from $O$; $S$ must always equal $O$, i.e., the greedy algorithm always outputs the correct solution. If we can prove the claim above, then we've proven our algorithm correct.

Fine. So how do we prove the claim? We think of a solution $S$ as a vector $(S_1,\dots,S_n)$ which corresponds to the sequence of $n$ choices made by the algorithm, and similarly, we think of the optimal solution $O$ as a vector $(O_1,\dots,O_n)$ corresponding to the sequence of choices that would lead to $O$. If $S$ is different from $O$, there must exist some index $i$ where $S_i \ne O_i$; we'll focus on the smallest such $i$. Then, we'll tweak $O$ by changing $O$ a little bit in the $i$th position to match $S_i$, i.e., we'll tweak the optimal solution $O$ by changing the $i$th choice to the one chosen by the greedy algorithm, and then we'll show that this leads to an even better solution. In particular, we'll define $O^*$ to be something like

$$O^* = (O_1,O_2,\dots,O_{i-1},S_i,O_{i+1},O_{i+2},\dots,O_n),$$

except that often we'll have to modify the

Context

StackExchange Computer Science Q#59964, answer score: 42

Revisions (0)

No revisions yet.