patternMinor
Greedy algorithm for Set Cover problem - need help with approximation
Viewed 0 times
problemcoverapproximationneedwithhelpalgorithmforgreedyset
Problem
I want to approximate how close is the greedy algorithm to the optimal solution for the Set Cover Problem, which I'm sure most of you are familiar with, but just in case, you can visit the link above.
The problem is NP-Hard, and I'm trying to find a bound on how well does the greedy algorithm perform.
I know it looks a lot, but please bare with me. I pretty much did most of the work, I'm just missing that last small piece.
Here is the pseudo code:
Input: $U$ - set of elements, $F$ - family of sets s.t. $\bigcup_{S\in F}S=U$
Output: $C$ - a family of sets; $C\subseteq F$ s.t. $\bigcup_{S\in C}S=U$
The algorithm is pretty straight forward, and it is easy to see that it is indeed polynomial.
This is my attempt to approximate:
Claim 1: In a set $U$ of $m$ objects, that can be covered with $k$ sets, there has to be a set $S\subseteq U$ whose size is at least $\frac{1}{k}m$.
Proof: Trivial. (I decided not to prove it)
A corollary is that given the situation described in that claim, the greedy algorithm will choose a set whose size is at least $\frac{1}{k}m$.
Claim 2: Given a universe $U$, if there exist a cover of size $k$, then after $k$ iterations, the greedy algorithm will cover at least half of the elements, meaning at least $\frac{1}{2}n$ elements.
Proof: By claim 1, in the first iteration the algorithm will cover at least $\frac{1}{k}n$ elements. Upon entering the second iteration, there are at most $n-n\frac{1}{k}$ elements, and so the greedy algorithm will at least cover additional $\frac{1}{k}(n-n\frac{1}{k})$ elements. In general, on the $i$'th iteration, the algorithm will cover $\frac{1}{k}(n-n\frac{i-1}{k})$ elements. So after $k$ iterations:
$$\sum_{i=1}^{k}\frac{1}{k}(n-n\frac{i-1}{k})=\sum_{i=0}^{k-1}\frac{1}{k}(n-n\frac{i}{k})$$
$$=\sum_{i=0}^{k-1}\frac{n}{k}-\sum_
The problem is NP-Hard, and I'm trying to find a bound on how well does the greedy algorithm perform.
I know it looks a lot, but please bare with me. I pretty much did most of the work, I'm just missing that last small piece.
Here is the pseudo code:
Input: $U$ - set of elements, $F$ - family of sets s.t. $\bigcup_{S\in F}S=U$
Output: $C$ - a family of sets; $C\subseteq F$ s.t. $\bigcup_{S\in C}S=U$
initially C is empty
while U is not empty do:
choose S from F that maximizes the cover of elements in U
add S to C
subtract S's elements from U
return CThe algorithm is pretty straight forward, and it is easy to see that it is indeed polynomial.
This is my attempt to approximate:
Claim 1: In a set $U$ of $m$ objects, that can be covered with $k$ sets, there has to be a set $S\subseteq U$ whose size is at least $\frac{1}{k}m$.
Proof: Trivial. (I decided not to prove it)
A corollary is that given the situation described in that claim, the greedy algorithm will choose a set whose size is at least $\frac{1}{k}m$.
Claim 2: Given a universe $U$, if there exist a cover of size $k$, then after $k$ iterations, the greedy algorithm will cover at least half of the elements, meaning at least $\frac{1}{2}n$ elements.
Proof: By claim 1, in the first iteration the algorithm will cover at least $\frac{1}{k}n$ elements. Upon entering the second iteration, there are at most $n-n\frac{1}{k}$ elements, and so the greedy algorithm will at least cover additional $\frac{1}{k}(n-n\frac{1}{k})$ elements. In general, on the $i$'th iteration, the algorithm will cover $\frac{1}{k}(n-n\frac{i-1}{k})$ elements. So after $k$ iterations:
$$\sum_{i=1}^{k}\frac{1}{k}(n-n\frac{i-1}{k})=\sum_{i=0}^{k-1}\frac{1}{k}(n-n\frac{i}{k})$$
$$=\sum_{i=0}^{k-1}\frac{n}{k}-\sum_
Solution
You have shown that if there is a set cover of size $k$ and there are $n$ elements left uncovered, then the greedy algorithm finds $k$ sets which cover at least $n/2$ elements, and so at most $n/2$ elements are left uncovered. You can prove by induction that if there is a set cover of size $k$ and there are $n$ elements left uncovered, then for each $t$, the first $tk$ sets found by the greedy algorithm cover all but at most $n/2^t$ elements. The idea now is to pick a value of $t$ such that $n/2^t < 1$. Since the number of elements left uncovered by the algorithm is an integer, if the number $m$ of elements left uncovered satisfies $m < 1$, then in fact $m = 0$, i.e., all elements are covered.
As a side note, you can actually improve on your reasoning and show that the first $k$ sets cover at least $(1-1/e)n$ elements, and so leave only $n/e$ elements uncovered. This improves the bound from your $k\log_2 n$ to the sharp $k\ln n$, which is known to be tight (unless P=NP).
As a side note, you can actually improve on your reasoning and show that the first $k$ sets cover at least $(1-1/e)n$ elements, and so leave only $n/e$ elements uncovered. This improves the bound from your $k\log_2 n$ to the sharp $k\ln n$, which is known to be tight (unless P=NP).
Context
StackExchange Computer Science Q#40764, answer score: 4
Revisions (0)
No revisions yet.