HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Is finding the smallest collection of subsets so that the number of elements among the subsets is <= the number of subsets NP-hard?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
smallestamongthenumberelementscollectionsubsetshardthatfinding

Problem

Given a collection of non-empty subsets of $\{1,2,\ldots,N\}$ ($N$ not fixed), the problem is to find the smallest non-empty collection of subsets so that the number of distinct elements appearing among the union of subsets is less than or equal to the number of subsets chosen. This seems like an NP-hard problem but I can't prove it. Any help?

Solution

This is too long for a comment but is not really a full answer.

Let's re-formulate the problem setting slightly as a bipartite graph $G$, with $\{1,2,\ldots,N\} = X$ as one set of vertices and the collection of subsets (say, $Y$) as the other set of vertices, with an edge between $i \in X$ and $J \in Y$ iff $i \in J$. For any set of vertices $W$ in $Y$, define $N(W)$ to be the neighborhood of $W$ in $G$ -- namely, the set of all vertices in $X$ that are adjacent to at least one vertex in $W$.

Then, the problem asks to find a minimal subset $W \subseteq Y$ such that $|N(W)| \leq |W|$. However, if we change the criterion slightly to $|N(W)| \lt |W|$, we find that such a subset exists if and only if there does not exist a matching between $X$ and $Y$ that covers $Y$, for it is exactly equivalent to Hall's Marriage Theorem.

Since such matchings are findable in polynomial time (e.g.: via Hopcroft-Karp), I think this version of the problem should probably be relatively easy but I'd have to do some more work to figure out if and how the standard algorithms for bipartite matching expose these deficient sets and whether or not minimal deficient sets can also be obtained easily. Furthermore, I can't tell right away how much harder the original version of the problem (which allows $|N(W)| = |W|$) is than this modified version.

Context

StackExchange Computer Science Q#29328, answer score: 2

Revisions (0)

No revisions yet.