patternMinor
Finding a subset with few neighbors
Viewed 0 times
withfewfindingsubsetneighbors
Problem
Given a bipartite graph $G(X+Y,E)$, how can I find a non-empty subset $Y'\subseteq Y$, such that $|N(Y')| \leq |Y'|$ (where $N$ is the set of neighbors)?
If $|Y|\geq |X|$ then the problem is easy - $Y'=Y$ satisfies the requirements. The interesting case is $|Y|
-
In bipartite expansion, the goal is to find a subset $Y'$ containing a fraction $\beta$ of the vertices of $Y$ which minimizes the size of $N(Y')$. The problem is hard to approximate. However, our problem is possibly easier, since we only require $Y'$ to be non-empty, and only require $|N(Y')|$ to be at most $|Y'|$ - we do not try to minimize it.
-
In union minimization, we are given a set-system and an integer $k$, and the goal is to find $k$ sets such that their union is minimized. We can view the set-system as a bipartite graph with sets on one side and elements on the other side; then, the goal is to find a $Y'$ of size $k$, such that $|N(Y')|$ is minimized. This is hard to approximate, but again our problem is potentially easier since it does not insist on $k$ and does not require minimization.
So, can the problem be solved in polynomial time?
If $|Y|\geq |X|$ then the problem is easy - $Y'=Y$ satisfies the requirements. The interesting case is $|Y|
-
In bipartite expansion, the goal is to find a subset $Y'$ containing a fraction $\beta$ of the vertices of $Y$ which minimizes the size of $N(Y')$. The problem is hard to approximate. However, our problem is possibly easier, since we only require $Y'$ to be non-empty, and only require $|N(Y')|$ to be at most $|Y'|$ - we do not try to minimize it.
-
In union minimization, we are given a set-system and an integer $k$, and the goal is to find $k$ sets such that their union is minimized. We can view the set-system as a bipartite graph with sets on one side and elements on the other side; then, the goal is to find a $Y'$ of size $k$, such that $|N(Y')|$ is minimized. This is hard to approximate, but again our problem is potentially easier since it does not insist on $k$ and does not require minimization.
So, can the problem be solved in polynomial time?
Solution
I have an idea and will be happy to know if it is correct, or can be corrected.
Consider the integer linear program:
\begin{align}
\text{maximize} && \sum_{j\in Y} y_j - \sum_{i\in X} x_i
\\
\text{subject to} && y_j - x_i \leq 0 && \forall (i,j)\in E
\\
&& 0\leq x_i \leq 1,~~ 0\leq y_j \leq 1 && \forall i\in X, j\in Y
\\
&& x_i\in \mathbb{Z},~~ y_j\in\mathbb{Z} && \forall i\in X, j\in Y
\end{align}
In this program, there is a variable $x_i$ for each node $i$ in $X$, and a variable $y_j$ for each node $j$ in $Y$.
Let $X'$ be the subset of $X$ defined by $\{i\in X: x_i=1\}$, and $Y'$ the subset of $Y$ defined by $\{j\in Y: y_j=1\}$.
The constraints $y_j - x_i \leq 0$ ensure that, if some vertex $j$ is in $Y'$, then all its neighbors are in $X'$, so $X'\supseteq N(Y')$.
The objective is to maximize the difference $|Y'|-|X'|$. If the optimal solution is at least 0, then $|Y'|\geq |X'| \geq |N(Y')|$, so $Y'$ is the required set. On the other hand, if the optimal solution is negative, there is no such set.
In general, integer linear programs are hard to solve. But in this case, the matrix of constraints is Totally unimodular. It satisfies the Hoffman-Gale conditions: every entry in the matrix is one of $+1, 0, -1$; every row in the matrix contains at most two nonzero entries; and if there are two nonzero entries, they have opposite signs. Therefore, we can solve the LP without the integrality constraints, and we are guaranteed to get an integral solution which gives us $Y'$.
The only problem is that the above LP does not guarantee that $Y'$ is non-empty. Adding a constraint such as $\sum_{j\in Y} y_j\geq 1$ would invalidate the argument of total-unimodularity. A possible solution is to solve $|Y|$ different linear programs: for each vertex $j\in Y$, solve the LP with the constraint $0\leq y_j \leq 1$ replaced with the constraint $y_j = 1$. If for some $j\in Y$ the objective is at least 0, we have found a non-empty $Y'$ as required. Otherwise, such $Y'$ does not exist.
Consider the integer linear program:
\begin{align}
\text{maximize} && \sum_{j\in Y} y_j - \sum_{i\in X} x_i
\\
\text{subject to} && y_j - x_i \leq 0 && \forall (i,j)\in E
\\
&& 0\leq x_i \leq 1,~~ 0\leq y_j \leq 1 && \forall i\in X, j\in Y
\\
&& x_i\in \mathbb{Z},~~ y_j\in\mathbb{Z} && \forall i\in X, j\in Y
\end{align}
In this program, there is a variable $x_i$ for each node $i$ in $X$, and a variable $y_j$ for each node $j$ in $Y$.
Let $X'$ be the subset of $X$ defined by $\{i\in X: x_i=1\}$, and $Y'$ the subset of $Y$ defined by $\{j\in Y: y_j=1\}$.
The constraints $y_j - x_i \leq 0$ ensure that, if some vertex $j$ is in $Y'$, then all its neighbors are in $X'$, so $X'\supseteq N(Y')$.
The objective is to maximize the difference $|Y'|-|X'|$. If the optimal solution is at least 0, then $|Y'|\geq |X'| \geq |N(Y')|$, so $Y'$ is the required set. On the other hand, if the optimal solution is negative, there is no such set.
In general, integer linear programs are hard to solve. But in this case, the matrix of constraints is Totally unimodular. It satisfies the Hoffman-Gale conditions: every entry in the matrix is one of $+1, 0, -1$; every row in the matrix contains at most two nonzero entries; and if there are two nonzero entries, they have opposite signs. Therefore, we can solve the LP without the integrality constraints, and we are guaranteed to get an integral solution which gives us $Y'$.
The only problem is that the above LP does not guarantee that $Y'$ is non-empty. Adding a constraint such as $\sum_{j\in Y} y_j\geq 1$ would invalidate the argument of total-unimodularity. A possible solution is to solve $|Y|$ different linear programs: for each vertex $j\in Y$, solve the LP with the constraint $0\leq y_j \leq 1$ replaced with the constraint $y_j = 1$. If for some $j\in Y$ the objective is at least 0, we have found a non-empty $Y'$ as required. Otherwise, such $Y'$ does not exist.
Context
StackExchange Computer Science Q#103157, answer score: 2
Revisions (0)
No revisions yet.