patternMinor
Finding a maximal independent set in parallel
Viewed 0 times
independentparallelfindingmaximalset
Problem
On a graph $G(V,E)$, we do the following process:
For example:
MY QUESTION IS: what is the average number of steps that this process takes before all nodes are colored?
My current calculation leads me to an $O(1)$ estimate, which seems too good to be true. Here is the calculation:
Consider a node $v$ with $d$ neighbours. The probability that $v$ will be the smallest among its neighbours is $1/(d+1)$. If this happens, then $v$ and all its neighbours will be colored. So the expected number of vertices colored each step is $(d+1)/(d+1)=1$ per node. Hence the total expected number of vertices colored each step is $O(n)$, so in $O(1)$ time all nodes will be colored.
If this analysis is wrong (which is probably the case), then what is the actual number of steps?
EDIT: As noted by @JukkaSuomela, the algorithm described above is due to Metivier et al, 2011 and is explained and analyzed in these lecture notes. They prove that the run time is $O(\log n)$.
But, I am still not convinced that this analysis is tight. In all graphs I checked, it seems that the algorithm completes in $O(1)$ expected time.
My question is now: what is a worst-case graph in which this algorithm indeed requires $O(\log n)$ steps in average?
- Initially, all nodes in $V$ are uncolored.
- While there are uncolored nodes in $V$, each uncolored node does the following:
- Selects a random real number and sends it to all its neighbours;
- Compares its number to the number of its neighbours; if its own number is strictly smallest, then the neighbour paints itself red and notifies all its neighbours.
- If a neighbour became red, then this node paints itself black.
For example:
- Suppose the graph is a path: a-b-c-d-e.
- Suppose the numbers in the first step are: 1-2-0-3-4.
- Nodes a and c are painted red; nodes b and d are painted black.
- In the second step, only node e remains uncolored; it is trivially minimal so it paints itself red.
MY QUESTION IS: what is the average number of steps that this process takes before all nodes are colored?
My current calculation leads me to an $O(1)$ estimate, which seems too good to be true. Here is the calculation:
Consider a node $v$ with $d$ neighbours. The probability that $v$ will be the smallest among its neighbours is $1/(d+1)$. If this happens, then $v$ and all its neighbours will be colored. So the expected number of vertices colored each step is $(d+1)/(d+1)=1$ per node. Hence the total expected number of vertices colored each step is $O(n)$, so in $O(1)$ time all nodes will be colored.
If this analysis is wrong (which is probably the case), then what is the actual number of steps?
EDIT: As noted by @JukkaSuomela, the algorithm described above is due to Metivier et al, 2011 and is explained and analyzed in these lecture notes. They prove that the run time is $O(\log n)$.
But, I am still not convinced that this analysis is tight. In all graphs I checked, it seems that the algorithm completes in $O(1)$ expected time.
My question is now: what is a worst-case graph in which this algorithm indeed requires $O(\log n)$ steps in average?
Solution
There is a minor mistake, the probability of $v$ being minimal is $1/(d+1)$. That doesn't change anything, but it is still worth pointing out.
The problem is that the events you are summing aren't disjoint. Consider two vertices that aren't adjacent but have a common neighbour. If both vertices end up being minimal amongst their neighbours, then the common neighbour gets counted as being coloured twice.
We need to examine what happens in a vertex more closely. Let's compute the probability (and thus expectation of the indicator variabele) of a single vertex getting coloured.
Assume for the sake of analysis that the graph is $d$-regular and contains no triangles or squares. We will compute the chance that a given vertex $v$ does not get coloured. There are $d+1$ possibilities: the $v$ might be the smallest amongst it neighbours, $v$ might be the second smallest, third smallest, and so on... We are not interested in the case that it is the smallest, since then it gets coloured.
If $v$ is the second smallest, there is one vertex that is smaller. The probability of this vertex being smallest amongst its neighbours is $\frac{1}{d}$ (and thus $v$ getting coloured). The probability of the remaining neighbours being smallest amongst their neighbours is 0 (since $v$ is smaller). The total probability (of $v$ not getting coloured) from this case is $\frac{1}{d+1}\cdot\frac{d-1}{d}$.
If $v$ is third smallest, there are two smaller vertices. The probability from this case is $\frac{1}{d+1}\cdot(\frac{d-1}{d})^2$.
Continuing on like this, we find that the probability that $v$ does not get coloured is $\frac{1}{d+1}\Sigma_{i=1}^d (\frac{d-1}{d})^i$. The probability of a given vertex getting coloured is thus $1-\frac{1}{d+1}\Sigma_{i=1}^d (\frac{d-1}{d})^i$. As $d\to \infty$, the probability becomes $\frac{1}{e}\approx 0.368$.
Hence the probability that a given vertex gets coloured is a constant, so the expected number of vertices that gets coloured in the first step is indeed $O(n)$.
That does not make the algorithm $O(1)$. Assuming the probability stays constant (which is not entirely justified considering nodes get coloured and no longer participate in the algorithm, the graph does not remain $d$-regular), in the $k$-th step there will be (in expectation) ${(\frac{1}{e})}^kn$ uncoloured nodes. The decay is exponential, so the algorithm takes $O(\log n)$ steps.
Perhaps somebody else will be able to offer a more precise analysis, but from my argumentation it seems likely the algorithm is $O(\log n)$.
The problem is that the events you are summing aren't disjoint. Consider two vertices that aren't adjacent but have a common neighbour. If both vertices end up being minimal amongst their neighbours, then the common neighbour gets counted as being coloured twice.
We need to examine what happens in a vertex more closely. Let's compute the probability (and thus expectation of the indicator variabele) of a single vertex getting coloured.
Assume for the sake of analysis that the graph is $d$-regular and contains no triangles or squares. We will compute the chance that a given vertex $v$ does not get coloured. There are $d+1$ possibilities: the $v$ might be the smallest amongst it neighbours, $v$ might be the second smallest, third smallest, and so on... We are not interested in the case that it is the smallest, since then it gets coloured.
If $v$ is the second smallest, there is one vertex that is smaller. The probability of this vertex being smallest amongst its neighbours is $\frac{1}{d}$ (and thus $v$ getting coloured). The probability of the remaining neighbours being smallest amongst their neighbours is 0 (since $v$ is smaller). The total probability (of $v$ not getting coloured) from this case is $\frac{1}{d+1}\cdot\frac{d-1}{d}$.
If $v$ is third smallest, there are two smaller vertices. The probability from this case is $\frac{1}{d+1}\cdot(\frac{d-1}{d})^2$.
Continuing on like this, we find that the probability that $v$ does not get coloured is $\frac{1}{d+1}\Sigma_{i=1}^d (\frac{d-1}{d})^i$. The probability of a given vertex getting coloured is thus $1-\frac{1}{d+1}\Sigma_{i=1}^d (\frac{d-1}{d})^i$. As $d\to \infty$, the probability becomes $\frac{1}{e}\approx 0.368$.
Hence the probability that a given vertex gets coloured is a constant, so the expected number of vertices that gets coloured in the first step is indeed $O(n)$.
That does not make the algorithm $O(1)$. Assuming the probability stays constant (which is not entirely justified considering nodes get coloured and no longer participate in the algorithm, the graph does not remain $d$-regular), in the $k$-th step there will be (in expectation) ${(\frac{1}{e})}^kn$ uncoloured nodes. The decay is exponential, so the algorithm takes $O(\log n)$ steps.
Perhaps somebody else will be able to offer a more precise analysis, but from my argumentation it seems likely the algorithm is $O(\log n)$.
Context
StackExchange Computer Science Q#39563, answer score: 3
Revisions (0)
No revisions yet.