HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Find all edges in a tree whose removal doesn't separate color classes

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
removaledgeswhoseallcolorseparatedoesnclassesfindtree

Problem

Given a tree of $N$ vertices, each colored in one of the colors $1,\dots,N$.

Let’s call an edge a separator if when the edge is removed from the tree, all vertices of each color stay connected.

The goal is to find all separators in a tree in $O(N)$ time and $O(N)$ memory.

My first approach was to use Farach-Colton and Bender algorithm to find LCA of all vertices of each color, and then the separator is an edge that doesn’t lie on a path between the LCA and some vertex of the same color for any color, but I can’t come up with any ideas to move further.

Also, for the reference, there’s $O(n \log n)$ approach to solve this task, which doesn’t seem to be optimizable. Let’s store in the vertice the count of all vertices of each color (now the edge is separator if in subtree there’re either no occurrences, or all occurrences of each color), but it’s a quadratic solution. Now we use small-to-large approach (merge these arrays to the one that belongs to the largest child, and pass the reference to that array to the parent), and it becomes $O(n \log n)$.

Solution

A useful lemma

First, root the tree arbitrarily if necessary, and orient all edges towards the root.

Denote by $b(v)$ the number of colours that appear at least once in the subtree rooted at $v$, and by $e(v)$ the number of colours that are completely contained in the subtree rooted at $v$. The following lemma is key to efficiently finding all separators:

Lemma: The edge from $v$ to its parent is a separator iff $b(v) = e(v)$.

Proof sketch: The direction "the edge from $v$ to its parent is a separator $\implies b(v) = e(v)$" is easy to see: some number $k$ of colours appear in their entirety in the subtree rooted at $v$, so it must be that $b(v) = e(v) = k$. The converse can be shown by induction on the number of colours in the entire tree, but intuitively, if we perform a series of steps, in each of which we insert a set of vertices of a fresh colour into the entire tree, then every step that adds a colour that is completely contained in the subtree rooted at $v$ increases both $b(v)$ and $e(v)$ by 1, every step that adds a colour that appears both inside and outside this subtree increases just $b(v)$ by 1, and there is no way to increase just $e(v)$, so it is impossible to recover the state $b(v) = e(v)$ while an incompletely contained colour remains in the subtree.

Computing $b(\cdot)$ and $e(\cdot)$ using deltas

Define $b^+(v)$ to be the number of colours that appear at $v$ but nowhere else in the subtree below it. Clearly $b(v) = b^+(v) + \sum_{u \in \{u: (u,v)\in E\}} b(u)$, which we can compute in a single bottom-up pass over $T$ if we can compute $b^+(v)$ for each vertex. Notice that $b^+(v)$ is either 1 or 0, since the only colour that can possibly contribute is the colour of $v$ itself.

Similarly define $e^+(v)$ to be the number of colours that are completely contained in the subtree rooted at $v$, but not in any proper subtree. Clearly $e(v) = e^+(v) + \sum_{u \in \{u: (u,v)\in E\}} e(u)$. Notice that, unlike for $b^+(v)$, $e^+(v)$ can be larger than 1, since a vertex can be the LCA of multiple colours.

Computing $b^+(\cdot)$ efficiently

Perform a bottom-up depth-first traversal, using an array $s$ of $N$ flags to keep track of whether or not a colour has been seen in any of the child subtrees. Specifically:

  • Initially, set every element of $s$ to false.



  • On entry to a vertex $v$ having colour $c(v)$:



  • Set $s[c(v)]$ to false.



  • Recurse to process all children.



  • If $s[c(v)]$ is still false, then colour $c(v)$ appears at $v$ but nowhere below it, so set $b^+(v) = 1$. Otherwise set $b^+(v) = 0$.



  • Set $s[c(v)]$ to true.



Computing $e^+(\cdot)$ efficiently

Every colour $c$ for which $LCA(c) = v$ contributes 1 to the value of $e^+(v)$. Thus it suffices to calculate the LCA for each colour, and increase a counter at the corresponding vertex each time.

The total number of LCA computations is $O(N)$, since the colours partition the tree vertices, and the LCA of a set of vertices $\{v_1, v_2, \dots\}$ can be computed as $LCA(v_1, LCA(v_2, LCA(\dots)))$. Using Farach-Colton and Bender's LCA algorithm, after $O(N)$ preprocessing each LCA can be computed in constant time, for $O(N)$ computation of all $e^+(\cdot)$ values overall.

Final pass

A final bottom-up pass computes $b(v)$ and $e(v)$ for each vertex $v$ from $b^+(v)$ and $e^+(v)$ and the values of $b(v)$ and $e(v)$ of child subtrees as outlined above. At each vertex $v$, test whether $b(v) = e(v)$, and if so output the edge leading from $v$ to its parent as a separator. This takes constant time per tree edge, so $O(N)$ overall, making the entire algorithm $O(N)$-time.

Context

StackExchange Computer Science Q#104407, answer score: 4

Revisions (0)

No revisions yet.