HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Finding number of edges with weight less than k on the path between two nodes in a tree

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
pathnumberedgesthenodeswiththanweighttwobetween

Problem

You are given a tree with $n$ nodes and $n-1$ edges. Each edge has a weight. There are $m$ queries. In each query you have to answer the following question:

  • How many edges are there in the path between node $a$ and node $b$ with weight less than $k$?



$a$, $b$, and $k$ vary in each query. You're allowed to preprocess the tree to make it easy to answer each query.

Consider the tree with edges

1
 weight=4/ \weight=5
        2   3
weight=4|  
        4
weight=4|  
        5


Between nodes 3 and 5, the total number of edges with weight less than 5 is 3, i.e. {1-2, 2-4, 4-5}.

I was trying to solve by brute force method by checking each path.

I want to know if there is a better way of solving this.

Solution

You can answer each query in time $O(\log^3 n)$, plus whatever time is needed to answer a lowest-common-ancestor query (a tricky $O(1)$-time solution exists, but you might as well go with the easier $O(\log n)$-time solution, since it doesn't hurt the final time complexity), after doing $O(n\log^3 n)$ preprocessing.

We only need counts from each vertex to the root

Let's call the number of edges of weight less than $k$ on the path from a vertex $v$ to the root $f(v, k)$. We can calculate the desired final answer -- the number of edges of weight less than $k$ on the path from $a$ to $b$ -- using $f(a, k) + f(b, k) - 2f(lca(a, b), k)$, where $lca(u, v)$ is a the lowest common ancestor of $u$ and $v$.

From now on we will focus on answering the simpler question of computing $f(v, k)$.

Heavy-light decomposition

The heavy-light decomposition decomposes the tree into disjoint paths in such a way that at most $O(\log n)$ such "heavy paths" are visited on the path from any vertex to the root. This can be accomplished with a DFS in $O(n)$ time. The idea here is that, since the heavy paths are disjoint, if we have a way to quickly compute the number of edges of weight less than any given $k$ for each one, then we can nearly compute $f(v, k)$ for any $v$ and $k$ just by summing these counts over all heavy paths that we visit on the way from $v$ to the root. And we do have a way to do this counting: Simply by sorting the edges within each heavy path by weight in a preprocessing step, we can later binary-search for any $k$ to count the number of edges of weight less than $k$ in this particular heavy path in $O(\log n)$ time.

But why only "nearly"? Because we have so far ignored the possibility that $v$ might not begin exactly at the bottom of a heavy path -- it could begin part-way up. We could try to "patch things up" by walking down to the bottom of the heavy path containing $v$, subtracting 1 from the count for each edge of weight less than $k$ that we see -- but this won't be worst-case efficient, since an individual heavy path can be as long as $n-1$ edges. We need a way to handle this bottommost heavy path efficiently.

As noted by commenter xskxzr, the issue of beginning part-way up a heavy path is not only confined to the bottommost vertex $v$ in a path to the root -- it can also happen in the middle of such a path, such as in the path from 11 to 0 in the figure on the linked page. Fortunately, the same trick, described next, works for both cases.

Fenwick trees of sorted lists

The subproblem we are now focussed on is: Given the ordered list of edge weights $x_1, \dots, x_m$ encountered while traversing down a particular heavy path, we want to preprocess this list to enable us to be able to later efficiently answer queries of the form "How many of the first $r$ elements are less than $k$?", with both $r$ and $k$ part of the query. One way is to use a Fenwick tree which has as its elements not individual integers, but sorted lists of integers.

A "regular" Fenwick tree with single-integer elements takes $O(n)$ space. Since each $x_i$ belongs to at most $O(\log n)$ buckets, increasing the size of the Fenwick tree's elements from single integers to sorted lists of integers fortunately only increases space usage by a $\log n$ factor. The associative operation in a "regular" Fenwick tree is usually addition; in our Fenwick tree it's "set union". For efficiency, all lists should be populated in a first pass, and then only sorted once, in a second pass. This can be done in $O(n\log^2 n)$ time for a single heavy path, so $O(n\log^3 n)$ overall.

To process an $(r, k)$-query for a particular heavy path, a subquery in that heavy path's Fenwick tree is performed on each of the sequence of buckets found by stripping least-significant 1-bits from $r$, and the counts from these subqueries are summed. That much is standard for Fenwick trees, but the individual subqueries are more complicated: Instead of just reading out the integer value from each bucket, we must binary-search the sorted list in that bucket for $k$; the leftmost position where $k$ appears (or would appear) tells us the number of edges of weight less than $k$ in that bucket.

An example might help: Suppose $r = 43$. In binary, 43 is 101011 (32 + 8 + 2 + 1). To answer a query for $r = 43$ (and some $k$ value), we would binary-search for $k$ in the following 4 buckets: 101011, 101010, 101000, 100000. Each bucket is found by stripping the least-significant 1-bit from the previous bucket, so there cannot be more than $\log n$ subqueries. Each subquery is a binary search, so cannot take longer than $O(\log n)$ time. Finally, there are at most $O(\log n)$ heavy paths encountered on the path from any vertex to the root, so the worst-case time to compute $f(v, k)$ is $O(\log^3 n)$.

Context

StackExchange Computer Science Q#102065, answer score: 3

Revisions (0)

No revisions yet.