patternMinor
Balanced Binary Search Tree Two-Sum with Constraints
Viewed 0 times
constraintssearchwithbalancedtwobinarysumtree
Problem
My question spawns from this question. The question is straightforward:
Can we find whether there exist 2 values in a balanced binary search tree where their sum equals a given target value?
Now the constraints:
-
Constant additional space in regards to $n$, the size of the tree. By this I mean you can use a constant number of $O(\log n)$ bit words however you wish. Assume references to nodes also take $\log n$ bits to store.
-
Read-only access to the tree. No modifications are allowed. This prevents a threaded traversal.
-
A node
-
The algorithm must run in worst-case linear time in regards to $n$, the size of the tree.
It's worth noting that if we remove any one of the four constraints, we can satisfy the the other three relatively easily.
My initial thought is that this is not possible. Simply because we can have no memory of where we are in the tree when traversing it.
We could however implement a successor and predecessor in-order traversal using binary searches to find successors and predecessors. This would take $O( n \log n)$.
Another thought was that we could use a non-deterministic algorithm. Possibly a Monte Carlo algorithm with repetition or a Las Vegas algorithm. This randomized algorithm could be used to do a random walk along one path, then have another path search for a compatible node based on the random walk's nodes. I'm not entirely sure if this would work, but these are just some thoughts.
If this is indeed not possible, as I am suspicious, I would also be interested in the following additional modification:
Can we find whether there exist 2 values in a balanced binary search tree where their sum equals a given target value?
Now the constraints:
-
Constant additional space in regards to $n$, the size of the tree. By this I mean you can use a constant number of $O(\log n)$ bit words however you wish. Assume references to nodes also take $\log n$ bits to store.
-
Read-only access to the tree. No modifications are allowed. This prevents a threaded traversal.
-
A node
n has the following attributes: n.left, n.right, n.value. This means we have no access to a node's parent. This prevents a successor and predecessor traversal. -
The algorithm must run in worst-case linear time in regards to $n$, the size of the tree.
It's worth noting that if we remove any one of the four constraints, we can satisfy the the other three relatively easily.
My initial thought is that this is not possible. Simply because we can have no memory of where we are in the tree when traversing it.
We could however implement a successor and predecessor in-order traversal using binary searches to find successors and predecessors. This would take $O( n \log n)$.
Another thought was that we could use a non-deterministic algorithm. Possibly a Monte Carlo algorithm with repetition or a Las Vegas algorithm. This randomized algorithm could be used to do a random walk along one path, then have another path search for a compatible node based on the random walk's nodes. I'm not entirely sure if this would work, but these are just some thoughts.
If this is indeed not possible, as I am suspicious, I would also be interested in the following additional modification:
- The algorithm must run in average-case linear time.
Solution
I don't know how to do this is in $O(n)$ time and $O(1)$ space, but I can show you how to do it in $O(n)$ time and $O(\lg \lg n)$ space. In particular, given any tree of depth $O(\lg n)$, I'll show how to do an in-order traversal in $O(n)$ time and $O(\lg \lg n)$ space. That will be enough to solve this problem.
Warmup
Let me first establish some notation. At any point in the traversal, we'll currently be at some leaf node $x_1$. Let $x_1,x_2,x_3,\dots,x_d$ be the nodes on the path from $x_1$ to the root (so $x_1$ is the leaf and $x_d$ is the root of the tree), where $d$ denotes the depth of the tree. Note that since the tree has depth $O(\lg n)$, it follows that this sequence has length $O(\lg n)$.
For convenience, I'll assume that the tree is a complete binary tree, i.e., all leaves are at the same depth. This simplifies explanation of the main ideas. All of the techniques here generalize to an arbitrary binary tree, as long as its maximum depth is $O(\lg n)$.
An easy way to do a traversal would be to keep track of the entire sequence $x_1,x_2,x_3,\dots,x_d$. When you want to find the successor, you update the entire sequence. For instance, if $x_1$ is the left child of $x_2$, this involves updating just $x_1$ to its right sibling. If $x_1$ is the right child of $x_2$ and $x_2$ is the left child of $x_3$, this involves updating $x_1$ and $x_2$. And so on. Notice that as we repeatedly compute the successor, the first parts of the sequence get updated more often: $x_1$ changes every time, $x_2$ changes half of the time, $x_3$ changes $1/4$ of the time, and so on. This gives an $O(n)$ time traversal (since $1 + 1/2 + 1/4 + \cdots = O(1)$), but it requires $O(\lg n)$ space.
The main attraction
Our improvement will be to store only part of that sequence: we'll store $x_1,x_2,x_4,x_8,\dots,x_k$ (where $k=2^{\lfloor \lg d \rfloor}$). In other words, we're only storing the power-of-two-indexed elements of the path from the current leaf to the root. Since $d=O(\lg n)$, we're only storing $O(\lg \lg n)$ elements of the sequence.
When we want to find the successor, we might need to update one or more of these elements. As we repeatedly iterate compute successors, the early parts change more often: $x_1$ always changes, $x_2$ changes $1/2$ of the time, $x_4$ changes $1/2^3$ of the time, $x_8$ changes $1/2^7$ of the time, $x_{16}$ changes $1/2^{15}$ of the time, and so on. How do we do the updates? To see if $x_2$ needs to change, we go down from $x_2$ to $x_1$ to find the successor to $x_1$. If $x_2$ doesn't need to change, we're done. Otherwise, we go down from $x_4$ to $x_2$ to find the successor to $x_2$ (at the same height as $x_2$). And so on. When $x_{2^i}$ changes, we go down from $x_{2^{i+1}}$ to $x_{2^i}$ to compute the successor to $x_{2^i}$ at the same height, and this takes $2^i$ steps of computation since we have to traverse $2^i$ levels of the tree; this happens for a $1/2^{2^i-1}$ fraction of the successor operations. Taking all of that into account, the amortized amount of work we do per successor operation is
$$1 + {2 \over 2} + {4 \over 2^3} + {8 \over 2^7} + {16 \over 2^{15}} + \cdots
= O(1).$$
This yields an in-order traversal that takes $O(n)$ time and $O(\lg \lg n)$ space.
Bonus answer
As an extra bonus, I'll show how to do an in-order traversal in $O(n \lg^ n)$ time and $O(\lg^ n)$ space. $O(\lg^* n)$ "might as well be constant" for all practical purposes, so maybe this is of interest.
The construction is a small modification of the scheme I gave above. Define $a_1,a_2,a_3,\dots$ by $a_1 = 1$ and $a_{i+1} = 2^{a_i}$. Now, instead of storing the whole sequence $x_1,\dots,x_d$, we'll store just the elements $x_{a_1},x_{a_2},x_{a_3},\dots,x_{a_\ell}$, where $\ell= \lg^ d = \lg^ (\lg n)$. Notice that $x_{a_i}$ changes for a $1/2^{a_i-1}$ fraction of the successor operations, and when it does change, we do $a_{i+1} - a_i$ work. Therefore, the amortized time per successor operation is
$$T = \sum_{i=1}^\ell {a_{i+1} - a_i \over 2^{a_i-1}}.$$
With a bit of manipulation we can show
$$T \le \sum_{i=1}^\ell {a_{i+1} \over 2^{a_i-1}} = \sum_{i=1}^\ell {1 \over 2} = \ell/2 = O(\lg^ (\lg n)) = O(\lg^ n).$$
Therefore, a traversal of the entire tree can be done $O(n \lg^ n)$ time, using $O(\lg^ n)$ time.
Warmup
Let me first establish some notation. At any point in the traversal, we'll currently be at some leaf node $x_1$. Let $x_1,x_2,x_3,\dots,x_d$ be the nodes on the path from $x_1$ to the root (so $x_1$ is the leaf and $x_d$ is the root of the tree), where $d$ denotes the depth of the tree. Note that since the tree has depth $O(\lg n)$, it follows that this sequence has length $O(\lg n)$.
For convenience, I'll assume that the tree is a complete binary tree, i.e., all leaves are at the same depth. This simplifies explanation of the main ideas. All of the techniques here generalize to an arbitrary binary tree, as long as its maximum depth is $O(\lg n)$.
An easy way to do a traversal would be to keep track of the entire sequence $x_1,x_2,x_3,\dots,x_d$. When you want to find the successor, you update the entire sequence. For instance, if $x_1$ is the left child of $x_2$, this involves updating just $x_1$ to its right sibling. If $x_1$ is the right child of $x_2$ and $x_2$ is the left child of $x_3$, this involves updating $x_1$ and $x_2$. And so on. Notice that as we repeatedly compute the successor, the first parts of the sequence get updated more often: $x_1$ changes every time, $x_2$ changes half of the time, $x_3$ changes $1/4$ of the time, and so on. This gives an $O(n)$ time traversal (since $1 + 1/2 + 1/4 + \cdots = O(1)$), but it requires $O(\lg n)$ space.
The main attraction
Our improvement will be to store only part of that sequence: we'll store $x_1,x_2,x_4,x_8,\dots,x_k$ (where $k=2^{\lfloor \lg d \rfloor}$). In other words, we're only storing the power-of-two-indexed elements of the path from the current leaf to the root. Since $d=O(\lg n)$, we're only storing $O(\lg \lg n)$ elements of the sequence.
When we want to find the successor, we might need to update one or more of these elements. As we repeatedly iterate compute successors, the early parts change more often: $x_1$ always changes, $x_2$ changes $1/2$ of the time, $x_4$ changes $1/2^3$ of the time, $x_8$ changes $1/2^7$ of the time, $x_{16}$ changes $1/2^{15}$ of the time, and so on. How do we do the updates? To see if $x_2$ needs to change, we go down from $x_2$ to $x_1$ to find the successor to $x_1$. If $x_2$ doesn't need to change, we're done. Otherwise, we go down from $x_4$ to $x_2$ to find the successor to $x_2$ (at the same height as $x_2$). And so on. When $x_{2^i}$ changes, we go down from $x_{2^{i+1}}$ to $x_{2^i}$ to compute the successor to $x_{2^i}$ at the same height, and this takes $2^i$ steps of computation since we have to traverse $2^i$ levels of the tree; this happens for a $1/2^{2^i-1}$ fraction of the successor operations. Taking all of that into account, the amortized amount of work we do per successor operation is
$$1 + {2 \over 2} + {4 \over 2^3} + {8 \over 2^7} + {16 \over 2^{15}} + \cdots
= O(1).$$
This yields an in-order traversal that takes $O(n)$ time and $O(\lg \lg n)$ space.
Bonus answer
As an extra bonus, I'll show how to do an in-order traversal in $O(n \lg^ n)$ time and $O(\lg^ n)$ space. $O(\lg^* n)$ "might as well be constant" for all practical purposes, so maybe this is of interest.
The construction is a small modification of the scheme I gave above. Define $a_1,a_2,a_3,\dots$ by $a_1 = 1$ and $a_{i+1} = 2^{a_i}$. Now, instead of storing the whole sequence $x_1,\dots,x_d$, we'll store just the elements $x_{a_1},x_{a_2},x_{a_3},\dots,x_{a_\ell}$, where $\ell= \lg^ d = \lg^ (\lg n)$. Notice that $x_{a_i}$ changes for a $1/2^{a_i-1}$ fraction of the successor operations, and when it does change, we do $a_{i+1} - a_i$ work. Therefore, the amortized time per successor operation is
$$T = \sum_{i=1}^\ell {a_{i+1} - a_i \over 2^{a_i-1}}.$$
With a bit of manipulation we can show
$$T \le \sum_{i=1}^\ell {a_{i+1} \over 2^{a_i-1}} = \sum_{i=1}^\ell {1 \over 2} = \ell/2 = O(\lg^ (\lg n)) = O(\lg^ n).$$
Therefore, a traversal of the entire tree can be done $O(n \lg^ n)$ time, using $O(\lg^ n)$ time.
Context
StackExchange Computer Science Q#77452, answer score: 5
Revisions (0)
No revisions yet.