patternMinor
Order-preserving update of a sublist of a list of mutable objects in sublinear time
Viewed 0 times
orderupdatesublistobjectspreservingtimesublinearlistmutable
Problem
Description
Say I have a source list like:
Is it possible to make the insertions and removals in the derived list, while maintaining order, in faster than $O(n)$ time, $n$ being the size of the list?
Finally, I'd like to be able to know the index of where the insertion or removal takes place in the derived list in faster than $O(n)$ time.
An idea that doesn't work
I could keep the derived list in a tree, sorted by the source index. This way, when I needed to insert my item in derived tree-list, it would take $O(log n)$. However, I would not be able to know that index of that insertion without walking the entire tree.
Thanks for your help!
Problem Statement
Let $T(i)$ be a function that:
Can $T$ and its side effects be faster than $O(n)$?
Any list-like data structures can be used within these constraints:
Say I have a source list like:
["a","b","c","d"] and want to run a capitalization filter so if I change, "b" to "B", my derived list looks like ["B"]. Next, if I change my source list's "d" to "D", the derived list should look like ["B","D"].Is it possible to make the insertions and removals in the derived list, while maintaining order, in faster than $O(n)$ time, $n$ being the size of the list?
Finally, I'd like to be able to know the index of where the insertion or removal takes place in the derived list in faster than $O(n)$ time.
An idea that doesn't work
I could keep the derived list in a tree, sorted by the source index. This way, when I needed to insert my item in derived tree-list, it would take $O(log n)$. However, I would not be able to know that index of that insertion without walking the entire tree.
Thanks for your help!
Problem Statement
- Let $S$ be an ordered list
- Let $Si.filterProperty$ be $true$ for some elements in $S$
- Let $F$ be an ordered list of all elements in $S$ where $Si.filterProperty$ is true
Let $T(i)$ be a function that:
- Sets $Si.filterProperty$ to $true$
- Inserts $Si$ into $F$, ordered in the same fashion as $S$
- Returns the index in $F$ where $Si$ was inserted
Can $T$ and its side effects be faster than $O(n)$?
Any list-like data structures can be used within these constraints:
- Insertion and side effects of insertion are faster than $O(n)$
- We must know the index of the newly inserted item into $F$
Solution
At the time of writing I was not absolutely sure what the problem
was. This lead to a more general second answer. See details in
discussion at the end of this answer.
Apparently, the "idea that does not work" does not work because you do
not know the index of an element $S_i$ in $F$ organized as a tree.
This is what is addressed here. The problem is restated for clarity, as your statement of it is a bit hard to follow.
Restatement of the problem:
Given a set $S$ with an order relation, maintain an ordered list $F$
where elements of $S$ can be inserted or removed, and such that the
index of an element in $F$ is returned when the element is inserted.
Actually, the solution will do more, as it is possible to access any
element of $F$ given its index, or to retrieve the index given an
element of $F$. All operations are in time $\log n$.
Solution:
You keep the ordered derived list $F$ in a self-balancing binary
tree. I checked with AVL trees, but other types (such as red-black
trees) should work as well. In each node $N$ of the tree you keep a
count $\text{weight}(N)$ of the number of elements of $F$ in the subtree
rooted in $N$, node $N$ included.
This can be done by updating weight counts as you walk down from the
root of the AVL tree for $F$ to reach the place for the insertion or deletion.
Then, during the retracing phase that rebalances the tree, when
needed, you also update the weight counts according to the
rebalancing. That only add constant cost to each elementary step,
thus does not change the $\log n$ time complexity of the algorithm.
Hence it is easy to use this weight to compute in $\log n$ time the
index of an element of the list $F$ when accessing it, either on the
way down from the AVL tree root, or by walking up to the root when
accessing the element directly.
The index is computed on the fly by adding the weights of all the immediate
siblings on the left of the path from the AVL tree root to the
concerned element of $F$.
Conversely, it is easy to find in $\log n$ time an element of $F$ from
its (left to right) index by walking down from the root and leaving
enough elements on the left.
Computing the index of $S_i$ in $F$ using weights:
Recall that the weight of a node $N$ is the number of nodes in the subtree rooted at $N$, node $N$ included.
Computing the index of $S_i$ from the weights in the tree is precisely
done as follows: you simply do a search for $S_i$ in the standard way of
binary search trees, with a modification to compute the index. Below
is a pseudo-code recursive version, adapted from the searching algorithm in
wikipedia. It raises an exception if $S_i$ is not in the AVL tree, and
the index count start at 0 for the first element. To have the count start at 1, the first return statement should be
When the path from the root goes to the right daughter, that means
that all nodes in the left daugther subtree, plus the current node, precede
$S_i$, and have to be added to whatever other predecessors will be
found while searching in the right daughter subtree.
I do not give here the modifications of the insertion and deletion code necessary to update the weight count, as this is a bit too long for an answer here, and not so hard to do.
About the question, and the two proposed answers
The precise specification of the problem took some time to agree upon.
This first answer to the question was written on the basis of an
initial specification that did not mention explicitly that the source list $S$
could be modified dynamically, by insertion or deletion, thus possibly
inducing a similar modification in the filtered list $F$.
The elements of list $S$ are mutable objects and have a filter-property
$P_F$ that may be true or false depending on operations performed on
an element. The purpose of the question is to maintain a sublist $F$ of
elements $s\in S$ such that $P_F(s)$, similarly ordered, such that the
rank in $F$ of an element $s\in S$ is known when it in inserted in $F$ due to
a mutation.
I assume in this first answer here that the list $S$ is constant,
except for changes to its mutable elements. Thus on may see the list
$S$ as a set (of list elements), totally ordered by the order of the
list, and that relative order of two elements can be checked in
constant time $O(1)$, for example by attaching its $S$-list rank to
each element.
In a much later comment to this answer, the author of the question
made it clear that he also wanted to be able to modify the list $S$
with
was. This lead to a more general second answer. See details in
discussion at the end of this answer.
Apparently, the "idea that does not work" does not work because you do
not know the index of an element $S_i$ in $F$ organized as a tree.
This is what is addressed here. The problem is restated for clarity, as your statement of it is a bit hard to follow.
Restatement of the problem:
Given a set $S$ with an order relation, maintain an ordered list $F$
where elements of $S$ can be inserted or removed, and such that the
index of an element in $F$ is returned when the element is inserted.
Actually, the solution will do more, as it is possible to access any
element of $F$ given its index, or to retrieve the index given an
element of $F$. All operations are in time $\log n$.
Solution:
You keep the ordered derived list $F$ in a self-balancing binary
tree. I checked with AVL trees, but other types (such as red-black
trees) should work as well. In each node $N$ of the tree you keep a
count $\text{weight}(N)$ of the number of elements of $F$ in the subtree
rooted in $N$, node $N$ included.
This can be done by updating weight counts as you walk down from the
root of the AVL tree for $F$ to reach the place for the insertion or deletion.
Then, during the retracing phase that rebalances the tree, when
needed, you also update the weight counts according to the
rebalancing. That only add constant cost to each elementary step,
thus does not change the $\log n$ time complexity of the algorithm.
Hence it is easy to use this weight to compute in $\log n$ time the
index of an element of the list $F$ when accessing it, either on the
way down from the AVL tree root, or by walking up to the root when
accessing the element directly.
The index is computed on the fly by adding the weights of all the immediate
siblings on the left of the path from the AVL tree root to the
concerned element of $F$.
Conversely, it is easy to find in $\log n$ time an element of $F$ from
its (left to right) index by walking down from the root and leaving
enough elements on the left.
Computing the index of $S_i$ in $F$ using weights:
Recall that the weight of a node $N$ is the number of nodes in the subtree rooted at $N$, node $N$ included.
Computing the index of $S_i$ from the weights in the tree is precisely
done as follows: you simply do a search for $S_i$ in the standard way of
binary search trees, with a modification to compute the index. Below
is a pseudo-code recursive version, adapted from the searching algorithm in
wikipedia. It raises an exception if $S_i$ is not in the AVL tree, and
the index count start at 0 for the first element. To have the count start at 1, the first return statement should be
return node.left.weight+1.function Find-recursive(S_i, node): // call initially with node = root
if node = Null then
raise Key_Not_Found
else if node.key = S_i then
if node.left = Null then return 0
else return node.left.weight
else if key < node.key then
return Find-recursive(key, node.left)
else
if node.left = Null then return Find-recursive(key, node.right)+1
else return Find-recursive(key, node.right)+node.left.weight+1When the path from the root goes to the right daughter, that means
that all nodes in the left daugther subtree, plus the current node, precede
$S_i$, and have to be added to whatever other predecessors will be
found while searching in the right daughter subtree.
I do not give here the modifications of the insertion and deletion code necessary to update the weight count, as this is a bit too long for an answer here, and not so hard to do.
About the question, and the two proposed answers
The precise specification of the problem took some time to agree upon.
This first answer to the question was written on the basis of an
initial specification that did not mention explicitly that the source list $S$
could be modified dynamically, by insertion or deletion, thus possibly
inducing a similar modification in the filtered list $F$.
The elements of list $S$ are mutable objects and have a filter-property
$P_F$ that may be true or false depending on operations performed on
an element. The purpose of the question is to maintain a sublist $F$ of
elements $s\in S$ such that $P_F(s)$, similarly ordered, such that the
rank in $F$ of an element $s\in S$ is known when it in inserted in $F$ due to
a mutation.
I assume in this first answer here that the list $S$ is constant,
except for changes to its mutable elements. Thus on may see the list
$S$ as a set (of list elements), totally ordered by the order of the
list, and that relative order of two elements can be checked in
constant time $O(1)$, for example by attaching its $S$-list rank to
each element.
In a much later comment to this answer, the author of the question
made it clear that he also wanted to be able to modify the list $S$
with
Code Snippets
function Find-recursive(S_i, node): // call initially with node = root
if node = Null then
raise Key_Not_Found
else if node.key = S_i then
if node.left = Null then return 0
else return node.left.weight
else if key < node.key then
return Find-recursive(key, node.left)
else
if node.left = Null then return Find-recursive(key, node.right)+1
else return Find-recursive(key, node.right)+node.left.weight+1Context
StackExchange Computer Science Q#43447, answer score: 3
Revisions (0)
No revisions yet.