patternpythonMinor
Using range minimal query for lowest common ancestor (LCA)
Viewed 0 times
rangequerylowestlcaforusingancestorminimalcommon
Problem
Here is my code to use range minimal query to resolve the LCA problem. Any advice on code bugs, performance improvement in terms of algorithm time complexity, code style are appreciated.
More specifically, my questions are:
Major ideas of my idea are:
```
from collections import defaultdict
class TreeNode:
def __init__(self, value, left, right):
self.value = value
self.left = left
self.right = right
class Solution:
def __init__(self):
self.index_node_map = defaultdict(int)
self.node_index_map = defaultdict(int)
self.depth = []
self.rmp_map = defaultdict(tuple) # key: (index, step), value: (min value, index)
def in_order(self, root, level):
if root.left:
self.in_order(root.left, level+1)
self.depth.append(level)
self.index_node_map[len(self.depth)-1]=root.value
self.node_index_map[root.value] = len(self.depth)-1
if root.right:
self.in_order(root.right, level+1)
def build_depth_index(self, root, level):
self.in_order(root, level)
print self.depth
print self.index_node_map
print self.node_index_map
self.build_rmq_map(self.depth)
def build_rmq_map(self, result):
step = len(result).bit_length()
for i in range(len(result)):
self.rmp_map[(i,0)]=(result[i],i)
for s in range(1, step+1):
for i in range(len(result)):
if i + 2 **(s-1) < len(
More specifically, my questions are:
- I am using in-order traverse to build depth array. Should I use other traverse, like pre-order (dfs), level-order, or post order?
- I am using two additional data structures,
index_node_mapandnode_index_map. I'm wondering if there are any ways to remove them, as they make my code seem lengthy.
Major ideas of my idea are:
- Build depth array of each node by in-order traverse
- Build MRQ index for depth array built in first step
- For any two nodes, their LCA are the lowest depth node between them (in depth array)
```
from collections import defaultdict
class TreeNode:
def __init__(self, value, left, right):
self.value = value
self.left = left
self.right = right
class Solution:
def __init__(self):
self.index_node_map = defaultdict(int)
self.node_index_map = defaultdict(int)
self.depth = []
self.rmp_map = defaultdict(tuple) # key: (index, step), value: (min value, index)
def in_order(self, root, level):
if root.left:
self.in_order(root.left, level+1)
self.depth.append(level)
self.index_node_map[len(self.depth)-1]=root.value
self.node_index_map[root.value] = len(self.depth)-1
if root.right:
self.in_order(root.right, level+1)
def build_depth_index(self, root, level):
self.in_order(root, level)
print self.depth
print self.index_node_map
print self.node_index_map
self.build_rmq_map(self.depth)
def build_rmq_map(self, result):
step = len(result).bit_length()
for i in range(len(result)):
self.rmp_map[(i,0)]=(result[i],i)
for s in range(1, step+1):
for i in range(len(result)):
if i + 2 **(s-1) < len(
Solution
This review describes my thoughts as a fellow competitive programmer.
About your questions
Major problems
-
You have God object in the solution, which, well, "solves the problem". Meaning, it mixes together all of the following:
Moreover, all these are highly tied together and you cannot, say, debug RMQ separately. It is a very common problem among competitive programmers (especially beginners). Though, some do not consider it to be a problem, but you're posting on codereview.
-
you represent two-dimensional array with
My suggestions
Get rid of God object
I'd suggest the following objects and functions:
Of course, that design can be de-coupled even further. Say,
Other suggestions
About your questions
- In-order traversal is a must here. Otherwise the RMQ-based algorithm won't work - it requires that for each vertexes A and B their LCA occurs between any occurrences of A and B.
- You will need some kind of maps to map tree node to their position in the RMQ array. So, at least one of them should stay. However, you can get rid of
index_node_mapif you make RMQ store node's "value" (which is actually node's id, as far as I understood from your code).
Major problems
- Your code does not satisfy PEP 8 at least in terms of whitespaces (it's easy to check and fix with tools like
pep8,flake8,autopep8), that significantly reduces readability.
-
You have God object in the solution, which, well, "solves the problem". Meaning, it mixes together all of the following:
- In-order traversal of a tree.
- Mapping between tree nodes and their indexes in the traversal.
- Calculating depths of vertexes.
- Reducing LCA to RMQ with
- Solution of RMQ problem with sparse table.
Moreover, all these are highly tied together and you cannot, say, debug RMQ separately. It is a very common problem among competitive programmers (especially beginners). Though, some do not consider it to be a problem, but you're posting on codereview.
- You initialize object outside of constructor, in the
build_depth_indexfunction. It may lead to a false conclusion that one may re-use the object with different inputs by callingbuild_depth_indexwith new tree. Despite "work in the constructor" being a kind of anti-pattern, I feel it's OK to move all initialization to the constructor in this case:
- You won't be able to reuse or not initialize the object.
- You will have less mutable state to care about (as opposed to the "initialize everything in
build_depth_indexand make the object reusable" approach), which is generally helpful for debugging.
- There are a bunch of implicit invariants, e.g.:
- You assume that all values in the tree are different. It's not enforced with
asserts, so it's possible to accidentally apply your algorithm to "wrong" tree and get "almost correct" results.
- In the
in_orderfunction you assume that each vertex occurs in the in-order traversal only once (because you "reenumerate" vertexes), but it's not the case for non-binary trees. Algorithm is more general than that. I'd suggest to avoid writing general algorithms for some "specific" cases because that is prone to copy-pasting and makes the algorithm harder to debug.
- There is some debug output in the middle, which cannot be switched off.
- No need in
defaultdicts. I'd suggest to avoiddefaultdicts when implementing data structures (esp. RMQ) because typically you should initialize the structure first and use it afterwards.
-
self.rmp_map = defaultdict(tuple) # key: (index, step), value: (min value, index)you represent two-dimensional array with
dict. Please don't do that, use arrays.- Use
namedtupleinstead of regular tuples to improve readability.
My suggestions
Get rid of God object
I'd suggest the following objects and functions:
class Rmq:
def __init__(self, array, key):
"""Initializes RMQ structure for 'array', assuming that the value
being minimized is returned by the 'key' function, if applied to
array element"""
def get_minimal_element(self, l, r):
"""Returns minimal element of array from position l to r (inclusive)"""
def get_in_order_traverse(tree):
"""Traverses tree in-order and returns a list of nodes."""
def get_depths(tree):
"""Calculates depths of all vertexes in the tree and returns mapping
from vertex's value to its depth (root's depth is 0)."""
class Lca:
def __init__(self, tree):
"""Initializes LCA solver for the given tree."""
def get_lca(self, a, b):
"""Given two vertexes a and b returns their LCA as a TreeNode."""Of course, that design can be de-coupled even further. Say,
Lca.__init__ can take an optional argument specifying Rmq's factory so we can plug different implementations of Rmq for debugging.Other suggestions
- Rename
TreeNode.valuetoTreeNode.id, because LCA does care about node "values", it cares about their ids only.
- Even better approach is to make
TreeNodes hashable by overloading__hash__and__eq__so functions do not depend anymore on "what is node's id and how to find it".
- Replace
TreeNode.leftandTreeNode.rightwith a list calledTreeNode.children. For the sake of generality - algorithm does not care.
- Use
loggingmodule for debug output so it can be switched on/off dynamically.
- Use
namedtupleinstead of tuples.
- In your sparse table implementation you can ignore all those segments which extend beyond array's length and get rid of
if. You never need to access them, as all requests to RMQ lie inside the original array.
Code Snippets
class Rmq:
def __init__(self, array, key):
"""Initializes RMQ structure for 'array', assuming that the value
being minimized is returned by the 'key' function, if applied to
array element"""
def get_minimal_element(self, l, r):
"""Returns minimal element of array from position l to r (inclusive)"""
def get_in_order_traverse(tree):
"""Traverses tree in-order and returns a list of nodes."""
def get_depths(tree):
"""Calculates depths of all vertexes in the tree and returns mapping
from vertex's value to its depth (root's depth is 0)."""
class Lca:
def __init__(self, tree):
"""Initializes LCA solver for the given tree."""
def get_lca(self, a, b):
"""Given two vertexes a and b returns their LCA as a TreeNode."""Context
StackExchange Code Review Q#154319, answer score: 7
Revisions (0)
No revisions yet.