HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Finding the tree height in Python

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
theheightpythonfindingtree

Problem

I'm trying to get an efficient algorithm to calculate the height of a tree in Python for large datasets. The code I have works for small datasets, but takes a long time for really large ones (100,000 items) so I'm trying to figure out ways to optimize it but am getting stuck. Sorry if it seems like a really newbie question, I'm pretty new to Python.

The input is a list length and a list of values, with each list item pointing to its parent, with list item -1 indicating the root of the tree. So with an input of:

5
4 -1 4 1 1


The answer would be 3 — the tree is: ({key:1, children: [{key: 3}, {key:4, children:[{key:0, {key:2}]}] }

Here is the code that I have so far:

import sys, threading
sys.setrecursionlimit(10**7) # max depth of recursion
threading.stack_size(2**25)  # new thread will get stack of such size

class TreeHeight:
    def read(self):
        self.n = int(sys.stdin.readline())
        self.parent = list(map(int, sys.stdin.readline().split()))

    def getChildren(self, node, nodes):
        parent = {'key': node, 'children': []}
        children = [i for i, x in enumerate(nodes) if x == parent['key']]
        for child in children:
            parent['children'].append(self.getChildren(child, nodes))
        return parent

    def compute_height(self, tree):
        if len(tree['children']) == 0:
            return 0
        else:
            max_values = []
            for child in tree['children']:
                max_values.append(self.compute_height(child))
            return 1 + max(max_values)

def main():
  tree = TreeHeight()
  tree.read()
  treeChild = tree.getChildren(-1, tree.parent)
  print(tree.compute_height(treeChild))

threading.Thread(target=main).start()

Solution

You don't need to explicitly build the tree to compute its depth.

class TreeDepth(object):
    def __init__(self, parents):
        self._parents = parents
        self._n = len(parents)
        self._max_depth = None
        self._depths = [None] * self._n

    def max_depth(self):
        if self._max_depth is None:
            for idx, parent in enumerate(self._parents):
                depth = self.get_depth(idx)
                if self._max_depth >> TreeDepth([4, -1, 4, 1, 1]).max_depth()
3


If my math is good, this goes over each item in self._parents at most twice, so it will have \$O(n)\$ performance. Also, I have used a recursive approach to compute the depth, but if speed is your main goal, you will probably want to turn that into an iterative approach:

def max_depth2(self):
    if self._max_depth is not None:
        return self._max_depth
    for idx, parent in enumerate(self._parents):
        parent_stack = []
        while parent != -1 and self._depths[idx] is None:
            parent_stack.append(idx)
            idx, parent = parent, self._parents[parent]
        if parent == -1:
            depth = 1
        else:
            depth = self._depths[idx]
        while parent_stack:
          self._depths[parent_stack.pop()] = depth
          depth += 1
        if self._max_depth < depth:
            self._max_depth = depth
    return self._max_depth

Code Snippets

class TreeDepth(object):
    def __init__(self, parents):
        self._parents = parents
        self._n = len(parents)
        self._max_depth = None
        self._depths = [None] * self._n

    def max_depth(self):
        if self._max_depth is None:
            for idx, parent in enumerate(self._parents):
                depth = self.get_depth(idx)
                if self._max_depth < depth:
                    self._max_depth = depth
        return self._max_depth

    def get_depth(self, idx):
        depth = self._depths[idx]
        if depth is not None:
            return depth
        parent = self._parents[idx]
        if parent == -1:
            depth = 1
        else:
            depth = self.get_depth(parent) + 1
        self._depths[idx] = depth
        return depth

>>> TreeDepth([4, -1, 4, 1, 1]).max_depth()
3
def max_depth2(self):
    if self._max_depth is not None:
        return self._max_depth
    for idx, parent in enumerate(self._parents):
        parent_stack = []
        while parent != -1 and self._depths[idx] is None:
            parent_stack.append(idx)
            idx, parent = parent, self._parents[parent]
        if parent == -1:
            depth = 1
        else:
            depth = self._depths[idx]
        while parent_stack:
          self._depths[parent_stack.pop()] = depth
          depth += 1
        if self._max_depth < depth:
            self._max_depth = depth
    return self._max_depth

Context

StackExchange Code Review Q#128635, answer score: 4

Revisions (0)

No revisions yet.