HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMinor

Lopsided Trees and Recursion

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
andlopsidedrecursiontrees

Problem

The original problem is:


Define the height of a binary tree to be the number of nodes in the
longest path from the root to a leaf. The empty tree is considered to
have height 0. A node is k-balanced if its left and right subtrees
differ in height by at most k. A tree is k-balanced if all of its
nodes are k-balanced. The empty tree is considered to be k-balanced.


For example, the tree below has height 4.

o
   / \
  o   o
 / \
o   o
   /
  o




This tree is 2-balanced but not 1-balanced, because the left subtree
of the root has height 3 and the right subtree of the root has height
1. Your task is to write a method that takes a balance factor k and a number of nodes n and returns the maximum height of a k-balanced tree
with n nodes.

One of the answer(right and fast) is:

int fewest_node(int k,int h)
{
    if(h  n)
            return h - 1;
}


and my solution(terribly slow) is:

int maxHeight(int k, int n)
{
    int i,left,right;
    if(k >= n)
        return n;
    if(n > 2)
    {
        left = maxHeight(k,n-1);
        if(left  k)
            {
                i--;
                left = maxHeight(k,i);
                right = maxHeight(k,n-1-i);
            }
            return left+1;
        }
    }
    else 
        return n;
}


My question:

-
Is my solution really correct? I mean, does it work regardless of its efficiency? Or in other words, is it right logically? (BTW, I think it's right, and I have tested the output when n is less than 30)

-
Why is my solution so slow? As you can see, I'm a newbie to DSA.

Solution

I would consider your first question as offtopic. So I will only answer the second question.

For each call to maxHeight you make a lot of recursive calls. Consider what happens if n is large. Say n=100, then can you tell me how many times you call maxHeight? I can't work it out other than "lots". Also consider that you will be calling maxHeight with the same arguments very often, calculating the same results over and over.

Lets look at an example recursive function that bears some resemblance to your code:

int f(int n)
{
    if (n == 0)
        return 1;

    int sum = 0;
    while(n--){ // In your code: while((left - right) > k)
        sum += f(n);
    }
}


While this looks innocent, even for small n you will iterate a really long time. For f(1) you will need one call to f(0). For f(2) you will need one call to f(1) (which calls f(0)) and one to f(0) which totals 3 calls. For f(3) you will need one call to f(2) (which then does 3 more calls), f(1) (which does 1 extra call for you) and f(0) i.e. (3+1) + (1+1) + 1 = 7. Continuing on f(4) will need (7 + 1) + (3 + 1) + (1+1) + 1 = 15. I think you can see where this is heading. For f(n) you will need 2^n-1 calls. So for n=300 you have 2^300 ~= 10^100 calls to f(0). To put that into perspective considering that it's been estimated that there are about 10^80 atoms in the observable universe. I wouldn't want to stick around and wait until that completes. Not even if every atom in the observable universe was a 1 THz computer working optimally in parallel to calculate this.

Your code exhibits the same type of pattern, you make many recursive calls per call to maxHeight so you are probably even worse off.

Comparing to:

int fewest_node(int k,int h)
{
    if(h  n)
            return h - 1;
}


Calling fewest_node(int j, int h) will result in at most 2^h recursive calls, which is still a lot, but less than your code. But here is the catch, fewest_node() is very simple in structure and the compiler can expand the function in a way similar to loop-unrolling which removes a lot of iteration and overhead. This is a case of K.I.S.Silly.

One way that both of the solutions can be speed up dramatically is by using Dynamic Programming, like this:

#define MAX_N 1000 // Adjust to taste
#define MAX_K 100

int dp_table[MAX_N *MAX_K]={0}; // Not sure if this is legal, otherwise use memset

int fewest_node(int k,int h)
{
    int index = k*MAX_N+h;
    if(!dp_table[index]){
        if(h  n)
            return h - 1;
}


Simply store each calculated result for fewest_node and start from the bottom up with small h. You can also dynamically allocated a table of the required size when you have received your n and k for the problem. This will drop the worst case time from 2^n to n*k which is eons faster. You can do the same for your code. Just introduce a table of cached results.

Code Snippets

int f(int n)
{
    if (n == 0)
        return 1;

    int sum = 0;
    while(n--){ // In your code: while((left - right) > k)
        sum += f(n);
    }
}
int fewest_node(int k,int h)
{
    if(h <= 0)
        return 0;
    return 1+fewest_node(k,h-1)+fewest_node(k,h-1-k);
}

int maxHeight2(int k, int n)
{
    for(int h = 1;;h++)
        if(fewest_node(k,h) > n)
            return h - 1;
}
#define MAX_N 1000 // Adjust to taste
#define MAX_K 100

int dp_table[MAX_N *MAX_K]={0}; // Not sure if this is legal, otherwise use memset

int fewest_node(int k,int h)
{
    int index = k*MAX_N+h;
    if(!dp_table[index]){
        if(h <= 0)
            return 0;
        dp_table[index] = 1+fewest_node(k,h-1)+fewest_node(k,h-1-k);    
    }
    return dp_table[index];
}

int maxHeight2(int k, int n)
{
    for(int h = 1;;h++)
        if(fewest_node(k,h) > n)
            return h - 1;
}

Context

StackExchange Code Review Q#66835, answer score: 6

Revisions (0)

No revisions yet.