HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Longest repeating and non-overlapping substring

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
nonsubstringrepeatingoverlappingandlongest

Problem

Given a string find the length of longest repeating non-overlapping substring in it. In other words, find 2 identical substrings of maximum length which do not overlap.

Examples:

Input : str = "aabaabaaba"

Output : 4(aaba)

Input : str = "aaaaaaaaaaa"

Output : 5(aaaaa)

Input : str = "banana"

Output : 2(an or na)

I have written following recursive code.

#include
using namespace std;

void f(string str, int i, int j, int len,int& mx_len){
    mx_len = max(mx_len,len);       
    if(j>=str.length())
        return; 
    if(str[i]==str[j] && (j-i) > len)
        f(str,i+1,j+1,len+1,mx_len);    
    f(str,i+1,j+1,0,mx_len);    
    f(str,i,j+1,0,mx_len);
}

int main(){
    string s;
    cin>>s;
    int mx_len=0;   
    f(s,0,1,0,mx_len);  
    cout<<mx_len<<endl;
    return 0;
}


I do understand that this can be done with dynamic programming. My focus here is on learning recursive style.

I am writing 3 statements that are causing recursive calls. Can the number of statements that cause recursive calls be reduced?

What changes,apart from making it dynamic programming (by caching or tabulation), can be done in this code to make it better?

Also do comment on coding style.

Solution

Includes and namespaces

First of all, bits/stdc++.h isn't part of the C++ standard. Instead, only include the files you actually need:

#include 
#include 


Next, it's considered bad practice to use using namespace …, so let's get rid of it.

Your function f

There are several things we can improve here. First of all, it's name is ambiguous. What does f do? It calculates the length of the longest repeating and non-overlapping substring. Let's give it a better name and type:

size_t longest_repeating_substring_length(const std::string & str){
    …
}


'But wait', I here you say. 'That's not my function anymore'. It sure isn't. The problem with your function is that it's possible to use it wrong, e.g.

f("123", 100, 0, 0,mx_len);
//       ^^^ oh oh


It's a fine way to implement the dynamic approach, but it shouldn't be available to the user by default. Therefore, let's "hide" it:

namespace detail {
    void longest_repeating_substring_length(
        const std::string & str, size_t i, size_t j, size_t len, size_t & mx_len
    ){
        mx_len = std::max(mx_len,len);

        if(j >= str.length()) {
            return; 
        }
        if(str[i] == str[j] && (j - i) > len) {
            longest_repeating_substring_length(str,i+1,j+1,len+1,mx_len);    
        }
        longest_repeating_substring_length(str,i+1,j+1,0,mx_len);    
        longest_repeating_substring_length(str,i,j+1,0,mx_len);
    }
}


Why is the const std::string& necessary? First of all, const will make sure that we don't accidentially change your string in our function. And & will get rid of additional copies that would happen throughout the execution.

Note that we can get rid of mx_len:

namespace detail {
    size_t longest_repeating_substring_length(
        const std::string & str, size_t i, size_t j, size_t len
    ){
        size_t mx_len = len;

        if(j >= str.length()) {
            return mx_len; 
        }
        if(str[i] == str[j] && (j - i) > len) {
            mx_len = longest_repeating_substring_length(str,i+1,j+1,len+1);    
        }
        mx_len = std::max(mx_len, longest_repeating_substring_length(str,i+1,j+1,0));    
        mx_len = std::max(mx_len, longest_repeating_substring_length(str,i,j+1,0);
        return mx_len;
    }
}


Depending on which variant you'll use, you end up with the following code:

namespace detail {
    void longest_repeating_substring_length(
        const std::string & str, size_t i, size_t j, size_t len, size_t & mx_len
    ){
        mx_len = std::max(mx_len,len);

        if(j >= str.length()) {
            return; 
        }
        if(str[i] == str[j] && (j - i) > len) {
            longest_repeating_substring_length(str,i+1,j+1,len+1,mx_len);    
        }

        longest_repeating_substring_length(str,i+1,j+1,0,mx_len);    
        longest_repeating_substring_length(str,i,j+1,0,mx_len);
    }
}

size_t longest_repeating_substring_length(const std::string & str){
    size_t max_length = 0;
    detail::longest_repeating_substring_length(str, 0, 1, 0, max_length);

    return max_length;
}


Easy-to-use function at global namespace, and function that can possibly get used in a wrong way hidden away in detail.

Which brings us to your new main:

int main(){
    std::string s;
    std::cin >> s;

    std::cout<< longest_repeating_substring_length(s) <<std::endl;

    return 0;
}


Less recursive calls

To get the number of recursive calls down, we have to ask ourselfs "how large can mx_len get in that particular call?"

Well, if we start at position I and J, the maximum we can get is limited by J-I (since we may not overlap both strings) and str.length() - J (since there are only so many characters starting from the second position.

So let's calculate the maximum length we could get for a string:

#include 
size_t max_length(size_t string_length, size_t first_pos, size_t second_pos) {
    assert(first_pos <= second_pos);
    assert(second_pos <= string_length);

    return std::min(second_pos - first_pos, string_length - second_pos);
}

size_t max_length(const std::string &s, size_t first_pos, size_t second_pos) {
    return max_length(s.size(), first_pos, second_pos);
}


Now we have a method to determine in \$\mathcal O(1)\$ whether we should continue a recursive call:

namespace detail {
    void longest_repeating_substring_length(
        const std::string & str, size_t i, size_t j, size_t len, size_t & mx_len
    ){
        // i is always less than j, otherwise we broke our algorithm
        assert(i = str.length()) {
            return; 
        }

        if(str[i] == str[j] && (j - i) > len && max_length(str,i,j)  mx_len){
            longest_repeating_substring_length(str,i+1,j+1,0,mx_len);    
        }
        if(max_length(str, i, j + 1) > mx_len){
            longest_repeating_substring_length(str,i,j+1,0,mx_len);
        }
    }
}


Further remarks

Your code could

Code Snippets

#include <iostream>
#include <string>
size_t longest_repeating_substring_length(const std::string & str){
    …
}
f("123", 100, 0, 0,mx_len);
//       ^^^ oh oh
namespace detail {
    void longest_repeating_substring_length(
        const std::string & str, size_t i, size_t j, size_t len, size_t & mx_len
    ){
        mx_len = std::max(mx_len,len);

        if(j >= str.length()) {
            return; 
        }
        if(str[i] == str[j] && (j - i) > len) {
            longest_repeating_substring_length(str,i+1,j+1,len+1,mx_len);    
        }
        longest_repeating_substring_length(str,i+1,j+1,0,mx_len);    
        longest_repeating_substring_length(str,i,j+1,0,mx_len);
    }
}
namespace detail {
    size_t longest_repeating_substring_length(
        const std::string & str, size_t i, size_t j, size_t len
    ){
        size_t mx_len = len;

        if(j >= str.length()) {
            return mx_len; 
        }
        if(str[i] == str[j] && (j - i) > len) {
            mx_len = longest_repeating_substring_length(str,i+1,j+1,len+1);    
        }
        mx_len = std::max(mx_len, longest_repeating_substring_length(str,i+1,j+1,0));    
        mx_len = std::max(mx_len, longest_repeating_substring_length(str,i,j+1,0);
        return mx_len;
    }
}

Context

StackExchange Code Review Q#143157, answer score: 5

Revisions (0)

No revisions yet.