HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Finding string roots

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
rootsstringfinding

Problem

This is the following Python code that I have coded for an algorithmic problem online. For each line of input, the goal is to find the greatest number of repeating blocks into which the string can be split. For example, if the input is abcabcabcabc, then the output should be 4, because the string is equal to 4 repetitions of abc.

The online compiler says "Time Limit Exceeded". Where in the following code can I optimize to make it much more efficient in terms of time?

inputString = raw_input()
while (inputString !='*'):
   scanned_string = ""
   matched_characters = ""
   frequency = 0
   if (len(inputString) == 1):
     print frequency
   for current_symbol in inputString:
     if not scanned_string:
        scanned_string = scanned_string + current_symbol
     if scanned_string:
        matched_characters = matched_characters + current_symbol
        if (scanned_string.startswith(matched_characters)):
            if (len(scanned_string) == len(matched_characters)):
                frequency = frequency + 1
                matched_characters = ""
        else:
            scanned_string = scanned_string+matched_characters
            matched_characters = ""
            frequency = 1
   print frequency
   inputString = raw_input()

Solution

Notice that:

  • The length of a root must be a divisor of the length of the input string: you could skip other lengths. For example for the input string "abcabcabc" (length = 9), "ab" (length = 2) or "abcd" (length = 4) cannot be a root, so no need to check those



  • The length of a root must be less than equal to half the length of the input string, otherwise it's the input string itself (frequency = 1)



Taking the above into consideration, this implementation should be faster:

def sroot(text):
    text_len = len(text)
    for segment_len in range(1, text_len // 2 + 1):
        if text_len % segment_len != 0:
            continue
        frequency = text_len // segment_len
        if text[:segment_len] * frequency == text:
            return frequency
    return 1


This can be further improved too.
For example, instead of iterating until text_len // 2,
it would be enough to iterate until math.sqrt(text_len),
and when finding a divisor, also try its factor pair as a possible segment.
For example for text_len = 100,
when finding segment_len = 5,
also try segment_len = 100 / 5 = 20.
Iterating until 10 will be a lot faster than iterating until 50 as in the example implementation above.

There are several coding style issues in your code:

-
It's good to put the main logic inside a function. It makes it easier to test. For example you can add assertions like this to test your solution:

assert sroot('abcabcabcabc') == 4
assert sroot('abcdefgh012') == 1
assert sroot('aaaaaaaaaa') == 10
assert sroot('abcdefgh012abcdefgh012') == 2
assert sroot('abcdefgh012abcdefgh012x') == 1


-
Lots of unnecessary parentheses, for example in:

while (inputString !='*'):
if (len(inputString) == 1):


-
This can be simplified:

if not scanned_string:
    scanned_string = scanned_string + current_symbol


to this:

if not scanned_string:
    scanned_string = current_symbol


-
This can be simplified:

matched_characters = matched_characters + current_symbol


to this:

matched_characters += current_symbol

Code Snippets

def sroot(text):
    text_len = len(text)
    for segment_len in range(1, text_len // 2 + 1):
        if text_len % segment_len != 0:
            continue
        frequency = text_len // segment_len
        if text[:segment_len] * frequency == text:
            return frequency
    return 1
assert sroot('abcabcabcabc') == 4
assert sroot('abcdefgh012') == 1
assert sroot('aaaaaaaaaa') == 10
assert sroot('abcdefgh012abcdefgh012') == 2
assert sroot('abcdefgh012abcdefgh012x') == 1
while (inputString !='*'):
if (len(inputString) == 1):
if not scanned_string:
    scanned_string = scanned_string + current_symbol
if not scanned_string:
    scanned_string = current_symbol

Context

StackExchange Code Review Q#71921, answer score: 3

Revisions (0)

No revisions yet.