HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

String pattern riddle

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
riddlestringpattern

Problem

I've tried this problem on Codeforces. Given a set of string patterns, all of which are the same length, and which consist of literal lowercase ASCII characters and ? wildcards, find a pattern that satisfies them all.

I want to optimize and simplify it further and make it a more elegant solution, if possible.

import sys

def main():
    n = int(sys.stdin.readline())
    t = sys.stdin.readlines()
    l = len(t[0])
    result = ''
    if n==1:
        result = t[0].replace('?', 'x')
    else:
        for y in range(0,l-1):
            let = t[0][y]
            for x in range(1,n):
                if let == '?' and t[x][y]!= '?':
                    let = t[x][y]
                if t[x][y] != let and t[x][y] != '?':
                    result += '?'
                    break
                elif x == n-1:
                    if let == '?':
                        result += 'x'
                    else:
                        result += let
    print result

main()

Solution

Use more meaningful variable names, for example:

  • lines instead of t



  • length instead of l



  • Since y is a character position, I would rename to pos



  • let to letter, because it's not much longer and a bit more readable (since let is keyword in some other programming languages, I find it confusing to read)



Since lines[0] is kind of special, and you make many references to it, it would be good to give it a name:

first = lines[0]


Your input lines end with a trailing newline character, but in the main logic of pattern matching this is just noise, not an important detail. I recommend this:

first = lines[0].strip()


This will get rid of the trailing newline character, and in the main logic you don't have to worry about checking letters until first[-1], you can consider the letters in first, which is simpler.

If you make changes to the script, you have to repeat all the tests to see if it's still working. That's very troublesome. To make that easier, add some unit tests, and refactor the main method to make it more testable, for example:

import unittest
from test.test_support import run_unittest

def main():
    n = int(sys.stdin.readline())
    lines = sys.stdin.readlines()[0:n]
    print find_common_pattern(lines)

def find_common_pattern(lines):
    # the rest of the code from the old version of `main`

class FindPatterns(unittest.TestCase):
    def test_given_examples(self):
        self.assertEqual('xab', find_common_pattern(['?ab\n', '??b\n']))
        self.assertEqual('?', find_common_pattern(['a\n', 'b\n']))
        self.assertEqual('xaxb', find_common_pattern(['?a?b\n']))

run_unittest(FindPatterns)


After this, it will be easy to follow my suggestions below and verify that the script is still working correctly, passing all test cases. (Feel free to add more!)

Instead of range(0, length-1), you can write simply range(length-1).

if let == '?' and t[x][y]!= '?':
    let = t[x][y]
if t[x][y] != let and t[x][y] != '?':
    result += '?'


If the first if was true, the second will not be true, and it's a waste to evaluate it again. You could make the second if an elif instead.

The last elif in your for loop is a bit dirty: it's for checking if you're on the last line. The condition will be evaluated for every line where the current position is ?, which is a bit wasteful. It would be better to refactor that loop to perform that check only once at the end, for example:

for pos in range(length):
    letter = first[pos]
    newletter = letter
    for x in range(1, len(lines)):
        if letter == '?' and lines[x][pos] != '?':
            letter = lines[x][pos]
        elif lines[x][pos] != letter and lines[x][pos] != '?':
            newletter = '?'
            break
    if letter == '?':
        result += 'x'
    else:
        result += newletter


After this refactoring, you don't need the value of x inside the loop anymore, so you could rewrite more intuitively as:

for line in lines[1:]:
    if letter == '?' and line[pos] != '?':
        letter = line[pos]
    elif line[pos] != letter and line[pos] != '?':
        newletter = '?'
        break


Since the characters ? and x appear several times in the code, I would treat them as constants and name them for example:

  • ANYCHAR = '?'



  • SAMPLECHAR = 'x'



Instead of running main, the common way is like this:

if __name__ == '__main__':
    main()


This way, your script will still run as before when you do python script.py, and at the same time it will be possible to import it from another python script as a library without automatically running it. It's a good practice to get used to, even if you don't intend to make this a library.

Putting it all together, and some other minor improvements, this would be more Pythonic, and slightly more readable and efficient:

ANYCHAR = '?'
SAMPLECHAR = 'x'

def find_common_pattern(lines):
    first = lines[0].strip()
    lines = lines[1:]
    if not lines:
        return first.replace(ANYCHAR, SAMPLECHAR)
    result = ''
    for pos in range(len(first)):
        letter = first[pos]
        newletter = letter
        for line in lines:
            if line[pos] != ANYCHAR:
                if letter == ANYCHAR:
                    letter = line[pos]
                elif line[pos] != letter:
                    newletter = ANYCHAR
                    break
        if letter == ANYCHAR:
            result += SAMPLECHAR
        else:
            result += newletter
    return result

Code Snippets

first = lines[0]
first = lines[0].strip()
import unittest
from test.test_support import run_unittest

def main():
    n = int(sys.stdin.readline())
    lines = sys.stdin.readlines()[0:n]
    print find_common_pattern(lines)

def find_common_pattern(lines):
    # the rest of the code from the old version of `main`

class FindPatterns(unittest.TestCase):
    def test_given_examples(self):
        self.assertEqual('xab', find_common_pattern(['?ab\n', '??b\n']))
        self.assertEqual('?', find_common_pattern(['a\n', 'b\n']))
        self.assertEqual('xaxb', find_common_pattern(['?a?b\n']))

run_unittest(FindPatterns)
if let == '?' and t[x][y]!= '?':
    let = t[x][y]
if t[x][y] != let and t[x][y] != '?':
    result += '?'
for pos in range(length):
    letter = first[pos]
    newletter = letter
    for x in range(1, len(lines)):
        if letter == '?' and lines[x][pos] != '?':
            letter = lines[x][pos]
        elif lines[x][pos] != letter and lines[x][pos] != '?':
            newletter = '?'
            break
    if letter == '?':
        result += 'x'
    else:
        result += newletter

Context

StackExchange Code Review Q#47638, answer score: 7

Revisions (0)

No revisions yet.