HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Finding a common prefix/suffix in a list/tuple of strings

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
suffixtuplefindingprefixliststringscommon

Problem

The question that sparked this question, was one on Stack Overflow in which the OP was looking for a way to find a common prefix among file names( a list of strings). While an answer was given that said to use something from the os library, I began to wonder how one might implement a common_prefix function.

I deiced to try my hand at finding out, and along with creating a common_prefix function, I also created a common_suffix function. After verifying that the functions worked, I deiced to go the extra mile; I documented my functions and made them into a package of sorts, as I'm sure they will come in handy later.

But before sealing up the package for good, I deiced I would try to make my code as "Pythonic" as possible, which lead me here.

I made sure to document my code heavily, so I feel confident that I shouldn't have to explain how the functions work, and how to use them:

```
from itertools import zip_longest

def all_same(items: (tuple, list, str)) -> bool:
'''
A helper function to test if
all items in the given iterable
are identical.

Arguments:
item -> the given iterable to be used

eg.
>>> all_same([1, 1, 1])
True
>>> all_same([1, 1, 2])
False
>>> all_same((1, 1, 1))
True
>> all_same((1, 1, 2))
False
>>> all_same("111")
True
>>> all_same("112")
False
'''
return all(item == items[0] for item in items)

def common_prefix(strings: (list, tuple), _min: int=0, _max: int=100) -> str:
'''
Given a list or tuple of strings, find the common prefix
among them. If a common prefix is not found, an empty string
will be returned.

Arguments:
strings -> the string list or tuple to
be used.

_min, _max - > If a common prefix is found,
Its length will be tested against the range _min
and _max. If its length is not in the range, and
empty string will be returned, otherwise the prefix
is returned

eg.
>>> common_prefix([

Solution

common_suffix can be written as return common_prefix(string[::-1])[::-1] because the operations are just the simmetric of one another, and this way will prevent duplication.

Also I think you should not handle max or min inside the common_prefix function because it feels like the function has double responsabilty: finding prefixes + length interval check.

Why are you limiting yourself to strings? Python allows general functions very easily.

Why do you build all the result and then return it? You should yield the result item by item:

Why do you write so much yourself? Using the itertools module is much more efficient and simple:

def common_prefix(its):
    yield from itertools.takewhile(all_equal, zip(*its))


PS: common_suffix will now need to use reversed(list instead of [::-1]

Code Snippets

def common_prefix(its):
    yield from itertools.takewhile(all_equal, zip(*its))

Context

StackExchange Code Review Q#145757, answer score: 5

Revisions (0)

No revisions yet.