HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Refactor Oxford Comma function (and other functions)

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
commafunctionoxfordandfunctionsotherrefactor

Problem

I wrote a program that managed projects and created a set of SQL files for each project. In the logic, I have a few functions that, each time I look at them, I wonder if they could be written in a better way. They are pretty straight forward and compact, however I feel like there are better, more efficient ways of computing their results.

I would appreciate any comments on coding style or improvements (either in readability or performance) that you may have.

Oxford Comma

This function takes a list as input then passes an Oxford comma-separated string off to another function:

# >>> obj._format_missing_message(['a','b','c'])
# 'a, b, and c'
def _format_missing_message(self, missing): 
        length = len(missing)

        # Show the message if needed
        if length > 0:
            and_clause = '{} and '.format('' if length <= 2 else ',')
            message = and_clause.join([', '.join(missing[:-1]), missing[-1]])
            self._tab_pane.currentWidget().update_status(
                                            "Configs not found for: {}.".format(message))


SQL 'LIKE' String Creator

This next function takes a list of string inputs then finds the SQL LIKE syntax that recognizes all of the list:

```
import difflib

# >>> create_sql_regex(['123','124','125'])
# '12%'
def create_sql_regex(ids):
''' Creates a SQL regex from a list of strings for use in LIKE statement '''
longest_match = ''

# Raise an error if there is only one item in the list as there needs to be at least two
# items to create the regex.
length = len(ids)
if length <= 1:
raise NotEnoughComparisonData(
'Not enough data to compare. Passed list length: {}'.format(length))

# Create the SequenceMatcher and loop through each element in the list, comparing it and
# the previous item.
matcher = difflib.SequenceMatcher()
for item, next_item in zip(ids[:-1], ids[1:]):
matcher.set_seqs(item, next_item)
long_m

Solution

Oxford comma

Are these really your expected outcomes?

# for ['a']
Configs not found for:  and a.

# for ['a', 'b']
Configs not found for: a and b.

# for ['a', 'b', 'c']
Configs not found for: a,  b, and c.


In particular, the case of a single item looks like a bug, and in the case of 3 or more items the 2 spaces between a, b looks a bit strange.

In any case, it would make sense to split the method, for example:

def _format_missing_message(self, missing):
    if missing:
        message = _oxford_comma(missing)
        self._tab_pane.currentWidget().update_status(
            "Configs not found for: {}.".format(message))

@staticmethod
def _oxford_comma(items):
    length = len(items)
    and_clause = '{} and '.format('' if length <= 2 else ',')
    return and_clause.join([',  '.join(items[:-1]), items[-1]])


This way it's easier to unit test the Oxford comma logic.

Finally, I think the method would be slightly easier to read like this:

def _oxford_comma(items):
    length = len(items)
    if length == 1:
        return items[0]
    if length == 2:
        return '{} and {}'.format(*items)
    return '{}, and {}'.format(', '.join(items[:-1]), items[-1])


Count Duplicates

My only objection is the method name, count_duplicates. You are counting elements, not necessarily duplicates, and returning a map of item => count pairs. So for example get_counts would be more natural.

That's all I can pick on ;-)

Code Snippets

# for ['a']
Configs not found for:  and a.

# for ['a', 'b']
Configs not found for: a and b.

# for ['a', 'b', 'c']
Configs not found for: a,  b, and c.
def _format_missing_message(self, missing):
    if missing:
        message = _oxford_comma(missing)
        self._tab_pane.currentWidget().update_status(
            "Configs not found for: {}.".format(message))

@staticmethod
def _oxford_comma(items):
    length = len(items)
    and_clause = '{} and '.format('' if length <= 2 else ',')
    return and_clause.join([',  '.join(items[:-1]), items[-1]])
def _oxford_comma(items):
    length = len(items)
    if length == 1:
        return items[0]
    if length == 2:
        return '{} and {}'.format(*items)
    return '{}, and {}'.format(', '.join(items[:-1]), items[-1])

Context

StackExchange Code Review Q#51238, answer score: 2

Revisions (0)

No revisions yet.