patternpythonMinor
Refactor Oxford Comma function (and other functions)
Viewed 0 times
commafunctionoxfordandfunctionsotherrefactor
Problem
I wrote a program that managed projects and created a set of SQL files for each project. In the logic, I have a few functions that, each time I look at them, I wonder if they could be written in a better way. They are pretty straight forward and compact, however I feel like there are better, more efficient ways of computing their results.
I would appreciate any comments on coding style or improvements (either in readability or performance) that you may have.
Oxford Comma
This function takes a
SQL 'LIKE' String Creator
This next function takes a
```
import difflib
# >>> create_sql_regex(['123','124','125'])
# '12%'
def create_sql_regex(ids):
''' Creates a SQL regex from a list of strings for use in LIKE statement '''
longest_match = ''
# Raise an error if there is only one item in the list as there needs to be at least two
# items to create the regex.
length = len(ids)
if length <= 1:
raise NotEnoughComparisonData(
'Not enough data to compare. Passed list length: {}'.format(length))
# Create the SequenceMatcher and loop through each element in the list, comparing it and
# the previous item.
matcher = difflib.SequenceMatcher()
for item, next_item in zip(ids[:-1], ids[1:]):
matcher.set_seqs(item, next_item)
long_m
I would appreciate any comments on coding style or improvements (either in readability or performance) that you may have.
Oxford Comma
This function takes a
list as input then passes an Oxford comma-separated string off to another function:# >>> obj._format_missing_message(['a','b','c'])
# 'a, b, and c'
def _format_missing_message(self, missing):
length = len(missing)
# Show the message if needed
if length > 0:
and_clause = '{} and '.format('' if length <= 2 else ',')
message = and_clause.join([', '.join(missing[:-1]), missing[-1]])
self._tab_pane.currentWidget().update_status(
"Configs not found for: {}.".format(message))SQL 'LIKE' String Creator
This next function takes a
list of string inputs then finds the SQL LIKE syntax that recognizes all of the list:```
import difflib
# >>> create_sql_regex(['123','124','125'])
# '12%'
def create_sql_regex(ids):
''' Creates a SQL regex from a list of strings for use in LIKE statement '''
longest_match = ''
# Raise an error if there is only one item in the list as there needs to be at least two
# items to create the regex.
length = len(ids)
if length <= 1:
raise NotEnoughComparisonData(
'Not enough data to compare. Passed list length: {}'.format(length))
# Create the SequenceMatcher and loop through each element in the list, comparing it and
# the previous item.
matcher = difflib.SequenceMatcher()
for item, next_item in zip(ids[:-1], ids[1:]):
matcher.set_seqs(item, next_item)
long_m
Solution
Oxford comma
Are these really your expected outcomes?
In particular, the case of a single item looks like a bug, and in the case of 3 or more items the 2 spaces between
In any case, it would make sense to split the method, for example:
This way it's easier to unit test the Oxford comma logic.
Finally, I think the method would be slightly easier to read like this:
Count Duplicates
My only objection is the method name,
That's all I can pick on ;-)
Are these really your expected outcomes?
# for ['a']
Configs not found for: and a.
# for ['a', 'b']
Configs not found for: a and b.
# for ['a', 'b', 'c']
Configs not found for: a, b, and c.In particular, the case of a single item looks like a bug, and in the case of 3 or more items the 2 spaces between
a, b looks a bit strange.In any case, it would make sense to split the method, for example:
def _format_missing_message(self, missing):
if missing:
message = _oxford_comma(missing)
self._tab_pane.currentWidget().update_status(
"Configs not found for: {}.".format(message))
@staticmethod
def _oxford_comma(items):
length = len(items)
and_clause = '{} and '.format('' if length <= 2 else ',')
return and_clause.join([', '.join(items[:-1]), items[-1]])This way it's easier to unit test the Oxford comma logic.
Finally, I think the method would be slightly easier to read like this:
def _oxford_comma(items):
length = len(items)
if length == 1:
return items[0]
if length == 2:
return '{} and {}'.format(*items)
return '{}, and {}'.format(', '.join(items[:-1]), items[-1])Count Duplicates
My only objection is the method name,
count_duplicates. You are counting elements, not necessarily duplicates, and returning a map of item => count pairs. So for example get_counts would be more natural.That's all I can pick on ;-)
Code Snippets
# for ['a']
Configs not found for: and a.
# for ['a', 'b']
Configs not found for: a and b.
# for ['a', 'b', 'c']
Configs not found for: a, b, and c.def _format_missing_message(self, missing):
if missing:
message = _oxford_comma(missing)
self._tab_pane.currentWidget().update_status(
"Configs not found for: {}.".format(message))
@staticmethod
def _oxford_comma(items):
length = len(items)
and_clause = '{} and '.format('' if length <= 2 else ',')
return and_clause.join([', '.join(items[:-1]), items[-1]])def _oxford_comma(items):
length = len(items)
if length == 1:
return items[0]
if length == 2:
return '{} and {}'.format(*items)
return '{}, and {}'.format(', '.join(items[:-1]), items[-1])Context
StackExchange Code Review Q#51238, answer score: 2
Revisions (0)
No revisions yet.