HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Break a full name into a dictionary of its parts

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
fullintoitsnamedictionarybreakparts

Problem

It seems like I'm repeating myself here. Is there a standard way to deal with code like this?

def getNameParts(name):
    """Return a dictionary of a name's parts.
    name: e.g. 'JOHNSON, John Roberts, Jr. (Jack)'
    Ignores nicknames.
    """
    csSplit = name.split(', ')
    try:
        last = csSplit[0]
    except:
        last = ''
    try:
        first = csSplit[1].split(' ')[0]
    except:
        first = ''
    try:
        middle = ' '.join(csSplit[1].split(' ')[1:])
    except:
        middle = ''
    try:
        suffix = csSplit[2].split(' ')[0]
    except:
        suffix = ''
    partsDict = {'first': first,
                 'last': last,
                 'middle': middle,
                 'suffix': suffix}
    return(partsDict)

Solution

You have some unnecessary try blocks. Watch this:

>>> 'this'.split(', ')
['this']
>>> ''.split(', ')
['']


Look, Ma! No empty lists! Since a split will always have at least one item in the result, you don't need to try to get the first item. It will always be there for you. Watch this:

>>> [][1:]
[]
>>> ' '.join([])
''


Look, Ma! No errors! A slice never throws errors, and a ' '.join() will return '' when the argument is empty. That means that you need only one try block for first and middle.

An except block should specify what it is expecting. A bare except is dangerous. I have found myself caught when I have done that because I can't even interrupt the script with Ctrl+C. If it isn't an IndexError that trips the except, we want to know about it. We don't want to hide it under an except.

Your naming does not comply with PEP 8 (the Python style guide). It says to use lowercase_with_underscores for function and variable names. Besides that, csSplit is a little hard to understand. Sure, one can recognize cs as being short for comma-separated if one thinks about it, but I prefer something a little bit easier to understand. Keeping the same spirit, why not comma_split? I don't really like that name, but it seems a little easier to understand that csSplit.

It doesn't make much difference with short strings, but first and middle both perform the same split operation. To speed things up, make the split only once by assigning a variable to it.

Why the intermediate variable partsDict? We know that the function returns a dictionary of parts by its name and doc string. Therefore, the variable cannot be for clarity. It is longer to use than a simple return, so it isn't for line length. Why then? Just return directly.

The code ends up looking like this:

def get_name_parts(name):
    """Return a dictionary of a name's parts.
    name: e.g. 'JOHNSON, John Roberts, Jr. (Jack)'
    Ignores nicknames.
    """
    comma_split = name.split(', ')
    last = comma_split[0]

    try:
        first_mid = comma_split[1].split(' ')
    except IndexError:
        first_mid = ['']

    first = first_mid[0]
    middle = ' '.join(first_mid[1:])

    try:
        suffix = comma_split[2].split(' ')[0]
    except IndexError:
        suffix = ''

    return {'first': first,
            'last': last,
            'middle': middle,
            'suffix': suffix}


Is it shorter? Not really. It is only two or three lines shorter, but it is clearer, less repetitive, and safer. Happy coding!

Code Snippets

>>> 'this'.split(', ')
['this']
>>> ''.split(', ')
['']
>>> [][1:]
[]
>>> ' '.join([])
''
def get_name_parts(name):
    """Return a dictionary of a name's parts.
    name: e.g. 'JOHNSON, John Roberts, Jr. (Jack)'
    Ignores nicknames.
    """
    comma_split = name.split(', ')
    last = comma_split[0]

    try:
        first_mid = comma_split[1].split(' ')
    except IndexError:
        first_mid = ['']

    first = first_mid[0]
    middle = ' '.join(first_mid[1:])

    try:
        suffix = comma_split[2].split(' ')[0]
    except IndexError:
        suffix = ''

    return {'first': first,
            'last': last,
            'middle': middle,
            'suffix': suffix}

Context

StackExchange Code Review Q#135805, answer score: 9

Revisions (0)

No revisions yet.