HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Stripping whitespace in a CSV file

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
strippingcsvfilewhitespace

Problem

I am interested in removing leading/trailing whitespace in a .csv file, and I was wondering if there's a better way to execute this:

with open("csv_file.csv", "rb") as infile:
    r = csv.DictReader(infile)
    fieldnames = r.fieldnames #Creates list of fieldnames

    for row in r:
        for f in fieldnames:
            row[f] = row[f].strip()


I am fine with using this method, but I was wondering if there's a way to circumvent using a nested for loop.

Solution

I don't think you need the b flag when opening a csv file.
Python 3 doesn't let me use it. I think you can drop it.

To eliminate one level of nesting,
you can use a dict comprehension, and the update method of a dictionary:

with open("/path/to/file.csv") as infile:
    reader = csv.DictReader(infile)
    fieldnames = reader.fieldnames

    for row in reader:
        row.update({fieldname: value.strip() for (fieldname, value) in row.items()})


Notice that I also renamed the variables, as they were very poorly named.
I also removed the "r" flag from the open, as that's the default mode anyway.

I don't think there's a simpler way to do this.
I tried using a dialect, but this is the best I could get:

with open("/path/to/file.csv") as infile:
    csv.register_dialect('strip', skipinitialspace=True)
    reader = csv.DictReader(infile, dialect='strip')
    fieldnames = reader.fieldnames

    for row in reader:
        print(row)


Which does NOT work as intended: whitespace is stripped from the left of each field, but not from the right.
My first thought was actually to set delimiter=r'\s,\s' when registering the dialect,
but that doesn't work because delimiter must be a single character,
it cannot be a regex.
So, if you want to strip spaces from the values,
you have no choice but do it yourself.

Code Snippets

with open("/path/to/file.csv") as infile:
    reader = csv.DictReader(infile)
    fieldnames = reader.fieldnames

    for row in reader:
        row.update({fieldname: value.strip() for (fieldname, value) in row.items()})
with open("/path/to/file.csv") as infile:
    csv.register_dialect('strip', skipinitialspace=True)
    reader = csv.DictReader(infile, dialect='strip')
    fieldnames = reader.fieldnames

    for row in reader:
        print(row)

Context

StackExchange Code Review Q#98932, answer score: 5

Revisions (0)

No revisions yet.