HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Updating a .csv file

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
updatingfilecsv

Problem

I have a CSV file, call it csv_file. It has the following content:

Username, Password
name1, pass1
name2, pass2
...


I also have a dictionary, call it mydict. It has the following content:

mydict = {
    "name2" : "pass2",
    "name3" : "pass3"
     ...
}


I want to update my CSV file to now include name3, pass3, since those aren't in the CSV file but they are in the dictionary.

What's the most efficient, pythonic way of doing this?

Right now, here's what I have, but I don't think it's very efficient:

with open(csv_file, 'rb') as infile, open(new_csv_file, 'wb') as outfile:

     r = csv.DictReader(infile)
     w = csv.DictWriter(outfile, r.fieldnames)
     w.writeheader()

     temp_dict = {row['Username'] : row['Password'] for row in r}

     for k in mydict:
          if k.key not in temp_dict:
               temp_dict[k] = mydict[k]

     for value in temp_dict:
          w.writerow({'Username' : value, 'Password' : temp_dict[value]})


I'm sure there's something I can do to make this better. Any suggestions?

Solution

There's no better way than creating a temporary dictionary to quickly update the contents of the entire file the way you want. However you can speed things by not using csv.DictReader and csv.DictWriter because they require building a separate temporary dictionary for each row processed.

Here's a more efficient version based on that supposition that also effectively updates the file "in-place". Note that the order of the rows in the file will be changed as a result of storing them temporarily in the dictionary. If that's important, use a collections.OrderedDict instead.

Also noteworthy is that it would be even more efficient to use @user3757614's suggestion, and instead do a less complicated mydict.update(temp_dict) (and then write mydict.items() out as the updated version of the file). If you want to preserve mydict, just make a copy of it first and then update that with temp_dict's contents.

import csv
import os

mydict = {
    "name2" : "pass2",
    "name3" : "pass3"
#     ...
}

csv_file = 'users.csv'  # file to be updated
tempfilename = os.path.splitext(csv_file)[0] + '.bak'
try:
    os.remove(tempfilename)  # delete any existing temp file
except OSError:
    pass
os.rename(csv_file, tempfilename)

# create a temporary dictionary from the input file
with open(tempfilename, mode='rb') as infile:
    reader = csv.reader(infile, skipinitialspace=True)
    header = next(reader)  # skip and save header
    temp_dict = {row[0]: row[1] for row in reader}

# only add items from my_dict that weren't already present
temp_dict.update({key: value for (key, value) in mydict.items()
                      if key not in temp_dict})

# create updated version of file
with open(csv_file, mode='wb') as outfile:
    writer = csv.writer(outfile)
    writer.writerow(header)
    writer.writerows(temp_dict.items())

os.remove(tempfilename)  # delete backed-up original

Code Snippets

import csv
import os

mydict = {
    "name2" : "pass2",
    "name3" : "pass3"
#     ...
}

csv_file = 'users.csv'  # file to be updated
tempfilename = os.path.splitext(csv_file)[0] + '.bak'
try:
    os.remove(tempfilename)  # delete any existing temp file
except OSError:
    pass
os.rename(csv_file, tempfilename)

# create a temporary dictionary from the input file
with open(tempfilename, mode='rb') as infile:
    reader = csv.reader(infile, skipinitialspace=True)
    header = next(reader)  # skip and save header
    temp_dict = {row[0]: row[1] for row in reader}

# only add items from my_dict that weren't already present
temp_dict.update({key: value for (key, value) in mydict.items()
                      if key not in temp_dict})

# create updated version of file
with open(csv_file, mode='wb') as outfile:
    writer = csv.writer(outfile)
    writer.writerow(header)
    writer.writerows(temp_dict.items())

os.remove(tempfilename)  # delete backed-up original

Context

StackExchange Code Review Q#98627, answer score: 7

Revisions (0)

No revisions yet.