HiveBrain v1.2.0
Get Started
← Back to all entries
snippetpythonMinor

Manipulating a .csv file to look for two common values to create a key, then summing up values

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
filemanipulatingcreatesummingcsvlooktwoforthenvalues

Problem

My code reads a .csv file, looks for a couple values to create its key, and then wraps up the data based on a few business rules I have. I'm trying to learn "the right way", so I'd really appreciate some tips and hints.

```
import csv
import sys

# add DictReader
with open(sys.argv[1], 'rbU') as csvinfile:
myreader = csv.reader(csvinfile, delimiter=',', quotechar='"')

# initialize variables and dictionaries
line_cntr = 0
dict = {}
cntr = {}

for row in myreader:
# constant values
nd_amt = row[14]
deduct_amt = row[15]
nondeduct_ytd = row[16]
deduct_ytd = row[17]
pid = row[18]
don_date = row[19]
amount = row[23]
anon = row[25]
int_0003 = row[38]
int_0006 = row[39]
int_0028 = row[40]

# create a composite key for our dictionary
key = pid + ":" + don_date
# check to see if key exists in dictionary, if not then add
# as per BR-0010 every group of up to 6 entries with same P_ID and Don_Date should print on their own line (i.e. different entry in Dict).
if key in dict:
if cntr[key] % 6 != 0:
dict[key][14] += row[14]
dict[key][15] += row[15]
dict[key][16] += row[16]
dict[key][17] += row[17]
dict[key][23] += row[23]
cntr[key] += 1
else:
key = pid + ":" + don_date + ":" + str(cntr[key]//6)
dict[key] = row
else:
dict[key] = row
cntr[key] = 1

# debugging
for key in cntr:
if cntr[key] > 6:
print(key, cntr[key])

# keep track of lines processed for recon
line_cntr += 1

# add DictWriter
with open(sys.argv[2], 'wb') as csvoutfile:
mywriter = csv.writer(csvoutfile, delimiter=',', quotechar='"')

for key in dict:
outline = (key, dict[key])
mywriter.w

Solution

Overall, your code is pretty clean.

Here are some points for improvement:

-
Use the if __name__ == '__main__' idiom if this file is to be executed directly. https://docs.python.org/2/library/main.html

-
Structure your code so that no code is executed before the name clause or upon import. In other words, don't have any code blocks which start at the left-hand margin. Instead, place code within a few functions which are called by a main run() function.

-
Use an enum or constants to represent the index values for things like nd_amount.

EDIT

ND_AMT = 14
...

nd_amt = row[ND_AMT]

Code Snippets

ND_AMT = 14
...

nd_amt = row[ND_AMT]

Context

StackExchange Code Review Q#97843, answer score: 3

Revisions (0)

No revisions yet.