patternpythonMinor
Merging three CSV files with a primary key
Viewed 0 times
threeprimarymergingwithcsvfileskey
Problem
This is my first time coding in my life. I don't really know much about coding. I googled whatever I need and combined them all together. It works great, but want to know if there is any improvement needed. Is there a better language to write in?
```
import os, csv, re, time
from datetime import date, timedelta
SendIDs =[]
BouncedEAes=[]
SubKeys=[]
MerchantIDs=[]
EventDates=[]
BounceCategories=[]
BounceReasones=[]
#SendTimes=[] #To find the most recent date that a welcome email was sent.
BounceDate=0
f = open("SendJobs.csv") #To get Send IDs and dates for Welcome emails
for row in csv.reader(f):
if "Welcome" in row[7]:
SendIDs.append(row[1])
# SendTimes.append(
# time.strftime("%Y%m%d",
# time.strptime(row[5],"%m/%d/%Y %H:%M:%S %p")))
f.close()
# if not os.path.exists('SendIDs.csv'):
# open('SendIDs.csv', 'w').close()
# f = open("SendIDs.csv")
# for row in csv.reader(f):
# SendIDs.append(row[0])
# UniqSendIDs = {}.fromkeys(SendIDs).keys()
# f.close()
# f = open("SendIDs.csv","w")
# with f as output:
# writer = csv.writer(output, lineterminator='\n')
# for item in UniqSendIDs:
# writer.writerow([item])
# f.close()
f = open('Bounces.csv')
for row in csv.reader(f):
for item in SendIDs: #OR UniqSendIDs
if item == row[1]:
SubKeys.append(row[2])
BouncedEAes.append(row[3])
BounceCategories.append(row[8])
BounceReasones.append(row[10])
#EventDate: Only need bounce date, NO time required.
BounceDate = time.strptime(row[6],"%m/%d/%Y %H:%M:%S %p")
BounceDate = time.strftime("%m/%d/%Y", BounceDate)
EventDates.append(BounceDate)
f.close()
f = open('Attributes.csv')
for row in csv.reader(f):
for item in BouncedEAes:
if item == row[2]:
MerchantIDs.append(row[4])
f.close()
SubKeys.insert(0,"SubscriberKey")
BouncedEAes.insert(0,"EmailAddress")
MerchantIDs.insert(0,"Merchant_Number")
E
```
import os, csv, re, time
from datetime import date, timedelta
SendIDs =[]
BouncedEAes=[]
SubKeys=[]
MerchantIDs=[]
EventDates=[]
BounceCategories=[]
BounceReasones=[]
#SendTimes=[] #To find the most recent date that a welcome email was sent.
BounceDate=0
f = open("SendJobs.csv") #To get Send IDs and dates for Welcome emails
for row in csv.reader(f):
if "Welcome" in row[7]:
SendIDs.append(row[1])
# SendTimes.append(
# time.strftime("%Y%m%d",
# time.strptime(row[5],"%m/%d/%Y %H:%M:%S %p")))
f.close()
# if not os.path.exists('SendIDs.csv'):
# open('SendIDs.csv', 'w').close()
# f = open("SendIDs.csv")
# for row in csv.reader(f):
# SendIDs.append(row[0])
# UniqSendIDs = {}.fromkeys(SendIDs).keys()
# f.close()
# f = open("SendIDs.csv","w")
# with f as output:
# writer = csv.writer(output, lineterminator='\n')
# for item in UniqSendIDs:
# writer.writerow([item])
# f.close()
f = open('Bounces.csv')
for row in csv.reader(f):
for item in SendIDs: #OR UniqSendIDs
if item == row[1]:
SubKeys.append(row[2])
BouncedEAes.append(row[3])
BounceCategories.append(row[8])
BounceReasones.append(row[10])
#EventDate: Only need bounce date, NO time required.
BounceDate = time.strptime(row[6],"%m/%d/%Y %H:%M:%S %p")
BounceDate = time.strftime("%m/%d/%Y", BounceDate)
EventDates.append(BounceDate)
f.close()
f = open('Attributes.csv')
for row in csv.reader(f):
for item in BouncedEAes:
if item == row[2]:
MerchantIDs.append(row[4])
f.close()
SubKeys.insert(0,"SubscriberKey")
BouncedEAes.insert(0,"EmailAddress")
MerchantIDs.insert(0,"Merchant_Number")
E
Solution
if it's really your first time coding, then that's great: you actually managed to use some parts of the Python standard library that people being exposed for the first time to the language tend to needlessly reinvent, such as the
Python version
The first big question: you used Python, but why did you use the version 2.7? This version (minus the bugfixes) is already 5 years old and its maintenance should have been dropped this year (it has been extended to 2020 since it's still widely used). But if you start from scratch, you might as well start with a fresh new version, such as Python 3.4. Generally speaking, don't use Python 2.7 unless you're forced to (there are still reasons); try to adapt to the most recent version instead.
Commented out code
Generally speaking, you should never leave commented-out pieces of code in your code. They will only hinder readability. I must admit that you probably don't use source control source control software if it's your first time programming, but you might want to look at software like Git in the future so that every version of your code is saved somewhere and you don't have to keep outdated remnants of code in your source.
Useless stuff in general
More generally, remove from your code what is unneeded, such as unused variables or unused includes. In your case, you could get rid of the
The
One thing that is easy to forget is closing files. You didn't forget to close them, which is great, but Python provides a superior alternative to closing files in the form of the
As you can see, I didn't manually call
Use the PEP 8
Python has a style guide which is rather good and people tend to follow it. Therefore, you could try to rewrite your code following that style guide so that it looks more like what people want to read when they read Python code. For example, I will rewrite once again the previous piece of code, following the PEP8:
Basically, it's still the same. I only added spaces. But even those can matter when you read code :)
csv module :)Python version
The first big question: you used Python, but why did you use the version 2.7? This version (minus the bugfixes) is already 5 years old and its maintenance should have been dropped this year (it has been extended to 2020 since it's still widely used). But if you start from scratch, you might as well start with a fresh new version, such as Python 3.4. Generally speaking, don't use Python 2.7 unless you're forced to (there are still reasons); try to adapt to the most recent version instead.
Commented out code
Generally speaking, you should never leave commented-out pieces of code in your code. They will only hinder readability. I must admit that you probably don't use source control source control software if it's your first time programming, but you might want to look at software like Git in the future so that every version of your code is saved somewhere and you don't have to keep outdated remnants of code in your source.
Useless stuff in general
More generally, remove from your code what is unneeded, such as unused variables or unused includes. In your case, you could get rid of the
re, os and time imports: you don't use re, you only use os in commented-out code that shouldn't belong, and what you use from time is actually also available in datetime.The
with statementOne thing that is easy to forget is closing files. You didn't forget to close them, which is great, but Python provides a superior alternative to closing files in the form of the
with statement. Here are your first lines of code rewritten with it:with open("Welcome-"+yesterday.strftime('%Y%m%d')+".csv","wb") as f:
output=csv.writer(f)
for row in new_data:
output.writerow(row)As you can see, I didn't manually call
f.close(), this is automatically done when we leave the with block. Actually, with isn't limited to closing files, it can do many other things, but it depends on the type you use it with.Use the PEP 8
Python has a style guide which is rather good and people tend to follow it. Therefore, you could try to rewrite your code following that style guide so that it looks more like what people want to read when they read Python code. For example, I will rewrite once again the previous piece of code, following the PEP8:
with open("Welcome-" + yesterday.strftime('%Y%m%d') + ".csv", "wb") as f:
output = csv.writer(f)
for row in new_data:
output.writerow(row)Basically, it's still the same. I only added spaces. But even those can matter when you read code :)
Code Snippets
with open("Welcome-"+yesterday.strftime('%Y%m%d')+".csv","wb") as f:
output=csv.writer(f)
for row in new_data:
output.writerow(row)with open("Welcome-" + yesterday.strftime('%Y%m%d') + ".csv", "wb") as f:
output = csv.writer(f)
for row in new_data:
output.writerow(row)Context
StackExchange Code Review Q#88314, answer score: 7
Revisions (0)
No revisions yet.