HiveBrain v1.2.0
Get Started
← Back to all entries
snippetpythonMinor

Parse emails and create HTML markup from attachments

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
attachmentscreateparsemarkupemailsandfromhtml

Problem

I am developing a script that will act as a surrogate of sorts for a web portal and an email blast program. The web portal is used to create web-to-print files and is very good at creating print-ready PDF files, but currently lacks any email blast or HTML output options for the end user. In order to circumvent this limitation, my script will parse data that is in an email that is automatically sent after each order is placed. In this email are variables from the site that user has submitted as well as images that the user has uploaded.

My code still needs to archive the emails after they have been zipped (to eliminate duplicate files) and send the zip file to the person who ordered the HTML. Despite not being 100%, I was hoping to get some good feedback/criticism on what I have so far. I am not brand new to Python, but I am self taught and figured this is the best place to learn how to refine my code.

Here is my current (expected) workflow:

  • User logs into web portal to order.



  • User uploads images, enters text into fields, and receives a proof of their "HTML".



  • User completes order, checks out. Process is now out of user's hands.



  • An email is generated and sent to me. This email includes all information necessary to create HTML markup.



  • Script reads email, downloads attachments, assigns values to variables based on their "markup" in the email. Specific characters are used to identify text strings to be used.



  • Script creates HTML markup using assigned variables, then adds .html file and all included images to a .zip file fit for distribution.



  • Script archives email to prevent duplicate zip files being created on next run.



  • An email is sent to the user with the zip file attached.



  • User uploads zip file to their own specific email blast program.



Links to my code and supporting files are here:

  • Script



  • Email files



```
import email, getpass, imaplib, os, re, csv, zipfile, glob, threading

detach_dir = 'directory' # directory where to save a

Solution

PEP8

You're not following PEP8, Python's official coding style guide, for example:

  • Don't put multiple imports on the same line



  • Put a space after commas in parameter lists, so m.login(user, pwd) instead of m.login(user,pwd)



  • There shouldn't be whitespace after :



  • Use at least 2 spaces before inline comments (before each # comment)



  • Put a space after # when starting comments, so # comment instead of #comment



  • Some lines are too long (longer than 79 characters)



Bad practices

  • Remove unused imports: getpass, csv, threading



  • No need for the parens in user = ("username"), and it may get confused as a tuple (which it isn't)



  • The variable m is poorly named: judging by the name I might guess it's a "message", when it's actually a mailbox



  • At the point where you do str(message), message might not have been assigned yet



  • At the point where you do os.listdir(path), path might not have been assigned yet



In this code:

resp, items = m.search(None, "ALL")
items = items[0].split()  # getting the mails id

for emailid in items:
    resp, data = m.fetch(emailid, "(RFC822)")


The initialization of resp is pointless, because it will be reassigned anyway in the loop. Effectively you throw away the first assignment of resp. In such cases a common practice is to use _ as the variable name, like this:

_, items = m.search(None, "ALL")


On closer look, I see that after the 2nd assignment too,
resp is not used again. So it could be replaced with _ there too.

message = re.compile(r'\%(.+?)\%', re.DOTALL).findall(content)


This is the same as the above but simpler:

message = re.findall(r'\%(.+?)\%', content, re.DOTALL)


But since you have this code inside a for loop,
it would be best to compile the pattern once before the loop,
and reuse it inside, like this:

re_between_percent = re.compile(r'\%(.+?)\%', re.DOTALL)
for part in mail.walk():
    # ...
    message = re_between_percent.findall(content)


Do the same for the other statements too where you do re.compile(...).findall(...).

Instead of this:

htmlData = open(os.path.join('directory', htmlFile), 'w+')
    htmlData.write(htmlCode)
    print(htmlFile + ' Complete')
    htmlData.close()


The Pythonic way to work with files:

with open(os.path.join('directory', htmlFile), 'w+') as htmlData:
        htmlData.write(htmlCode)
        print(htmlFile + ' Complete')

Code Snippets

resp, items = m.search(None, "ALL")
items = items[0].split()  # getting the mails id

for emailid in items:
    resp, data = m.fetch(emailid, "(RFC822)")
_, items = m.search(None, "ALL")
message = re.compile(r'\%(.+?)\%', re.DOTALL).findall(content)
message = re.findall(r'\%(.+?)\%', content, re.DOTALL)
re_between_percent = re.compile(r'\%(.+?)\%', re.DOTALL)
for part in mail.walk():
    # ...
    message = re_between_percent.findall(content)

Context

StackExchange Code Review Q#26167, answer score: 2

Revisions (0)

No revisions yet.