HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Counting e-mails in a mailbox for each hour

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
mailscountingeachhourmailboxfor

Problem

Pardon if this is silly, but I'm pretty new to coding. I have this code, that works, but I feel like it could be tighter than this, I just can't seem to get it to work any other way. The idea is to read a .txt file find incoming e-mail strings and organize the data by frequency of hour sent.

Here is an example line that I'm looking for in the text:

From email@emailaddress.com Sat Jan 5 09:14:16 2008


Here is my code:

`fname = input("Enter file:")
if len(fname)

Solution

Formatting

Please follow PEP8, the Python style guide.
Most noticeably,
use 4 spaces to indent.

Working with files

The recommended way to work with files is using with ... as, like this:

with open(fname) as fh:
    for line in fh:
        if not line.startswith("From "):
            continue
        words = line.split()
        time.append(words[5])


This way, you don't have to remember to close the filehandle when you're done using it.

Naming

Use plural names for collections.
time is really odd for a list of time strings.
And I suppose that's part of the reason for naming the loop variable i here:

for i in time:
    i.split(":")
    hours.append(i[:2])


When it would have been so much more natural as:

for time in times:
    time.split(":")
    hours.append(time[:2])


Pointless statements

i.split here is completely pointless. It splits the string, but then the result is never used:

for i in time:
    i.split(":")
    hours.append(i[:2])


This is the same thing:

for i in time:
    hours.append(i[:2])


Use list comprehensions

The hours list can be created simply as:

hours = [time[:2] for time in times]


Use collections.Counter

Instead of this:

for h in hours:
        hr[h]=hr.get(h, 0)+1


You could create hr a lot simpler:

from collections import Counter

hr = Counter(hours)


(Of course, the import statement should be at the top of the script.)

Sorting

Instead of this:

l = list()
for k, v in hr.items():
    l.append((k, v))
l.sort()
for k, v in l:
    print(k, v)


You could use the builtin sorted, and greatly simplify:

for hour, count in sorted(hr.items()):
    print(hour, count)


Notice that I renamed the loop variables. How much easier to read that!

Use functions

Rather than just a sequence of statements,
it would be better to decompose your logic to functions.

Suggested implementation

With the above suggestions applied (+ PEP8, do read that!), and then some more:

from collections import Counter

DEFAULT_FILENAME = 'filename.txt'

def input_filename():
    filename = input("Enter file:")
    if not filename:
        filename = DEFAULT_FILENAME
    return filename

def read_times(filename):
    with open(filename) as fh:
        for line in fh:
            if line.startswith("From "):
                yield line.split()[5][:2]

def main():
    filename = input_filename()

    hour_counts = Counter(read_times(filename))

    for hour, count in sorted(hour_counts.items()):
        print(hour, count)

if __name__ == '__main__':
    main()

Code Snippets

with open(fname) as fh:
    for line in fh:
        if not line.startswith("From "):
            continue
        words = line.split()
        time.append(words[5])
for i in time:
    i.split(":")
    hours.append(i[:2])
for time in times:
    time.split(":")
    hours.append(time[:2])
for i in time:
    i.split(":")
    hours.append(i[:2])
for i in time:
    hours.append(i[:2])

Context

StackExchange Code Review Q#111240, answer score: 7

Revisions (0)

No revisions yet.