HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Finding a pair of words from two files with a particular MD5

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
withwordsmd5twofilesfindingparticularfrompair

Problem

I have two txt files. Both contain 60k+ words in a list. I put together some code to loop through each file and concatenate the strings

import hashlib
with open('1.txt') as f:
    a= [line.rstrip() for line in f]
with open('2.txt') as f:
    b= [line.rstrip() for line in f]

for y in a:
    for z in b:
        if hashlib.md5(y+z).hexdigest()=='XXXTHEXXXHASHXXX':
            print y+z
            break


So line 1 in file 1 gets concatenated with line 1 in file 2, then line 2 etc. Line 2 in file gets.... you get where this is going...

Is there a cleaner way to do this? Not only that but how can I edit the script to use more cores?

Solution

In general, no, this would not be the fastest approach. However, assuming that all the lines are quite small, I think you won't do better by the 'general' fast approach.

You could use hashlib.md5().copy().

import hashlib
with open('1.txt') as f:
    a= [line.rstrip() for line in f]
with open('2.txt') as f:
    b= [line.rstrip() for line in f]

for y in a:
    prefix_hash = hashlib.md5(y)
    for z in b:
        if prefix_hash.copy().update(z).hexdigest() == 'XXXTHEXXXHASHXXX':
            print y + z
            break


Again, assuming len(y) is not that large, this is unlikely to do any better, and might even do worse because of the .copy() now used. But you're welcome to benchmark it.

Code Snippets

import hashlib
with open('1.txt') as f:
    a= [line.rstrip() for line in f]
with open('2.txt') as f:
    b= [line.rstrip() for line in f]

for y in a:
    prefix_hash = hashlib.md5(y)
    for z in b:
        if prefix_hash.copy().update(z).hexdigest() == 'XXXTHEXXXHASHXXX':
            print y + z
            break

Context

StackExchange Code Review Q#118858, answer score: 2

Revisions (0)

No revisions yet.