patternpythonCritical
Calculate SHA1 hash from binary and verify with a provided hash
Viewed 0 times
withprovidedhashverifysha1binarycalculateandfrom
Problem
I applied for a job and they asked me to write code with the following requirements:
Get a "toolbar offer" description from
http..update.utorrent.com/installoffer.php?offer=conduit. Parse the
bencoded (see http://en.wikipedia.org/wiki/Bencode) response and
extract the URL of the binary and its SHA1 hash. The hash is contained
in the value of the key 'offer_hash'. The binary is in list for the
key 'offer_urls' — please use the one prefixed with
"http..download.utorrent.com/".
Download the binary and calculate its hash
Verify that the calculated SHA1 hash is identical to the one provided
in the offer details. It is okay to use existing libraries for
bencoding, hashing, etc.
What is wrong with my code? The recruiter said it didn't meet their bar.
```
# Script that verifies the correctness of a toolbar offers
import urllib, bencode , hashlib
from hashlib import sha1
url = "http://update.utorrent.com/installoffer.php?offer=conduit"
filename = "content.txt"
f= urllib.urlretrieve(url, filename)
#Bdecoding the response
with open (str(filename), 'rb') as myfile:
data=myfile.read()
decoded = bencode.bdecode(data)
# Returning the list for the key 'offer_urls'
list = decoded['offer_urls']
#print list
# Returning the URL of the binary that is prefixed with "http://download3.utorrent.com/"
length = len (list)
prefix = "http://download3.utorrent.com/"
i= -1
while i < length:
i = i + 1
if list[i].startswith(prefix) :
break
binary= list[i]
print "The URL of the the binary is: " , binary
# Returning the sha1 hash contained in the value of the key 'offer_hash'
encrypted_hash = decoded['offer_hash']
sha1_hash1 = encrypted_hash.encode('hex')
print "The sha1 hash contained in the value of the key 'offer_hash' is: " , sha1_hash1
# Downloading the binary and calculating its hash
urllib.urlretrieve ( binary , "utct2-en-conduit-20130523.exe")
file = "C:\Python27\utct2-en-conduit-20130523.exe"
wit
Get a "toolbar offer" description from
http..update.utorrent.com/installoffer.php?offer=conduit. Parse the
bencoded (see http://en.wikipedia.org/wiki/Bencode) response and
extract the URL of the binary and its SHA1 hash. The hash is contained
in the value of the key 'offer_hash'. The binary is in list for the
key 'offer_urls' — please use the one prefixed with
"http..download.utorrent.com/".
Download the binary and calculate its hash
Verify that the calculated SHA1 hash is identical to the one provided
in the offer details. It is okay to use existing libraries for
bencoding, hashing, etc.
What is wrong with my code? The recruiter said it didn't meet their bar.
```
# Script that verifies the correctness of a toolbar offers
import urllib, bencode , hashlib
from hashlib import sha1
url = "http://update.utorrent.com/installoffer.php?offer=conduit"
filename = "content.txt"
f= urllib.urlretrieve(url, filename)
#Bdecoding the response
with open (str(filename), 'rb') as myfile:
data=myfile.read()
decoded = bencode.bdecode(data)
# Returning the list for the key 'offer_urls'
list = decoded['offer_urls']
#print list
# Returning the URL of the binary that is prefixed with "http://download3.utorrent.com/"
length = len (list)
prefix = "http://download3.utorrent.com/"
i= -1
while i < length:
i = i + 1
if list[i].startswith(prefix) :
break
binary= list[i]
print "The URL of the the binary is: " , binary
# Returning the sha1 hash contained in the value of the key 'offer_hash'
encrypted_hash = decoded['offer_hash']
sha1_hash1 = encrypted_hash.encode('hex')
print "The sha1 hash contained in the value of the key 'offer_hash' is: " , sha1_hash1
# Downloading the binary and calculating its hash
urllib.urlretrieve ( binary , "utct2-en-conduit-20130523.exe")
file = "C:\Python27\utct2-en-conduit-20130523.exe"
wit
Solution
I'm going to tell you the things in your code that made me wonder.
Extra space after
Use of single-letter variables suggests lack of attention to clean code.
Massive indentation suggests disregard for consistent indentation style.
Downloading the file to disk then reading it into memory is poor use of available APIs. You should download the file directly into memory.
Excess whitespace suggests someone is enter-happy.
"Returning" is technically incorrect here. You aren't in a function, so you aren't returning anything. Use of
Putting spaces after function calls is odd, and you don't do it consistently. There's also rarely a good reason to store the length of a
No space before the
You are using a
Is that hash really encrypted?
Hard-coded filename will be easy to break. You shouldn't even need to do it. The string also contains
or
or in most cases you can use the other slashes, even on Windows.
You got away with what you did, but that's mostly because you are lucky.
This last piece of code did pretty much the same thing as a previous bit of code, but you didn't use a function.
You've put parentheses, which are not necessary.
Your basic problems:
Here's my take on the problem:
```
# Script that verifies the correctness of a toolbar offers
import urllib
import bencode
import hashlib
import contextlib
# it's common practice to put input details like this
# as global constants at the start of your script
URL = "http://update.utorrent.com/installoffer.php?offer=conduit"
PREFIX = "http://download3.utorrent.com/"
# by breaking down your task into functions
# you can make the code easier to follow
def read_bencoded_url(url):
with contextlib.closing(urllib.urlopen(url)) as offer_file:
return bencode.bdecode(offer_file.read())
def find_string_with_prefix(strings, prefix):
for string in strings:
if string.startswith(prefix):
return string
else:
raise Exception("Did not find prefix: {}".format(prefix))
# this function is a bit more complicated
# but it avoids saving the file to disk or loading it entirely into memory.
# instead it hashes it 4096 bytes at a time
def hash_of_url(url):
hasher = hashlib.sha1()
with contextlib.closing(urllib.urlopen(url)) as binary_file:
while True:
data = binary_file.read(4096)
if not data:
break
hasher.update(data)
return hasher.hexdigest()
# the actual high level logic just calls the functions
# this avoid obscuring
# Script that verifies the correctness of a toolbar offers
import urllib, bencode , hashlibExtra space after
bencode suggests lack of attention to detail.from hashlib import sha1
url = "http://update.utorrent.com/installoffer.php?offer=conduit"
filename = "content.txt"
f= urllib.urlretrieve(url, filename)Use of single-letter variables suggests lack of attention to clean code.
#Bdecoding the response
with open (str(filename), 'rb') as myfile:filename is already a string. Converting it to a string again suggests confusion as to what is going on.data=myfile.read()
decoded = bencode.bdecode(data)Massive indentation suggests disregard for consistent indentation style.
Downloading the file to disk then reading it into memory is poor use of available APIs. You should download the file directly into memory.
# Returning the list for the key 'offer_urls'Excess whitespace suggests someone is enter-happy.
list = decoded['offer_urls']"Returning" is technically incorrect here. You aren't in a function, so you aren't returning anything. Use of
list is a bad name for a variable since it's a built-in Python type.#print list
# Returning the URL of the binary that is prefixed with "http://download3.utorrent.com/"
length = len (list)Putting spaces after function calls is odd, and you don't do it consistently. There's also rarely a good reason to store the length of a
list.prefix = "http://download3.utorrent.com/"
i= -1No space before the
=.while i < length:
i = i + 1
if list[i].startswith(prefix) :
break
binary= list[i]You are using a
while loop when you should be using a for loop. It is also going to cause an IndexError, rather than fail gracefully, if the prefix isn't found.print "The URL of the the binary is: " , binary
# Returning the sha1 hash contained in the value of the key 'offer_hash'
encrypted_hash = decoded['offer_hash']Is that hash really encrypted?
sha1_hash1 = encrypted_hash.encode('hex')
print "The sha1 hash contained in the value of the key 'offer_hash' is: " , sha1_hash1
# Downloading the binary and calculating its hash
urllib.urlretrieve ( binary , "utct2-en-conduit-20130523.exe")
file = "C:\Python27\utct2-en-conduit-20130523.exe"Hard-coded filename will be easy to break. You shouldn't even need to do it. The string also contains
"\" which should be escaped, or the whole string should be raw. i.e."C:\\Python27\\utct2-en-conduit-20130425.exe"or
r"C:\Python27\utct2-en-conduit-20130425.exe"or in most cases you can use the other slashes, even on Windows.
"C:/Python27/utct2-en-conduit-20130425.exe"You got away with what you did, but that's mostly because you are lucky.
with open (file, 'rb') as myfile:
downloaded=myfile.read()This last piece of code did pretty much the same thing as a previous bit of code, but you didn't use a function.
k = hashlib.sha1()
k.update(downloaded)
sha1file = k.hexdigest()
print "The sha1 hash of the downloaded binary is: " , sha1file
# Verify that the calculated sha1 hash is identical to the one provided in the offer details
if (sha1file == sha1_hash1) :You've put parentheses, which are not necessary.
print "Test result = Pass"
else :
print "Test result = Fail"Your basic problems:
- Your style is poor and inconsistent
- A few things suggest you don't quite understand what's going on
- You haven't taken care to make the code readable
Here's my take on the problem:
```
# Script that verifies the correctness of a toolbar offers
import urllib
import bencode
import hashlib
import contextlib
# it's common practice to put input details like this
# as global constants at the start of your script
URL = "http://update.utorrent.com/installoffer.php?offer=conduit"
PREFIX = "http://download3.utorrent.com/"
# by breaking down your task into functions
# you can make the code easier to follow
def read_bencoded_url(url):
with contextlib.closing(urllib.urlopen(url)) as offer_file:
return bencode.bdecode(offer_file.read())
def find_string_with_prefix(strings, prefix):
for string in strings:
if string.startswith(prefix):
return string
else:
raise Exception("Did not find prefix: {}".format(prefix))
# this function is a bit more complicated
# but it avoids saving the file to disk or loading it entirely into memory.
# instead it hashes it 4096 bytes at a time
def hash_of_url(url):
hasher = hashlib.sha1()
with contextlib.closing(urllib.urlopen(url)) as binary_file:
while True:
data = binary_file.read(4096)
if not data:
break
hasher.update(data)
return hasher.hexdigest()
# the actual high level logic just calls the functions
# this avoid obscuring
Code Snippets
# Script that verifies the correctness of a toolbar offers
import urllib, bencode , hashlibfrom hashlib import sha1
url = "http://update.utorrent.com/installoffer.php?offer=conduit"
filename = "content.txt"
f= urllib.urlretrieve(url, filename)#Bdecoding the response
with open (str(filename), 'rb') as myfile:data=myfile.read()
decoded = bencode.bdecode(data)# Returning the list for the key 'offer_urls'Context
StackExchange Code Review Q#27824, answer score: 370
Revisions (0)
No revisions yet.