patternpythonMinor
Python PBKDF2 using core modules
Viewed 0 times
modulescoreusingpythonpbkdf2
Problem
I'm trying to implement pbkdf2 in Python 2.7/3.4 without needing the pbkdf2 module. (I'm also trying to avoid using classes.)
Any insight would be appreciated:
Any insight would be appreciated:
from binascii import hexlify, unhexlify
import hmac, struct, hashlib, sys
is_python2 = True if sys.version_info.major == 2 else False
def pbkdf_two(passwd, salt, iters=2048, keylen=64, digestmod=hashlib.sha512):
"""
>>> hexlify(pbkdf_two(b'All n-entities must communicate with other n-entities via n-1 entiteeheehees', unhexlify('1234567878563412'), 500, 16, hashlib.sha1))
'6a8970bf68c92caea84a8df285108586'
"""
dgsz = digestmod().digest_size if callable(digestmod) else digestmod.digest_size
if keylen is None: keylen = dgsz
# Helper function which copies each iteration for h, where h is an hmac seeded with password
def pbhelper(h, salt, itercount, blocksize):
def prf(h, data):
hm = h.copy()
hm.update(data)
return hm.digest()
U = prf(h, salt + struct.pack('>i', blocksize))
T = U
for j in range(2, itercount+1):
U = prf(h, U)
T = "".join([chr( ord(x) ^ ord(y) ) for (x, y) in zip( T, U )]) \
if is_python2 else bytes([x ^ y for (x, y) in zip(T, U)]) # XORing
return T
L = int(keylen/dgsz) # L - number of output blocks to produce
if keylen % dgsz != 0: L += 1
h = hmac.new(key=passwd, msg=None, digestmod=digestmod )
T = b""
for i in range(1, L+1):
T += pbhelper(h, salt, iters, i)
return T[:keylen]Solution
- Review
-
The docstring doesn't explain what the function does, or how to call it, or what it returns. There's just a doctest.
-
If the code were reformatted to fit in 79 columns, as recommended by the Python style guide (PEP8), then we wouldn't have to scroll it horizontally to read it here at Code Review.
-
hexlify and unhexlify are only used by the doctest, so they could be included there.-
sys is not used, so the import could be omitted.-
The key derivation function is called PBKDF2, so I think the function would be better named
pbkdf2, not pbkdf_two.-
I don't think that having default values for arguments to this function is a good idea. The caller needs to think about these values, not rely on the defaults. Note that RFC 2898 doesn't specify any particular values.
-
We're not running out of letters, so why not
password instead of passwd and digest_size instead of dgsz?-
The helper function
pbhelper is called from only one place in the code, so there's nothing to be gained by making it into a local function.-
The fourth argument to
pbhelper is named blocksize, but this is very misleading: it's actually the (1-based) index of the block.-
The iteration variable
j is not used in the body of the loop. It's conventional to name such a variable _.-
Since
j is not used, it doesn't matter what values it takes, and so it's simpler to use range(iters-1) instead of range(2, iters+1).-
Since
h is always the same, there's no need for it to be a parameter to the prf function.-
The
msg argument to hmac.new defaults to None so there's no need to specify it.-
If the code used
bytearray instead of bytes then it would be portable between Python 2 and 3 without needing a version test.-
Instead of:
L = int(keylen/dgsz) # L - number of output blocks to produce
if keylen % dgsz != 0: L += 1the number of blocks can be computed like this, using the floor division operator
//:L = (keylen + dgsz - 1) // dgsz- But even more simply, why not just iterate until the result is long enough? That way you wouldn't have to compute
digest_size, and it would have the advantage that in Python 3.4 or later, the caller could pass in the name of the digest algorithm, just as forhmac.new.
- Revised code
import hmac
import struct
def pbkdf2(password, salt, iters, keylen, digestmod):
"""Run the PBKDF2 (Password-Based Key Derivation Function 2) algorithm
and return the derived key. The arguments are:
password (bytes or bytearray) -- the input password
salt (bytes or bytearray) -- a cryptographic salt
iters (int) -- number of iterations
keylen (int) -- length of key to derive
digestmod -- a cryptographic hash function: either a module
supporting PEP 247, a hashlib constructor, or (in Python 3.4
or later) the name of a hash function.
For example:
>>> import hashlib
>>> from binascii import hexlify, unhexlify
>>> password = b'Squeamish Ossifrage'
>>> salt = unhexlify(b'1234567878563412')
>>> hexlify(pbkdf2(password, salt, 500, 16, hashlib.sha1))
b'9e8f1072bdf5ef042bd988c7da83e43b'
"""
h = hmac.new(password, digestmod=digestmod)
def prf(data):
hm = h.copy()
hm.update(data)
return bytearray(hm.digest())
key = bytearray()
i = 1
while len(key) i', i))
for _ in range(iters - 1):
U = prf(U)
T = bytearray(x ^ y for x, y in zip(T, U))
key += T
i += 1
return key[:keylen]Code Snippets
L = int(keylen/dgsz) # L - number of output blocks to produce
if keylen % dgsz != 0: L += 1L = (keylen + dgsz - 1) // dgszimport hmac
import struct
def pbkdf2(password, salt, iters, keylen, digestmod):
"""Run the PBKDF2 (Password-Based Key Derivation Function 2) algorithm
and return the derived key. The arguments are:
password (bytes or bytearray) -- the input password
salt (bytes or bytearray) -- a cryptographic salt
iters (int) -- number of iterations
keylen (int) -- length of key to derive
digestmod -- a cryptographic hash function: either a module
supporting PEP 247, a hashlib constructor, or (in Python 3.4
or later) the name of a hash function.
For example:
>>> import hashlib
>>> from binascii import hexlify, unhexlify
>>> password = b'Squeamish Ossifrage'
>>> salt = unhexlify(b'1234567878563412')
>>> hexlify(pbkdf2(password, salt, 500, 16, hashlib.sha1))
b'9e8f1072bdf5ef042bd988c7da83e43b'
"""
h = hmac.new(password, digestmod=digestmod)
def prf(data):
hm = h.copy()
hm.update(data)
return bytearray(hm.digest())
key = bytearray()
i = 1
while len(key) < keylen:
T = U = prf(salt + struct.pack('>i', i))
for _ in range(iters - 1):
U = prf(U)
T = bytearray(x ^ y for x, y in zip(T, U))
key += T
i += 1
return key[:keylen]Context
StackExchange Code Review Q#87538, answer score: 5
Revisions (0)
No revisions yet.