HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Recursive bytearray hash function

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
recursivefunctionhashbytearray

Problem

I need a function that hashes a list of different inputs (integral datatypes, strings, bytearrays) in a recursive manner.

def RecHash(v):    
    isSingleElementOfList = False

    if type(v) is list and len(v) == 1:
            isSingleElementList = True

    if type(v) is not list or isSingleElementOfList:   # if v is a single element
        v0 = v
        if isSingleElementOfList: v0 = v[0]

        if type(v0) is bytearray or v0.__class__.__name__ == 'bytes':
            return hash(v0)
        if type(v0) is int:
           return hash(ToByteArray(v0))
        if type(v0) is str:
            return hash(v0.encode('utf-8'))       

        return bytes()
    else:                               # if v is a list
        res = bytearray()
        for vi in v:
            res +=  RecHash(vi)     # recursion
        return hash(res)            # hash the concatenated hashes


and the helper function:

def ToByteArray(x):       
    q, r = divmod(BitAbs(x),8)
    q += bool(r)    
    return ToByteArrayN(x, q)

def ToByteArrayN(x, n):

    B = bytearray()

    for i in range(0, int(n)):
        b = x % 256
        x = x // 256                  # // = integer division => floor
        B.insert(0, b)

    return bytes(B)

def BitAbs(i):
     return i.bit_length()


It is working fine, however, it's very slow (and interestingly, it's not the hash() function that is slowing it down.).

Performance is worst when the list contains mostly numbers, so the ToByteArray function also doesn't seem to perform well.

Solution

ISTR that type(x) is relatively slow. Try saving it in a local variable:

typev = type(v)
if typev is list and len(v) == 1:
        isSingleElementList = True

if typev is not list or isSingleElementOfList:   # if v is a single element
    v0 = v[0] if isSingleElementOfList else v

    type_v0 = type(v0)
    if type_v0 is bytearray or v0.__class__.__name__ == 'bytes':
        return hash(v0)
    if type_v0 is int:
       return hash(ToByteArray(v0))
    if type_v0 is str:
        return hash(v0.encode('utf-8'))       
else:                               # if v is a list
    res = bytearray()
    for vi in v:
        res +=  RecHash(vi)     # recursion
    return hash(res)            # hash the concatenated hashes


Also, btw, your code will choke on nested single element lists. Consider [[[1]]]

Code Snippets

typev = type(v)
if typev is list and len(v) == 1:
        isSingleElementList = True

if typev is not list or isSingleElementOfList:   # if v is a single element
    v0 = v[0] if isSingleElementOfList else v

    type_v0 = type(v0)
    if type_v0 is bytearray or v0.__class__.__name__ == 'bytes':
        return hash(v0)
    if type_v0 is int:
       return hash(ToByteArray(v0))
    if type_v0 is str:
        return hash(v0.encode('utf-8'))       
else:                               # if v is a list
    res = bytearray()
    for vi in v:
        res +=  RecHash(vi)     # recursion
    return hash(res)            # hash the concatenated hashes

Context

StackExchange Code Review Q#156912, answer score: 2

Revisions (0)

No revisions yet.