HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Hash Table: Relation between position of a value and Hash table size

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
sizepositionhashvaluebetweenandrelationtable

Problem

I need to know whether hash table size, $S$ has any impact on distribution of a value in hash table.

At the moment I'm using SHA512 and I can tolerate up to 50 elements in any bucket, and I have $10^6$ elements; I set hash table length to $S=50000$. This is far more larger than what is suggested in the literature [1]. However, if I reduce the table length I would get overflow (as I have tested and saw it).

To find position of an element in hash table I do $hash(val) mod S$.

Question: Does $S$ play any role in distribution of the value in hash table?
In other words, do I need to set $S$ as a prime number or power of two ?

[1]. http://www14.in.tum.de/personen/raab/publ/balls.pdf

Solution

No. If you use SHA512, you can use any modulus $S$ you want: it doesn't need to be prime. A power of two is fine and is no better or worse than a prime.

The reason is that the cryptographic properties of SHA512 ensure that the output of SHA512 is (for all practical purposes) essentially a uniformly random 512-bit string. Because $2^{512}$ is so much larger than your modulus, after reducing modulo $S$, the result will be approximately uniform. Any non-uniformity will be incredibly, exponentially small: so small that it's not detectable within your lifetime, or the lifetime of the solar system. (I know there's another answer that claims non-uniformity is a problem, but that answer failed to quantify the extent of the non-uniformity and thus came to the wrong conclusion; the amount of non-uniformity is so exponentially small that it's completely irrelevant in practice).

So, go ahead and use any modulus that's convenient. It doesn't need to be a power of two or a prime number, if you're using SHA512 as your hash function.

Context

StackExchange Computer Science Q#49640, answer score: 3

Revisions (0)

No revisions yet.