HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Hash size: Are prime numbers "near" powers of two a poor choice for the modulus?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
thearesizenumbershashpoortwochoiceforprime

Problem

Cormen et al.'s "Introduction to Algorithms" says the following about the division method hash function $h(k)=k \text{ mod } m$:


A prime not too close to an exact power of 2 is often a good choice for $m$. For example, suppose we wish to allocate a hash table, with collisions resolved by chaining, to hold roughly $n=2000$ character strings, where a character has 8 bits. We don't mind examining an average of 3 elements in an unsuccesful search, so we allocate a table of size $m=701$. The number $701$ is chose because it is a prime near $2000/3$ but not near any power of 2.

The book, however, does not explain what it means by "near". Is $701$ preferrable over, say, $661$ because $661-512<701-512$? I do not understand how this is relevant to the modulo function involved in the calculation of $h(k)$.

Solution

If a string of 8 bit bytes is interpreted as a large integer, and you calculate the value modulo $2^{24}-1$ then the first, fourth, seventh byte etc. are all hashed to the same value, which means the hash will not be well distributed.

Context

StackExchange Computer Science Q#86237, answer score: 2

Revisions (0)

No revisions yet.