HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

For what kind of data are hash table operations O(1)?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
operationswhatarehashkindfordatatable

Problem

From the answers to (When) is hash table lookup O(1)?, I gather that hash tables have $O(1)$ worst-case behavior, at least amortized, when the data satisfies certain statistical conditions, and there are techniques to help make these conditions broad.

However, from a programmer's perspective, I don't know in advance what my data will be: it often comes from some external source. And I rarely have all the data at once: often insertions and deletions happen at a rate that's not far below the rate of lookups, so preprocessing the data to fine-tune the hash function is out.

So, taking a step out: given some knowledge about data source, how can I determine whether a hash table has a chance of having $O(1)$ operations, and possibly which techniques to use on my hash function?

Solution

Hash table lookup can always be $O(1)$ for static sets, see the 2002 paper by Arne Andersson and Mikkel Thorup: Dynamic ordered sets with exponential search trees


Firstly, we give the first deterministic polynomial-time (in n)
algorithm for constructing a linear space static dictionary with $O(1)$
worst-case access cost (cf. perfect hashing). As mentioned earlier, a
linear space data structure that supports member queries (neighbor
queries are not supported) in constant time can be constructed at a
worst-case cost $O (n^2 W)$ without division [30]. We show that the
dependency of word size can be removed.

In the general case, Andersson et al provide an algorithm for hash-indexed data structures that support lookups and updates in $O(\sqrt{\log n / \log \log n})$. Furthermore, they prove that this bound is optimal. So we know exactly how close we can get to $O(1)$ in the general case.

Context

StackExchange Computer Science Q#477, answer score: 6

Revisions (0)

No revisions yet.