HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

An efficient data structure supporting Insert, Delete, and MostFrequent

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
insertdeleteefficientstructuresupportinganddatamostfrequent

Problem

Assume that we have a set $D$ and each member of $D$ is a data and key pair. We want a data structure that would support the following operations:

  • Insert $(d,k)$ into $D$,



  • Delete member $e$, (no need to search to find $e$, e.g. $e$ points to a member in $D$),



  • MostFrequent, which returns a member $e \in D$ such that $e.key$ is one of the most frequent keys in $D$ (note that the most frequent key doesn't need to be unique).



What would be an efficient implementation of this data structure?

My solution is a heap for the keys and their frequencies prioritized by the frequencies plus a hash table where the hash function maps members with the same key to the same slot in the hash table (with pointers from each part to the other).

This can give $\Theta(\lg n)$ for the first two operations and $\Theta(1)$ for the third (worst case running time).

I am wondering if there is more efficient solution? (or a simpler solution with the same efficiency?)

Solution

In the comparison-based model of computation, you may implement the priority queue using a Fibonacci heap instead of an ordinary heap. This will give you the following bounds: $\mathcal{O}(1)$ amortized time for insert and $\mathcal{O}(\log n)$ amortized time for deletion operations.

If you depart from the comparison-based model and adopt the RAM model where keys are regarded as binary strings, each one contained in one or more machine words, you may implement your priority queue in $o(\log n)$. Indeed, you can achieve for both insert and delete operations $\mathcal{O}(\sqrt{\log \log n}) $ and $\mathcal{O}(1) $ time for the findMin operation. Thorup proved that

If we can sort $n$ keys in time $S(n)$ per key, then we can implement a priority queue supporting find-min in constant time and updates (insert and delete) in $S(n)$ time.

See M. Thorup. Equivalence between priority queues and sorting, 2002. in Proc. FOCS 2002

Since we can sort in $\mathcal{O}(n \sqrt{\log \log n}) $ expected time and linear
space, as shown by

Y. Han and M. Thorup. Integer sorting in $\mathcal{O}(n \sqrt{\log \log n}) $ expected time and linear
space. in Proc. FOCS 2002

the bound is proved.

Context

StackExchange Computer Science Q#6455, answer score: 7

Revisions (0)

No revisions yet.