patternMinor
An efficient data structure supporting Insert, Delete, and MostFrequent
Viewed 0 times
insertdeleteefficientstructuresupportinganddatamostfrequent
Problem
Assume that we have a set $D$ and each member of $D$ is a data and key pair. We want a data structure that would support the following operations:
What would be an efficient implementation of this data structure?
My solution is a heap for the keys and their frequencies prioritized by the frequencies plus a hash table where the hash function maps members with the same key to the same slot in the hash table (with pointers from each part to the other).
This can give $\Theta(\lg n)$ for the first two operations and $\Theta(1)$ for the third (worst case running time).
I am wondering if there is more efficient solution? (or a simpler solution with the same efficiency?)
- Insert $(d,k)$ into $D$,
- Delete member $e$, (no need to search to find $e$, e.g. $e$ points to a member in $D$),
- MostFrequent, which returns a member $e \in D$ such that $e.key$ is one of the most frequent keys in $D$ (note that the most frequent key doesn't need to be unique).
What would be an efficient implementation of this data structure?
My solution is a heap for the keys and their frequencies prioritized by the frequencies plus a hash table where the hash function maps members with the same key to the same slot in the hash table (with pointers from each part to the other).
This can give $\Theta(\lg n)$ for the first two operations and $\Theta(1)$ for the third (worst case running time).
I am wondering if there is more efficient solution? (or a simpler solution with the same efficiency?)
Solution
In the comparison-based model of computation, you may implement the priority queue using a Fibonacci heap instead of an ordinary heap. This will give you the following bounds: $\mathcal{O}(1)$ amortized time for insert and $\mathcal{O}(\log n)$ amortized time for deletion operations.
If you depart from the comparison-based model and adopt the RAM model where keys are regarded as binary strings, each one contained in one or more machine words, you may implement your priority queue in $o(\log n)$. Indeed, you can achieve for both insert and delete operations $\mathcal{O}(\sqrt{\log \log n}) $ and $\mathcal{O}(1) $ time for the findMin operation. Thorup proved that
If we can sort $n$ keys in time $S(n)$ per key, then we can implement a priority queue supporting find-min in constant time and updates (insert and delete) in $S(n)$ time.
See M. Thorup. Equivalence between priority queues and sorting, 2002. in Proc. FOCS 2002
Since we can sort in $\mathcal{O}(n \sqrt{\log \log n}) $ expected time and linear
space, as shown by
Y. Han and M. Thorup. Integer sorting in $\mathcal{O}(n \sqrt{\log \log n}) $ expected time and linear
space. in Proc. FOCS 2002
the bound is proved.
If you depart from the comparison-based model and adopt the RAM model where keys are regarded as binary strings, each one contained in one or more machine words, you may implement your priority queue in $o(\log n)$. Indeed, you can achieve for both insert and delete operations $\mathcal{O}(\sqrt{\log \log n}) $ and $\mathcal{O}(1) $ time for the findMin operation. Thorup proved that
If we can sort $n$ keys in time $S(n)$ per key, then we can implement a priority queue supporting find-min in constant time and updates (insert and delete) in $S(n)$ time.
See M. Thorup. Equivalence between priority queues and sorting, 2002. in Proc. FOCS 2002
Since we can sort in $\mathcal{O}(n \sqrt{\log \log n}) $ expected time and linear
space, as shown by
Y. Han and M. Thorup. Integer sorting in $\mathcal{O}(n \sqrt{\log \log n}) $ expected time and linear
space. in Proc. FOCS 2002
the bound is proved.
Context
StackExchange Computer Science Q#6455, answer score: 7
Revisions (0)
No revisions yet.