HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Sorting a set of $n$ elements containing only $\log n$ unique elements

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
sortingcontaininguniqueelementslogonlyset

Problem

We have a set of $n$ elements which contains at most $\log n$ different numbers. I want to sort this set faster than $O(n \log n)$. Is it possible?

I tried using a hash table to find the set of $\log n$ unique elements, then sort them in $O(\log n \log \log n)$, after that creating a new array of $n$ element with this array of $\log n$ elements in $O(n)$, but it is not the right answer since using hash table may fail, and the worst case still needs more than $O(n \log n)$.

Is there any algorithm faster than $O(n \log n)$?

Solution

Your solution is almost complete. Try replacing the hash table with a comparable data structure, such as a balanced binary search tree. Since the tree will only contain at most $\log n$ elements, all the operations on the tree will take time $O(\log\log n)$, and the resulting algorithm will take time $O(n\log\log n)$.

Any comparison-based algorithm for your problem takes time $\Omega(n\log\log n)$. To see that, notice that there are roughly $(\log n)^n$ different possible relative orderings for your set (the actual number is a bit smaller, but not by much). Any comparison-based decision tree must therefore have depth $\Omega(\log [(\log n)^n]) = \Omega(n\log\log n)$.

If you allow more general algorithms, then you can improve the complexity to randomized $O(n)$ using a hash table, along the lines that you mentioned. Since you can afford a hash table of size $O(n)$ although the occupancy is only $\log n$, the probability that your algorithm exceeds its expected running time significantly will be very, very small.

It is also possible that under a suitable computation model, you can sort you list in deterministic $O(n)$ time. Some people consider this kind of algorithm cheating, though.

Context

StackExchange Computer Science Q#65248, answer score: 6

Revisions (0)

No revisions yet.