HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Finding a value in a sorted array in log R time, R is the number of distinct elements

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
distinctthenumberelementslogarrayvaluetimefindingsorted

Problem

The standard binary search algorithm gives log N time, where N is the total number of elements in the array. When the array has duplicates, I don't see how you could detect those duplicates ahead of time. (Iterating through the array takes N time, which is too much.) Consequently how do you improve the performance from log N to log R?

Solution

There is no such algorithm. Here's an information-theoretic proof that it can't be done, inspired by gnasher79's answer.

Let's focus on the special case where $R=3$. Suppose there is a constant $c$ and an algorithm that examines at most $c$ elements of the array, regardless of $n$. Then this algorithm can't possibly solve your problem.

In particular, consider all arrays of the form $[0,\dots,0,1,2,\dots,2]$, i.e., containing some number of 0's, followed by a single 1, followed by 2's. Let's imagine what the algorithm does when you ask it to find a $1$. There are $n-1$ different possible positions for the $1$. However, after doing $c$ probes, the algorithm learns only $c \lg 3$ bits of information about the position of the $1$. Thus, if $n-1 > 2^{c \lg 3}$, the algorithm can't possibly work correctly for all such arrays: there exists some pair of arrays where the algorithm outputs the same thing in both cases, but the correct answers differs, so the algorithm must be wrong for at least one of those two cases. In other words, if we take $n$ to be sufficiently large (namely, $n> 3^c + 1$), the algorithm will be incorrect.

This implies there is no algorithm for the case $R=3$ whose running time is upper-bounded by a constant.

It follows that there is no algorithm whose running time is $O(\lg R)$ (and whose running time doesn't depend on $n$): if there was, plugging in $R=3$ would give us a $O(1)$-time algorithm for the special case where $R=3$... but I just proved that no such algorithm can exist.

However, the special case $R=2$ can be solved in $O(1)$ time: simply look at the first and last element of the array, and pick whichever one matches the value you're looking for.

Context

StackExchange Computer Science Q#62733, answer score: 6

Revisions (0)

No revisions yet.