HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Counting big enough elements

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
enoughbigcountingelements

Problem

Let $v$ be a vector of positive integers of length $n$. I want to find the highest $k$ such as there are at least $k$ elements of $v$ that are greater or equal to $k$.

What would be the best algorithm for finding $k$?

Examples:

  • $v = (1, 2, 3, \ldots, 100) \rightarrow k = 50$, because there are at least 50 elements that are greater or equal to 50 (there are even 51 of those, fifty through a hundred, but that doesn’t matter); but there are only 50 (so—less than 51) elements greater or equal to 51.



  • $v = (100, 101, 102, 103, 104) \rightarrow k = 5$, because all five elements are greater or equal to five, and there cannot possibly be six elements greater or equal to six.



  • $v = (1, 1, 1, 100, 100, 100) \rightarrow k = 3$



Any ideas how to approach this?

My initial idea is this:

  • Sort the vector in descending order in $O(n \log n)$



  • Starting with the median (it seems, intuitively, as a good place to start), check if a number on a set position $i$ is greater or equal than $i$—since everything to the left of it is greater to equal to it (the vector is sorted), we know there are at least that many elements that are greater or equal to $v[i]$.



  • Go right or left accordingly to what was found—if we found that there are not enough values greater to $v[i]$, let’s try with the smaller-or-equal number, $v[i+1]$. If it was big enough, go the opposite direction, checking $v[i-1]$.



  • Our work is done when there is some $i_m$ for which the requirement holds, but it doesn’t hold for $i_{m} + 1$. That would require at most $O(n)$ checks.



  • Thus, the slowest part is the sort itself, making the algorithm $O(n \log n)$ total.



I’m curious if this is the best we can do, or can we possibly make it even faster.

Solution

You can do it in linear time, using an efficient selection algorithm.

Find the median, and partition the array into numbers below the median and numbers above the median. Check whether the number you're looking for is above or below the median. Then, recurse on the appropriate partition.

The running time satisfies the recurrence $T(n)=T(n/2)+O(n)$, since you can find the median and partition the array in $O(n)$ time. The solution to this recurrence is $T(n)=O(n)$, so the total running time is $O(n)$. This is faster than your $O(n \log n)$ time algorithm.

Context

StackExchange Computer Science Q#119040, answer score: 3

Revisions (0)

No revisions yet.