HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Counting elements that are greater than the median of medians

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
thecountingmedianselementsgreaterarethanthatmedian

Problem

Short version: I want to know where the $-2$ comes from in the formula on p. 221 of CLRS 3rd edition.

Long version: CLRS (3rd ed.) give an algorithm for $O(n)$ worst case arbitrary order statistic of $n$ distinct numbers. The algorithm is roughly:


Input: an array of $n$ elements and $i$, the number of the order statistic to return from the elements.



  • Divide the $n$ elements into $\lfloor n/5 \rfloor$ groups of 5 elements each along with an optional group containing $n\mod{5}$ elements (resulting in $\lceil n/5 \rceil$ groups.)



  • Find the median of each of the groups by sorting.



  • Recurse, using the $\lceil n/5 \rceil$ medians as the array and $\lfloor\lceil n/5 \rceil/2\rfloor$ as the order statistic, resulting in the median-of-medians.



  • Partition the $n$ elements around the median-of-medians (using a quicksort-like $O(n)$ partitioning algorithm.



  • Letting $k-1$ be the number of elements less than the median-of-medians, if $i = k$, return the median-of-medians. Otherwise recurse: if $i k$, then recurse finding the $i-k$th order statistic of the $n-k$ elements greater than the median-of-medians.





Output: the $i$th order statistic of the $n$ numbers.

In the proof of the runtime, CLRS argue that the number of elements greater than the median-of-medians is at least:

$$
3 \bigg(\bigg\lceil \frac{1}2 \bigg\lceil{\frac{n}5} \bigg\rceil \bigg\rceil - 2\bigg)
$$

The reasoning is that half of the medians are greater than the median-of-medians, and each of those medians' groups has at least three elements greater than the median-of-medians (the median itself plus the two elements greater than the median.) That would result in

$$
3 \bigg(\bigg\lceil \frac{1}2 \bigg\lceil{\frac{n}5} \bigg\rceil \bigg\rceil\bigg)
$$

for the lower bound on the number of elements greater than the median-of-medians.

But we must account for two things: the group containing the median-of-medians (the median-of-medians is not greater than itself) and

Solution

In

$$
3 \bigg(\bigg\lceil \frac{1}2 \bigg\lceil{\frac{n}5} \bigg\rceil \bigg\rceil - 2\bigg)
$$

we are subtracting 2 in order to discard the group containing the median of medians and the group of leftovers. So, 2 is the number of groups we are discarding.

First of all, note that there may not be a group of leftover elements: if $n$ is an exact multiple of 5 there will be no leftover group. We are interested in bounding from below the number of elements greater than the median-of-medians in the worst case, so suppose that a leftover group exists. Therefore, it will contain at least an element and no more than 4 elements. If you want to reason in terms of elements to be discarded and not in terms of groups, then we must discard exactly 3 elements for the group containing the median of medians (including the median of the medians), and at most 2 element from the leftovers group (if this group contains 1 or 2 elements you do not discard any element; if this group contains 3 or 4 elements, then you discard respectively 1 or 2 elements which are greater than the group's median). So, you discard in the worst case (leftover group with 4 elements) 3 + 2 elements:

$$
3 \bigg(\bigg\lceil \frac{1}2 \bigg\lceil{\frac{n}5} \bigg\rceil \bigg\rceil\bigg) - 5
$$

Context

StackExchange Computer Science Q#19542, answer score: 3

Revisions (0)

No revisions yet.