HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Worst-case input for median-of-medians with groups of size 3

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
casegroupsmedianswithsizeinputforworstmedian

Problem

Typically, median of medians algorithm is written with groups of size $5$ or $7$ to ensure worst-case linear performance. The argument against groups of size $k=3$ is typically that we get a recurrence of:

$$\begin{align*}
T(n) & \leq T(n/3) + T(2n/3) + O(n)\\
& = O(n \log n)
\end{align*}$$

However, this assumes that there exists an input which would always split the recurrence in the worst possible case. I am not convinced that this input always exists.

Question: Is there a general procedure to create a worst-case input for the median-of-medians algorithm with $k=3$ for an input of any length $n$?

If it helps, you can assume $n$ is of some particular format and that we are always searching for the exact median (as opposed to querying the $i$-th order statistic). You can also assume that the algorithm groups elements deterministically (e.g. incrementally group 3 elements).

You can choose any deterministic implementation if it helps, here is one for reference. Given an input array $A$ of $n$ distinct elements and query $i$ for the $i$-th order statistic, the $\mathrm{SELECT}(A, i)$ procedure is as follows:

  • Divide $A$ into $\lfloor \tfrac{n}{3} \rfloor$ groups of length $3$.



  • We can do this by putting $\{a_1, a_2, a_3\}$ as group $G_1$, then $\{a_4, a_5, a_6\}$ as group $G_2$, and so on...



  • If $n \neq 3k$ then we can create one more group with the remaining 1 or 2 elements.



  • Find the median of each group $G_i$ and store them in a list $M$.



  • We can do this by creating $M = [\mathrm{med}(G_1), \mathrm{med}(G_2), \ldots ]$.



  • Recursively call this function to get the median of medians $x = \mathrm{SELECT}(M, \lfloor \tfrac{n}{6}\rfloor)$.



  • Partition $A$ about $x$ into $A_1$ and $A_2$.



  • Let all elements in $A_1$ be less than $x$. (Assume order is retained)



  • Let all elements in $A_2$ be greater than $x$. (Assume order is retained)



  • If $i = |A_1| + 1$ return $x$



  • If $i



  • If $i > |A_1| + 1$ return $\mathrm{SELECT}(A_2, i - |A_1| - 1)$.



Solution

Whether or not the median-of-medians algorithm with groups of size 3 runs in linear time is an open problem as said in [1] (while they proposed a variant running in linear time). I checked some follow-up papers and no one has a progress on showing the complexity of this algorithm. I believe it still remains open now.

[1] Chen, K., & Dumitrescu, A. (2015, August). Select with groups of 3 or 4. In Workshop on Algorithms and Data Structures (pp. 189-199). Springer, Cham.

Context

StackExchange Computer Science Q#106349, answer score: 8

Revisions (0)

No revisions yet.