patternMinor
Worst-case input for median-of-medians with groups of size 3
Viewed 0 times
casegroupsmedianswithsizeinputforworstmedian
Problem
Typically, median of medians algorithm is written with groups of size $5$ or $7$ to ensure worst-case linear performance. The argument against groups of size $k=3$ is typically that we get a recurrence of:
$$\begin{align*}
T(n) & \leq T(n/3) + T(2n/3) + O(n)\\
& = O(n \log n)
\end{align*}$$
However, this assumes that there exists an input which would always split the recurrence in the worst possible case. I am not convinced that this input always exists.
Question: Is there a general procedure to create a worst-case input for the median-of-medians algorithm with $k=3$ for an input of any length $n$?
If it helps, you can assume $n$ is of some particular format and that we are always searching for the exact median (as opposed to querying the $i$-th order statistic). You can also assume that the algorithm groups elements deterministically (e.g. incrementally group 3 elements).
You can choose any deterministic implementation if it helps, here is one for reference. Given an input array $A$ of $n$ distinct elements and query $i$ for the $i$-th order statistic, the $\mathrm{SELECT}(A, i)$ procedure is as follows:
$$\begin{align*}
T(n) & \leq T(n/3) + T(2n/3) + O(n)\\
& = O(n \log n)
\end{align*}$$
However, this assumes that there exists an input which would always split the recurrence in the worst possible case. I am not convinced that this input always exists.
Question: Is there a general procedure to create a worst-case input for the median-of-medians algorithm with $k=3$ for an input of any length $n$?
If it helps, you can assume $n$ is of some particular format and that we are always searching for the exact median (as opposed to querying the $i$-th order statistic). You can also assume that the algorithm groups elements deterministically (e.g. incrementally group 3 elements).
You can choose any deterministic implementation if it helps, here is one for reference. Given an input array $A$ of $n$ distinct elements and query $i$ for the $i$-th order statistic, the $\mathrm{SELECT}(A, i)$ procedure is as follows:
- Divide $A$ into $\lfloor \tfrac{n}{3} \rfloor$ groups of length $3$.
- We can do this by putting $\{a_1, a_2, a_3\}$ as group $G_1$, then $\{a_4, a_5, a_6\}$ as group $G_2$, and so on...
- If $n \neq 3k$ then we can create one more group with the remaining 1 or 2 elements.
- Find the median of each group $G_i$ and store them in a list $M$.
- We can do this by creating $M = [\mathrm{med}(G_1), \mathrm{med}(G_2), \ldots ]$.
- Recursively call this function to get the median of medians $x = \mathrm{SELECT}(M, \lfloor \tfrac{n}{6}\rfloor)$.
- Partition $A$ about $x$ into $A_1$ and $A_2$.
- Let all elements in $A_1$ be less than $x$. (Assume order is retained)
- Let all elements in $A_2$ be greater than $x$. (Assume order is retained)
- If $i = |A_1| + 1$ return $x$
- If $i
- If $i > |A_1| + 1$ return $\mathrm{SELECT}(A_2, i - |A_1| - 1)$.
Solution
Whether or not the median-of-medians algorithm with groups of size 3 runs in linear time is an open problem as said in [1] (while they proposed a variant running in linear time). I checked some follow-up papers and no one has a progress on showing the complexity of this algorithm. I believe it still remains open now.
[1] Chen, K., & Dumitrescu, A. (2015, August). Select with groups of 3 or 4. In Workshop on Algorithms and Data Structures (pp. 189-199). Springer, Cham.
[1] Chen, K., & Dumitrescu, A. (2015, August). Select with groups of 3 or 4. In Workshop on Algorithms and Data Structures (pp. 189-199). Springer, Cham.
Context
StackExchange Computer Science Q#106349, answer score: 8
Revisions (0)
No revisions yet.