patternMinor
Proving O(log n) bound for the number of iterations when we select the average as the pivot
Viewed 0 times
provingboundthenumberiterationslogpivotaverageforwhen
Problem
Motivation
So the other day I had fun providing a new solution to this famous question. In the analysis part I showed that my little algorithm has space complexity:
To save time for the reader I have changed my space complexity problem to an equivalent complexity problem (the complexity of a function [not of space or time]):
Data:
Array of size
However, this proof is quite informal and I'm not sure how to show that this genuinely represents the worst-case scenario.
So the other day I had fun providing a new solution to this famous question. In the analysis part I showed that my little algorithm has space complexity:
O(k) and Ω(log(k)). However, my rough logic says that we should be able to prove a tighter bound of O(min(k,log(n))), but I was unable to prove it. Moreover it seems like an interesting, general-case problem.To save time for the reader I have changed my space complexity problem to an equivalent complexity problem (the complexity of a function [not of space or time]):
Data:
Array of size
k; where elements, e, are unique integers: 0 than the average, or all the elements x^2 - x^2/2 + x/2 = n-1
=> O(x) = O(sqrt(n))
and here x represents the biggest sub-size.
So in this artificial "worst case" scenario, we get O(sub-array size) = O(min(k,sqrt(n))), which kind of implies that the overall complexity = O(min(k,log(sqrt(n)))) = O(min(k,log(n)))`.However, this proof is quite informal and I'm not sure how to show that this genuinely represents the worst-case scenario.
Solution
We introduce a weight measure for $S$, a set of numbers with respect to its average, namely: $w(S) = \sum_{s\in S} |s-avg(S)|$. The measure $w(S)$ expresses the total distance to the average. Based on this weight we will prove the number of iterations is in $O(\log\ n)$.
If we apply one iteration of your described algorithm on a set $S$ we obtain either the set $S_\leq = \{s \leq avg(S) \mid s\in S\} $ or the set $S_> = \{s>avg(S) \mid s\in S\}$. Now we show that $w(S_\leq)\leq \frac{w(S)}{2}$ and $w(S_>) \leq \frac{w(S)}{2}$. The weight of both sets $S_\leq, S_>$ w.r.t. the old average of $S$ is equal to:
$$\sum_{s\in S_\leq} |s-avg(S)| = \sum_{s\in S_>} |s-avg(S)| = \frac{w(S)}{2},$$
which follows from the observation that $$\sum_{s\in S} (s - avg(S)) = \sum_{s\in S} s - |S|*avg(S) = \sum_{s\in S} s - \sum_{s\in S} s = 0.$$
By the construction of the sets $S_\leq$, $S_>$ only contain negative, and positive terms in $s-avg(S)$, and their sum is the total.
Note that $w(S_\leq) \leq \sum_{s\in S_\leq} |s-avg(S)|$ since $avg(S_\leq)\leq avg(S)$ and symmetrical for the other case.
To complete the complexity bound we confirm that
If we apply one iteration of your described algorithm on a set $S$ we obtain either the set $S_\leq = \{s \leq avg(S) \mid s\in S\} $ or the set $S_> = \{s>avg(S) \mid s\in S\}$. Now we show that $w(S_\leq)\leq \frac{w(S)}{2}$ and $w(S_>) \leq \frac{w(S)}{2}$. The weight of both sets $S_\leq, S_>$ w.r.t. the old average of $S$ is equal to:
$$\sum_{s\in S_\leq} |s-avg(S)| = \sum_{s\in S_>} |s-avg(S)| = \frac{w(S)}{2},$$
which follows from the observation that $$\sum_{s\in S} (s - avg(S)) = \sum_{s\in S} s - |S|*avg(S) = \sum_{s\in S} s - \sum_{s\in S} s = 0.$$
By the construction of the sets $S_\leq$, $S_>$ only contain negative, and positive terms in $s-avg(S)$, and their sum is the total.
Note that $w(S_\leq) \leq \sum_{s\in S_\leq} |s-avg(S)|$ since $avg(S_\leq)\leq avg(S)$ and symmetrical for the other case.
To complete the complexity bound we confirm that
- $w(S)\in O(n^2)$ initially since $S\subseteq\{0,\dots,n-1\}$.
- if $w(S)\leq 1$ it is a base case in $O(1)$ (only 1 more split).
Context
StackExchange Computer Science Q#131150, answer score: 2
Revisions (0)
No revisions yet.