HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Sorting "almost sorted" array in $O(n\log\log n)$

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
sortinglogalmostarraysorted

Problem

Given array $A$ of length $n$, we call it almost sorted if there are at most $\log n$ indices satisfying $A[i] > A[i+1]$.

Find an algorithm that sorts the array in $O(n\log\log n)$.

My attempt:

  • Create an array $B$ of size $\log n + 1$.



  • Go through the array $A$, recognize the $\log n$ pairs, and insert them into $B$.



  • Sort $B$ using insertion sort in time $O(\log^2 n)$.



At this point I am stuck.

Solution

Suppose that $A$ is an array with $m$ indices satisfying $A[i] > A[i+1]$. You can find these indices in $O(n)$. These $m$ indices split $A$ into $m+1$ nondecreasing arrays $B_1,\ldots,B_{m+1}$ of total length $n$. We now merge them according to the following strategy: at each step, take the two shortest arrays, and merge them. You can implement the choice mechanism in $O(m\log m) = O(n\log m)$ using a heap.

When merging two arrays of length $a,b$, the running time is $O(a+b)$. Hence we would like to understand the total sum $S$ of $a+b$, where $(a,b)$ goes over all $m$ pairs of arrays being merged. To this end, we consider a merge tree, which is formed in the following way. We start with $m+1$ vertices corresponding to $B_1,\ldots,B_{m+1}$. When two arrays corresponding to vertices $x,y$ are merged, we create a new vertex with $x,y$ its only children. You can check that
$$
S = \sum_{i=1}^{m+1} |B_i| \mathrm{depth}(B_i).
$$
Consider now a probability distribution $X$ on $[m+1]$ with $\Pr[X=i] = |B_i|/n$. Then $S/n$ is the average codeword length of an optimal prefix code for $X$ (this is because we're essentially running Huffman's algorithm). Therefore $S/n < \log m + 1$, showing that the merging steps take $O(n\log m)$ time in total.

Summarizing, the algorithm runs in time $O(n\log m)$.

(Strictly speaking, to handle the cases $m=1$ and $m=0$, we should replace $m$ with $m+2$.)

Context

StackExchange Computer Science Q#104603, answer score: 6

Revisions (0)

No revisions yet.