HiveBrain v1.2.0
Get Started
← Back to all entries
snippetMinor

Why is finding minimum number of comparisons to sort $n$ elements so difficult?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
whynumberelementsminimumfindingsortdifficultcomparisons

Problem

In The Art of Computer Programming 2nd Ed, Vol 3, Section 5.3.1 then discuss a function $S(n)$ which is define as:


$S(n)$ : The minimum number of comparisons that suffice to sort $n$ elements.

Further, the book regards $\lceil \lg n! \rceil$ as the information theoretic lower bound for $S(n)$.

Using merge insertion they also upper bound the number of comparisons by $F(n)$ where

$$F(n) = \sum_{k = 1}^{n} \lceil \lg \tfrac{3}{4} k \rceil$$

So you can get the bound $\lceil \lg n! \rceil \leq S(n) \leq F(n)$, and for any values $n$ where $\lceil \lg n! \rceil = F(n)$ you can find the exact value of $S(n)$.

My questions are:

-
Why does $S(n)$ not always match the information theoretic lower bound $\lceil \lg n! \rceil$? It seems like if this is all the bits of information we should need, that this is all the comparisons we would need. Why do they differ?

-
Why is $S(n)$ so difficult to compute? It's discussed in the book some but the reasons are still unclear to me. Do you have to brute force and create every possible decision tree for a given $n$ and determine the longest path? Is there not a more efficient way? It seems that $S(n)$ has only been exactly computed for $n \leq 22$ (See A036604 here).

Solution

Let's say the size of the array is such that there are $2^{k-1} < x ≤ 2^k$ possible permutations, so the theoretical lower bound to sort this array is k comparisons.

Any comparison divides the set of possible permutations in two subsets, consistent with each outcome of the comparison. You'd first need a comparison that splits the set of all possible permutations into two subsets, each of size at most $2^{k-1}$. And each subset you need to split into two subsets of size at most $2^{k-2}$ etc. If these subsets don't have the same size, then after j comparisons the size of the largest subset could exceed $2^{k-j}$, and you lost.

Context

StackExchange Computer Science Q#106738, answer score: 3

Revisions (0)

No revisions yet.