snippetModerate
How can you shuffle in $O(n)$ time if you need $\Omega(n \log n)$ random bits?
Viewed 0 times
randomcanomegayouneedlogbitstimeshufflehow
Problem
A shuffling algorithm is supposed to generate a random permutation of a given finite set. So, for a set of size $n$, a shuffling algorithm should return any of the $n!$ permutations of the set uniformly at random.
Also, conceptually, a randomized algorithm can be viewed as a deterministic algorithm of the input and some random seed. Let $S$ be any shuffling algorithm. On input $X$ of size $n$, its output is a function of the randomness it has read. To produce $n!$ different outputs, $S$ must have read at least $\log(n!) = \Omega(n \log n)$ bits of randomness. Hence, any shuffling algorithm must take $\Omega(n \log n)$ time.
On the other hand, the Fisher-Yates shuffle is widely believed to run in $O(n)$ time. Is there something wrong with my argument? If not, why is this belief so widespread?
Also, conceptually, a randomized algorithm can be viewed as a deterministic algorithm of the input and some random seed. Let $S$ be any shuffling algorithm. On input $X$ of size $n$, its output is a function of the randomness it has read. To produce $n!$ different outputs, $S$ must have read at least $\log(n!) = \Omega(n \log n)$ bits of randomness. Hence, any shuffling algorithm must take $\Omega(n \log n)$ time.
On the other hand, the Fisher-Yates shuffle is widely believed to run in $O(n)$ time. Is there something wrong with my argument? If not, why is this belief so widespread?
Solution
Your argument appears to be perfectly valid (Fisher-Yates does indeed require $\log (n!)$ bits of randomness), the discrepancy comes in by making different assumptions about the complexity of the random number generation.
You're assuming generating a random number between $0$ and $n$ takes $O(\log n)$.
But, when saying that the Fisher-Yates shuffle is $O(n)$, one assumes generating a random number takes $O(1)$.
An integer index (which is used in the Fisher-Yates shuffle) is generally considered $O(1)$ space (even though it's technically $O(\log n)$) because you can index way more data with a simple 64-bit integer than can currently reasonably fit into the memory of any computer most of us have access to $^{1}$. And the space complexity per number is same as the complexity for the number of random bits that must be generated for that number.
1: To go beyond the number of elements a 64-bit integer can index, we would require $2^{64}$ = $18446744073709551616$ elements, and, assuming 1 bit (which is obviously very small) per element, this requires 2 exabytes.
You're assuming generating a random number between $0$ and $n$ takes $O(\log n)$.
But, when saying that the Fisher-Yates shuffle is $O(n)$, one assumes generating a random number takes $O(1)$.
An integer index (which is used in the Fisher-Yates shuffle) is generally considered $O(1)$ space (even though it's technically $O(\log n)$) because you can index way more data with a simple 64-bit integer than can currently reasonably fit into the memory of any computer most of us have access to $^{1}$. And the space complexity per number is same as the complexity for the number of random bits that must be generated for that number.
1: To go beyond the number of elements a 64-bit integer can index, we would require $2^{64}$ = $18446744073709551616$ elements, and, assuming 1 bit (which is obviously very small) per element, this requires 2 exabytes.
Context
StackExchange Computer Science Q#13990, answer score: 12
Revisions (0)
No revisions yet.