HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Shuffling a file on disk using $O(\log n)$ memory

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
filedisklogshufflingusingmemory

Problem

How do you shuffle the bytes in a file (bytes for simplicity) on disk with a small, $O(\log n)$, amount of memory and preferably in-place?

If the file had size $2^m$, then we can first split the file into 2-byte chunks, shuffle them individually, then shuffle each set of neighboring 2 chunks and merge, doing this recursively. Merging can be done using constant memory by reading the start of a chunk and swapping them with another.

But if the file size is not $2^m$ then the above doesn't really work, as for a 3-byte file the last byte cannot be in the middle position. (the chunks here are [1, 2] and [3]).

[EDIT] This doesn't generate all permutations as Yuval Filmus points out.

[EDIT 2] The paper "Random permutations on distributed, external and hierarchical memory" seems like a close match. Other refs/ideas welcome.

Solution

The algorithm you suggest doesn't result in a uniform permutation. An in-place algorithm which works for every file size is the Fisher–Yates shuffle.

Context

StackExchange Computer Science Q#50856, answer score: 5

Revisions (0)

No revisions yet.