patternMinor

Efficiently shuffling items in $N$ buckets using $O(N)$ space

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

sampling cs stackoverflow algorithms randomized-algorithms time-complexity

bucketsefficientlyspaceshufflingitemsusing

Problem

I’ve run into a challenging algorithm puzzle while trying to generate a large amount of test data. The problem is as follows:

-
We have $N$ buckets, $B_1$ through $B_N$. Each bucket $B_i$ maps to a unique item $a_i$ and a count $k_i$. Altogether, the collection holds $T=\sum_1^N{k_i}$ items. This is a more compact representation of a vector of $T$ items where each $a_i$ is repeated $k_i$ times.

-
We want to output a shuffled list of the $T$ items, all permutations equally probable, using only $O(N)$ space and minimal time complexity. (Assume a perfect RNG.)

-
$N$ is fairly large and $T$ is much larger; 5,000 and 5,000,000 in the problem that led me to this investigation.

(EDIT: further research instigated by @YuvalFilmus’s comment shows that this is equivalent to weighted sampling without replacement, a search term that leads to quite a lot of research.)

Now clearly the time complexity is at least $O(T)$ since we have to output that many items. But how closely can we approach that lower bound? Some algorithms:

-
Algorithm 1: Expand the buckets into a vector of $T$ items and use Fisher-Yates. This uses $O(T)$ time, but also $O(T)$ space, which we want to avoid.

-
Algorithm 2: For each step, choose a random number $R$ from $[0,T-1]$. Traverse the buckets, subtracting $k_i$ from $R$ each time, until $R

-
Algorithm 3: Convert the vector of buckets into a balanced binary tree with buckets at the leaf nodes; the depth should be close to $\log_2{N}$. Each node stores the total count of all the buckets under it. To shuffle, choose a random number $R$ from $[0,T-1]$, then descend into the tree accordingly, decrementing each node count as we go; when descending to the right, reduce $R$ by the left count. When we reach a leaf node, output its value. It uses $O(N)$ space and $O(T\log{N})$ time.

-
Algorithm 3a: Same as Algorithm 3, but with a Huffman tree; this should be faster if the $k_i$ values vary widely, since the most often visited nodes will be closer to

Solution

Step 1. Store the buckets cumulatively. E.g. instead of storing $[3, 5, 2, 2]$ you store $[3, 8, 10, 12]$. This allows you to find using binary search in $\log n$ time what bucket the $k$th (out of $T$) item belongs to (and by extension which $a_i$ to output).

Step 2. Use a memory-less permutation of the range $[1, T]$. I personally like Sometimes-Recurse Shuffle. A memory-less permutation can give you $\pi(k)$ directly for arbitrary $k$ in $\log T$ time. Some example code I have written a couple years back show how simple it is (with toy PRF implementation - replace that): https://gist.github.com/orlp/33535eefce782a59e185e4a971cda1a3.

Step 3. To actually shuffle simply enumerate $k = 1, 2, \dots$, compute $\pi(k)$ using the memory-less permutation, and find which bucket $\pi(k)$ belongs to output the proper $a_i$. Total time complexity is $\log(T)$ per item, or $T \log(T)$ for the whole permutation.

The advantage is that you don't have to compute the whole permutation, you can also just directly compute the $k$th element in the permutation.

Context

StackExchange Computer Science Q#100493, answer score: 2

Revisions (0)

No revisions yet.