HiveBrain v1.2.0
Get Started
← Back to all entries
snippetMinor

Efficient algorithm to generate two diffuse, deranged permutations of a multiset at random

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
randommultisetefficientalgorithmgeneratetwoderangeddiffusepermutations

Problem

Background

$\newcommand\ms[1]{\mathsf #1}\def\msD{\ms D}\def\msS{\ms S}\def\mfS{\mathfrak S}\newcommand\mfm[1]{#1}\def\po{\color{#f63}{\mfm{1}}}\def\pc{\color{#6c0}{\mfm{c}}}\def\pt{\color{#08d}{\mfm{2}}}\def\pth{\color{#6c0}{\mfm{3}}}\def\pf{4}\def\pv{\color{#999}5}\def\gr{\color{#ccc}}\let\ss\gr$Suppose I have two identical batches of $n$ marbles. Each marble can be one of $c$ colors, where $c≤n$. Let $n_i$ denote the number of marbles of color $i$ in each batch.

Let $\msS$ be the multiset $\small\{\overbrace{\po,…,\po}^{n_1},\;\overbrace{\pt,…,\pt}^{n_2},\;…,\;\overbrace{\vphantom 1\pc,…,\pc}^{n_c}\}$ representing one batch. In frequency representation, $\msS$ can also be written as $(\po^{n_1} \;\pt^{n_2}\; … \;\pc^{n_c})$.

The number of distinct permutations of $\msS$ is given by the multinomial:
$$\left|\mfS_{\msS}\right|=\binom{n}{n_1,n_2,\dots,n_c}=\frac{n!}{n_1!\,n_2!\cdots n_c!}=n! \prod_{i=1}^c \frac1{n_i!}.$$

Question

Is there an efficient algorithm to generate two diffuse, deranged permutations $P$ and $Q$ of $\msS$ at random? (The distribution should be uniform.)

-
A permutation $P$ is diffuse if for every distinct element $i$ of $P$, the instances of $i$ are spaced out roughly evenly in $P$.


For example, suppose $\msS=(\po^4\;\pt^4)=\{\po,\po,\po,\po,\pt,\pt,\pt,\pt\}$.



  • $\{\po, \po, \po, \pt, \pt, \pt, \pt, \po\}$ is not diffuse



  • $\{\po, \pt, \po, \pt, \po, \pt, \po, \pt\}$ is diffuse




More rigorously:

  • If $n_i=1$, there is only one instance of $i$ to “space out” in $P$, so let $\Delta(i)=0$.



  • Otherwise, let $d(i,j)$ be the distance between instance $j$ and instance $j+1$ of $i$ in $P$. Subtract from it the expected distance between instances of $i$, defining the following:


$$\delta(i,j)=d(i,j)-\frac n{n_i}\qquad\qquad\Delta(i)=\sum_{j=1}^{n_i-1} \delta(i,j)^2$$ If $i$ is evenly spaced in $P$, then $\Delta(i)$ should be zero, or very close to zero if $n_i\nmid n$.

Now define the statistic $s(P)=\sum_{i=1}^c\Delta(i

Solution

One approach: you can reduce this to the following problem: Given a boolean formula $\varphi(x)$, choose an assignment $x$ uniformly at random from among all the satisfying assignments of $\varphi(x)$. This problem is NP-hard, but there are standard algorithms for generating an $x$ that is approximately uniformly distributed, borrowing methods from #SAT algorithms. For instance, one technique is to pick a hash function $h$ whose range has a carefully chosen size (about the same size as the number of satisfying assignments of $\varphi$), choose uniformly at random a value $y$ from within the range of $h$, and then use a SAT solver to find a satisfying assignment to the formula $\varphi(x) \land (h(x)=y)$. To make it efficient, you can choose $h$ to be a sparse linear map.

This might be shooting a flea with a cannon, but if you have no other approaches that seem workable, this is one you could try.

Context

StackExchange Computer Science Q#41777, answer score: 3

Revisions (0)

No revisions yet.