snippetMinor
How to iterate over vectors in order of probability
Viewed 0 times
ordervectorsiterateprobabilityhowover
Problem
Consider a random $n$ dimensional vector $v$ where $v_i \in \{0,1\}$. For each $i$ we know $p_i = P(v_i = 1)$ and let us assume the $v_i$ are independent. Using these probabilities, is there an efficient way to iterate over all possible binary $n$ dimensional vectors in order from most likely to least likely (with arbitrary choices for ties)?
Take for example $p = \{0.8, 0.3, 0.6\}$. The most likely vector is $(1,0,1)$ and the least likely is $\{0,1,0\}$.
For small $n$ we could label each vector with its probability and sort although this isn't very efficient. However, consider $n=100$ for example. This is no longer an option.
Take for example $p = \{0.8, 0.3, 0.6\}$. The most likely vector is $(1,0,1)$ and the least likely is $\{0,1,0\}$.
For small $n$ we could label each vector with its probability and sort although this isn't very efficient. However, consider $n=100$ for example. This is no longer an option.
Solution
You can solve this with a priority queue.
Let $L(x)$ be the log-probability of the vector $x$, i.e.,
$$L(x) = \log \prod_i p_i^{x_i} (1-p_i)^{1-x_i} = \sum_i \ell_i(x_i)$$
where
$$\ell_i(b) = \log (p_i^b (1-p_i)^{1-b}).$$
Since the log is monotone, your problem is equivalent to looking for a way to enumerate the vectors $x \in \{0,1\}^n$ in order of decreasing $L(\cdot)$ value.
Of course, it is easy to identify the vector which has highest probability, i.e., highest $L(\cdot)$-value. We simply put a $1$ in the $i$-th position if $p_i\ge 1/2$ and a $0$ otherwise; call the result $x_{\text{best}}$. So create a priority queue that will hold a set of $n$-vectors. Initialize it to a single element, $x_{\text{best}}$. Assign it the priority $L(x_{\text{best}})$. In general, if $x$ is in the priority queue, its priority will be $L(x)$. Also, create a set $S$ of nodes that have been previously output; initially we will have $S = \emptyset$.
Now repeat the following loop:
-
Find the value of maximum priority and remove it from the priority queue. Call it $x$.
-
Output $x$. Add $x$ to $S$.
-
For each $i=1,2,\dots,n$: Let $y$ denote the vector obtained by flipping the $i$th bit of $x$; if $y \notin S$, add $y$ to the priority queue, with priority $L(y)$.
This loop will output the elements of $\{0,1\}^n$ in order of decreasing $L(\cdot)$-value.
Why does this work? Basically, if $x$ has the highest $L(\cdot)$-value, then the next-highest must be obtained by flipping a single bit; flipping two bits will give you something strictly worse. The same kind of justification can be applied inductively. Or, to think about it another way, this is basically Dijkstra's algorithm applied to the $n$-dimensional cube (vertices are $\{0,1\}^n$), and Dijkstra's algorithm is an efficient way to enumerate the nodes of a graph in order of decreasing distance. Since $L(\cdot)$ is additive, it can be understood as a distance from $x_{\text{best}}$.
Let $L(x)$ be the log-probability of the vector $x$, i.e.,
$$L(x) = \log \prod_i p_i^{x_i} (1-p_i)^{1-x_i} = \sum_i \ell_i(x_i)$$
where
$$\ell_i(b) = \log (p_i^b (1-p_i)^{1-b}).$$
Since the log is monotone, your problem is equivalent to looking for a way to enumerate the vectors $x \in \{0,1\}^n$ in order of decreasing $L(\cdot)$ value.
Of course, it is easy to identify the vector which has highest probability, i.e., highest $L(\cdot)$-value. We simply put a $1$ in the $i$-th position if $p_i\ge 1/2$ and a $0$ otherwise; call the result $x_{\text{best}}$. So create a priority queue that will hold a set of $n$-vectors. Initialize it to a single element, $x_{\text{best}}$. Assign it the priority $L(x_{\text{best}})$. In general, if $x$ is in the priority queue, its priority will be $L(x)$. Also, create a set $S$ of nodes that have been previously output; initially we will have $S = \emptyset$.
Now repeat the following loop:
-
Find the value of maximum priority and remove it from the priority queue. Call it $x$.
-
Output $x$. Add $x$ to $S$.
-
For each $i=1,2,\dots,n$: Let $y$ denote the vector obtained by flipping the $i$th bit of $x$; if $y \notin S$, add $y$ to the priority queue, with priority $L(y)$.
This loop will output the elements of $\{0,1\}^n$ in order of decreasing $L(\cdot)$-value.
Why does this work? Basically, if $x$ has the highest $L(\cdot)$-value, then the next-highest must be obtained by flipping a single bit; flipping two bits will give you something strictly worse. The same kind of justification can be applied inductively. Or, to think about it another way, this is basically Dijkstra's algorithm applied to the $n$-dimensional cube (vertices are $\{0,1\}^n$), and Dijkstra's algorithm is an efficient way to enumerate the nodes of a graph in order of decreasing distance. Since $L(\cdot)$ is additive, it can be understood as a distance from $x_{\text{best}}$.
Context
StackExchange Computer Science Q#24123, answer score: 6
Revisions (0)
No revisions yet.