patternMinor
Lexicographical position of a string in its type class
Viewed 0 times
positionlexicographicaltypeitsclassstring
Problem
I have the following problem:
Given a string $x\in\{1,...,M\}^+$ of length $n$.
Let $S$ be the set of all words with exactly the same numbers of occurences of smybols as in $x$. In fact, $S$ consists of all permutations of $x$. We call this set the type class of $x$.
My question:
How to compute the position of $x$ in the lexicographical ordering of $S$?
I found the paper "Enumerative source coding" written by Thomas M. Cover, where this position function $i_{S}$ is shown for binary alphabets (in the following the alphabet is $\{0,1\}$):
$i_S(x_1x_2\cdots x_n)=\sum\limits_{k=1}^{n}x_k\cdot$ $n-k\choose n(w,k)$
with $\;\;n(w,k)=w-\sum\limits_{j=1}^{k-1}x_j\;\;\;$
and $\;\;w$ is the number of occurrences of $1$ in $x\;\;$ ($w=\sum\limits_{k=1}^{n}x_k$).
Unfortunately, i do not know how to extend this formula to alphabets of size $M$. Any help?
Given a string $x\in\{1,...,M\}^+$ of length $n$.
Let $S$ be the set of all words with exactly the same numbers of occurences of smybols as in $x$. In fact, $S$ consists of all permutations of $x$. We call this set the type class of $x$.
My question:
How to compute the position of $x$ in the lexicographical ordering of $S$?
I found the paper "Enumerative source coding" written by Thomas M. Cover, where this position function $i_{S}$ is shown for binary alphabets (in the following the alphabet is $\{0,1\}$):
$i_S(x_1x_2\cdots x_n)=\sum\limits_{k=1}^{n}x_k\cdot$ $n-k\choose n(w,k)$
with $\;\;n(w,k)=w-\sum\limits_{j=1}^{k-1}x_j\;\;\;$
and $\;\;w$ is the number of occurrences of $1$ in $x\;\;$ ($w=\sum\limits_{k=1}^{n}x_k$).
Unfortunately, i do not know how to extend this formula to alphabets of size $M$. Any help?
Solution
This is yet another example of enumerative encoding. Suppose that $x$ has $\alpha_i$ symbols of type $i$. The size of $S$ is then the multinomial coefficient $\binom{n}{\alpha_1,\ldots,\alpha_M}$. Pascal's identity reads
$$ \binom{n}{\alpha_1,\ldots,\alpha_M} = \sum_{i\colon \alpha_i \neq 0} \binom{n-1}{\alpha_1,\ldots,\alpha_i-1,\ldots,\alpha_M}. $$
In the lexicographic ordering of $S$, the strings starting with $1$ (if any) constitute the first $\binom{n-1}{\alpha_1-1,\ldots,\alpha_M}$ positions, the strings starting with $2$ (if any) constitute the following $\binom{n-1}{\alpha_1,\alpha_2-1,\ldots,\alpha_M}$ positions, and so on. So $x_1$ already gives you an interval of length $\binom{n-1}{\alpha_1,\ldots,\alpha_{x_1}-1,\ldots,\alpha_m}$ to which $x$ belongs in the lexicographical ordering of $S$. To find its location within the interval, recurse: examine $x_2$, then $x_3$, and so on.
Here is the resulting algorithm, returning a $0$-based position:
$$ \binom{n}{\alpha_1,\ldots,\alpha_M} = \sum_{i\colon \alpha_i \neq 0} \binom{n-1}{\alpha_1,\ldots,\alpha_i-1,\ldots,\alpha_M}. $$
In the lexicographic ordering of $S$, the strings starting with $1$ (if any) constitute the first $\binom{n-1}{\alpha_1-1,\ldots,\alpha_M}$ positions, the strings starting with $2$ (if any) constitute the following $\binom{n-1}{\alpha_1,\alpha_2-1,\ldots,\alpha_M}$ positions, and so on. So $x_1$ already gives you an interval of length $\binom{n-1}{\alpha_1,\ldots,\alpha_{x_1}-1,\ldots,\alpha_m}$ to which $x$ belongs in the lexicographical ordering of $S$. To find its location within the interval, recurse: examine $x_2$, then $x_3$, and so on.
Here is the resulting algorithm, returning a $0$-based position:
- Set $P \gets 0$.
- For $i=1,\ldots,n$:
- Set $P \gets P + \sum_{j
- Set $\alpha_{x_i} \gets \alpha_{x_i} - 1$.
- Return $P$.
Context
StackExchange Computer Science Q#28671, answer score: 3
Revisions (0)
No revisions yet.