patternMinor
How many sequences in a prefix code can be compressed by m bits?
Viewed 0 times
canhowbitscodemanyprefixcompressedsequences
Problem
I have a little understanding problem with Appendix A ("Universal Codes") in the paper "Shannon Information and Kolmogorov complexity" by Gründwald and Vitanyi (Link).
At the end of page 50, they say something corresponding to:
For each Prefix-code, the fraction of sequences of length $n$ that can be compressed by more than $m$ bits is less than $2^{-m}$.
Either this is a writing mistake or I should make a break from reading.
It is easy to see, that the number of strings with length $n$ and compression length smaller than $m$ is smaller $2^{m-n}$ for a prefix-code because there are simple not enough "shorter" words.. Did they mean this information?
It is also easy to understand, that the Kraft-Inequality $\sum_{x\in X}2^{-l(x)}\leq1$ holds for a prefix code with source word set $X$ and the compression length $l(x)$ for strings $x\in X$.
Is that only a writing mistake? Respectively can you tell me an explanation?
At the end of page 50, they say something corresponding to:
For each Prefix-code, the fraction of sequences of length $n$ that can be compressed by more than $m$ bits is less than $2^{-m}$.
Either this is a writing mistake or I should make a break from reading.
It is easy to see, that the number of strings with length $n$ and compression length smaller than $m$ is smaller $2^{m-n}$ for a prefix-code because there are simple not enough "shorter" words.. Did they mean this information?
It is also easy to understand, that the Kraft-Inequality $\sum_{x\in X}2^{-l(x)}\leq1$ holds for a prefix code with source word set $X$ and the compression length $l(x)$ for strings $x\in X$.
Is that only a writing mistake? Respectively can you tell me an explanation?
Solution
There is no problem with the statement. An encoding $f$ of the set of strings of length $n$ is a map from $\{0,1\}^n$ to $\{0,1\}^*$ which is injective (one-to-one). Such a mapping compresses a string $x$ by more than $m$ if $|f(x)| < |x| - m$. In our case, all the strings have length $n$, so a string $x \in \{0,1\}^n$ is compressed by more than $m$ if $|f(x)| < n-m$. There are $\sum_{i=0}^{n-m-1} 2^i = 2^{n-m}-1 < 2^{n-m}$ such strings, and since $f$ is injective, the fraction of strings compressed by more than $m$ is less than $2^{n-m}/2^n = 2^{-m}$.
Context
StackExchange Computer Science Q#20114, answer score: 2
Revisions (0)
No revisions yet.