patternModerate
Optional prefix code for the naturals
Viewed 0 times
thecodeoptionalforprefixnaturals
Problem
The naturals $\mathbb{N}$ can be encoded with a binary unary code such as
$$
\begin{align}
1&=0_b\\
2&=10_b\\
3&=110_b\\
&...
\end{align}
$$
The length this the encoding grows linearly with the natural number it represents. For example the length of $110_b$ is 3 and it also represents the number 3.
I am looking for an upper bound on the optimality of the prefix free encoding. For example, can we design a prefix coding of $\mathbb{N}$ such that the length of the encoding grows according to ~$\sqrt{n}$, instead of $n$. Is there an upper bound on the optimality that we can prove?
$$
\begin{align}
1&=0_b\\
2&=10_b\\
3&=110_b\\
&...
\end{align}
$$
The length this the encoding grows linearly with the natural number it represents. For example the length of $110_b$ is 3 and it also represents the number 3.
I am looking for an upper bound on the optimality of the prefix free encoding. For example, can we design a prefix coding of $\mathbb{N}$ such that the length of the encoding grows according to ~$\sqrt{n}$, instead of $n$. Is there an upper bound on the optimality that we can prove?
Solution
I wrote a paper on this. The short answer is that there is no optimal encoding, nor even an optimal sequence of better and better encodings.
Kraft's inequality states that there is a prefix code with word lengths $k_0,k_1,\ldots$ if and only if
$$
\sum_{n=0}^\infty 2^{-k_i} \leq 1.
$$
This gives a positive answer to your question. Concretely, Elias gamma coding has code word length $O(\log n)$. The idea is simple: first you encode the bit-length of the integer using your method (using roughly $\log_2 n + 1$ bits), then you encode the number itself (using another roughly $\log_2 n$ bits), for a total of roughly $2\log_2n + 1$ bits.
Kraft's inequality states that there is a prefix code with word lengths $k_0,k_1,\ldots$ if and only if
$$
\sum_{n=0}^\infty 2^{-k_i} \leq 1.
$$
This gives a positive answer to your question. Concretely, Elias gamma coding has code word length $O(\log n)$. The idea is simple: first you encode the bit-length of the integer using your method (using roughly $\log_2 n + 1$ bits), then you encode the number itself (using another roughly $\log_2 n$ bits), for a total of roughly $2\log_2n + 1$ bits.
Context
StackExchange Computer Science Q#80424, answer score: 12
Revisions (0)
No revisions yet.