HiveBrain v1.2.0
Get Started
← Back to all entries
patternModerate

Optional prefix code for the naturals

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
thecodeoptionalforprefixnaturals

Problem

The naturals $\mathbb{N}$ can be encoded with a binary unary code such as

$$
\begin{align}
1&=0_b\\
2&=10_b\\
3&=110_b\\
&...
\end{align}
$$
The length this the encoding grows linearly with the natural number it represents. For example the length of $110_b$ is 3 and it also represents the number 3.

I am looking for an upper bound on the optimality of the prefix free encoding. For example, can we design a prefix coding of $\mathbb{N}$ such that the length of the encoding grows according to ~$\sqrt{n}$, instead of $n$. Is there an upper bound on the optimality that we can prove?

Solution

I wrote a paper on this. The short answer is that there is no optimal encoding, nor even an optimal sequence of better and better encodings.

Kraft's inequality states that there is a prefix code with word lengths $k_0,k_1,\ldots$ if and only if
$$
\sum_{n=0}^\infty 2^{-k_i} \leq 1.
$$
This gives a positive answer to your question. Concretely, Elias gamma coding has code word length $O(\log n)$. The idea is simple: first you encode the bit-length of the integer using your method (using roughly $\log_2 n + 1$ bits), then you encode the number itself (using another roughly $\log_2 n$ bits), for a total of roughly $2\log_2n + 1$ bits.

Context

StackExchange Computer Science Q#80424, answer score: 12

Revisions (0)

No revisions yet.