patternMinor

Arithmetic coding and "the optimal compression ratio"

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

theoptimalcompressionratioandcodingarithmetic

Problem

Trying to learn more about compression techniques and found something in the wikipedia article on arithmetic coding that I'm not sure I fully grok. In describing how Huffman Coding can sometimes be inefficient, the author refers to an 'optimal compression ratio' that seems to be a function of the probabilities of a given symbol being represented at any given position in the dataset. Am I correct in understanding this to mean:

Given a set of data and a set of probabilities describing the likelihood of any given member of the dataset being a given symbol, then there is no way to represent that data encoded in that manner in fewer bits than the calculated optimal compression ratio?

In other words: there's no way to encode something in few than 'optimal' bits so don't try?

Solution

Beware: The phrase "optimal compression ratio" is perhaps a bit misleading. It is intended to make you think of "the best compression ratio that is achievable", but there are some assumptions that it comes with -- you should really think of it as "the best compression ratio that is achievable, under a certain set of assumptions" (or "the best compression ratio that is achievable by a certain type of compression algorithm"). Since those assumptions often aren't literally true, there's no real guarantee that this is truly the smallest something can be compressed to.

In particular, one key assumption is that the symbols come from a memoryless iid source: i.e., there is a distribution on characters, and each character is sampled randomly from this distribution, independent of all previous characters. For a source that satisfies this assumption, we can calculate the best achievable compression ratio; it is the entropy.

Of course we know this isn't true in practice; in English text, if the previous letter was 'q', the next one is much more likely to be a 'u' than otherwise. So really you should think of the entropy as telling you "if you didn't take advantage of that type of knowledge, what is the best compression ratio you could achieve?".

Context

StackExchange Computer Science Q#68252, answer score: 6

Revisions (0)

No revisions yet.