patternMinor

What is a self-delimiting program?

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

turing-machines reference-request cs stackoverflow

programdelimitingwhatself

Problem

I came across the term self-delimiting program, referring to the paper "Laws of information (nongrowth) and aspects of the foundation of probability theory" by Levin, 1974.

Unfortunately this paper is not available afaik, and I couldn't find any definition anywhere. What is the definition? or what is a resource stating the definition and explaining the idea?

Solution

While a specific program could be self-delimiting, in this context what we usually mean is a self-delimiting encoding of programs. An encoding of programs is self-delimiting if it forms a prefix code, that is, no program is a prefix of another program. Intuitively, programs are self-delimiting, so there is no ambiguity in where a program ends.

Why do we care? Consider the chain rule in Kolmogorov complexity, which expresses the idea that if there are short programs that generate $x$ and $y$, then there is a short program that generates the concatenation $xy$. If programs are self-delimiting, then a program for generating $xy$ could look as follows:

Run the following two programs in sequence.

Program for generating $x$.

Program for generating $y$.

Denoting the Kolmogorov complexity of a string $w$ by $C(w)$, this shows that $$ C(xy) \leq C(x) + C(y) + O(1). $$
(The $O(1)$ takes care of the first bit.)

In contrast, if programs are not self-delimiting, then it is impossible to tell where the program for generating $x$ ends. Therefore the chain rule for non-self-delimiting Kolmogorov complexity takes the more complicated form $$ K(xy) \leq K(x) + K(y) + O(\log K(x) + 1), $$
where $O(\log K(x))$ is needed to delimit the first program.

Self-delimiting programs are also useful for defining the universal distribution, which is a distribution on programs useful for understanding average-case complexity. The probability that the program $P$ comes up is $2^{-|P|}$. Since the valid programs form a prefix code, Kraft's inequality states that $\sum_P 2^{-|P|} \leq 1$, so this is a (sub-)probability distribution.

So far the advantages of prefix-free Kolmogorov complexity. The disadvantage is that whereas $K(w) \leq |w|$, this no longer holds for $C(w)$; instead we only have $C(w) \leq |w| + O(\log |w|+1)$.

You can read a lot more on these issues in the excellent monograph of Li and Vitanyi on Kolmogorov complexity.

Context

StackExchange Computer Science Q#136984, answer score: 5

Revisions (0)

No revisions yet.