patternMinor

compressed information = randomness?

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

data-compression information-theory cs stackoverflow

randomnesscompressedinformation

Problem

Suppose I have a compressed file and it is not possible to compress it more without loss of information. We say that this file is random or pseudorandom.

So, if the randomness means not comprehensible and not compressible, I don't understand why ths file is, at the same time, information that my computer and I can understand.

This file could be a book that my computer can show to me and read, and I can read and sum it ...so, it is really randomness?

Note: I understand that if I can make a summary of a text or define it with less words, that not means that it could be possible to get all the information of this book again, of course but this book is not random for me.

Note II: I undesrtand ramdoness as something that is not possible to reproduce with an smaller algorithm. I mean a string is random when I can't find an other smaller string that is an algorithm that can reproduce the first one.

Note III: I want to thank you all for your help.

Solution

okay, what you are talking about can be explained using the concept of Kolmogorov Complexity.

Let's understand the Kolomogorov complexity and randomness.

Suppose you have a string $A = HHHHH$ and $B = TTHTH$, now intuitively it seems $B$ has more randomness than $A$, however, statistically, both strings have an equal probability of being chosen. This troubled researchers for sometime untill Kolmogorov and Chaitin (independently) came up with a notion of randomness.

A string is said to be random if it cannot be compressed, that is it has no 'structure' in it.
Formally, for any word $x \in (\Sigma_{bool})^*$, the Kolmogorov Complexity $K(x)$ of the word is the binary length of the shortest program generating it.

A word is said to be random if it is not compressible. i.e. $K(w_n) \geq |w_n| + c$

If you want to look up more on this, you can start with this wonderful survey note by Lance Fortnow

Now, as I understand your question, you are asking how is an a word that is incompressible be 'information' while we use the same notion for randomness.

So, this is a bit philosophical... well, randomness is always philosophical! anyway, What we call/define to random is actually information without a structure. The outcome of an unbiased coin toss is also random, i.e. it should not have any structure to it, and one should never be able to find any patterns or periodic repetitions in the string.

Information is basically a numerical measure of the uncertainty of an experimental outcome.

Now, let's use the K-Complexity... suppose we start writing down the outcomes of a coin toss. Now without the information you basically do not have an metric to evaluate randomness of the string. The randomness is more of a property associated with information.
You can probably associate a certain degree of randomness to anything that's based on experiments.

The K-complexity is just a measure of randomness in information.
For an completely 'random' string, the $K(w_n) = |w_n| + c$ and for a completely 'non-random' string, the $K(w_n) = \delta + c$ where $\delta$ is some small quantity.

Context

StackExchange Computer Science Q#14772, answer score: 8

Revisions (0)

No revisions yet.