HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Solomonoff's theory of induction, Kolmogorov complexity and Bayesian Inference

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
theorybayesianinferencesolomonoffandcomplexitykolmogorovinduction

Problem

My motivations for asking this question are philosophical in nature. I'm by no means a computer scientist though, and I feel as though this question should be answered by someone who is since it's one thing to read about a subject second hand and another to understand it first hand. I'm an A-level student of philosophy, physics and mathematics (UK qualification) if that helps you formulate an answer at all.

The question relates to the problem of induction in that, we have little reason to believe the universe is uniform and apply that expectation to science and inference. However, I've seen the claim made that this guy Solomonoff created a theory of induction that uses a Bayesian framework as well as something called 'Kolomogrov complexity' as an objective measure of 'complexity' in a model or hypothesis. For example, if we observe over and over again that all A's are B's, we might be tempted to formulate the hypothesis:

"All A's are B's",

This would be afforded the maximum credence from a bayesian framework since it predicts that A's are B's with a probability of 1 and we're attaching ourselves to the likelihood lover's principle. However, you could also formulate the hypothesis:

"All A's are B's until a time t where they are C."

And there are infinitely many of these hypothesis since the 'C' can represent literally anything. This has a bayesian multiplier equal to the previous hypothesis so technically have an equal credence if we don't prefer one of these in our priors. And the only way (it would seem) to decide between the two is to see if hypothesis 2 shows it's prediction at time t. So we would otherwise have to always wait until time t to see if hypothesis 1 is correct.

My question can be split up into three parts:

1) Is "Kolmogrov complexity" generally considered a genuine formulation of an objective definition for complexity/simplicity?

2) Would the former hypothesis, with such an understanding of complexity be considered more 'simple'?

3) Ho

Solution

Mathematics and computer science doesn't have anything to say about whether simpler hypotheses are more likely. That's a question about reality, not about math / computer science.

What computer science can provide is the notion of Kolmogorov complexity. Kolmogorov complexity is one reasonable notion of simplicity: bit-strings with lower Kolmogorov complexity are "simpler" in some sense.

Now we could start with a model of reality, which says that nature randomly picks a process for generating data; the process is obtained by picking a Turing machine at random, with smaller (simpler) Turing machines more likely to be chosen than complex ones; and then nature runs the Turing machine and the data you observe is the output of that Turing machine. That's a sketch of a possible model of reality; basically, it's an assumption about how nature works.

That assumption then implies (if we fill in some details) a particular prior on hypotheses. A hypothesis amounts to a Turing machine. There will be multiple Turing machines that are all consistent with the observations (that could have produced those observations), and Bayes theorem + the prior will let you infer the posterior distribution on which of those hypotheses are most likely.

This is a possible way to obtain a prior, and it seems reasonable to me. Computer science can't tell you whether that model (that assumption) is a good model of reality. But it can help you work out the consequences of making that assumption.

Finally, you seem to be asking why simpler explanations are more often correct. Basically, why is Occam's razor useful? I don't know if there is any completely convincing answer to that.

One possible answer is the empirical answer: it seems to work well in practice. In other words, we often find in nature that very simple processes/models suffice to explain a wide range of natural observations. See, e.g., the unreasonable effectiveness of mathematics.

There's lots more that one can say about Occam's razor. See, e.g., https://en.wikipedia.org/wiki/Occam%27s_razor#Justifications and https://philosophy.stackexchange.com/questions/tagged/occams-razor?sort=votes. I think that gets beyond the scope of this site.

Context

StackExchange Computer Science Q#92144, answer score: 3

Revisions (0)

No revisions yet.