03-Apr-2011, 07:09 PM

Your question is already taken care of in the article under discussion. Here is the relevant excerpt:

'Gregory Chaitin is a pioneer of algorithmic information theory (AIT). To understand the essence of AIT, consider a very simple example. Take the set of all positive integers, and ask the question: How many bits of information are needed to specify all these integers? The answer is an absurdly large number. But the fact is that this set of data has very little information content. It has a structure which we can exploit to write an algorithm which can generate all the integers, and the number of bits of information needed to write the algorithm is indeed not large. So the algorithmic information content in this problem is small.

One can generalize and say that, in terms of computer algorithms, the best theory is that which requires the smallest computer program for calculating (and hence explaining) the observations. The more compact the theory, the smaller is the length of this computer program. Chaitin’s work has shown that the Ockham razor is not just a matter of philosophy; it has deep algorithmic-information underpinnings. If there are competing descriptions or theories of reality, the more compact one has a higher probability of being correct. Ockham’s razor cuts away all the flab. Let us see why.

In AIT, an important concept is that of algorithmic probability (AP). It is the probability that a random program of a given length fed into a computer will give a desired output, say the first million digits of π. Following Bennett and Chaitin’s pioneering work done in the 1970s, let us assume that the random program has been produced by a monkey. The AP in this case is the same as the probability that the monkey would type out the same bit string, i.e. the same computer program as, say, a Java program suitable for generating the first million digits of π. The probability that the monkey would press the first key on the keyboard correctly is 0.5. The probability that the first two keys would be pressed correctly is (0.5)2 or 0.25. And so on. Thus the probability gets smaller and smaller very rapidly as the number of correctly sequenced bits increases. The longer the program, the less likely it is that the monkey will crank it out correctly. This means that the AP is the highest for the shortest programs or the most compact theories. The best theory has the smallest number of axioms.

In the present context, suppose we are having a bit string representing a set of data, and we want to understand the mechanism responsible for the creation of that set of data. In other words, we want to discover the computer program (or the best theory), among many we could generate randomly, which is responsible for that set of data. The validation of Ockham’s philosophy comes from the fact that the shortest such program is the most plausible guess because it has the highest AP.'

For more details please see the book 'Thinking About Gödel and Turing' by Chaitin (2007). I can send you the pdf version of the book. Just send me an email.

'Gregory Chaitin is a pioneer of algorithmic information theory (AIT). To understand the essence of AIT, consider a very simple example. Take the set of all positive integers, and ask the question: How many bits of information are needed to specify all these integers? The answer is an absurdly large number. But the fact is that this set of data has very little information content. It has a structure which we can exploit to write an algorithm which can generate all the integers, and the number of bits of information needed to write the algorithm is indeed not large. So the algorithmic information content in this problem is small.

One can generalize and say that, in terms of computer algorithms, the best theory is that which requires the smallest computer program for calculating (and hence explaining) the observations. The more compact the theory, the smaller is the length of this computer program. Chaitin’s work has shown that the Ockham razor is not just a matter of philosophy; it has deep algorithmic-information underpinnings. If there are competing descriptions or theories of reality, the more compact one has a higher probability of being correct. Ockham’s razor cuts away all the flab. Let us see why.

In AIT, an important concept is that of algorithmic probability (AP). It is the probability that a random program of a given length fed into a computer will give a desired output, say the first million digits of π. Following Bennett and Chaitin’s pioneering work done in the 1970s, let us assume that the random program has been produced by a monkey. The AP in this case is the same as the probability that the monkey would type out the same bit string, i.e. the same computer program as, say, a Java program suitable for generating the first million digits of π. The probability that the monkey would press the first key on the keyboard correctly is 0.5. The probability that the first two keys would be pressed correctly is (0.5)2 or 0.25. And so on. Thus the probability gets smaller and smaller very rapidly as the number of correctly sequenced bits increases. The longer the program, the less likely it is that the monkey will crank it out correctly. This means that the AP is the highest for the shortest programs or the most compact theories. The best theory has the smallest number of axioms.

In the present context, suppose we are having a bit string representing a set of data, and we want to understand the mechanism responsible for the creation of that set of data. In other words, we want to discover the computer program (or the best theory), among many we could generate randomly, which is responsible for that set of data. The validation of Ockham’s philosophy comes from the fact that the shortest such program is the most plausible guess because it has the highest AP.'

For more details please see the book 'Thinking About Gödel and Turing' by Chaitin (2007). I can send you the pdf version of the book. Just send me an email.