./..

We explain the basics of the theory of the Kolmogorov complexity, also known as algorithmic information theory, and underline the main differences between the Kolmogorov complexity and Kolmogorov prefix complexity. Then, we introduce the definition of randomness for either finite or infinite words according to Per Martin-Löf and show that it is equivalent to the notion of uncompressibility defined via Kolmogorov complexity.

The contents of this research report

This research report (in French) is available at the LIP site in PostScript format. The version included here has been recompiled to get a correct PDF version and is missing all the traditional items of the LIP research reports (refer to the PostScript version if you want these).

This research report arose from the need to write a good summary of the Kolmogorov complexity basics, and also because my PhD advisor (Bruno Durand) was setting up a lecture about computability that would include some discussion of Kolmogorov complexity. Some basics of the coding theory is also covered. The reference book at this time was the Li-Vitányi book, and it was somewhat confusing (there have been several editions since then).

The Kolmogorov complexity is a brilliant idea introduced by Andrey N. Kolmogorov that captures the idea behind "the intrinsic complexity of some string". It is quite obvious that a string of 100 times the letter "A" is somehow "more simple" than a string composed of 100 (seemingly-)random letters. Some trials were given to ideas that did define this aspect, but none were capable of standing a deep analysis. The definition by Kolmogorov was the first one to resist various transformations.

The basic definition is that given two words \(x\) and \(y\) and a Turing machine \(\psi\), the conditional complexity of \(x\) knowing \(y\) is defined by \(\mathbf{KS}_{\psi}(x|y)=\min\{l(p),\psi\langle p,y\rangle=x\}\). The most important theorem is the Solomonoff-Kolmogorov theorem:

An optimal machine is a machine such that the complexity of any object is as good as with any other machine, up to an additive constant value: \(\forall\psi''\in C,\exists c_{\psi''},\forall x,y,\mathbf{KS}_{\psi}(x|y)\leqslant \mathbf{KS}_{\psi''}(x|y)+c_{\psi''}\). In any (acceptable) computational system, there exists such a machine \(\psi\).

Remark that Kolmogorov complexity defines nicely randomness, because it can be proved that a \(c\)-random string is also a \(c\)-compressible string (\(c\)-compressible means that the Kolmogorov complexity is smaller by at least \(c\) than the original object).

The second important theorem is that \(\mathbf{KS}_{\psi}(x|y)\) is a non-computable function, which means that there can be no such thing as the perfect compressor (with no losses of information) in the Turing framework. For the sake of completeness, I should say that Gregory Chaitin (in the U.S.) had the same ideas at about the same time (and I know for a fact that the two currents of thought have been battling subterraneously for decades).

The documents

Citing this publication

@TechReport{dubacq98, author = {Jean-Christophe Dubacq}, title = {Introduction `{a} la th{\'e}orie algorithmique de l'information}, institution = {\'Ecole normale sup\'erieure de {L}yon}, year = {1998}, type = {Rapport de recherche / Research Report}, number = {05}, month = may }