Positive and Negative Randomness Paul Vitanyi CWI University
- Slides: 20
Positive and Negative Randomness Paul Vitanyi CWI, University of Amsterdam Joint work with Kolya Vereshchagin
Non-Probabilistic Statistics
Classic Statistics--Recalled
Probabilistic Sufficient Statistic
Kolmogorov complexity K(x)= length of shortest description of x K(x|y)=length of shortest description of x given y. A string is random if K(x) ≥ |x|. K(x)-K(x|y) is information y knows about x. Theorem (Mutual Information). K(x)-K(x|y) = K(y)-K(y|x)
Randomness Deficiency
Algorithmic Sufficient Statistic where model is a set
Algorithmic suficient statistic where model is a total computable function Data is binary string x; Model is a total computable function p ; Prefix complexity is K(p) (size smallest TM computing p); Data-to-model code length l_x(p)=min_d {|d|: p(d)=x. x is typical for p if δ(x|p)=l_x(p)-K(x|p) is small. p is a sufficient statistic for x if K(p)+l_x(p)=K(x)+O(1) and p(d)=x for the d that achieves l_x(p). Theorem: If p is ss for x then x is typical for p. p is minimal ss (sophistication) for x if K(p) minimal.
Graph Structure Function h_x(α) log |S| Lower bound h_x(α)=K(x)-α α
Minimum Description Length estimator, Relations between estimators Structure function h_x(α)= min_S{log d(S): x in S and K(S)≤α}. MDL estimator λ_x(α)= min_S{log |S|+K(S): x in S and K(S)≤α}. Best-fit estimator: β_x(α) = min_S {δ(x|S): x in S and K(S)≤α}.
Individual characteristics: More detail, especially for meaningful (nonrandom) Data We flip the graph so that log|. | is on the x-axis and K(. ) is on the y-axis. This is essentally the Rate-distortion graph for list (set) distortion.
Primogeniture of ML/MDL estimators • ML/MDL estimators can be approximated from above; • Best-fit estimator cannot be approximated Either from above or below, up to any Precision. • But the approximable ML/MDL estimators yield the best-fitting models, even though we don’t know the quantity of goodnessof-fit ML/MDL estimators implicitly optimize goodness-of-fit.
Positive- and Negative Randomness, and Probabilistic Models
Precision of following given function h(α) Data-to-Model cost log |S| h_x(α) Model cost α d
Logarithmic precision is sharp Lemma. Most strings of length n have structure functions close to the diagonal n-n. Those are the strings of high complexity K(x) > n. For strings of low complexity, say K(x)< n/2, The number of appropriate functions is much greater than the number of strings. Hence there cannot be a string for every such function. But we show that there is a string for every approximate shape of function.
All degrees of neg. randomness Theorem: For every length n there are strings x of every minimal sufficient statstic in between 0 and n (up to a log term) Proof. All shapes of the structure function are possible, as long as it starts from n-k and decreases monotonically and is 0 at k for some k ≤ n. (Up to the precision in the previous slide).
Are there natural examples of negative randomness Question: Are there natural examples of strings of with large negative randomness. Kolmogorov didn’t Think they exist, but we know the are abundant. . Maybe information distance between strings x and y yields large negative randomness.
Information Distance: • Information Distance (Li, Vitanyi, 96; Bennett, Gacs, Li, Vitanyi, Zurek, 98) D(x, y) = min { |p|: p(x)=y & p(y)=x} Binary program for a Universal Computer (Lisp, Java, C, Universal Turing Machine) Theorem (i) D(x, y) = max {K(x|y), K(y|x)} Kolmogorov complexity of x given y, defined as length of shortest binary ptogram that outputs x on input y. (ii) D(x, y) ≤D’(x, y) Any computable distance satisfying ∑ 2 --D’(x, y) ≤ 1 for every x. y (iii) D(x, y) is a metric.
Not between random strings • T The information distance between random strings x and y of length n doesn’t work. • If x, y satisfy K(x|y), K(y|x) > n then p=x XOR y where XOR means bitwise exclusive-or serves as a program to translate x too y and y to x. But if x and y are positively random it appears that p is so too.
Selected Bibliography N. K. Vereshchagin, P. M. B. Vitanyi, A theory of lossy compression of individual data, http: //arxiv. org/abs/cs. IT/0411014, Submitted. P. D. Grunwald, P. M. B. Vitanyi, Shannon Information and Kolmogorov complexity, IEEE Trans. Information Theory, Submitted. N. K. Vereshchagin and P. M. B. Vitanyi, Kolmogorov's Structure functions and model selection, IEEE Trans. Inform. Theory, 50: 12(2004), 3265 - 3290. P. Gacs, J. Tromp, P. Vitanyi, Algorithmic statistics, IEEE Trans. Inform. Theory, 47: 6(2001), 2443 -2463. Q. Gao, M. Li and P. M. B. Vitanyi, Applying MDL to learning best model granularity, Artificial Intelligence, 121: 1 -2(2000), 1 --29. P. M. B. Vitanyi and M. Li, Minimum Description Length Induction, Bayesianism, and Kolmogorov Complexity, IEEE Trans. Inform. Theory, IT-46: 2(2000), 446 --464.
- Paul vitanyi
- Cwi trash
- Randomness probability and simulation
- Symbolic probability rules
- Chapter 14 randomness and probability
- Randomness probability and simulation
- Randomness probability and simulation
- Randomness probability and simulation
- Structure and randomness
- Non terminating decimal
- Compare
- Chi square test for randomness
- Chapter 13 from randomness to probability
- Ap stats chapter 10 understanding randomness
- Run test for randomness example
- Randomness in architecture
- Dssys
- Sami sahnoune
- Chapter 14 from randomness to probability
- What is technology
- Positive and negative impacts of materials technology