Third Generation Machine Intelligence Christopher M Bishop Microsoft
Third Generation Machine Intelligence Christopher M. Bishop Microsoft Research, Cambridge Microsoft Research Summer School 2009
First Generation “Artificial Intelligence” (GOFAI) Within a generation. . . the problem of creating ‘artificial intelligence’ will largely be solved Marvin Minsky (1967) Expert Systems – rules devised by humans Combinatorial explosion General theme: hand-crafted rules
Second Generation Neural networks, support vector machines Difficult to incorporate complex domain knowledge General theme: black-box statistical models
Third Generation General theme: deep integration of domain knowledge and statistical learning Probabilistic graphical models – Bayesian framework – fast inference using local message-passing Origins: Bayesian networks, decision theory, HMMs, Kalman filters, MRFs, mean field theory, . . .
Bayesian Learning Consistent use of probability to quantify uncertainty Predictions involve marginalisation, e. g.
Why is prior knowledge important? ? y x
Probabilistic Graphical Models Probability theory + graphs 1. New insights into existing models 2. Framework for designing new models 3. Graph-based algorithms for calculation and computation (c. f. Feynman diagrams in physics) 4. Efficient software implementation Directed graphs to specify the model Factor graphs for inference and learning
Directed Graphs
Example: Time Series Modelling
Manchester Asthma and Allergies Study Chris Bishop Iain Buchan Markus Svensén Vincent Tan John Winn
Factor Graphs
From Directed Graph to Factor Graph
Local message-passing Efficient inference by exploiting factorization:
Factor Trees: Separation y v w f 1(v, w) x f 3(x, y) f 2(w, x) z f 4(x, z)
Messages: From Factors To Variables y w x f 3(x, y) f 2(w, x) z f 4(x, z)
Messages: From Variables To Factors y x f 3(x, y) f 2(w, x) z f 4(x, z)
What if marginalisations are not tractable? True distribution Monte Carlo Variational Bayes Loopy belief propagation Expectation propagation
Illustration: Bayesian Ranking Ralf Herbrich Tom Minka Thore Graepel
Two Player Match Outcome Model s 1 s 2 1 2 y 12
Two Team Match Outcome Model s 1 s 2 s 3 t 1 s 4 t 2 y 12
Multiple Team Match Outcome Model s 1 s 2 t 1 s 3 s 4 t 2 y 12 t 3 y 23
Efficient Approximate Inference Gaussian Prior Factors s 1 s 2 t 1 s 3 s 4 t 2 t 3 Ranking Likelihood Factors y 12 y 23
Convergence 40 35 30 Level 25 20 15 char (True. Skill™) 10 SQLWildman (True. Skill™) char (Elo) 5 SQLWildman (Elo) 0 0 100 200 Number of Games 300 400
True. Skill. TM
John Winn Chris Bishop
research. microsoft. com/infernet Tom Minka John Winn John Guiver Anitha Kannan
Summary New paradigm for machine intelligence built on: – a Bayesian formulation – probabilistic graphical models – fast inference using local message-passing Deep integration of domain knowledge and statistical learning Large-scale application: True. Skill. TM Toolkit: Infer. NET
http: //research. microsoft. com/~cmbishop
- Slides: 30