Computer vision models learning and inference Chapter 9

  • Slides: 89
Download presentation
Computer vision: models, learning and inference Chapter 9 Classification models Please send errata to

Computer vision: models, learning and inference Chapter 9 Classification models Please send errata to s. prince@cs. ucl. ac. uk

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and Gaussian process classification Incremental fitting, boosting and trees Multi-classification Random classification trees Non-probabilistic classification Applications Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 2

Models for machine vision Computer vision: models, learning and inference. © 2011 Simon J.

Models for machine vision Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 3

Example application: Gender Classification Computer vision: models, learning and inference. © 2011 Simon J.

Example application: Gender Classification Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 4

Type 1: Model Pr(w|x) Discriminative How to model Pr(w|x)? – Choose an appropriate form

Type 1: Model Pr(w|x) Discriminative How to model Pr(w|x)? – Choose an appropriate form for Pr(w) – Make parameters a function of x – Function takes parameters q that define its shape Learning algorithm: learn parameters q from training data x, w Inference algorithm: just evaluate Pr(w|x) Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 5

Logistic Regression Consider two class problem. • Choose Bernoulli distribution over world. • Make

Logistic Regression Consider two class problem. • Choose Bernoulli distribution over world. • Make parameter l a function of x Model activation with a linear function creates number between . Maps to Computer vision: models, learning and inference. © 2011 Simon J. D. Prince with 6

Two parameters Learning by standard methods (ML, MAP, Bayesian) Inference: Just evaluate Pr(w|x) Computer

Two parameters Learning by standard methods (ML, MAP, Bayesian) Inference: Just evaluate Pr(w|x) Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 7

Neater Notation To make notation easier to handle, we • Attach a 1 to

Neater Notation To make notation easier to handle, we • Attach a 1 to the start of every data vector • Attach the offset to the start of the gradient vector f New model: Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 8

Logistic regression Computer vision: models, learning and inference. © 2011 Simon J. D. Prince

Logistic regression Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 9

Maximum Likelihood Take Logarithm Take derivative: Computer vision: models, learning and inference. © 2011

Maximum Likelihood Take Logarithm Take derivative: Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 10

Derivatives Unfortunately, there is no closed form solution– we cannot get and expression for

Derivatives Unfortunately, there is no closed form solution– we cannot get and expression for f in terms of x and w However, the function is convex We will use iterative non-linear optimization Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 11

Optimization Goal: How can we find the minimum? Cost function or Objective function Basic

Optimization Goal: How can we find the minimum? Cost function or Objective function Basic idea: • Start with estimate • Take a series of small steps to • Make sure that each step decreases cost • When can’t improve then must be in minimum Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 12

Local Minima Computer vision: models, learning and inference. © 2011 Simon J. D. Prince

Local Minima Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 13

Convexity If a function is convex, then it has only a single minimum Can

Convexity If a function is convex, then it has only a single minimum Can tell if a function is convex by looking at 2 nd derivatives Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 14

Gradient Based Optimization • Choose a search direction s based on the local properties

Gradient Based Optimization • Choose a search direction s based on the local properties of the function • Perform an intensive search along the chosen direction. This is called line search • Then set Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 15

Gradient Descent Consider standing on a hillside Look at gradient where you are standing

Gradient Descent Consider standing on a hillside Look at gradient where you are standing Find the steepest direction downhill Walk in that direction for some distance (line search) Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 16

Finite differences What if we can’t compute the gradient? Compute finite difference approximation: where

Finite differences What if we can’t compute the gradient? Compute finite difference approximation: where ej is the unit vector in the jth direction Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 17

Steepest Descent Problems Close up Computer vision: models, learning and inference. © 2011 Simon

Steepest Descent Problems Close up Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 18

Second Derivatives In higher dimensions, 2 nd derivatives change how much we should move

Second Derivatives In higher dimensions, 2 nd derivatives change how much we should move differently in the different directions: changes best direction to move in. Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 19

Newton’s Method Approximate function with Taylor expansion Take derivative Re-arrange Adding line search Computer

Newton’s Method Approximate function with Taylor expansion Take derivative Re-arrange Adding line search Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 20

Newton’s Method Matrix of second derivatives is called the Hessian. Expensive to compute via

Newton’s Method Matrix of second derivatives is called the Hessian. Expensive to compute via finite differences. If positive definite then convex Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 21

Newton vs. Steepest Descent Computer vision: models, learning and inference. © 2011 Simon J.

Newton vs. Steepest Descent Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 22

Optimization for Logistic Regression Derivatives of log likelihood: • The contribution of each data

Optimization for Logistic Regression Derivatives of log likelihood: • The contribution of each data point depends on the difference between the actual and the predicted class. • Incorrectly classified points contribute more. Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 24

Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 25

Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 25

Maximum likelihood fits Computer vision: models, learning and inference. © 2011 Simon J. D.

Maximum likelihood fits Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 26

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and Gaussian process classification Incremental fitting, boosting and trees Multi-classification Random classification trees Non-probabilistic classification Applications Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 27

Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 28

Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 28

Bayesian Logistic Regression Likelihood: Prior (not conjugate): Apply Bayes’ rule: • The prior is

Bayesian Logistic Regression Likelihood: Prior (not conjugate): Apply Bayes’ rule: • The prior is not conjugate to the likelihood • No closed form expression for the posterior • Use the Laplace approximation Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 29

Laplace Approximation Approximate posterior distribution with normal • Set mean to MAP estimate for

Laplace Approximation Approximate posterior distribution with normal • Set mean to MAP estimate for the mean • Set covariance to the Hessian at MAP estimate Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 30

Laplace Approximation Find MAP solution by optimizing log likelihood Approximate with normal where Computer

Laplace Approximation Find MAP solution by optimizing log likelihood Approximate with normal where Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 31

Laplace Approximation Prior Actual posterior Computer vision: models, learning and inference. © 2011 Simon

Laplace Approximation Prior Actual posterior Computer vision: models, learning and inference. © 2011 Simon J. D. Prince Approximated 32

Inference • The integral cannot be computed in closed form. • We note that

Inference • The integral cannot be computed in closed form. • We note that the prediction depends on a linear projection of the parameters: α=φΤx* • We re-express it in terms of these 1 D projections Using transformation properties of normal distributions: Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 33

Approximation of Integral Predict w* using x* and μ and Σ from the Laplace

Approximation of Integral Predict w* using x* and μ and Σ from the Laplace approximation: Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 34

Bayesian Solution Less confident (less crisp) than ML in both 1 D and 2

Bayesian Solution Less confident (less crisp) than ML in both 1 D and 2 D examples Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 35

ML vs Bayesian Solution Isoprobability contours are curved in the Bayesian solution Computer vision:

ML vs Bayesian Solution Isoprobability contours are curved in the Bayesian solution Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 36

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and Gaussian process classification Incremental fitting, boosting and trees Multi-classification Random classification trees Non-probabilistic classification Applications Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 37

Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 38

Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 38

Non-linear logistic regression Same idea as for regression. • Apply non-linear transformation • Build

Non-linear logistic regression Same idea as for regression. • Apply non-linear transformation • Build model as usual Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 39

Non-linear logistic regression Example transformations: Fit using optimization (also for the function parameters): Computer

Non-linear logistic regression Example transformations: Fit using optimization (also for the function parameters): Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 40

Non-linear logistic regression in 1 D a) b) c) d) Transformation of 1 D

Non-linear logistic regression in 1 D a) b) c) d) Transformation of 1 D to 7 D space. The 7 arctan functions. The functions multiplied by the ML weights φ. The final activation is the sum of the functions in b). Final probability: passing the activation through a logistic sigmoid. Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 41

Non-linear logistic regression in 2 D Transformation of 2 D to 2 D space

Non-linear logistic regression in 2 D Transformation of 2 D to 2 D space using two arctan functions. Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 42

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and Gaussian process classification Incremental fitting, boosting and trees Multi-classification Random classification trees Non-probabilistic classification Applications Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 43

Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 44

Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 44

Dual Logistic Regression KEY IDEA: Gradient F is just a vector in the data

Dual Logistic Regression KEY IDEA: Gradient F is just a vector in the data space Can represent as a weighted sum of the data points Now solve for Y. One parameter per training example. Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 45

Maximum Likelihood Derivatives Depend only depend on inner products! We may employ the kernel

Maximum Likelihood Derivatives Depend only depend on inner products! We may employ the kernel trick Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 46

Kernel Logistic Regression Does not interpolate well between samples. Random classification over a large

Kernel Logistic Regression Does not interpolate well between samples. Random classification over a large area Better modeling of the between-samples space Very smooth. Better interpolation where there is no data (top left corner) Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 47

ML vs. Bayesian • The Bayesian case is known as Gaussian process classification •

ML vs. Bayesian • The Bayesian case is known as Gaussian process classification • Very similar to ML but always somewhat less confident Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 48

Relevance vector classification Apply sparse prior to dual variables: As before, write as marginalization

Relevance vector classification Apply sparse prior to dual variables: As before, write as marginalization of dual variables: Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 49

Relevance vector classification Apply sparse prior to dual variables: Gives likelihood: We now make

Relevance vector classification Apply sparse prior to dual variables: Gives likelihood: We now make two approximations Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 50

Relevance vector classification 1) Use Laplace approximation: to describe the first two terms Computer

Relevance vector classification 1) Use Laplace approximation: to describe the first two terms Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 51

Relevance vector classification Previous result: 2) Maximize over the hidden variables rather than integrate

Relevance vector classification Previous result: 2) Maximize over the hidden variables rather than integrate over them: To solve, alternately update hidden variables in H and mean and variance of Laplace approximation. Updates for h are similar with the regression model Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 52

Relevance vector classification Most hidden variables increase to larger values This means prior over

Relevance vector classification Most hidden variables increase to larger values This means prior over dual variable is very tight around zero The final solution only depends on a very small number of examples Better generalization to new data Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 53

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and Gaussian process classification Incremental fitting & boosting Multi-classification Random classification trees Non-probabilistic classification Applications Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 54

Sparsity through incremental fitting Computer vision: models, learning and inference. © 2011 Simon J.

Sparsity through incremental fitting Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 55

Incremental Fitting Previously wrote: Now write the dot product as a weighted sum: Computer

Incremental Fitting Previously wrote: Now write the dot product as a weighted sum: Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 56

Incremental Fitting Greedily add terms one at a time and learn them through ML.

Incremental Fitting Greedily add terms one at a time and learn them through ML. One dimension of the (non linearly augmented) data is added at a time STAGE 1: Fit f 0, f 1, x 1 STAGE 2: Fit f 0, f 2, x 2 while keeping the remaining parameters constant STAGE K: Fit f 0, f. K, x. K while keeping the remaining parameters constant Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 57

Incremental Fitting • The procedure is suboptimal as we do not learn all the

Incremental Fitting • The procedure is suboptimal as we do not learn all the parameters simultaneously • However, it has three important properties Property 1: • It creates sparse models • the weights tend to decrease as we move through the sequence • Each subsequent basis function has less influence on the model Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 58

Incremental Fitting Property 2: • It is practical for both high dimensional data and

Incremental Fitting Property 2: • It is practical for both high dimensional data and a large number of available examples • The logistic regression models are suited for • small data dimensions or • small number of training examples (dual formulation) • During training we do not need to hold all of the transformed vectors in memory • At the k-th stage only the k-th dimension of the transformed parameter is needed (and the aggregation of the previous stages) Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 59

Incremental Fitting Property 3: • Learning is inexpensive as we optimize only a few

Incremental Fitting Property 3: • Learning is inexpensive as we optimize only a few parameters at each stage Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 60

Incremental Fitting Computer vision: models, learning and inference. © 2011 Simon J. D. Prince

Incremental Fitting Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 61

Boosting Incremental fitting with step functions • Each step function is a weak classifier

Boosting Incremental fitting with step functions • Each step function is a weak classifier • Its output is 0 or 1 so it classifies the data • The model combines all the weak classifier to a strong classifier • This particular model is called logitboost • Can`t take derivative w. r. t a so have to just use exhaustive search Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 62

Boosting • Incremental learning • Learn α by exhaustive search • Use non linear

Boosting • Incremental learning • Learn α by exhaustive search • Use non linear optimization to estimate φ0 and φk • Alternative approach: • Perform just one step of Newton’s method to estimate φ0 and φk • Use this solution to estimate α • Return and perform the full optimization for φ0 and φk Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 63

Boosting • After each classifier is added the relative importance of each data point

Boosting • After each classifier is added the relative importance of each data point is changed: Actual label Predicted Label • Points contribute to derivative more if they are still misclassified • The later classifiers become increasingly specialized to the difficult examples. Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 64

Boosting • Does not interpolate smoothly. Arctan functions have superior generalization and they employ

Boosting • Does not interpolate smoothly. Arctan functions have superior generalization and they employ continuous optimization • Step functions are faster to evalaluate but arctan may also be approximated by lookup tables Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 65

Branching Logistic Regression • A different way to define non-linear decision boundaries • Partition

Branching Logistic Regression • A different way to define non-linear decision boundaries • Partition the data space into distinct regions and apply different classifier in each region: New activation The term is a gating function defining the region. • Returns a number between 0 and 1 • If 0 then we get one logistic regression model • If 1 then get a different logistic regression model • Intermediate values get a weighted sum of these models Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 66

Branching Logistic Regression • The gating function depends on the data and induces a

Branching Logistic Regression • The gating function depends on the data and induces a non linear decision boundary • The two linear models are specialized to different regions of the data space (also called experts) • It could take many forms but an interesting possibility is to use a second logistic regression model! • Compute a linear function of the data that is passed through a logistic sigmoid: Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 67

Branching Logistic Regression Expert for the right hand side Expert for the left hand

Branching Logistic Regression Expert for the right hand side Expert for the left hand side Determines which expert contributes to the decision Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 68

Logistic Classification Trees • Extension of the idea to create hierarchical trees Computer vision:

Logistic Classification Trees • Extension of the idea to create hierarchical trees Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 69

Logistic Classification Trees • Learning may be done incrementally beginning at the top, then

Logistic Classification Trees • Learning may be done incrementally beginning at the top, then fit the left branch and then the right branch • If each gating function produces a binary output (e. g. Heaviside) then each data point ends up at a single leaf • If each branch performs a linear operation then all these may be aggregated to a single linear function at each leaf • Since each data point receives special processing the tree may not be deep Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 70

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and Gaussian process classification Incremental fitting, boosting and trees Multi-classification Random classification trees Non-probabilistic classification Applications Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 71

Multiclass Logistic Regression For multiclass recognition, choose distribution over w and then make the

Multiclass Logistic Regression For multiclass recognition, choose distribution over w and then make the parameters of this a function of x. Softmax function maps real activations {ak} to numbers between zero and one that sum to one One set of parameter vectors {fk} for each class: Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 72

Multiclass Logistic Regression Softmax function maps activations which can take any value to parameters

Multiclass Logistic Regression Softmax function maps activations which can take any value to parameters of categorical distribution between 0 and 1 Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 73

Multiclass Logistic Regression To learn model, maximize log likelihood No closed from solution, learn

Multiclass Logistic Regression To learn model, maximize log likelihood No closed from solution, learn with non-linear optimization where Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 74

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and Gaussian process classification Incremental fitting, boosting and trees Multi-classification Random classification trees Non-probabilistic classification Applications Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 75

Random classification tree • • • Binary tree for multiclass problems Randomly chosen function

Random classification tree • • • Binary tree for multiclass problems Randomly chosen function q[x] at each split Choose threshold τ to maximize log probability First term: contribution of the data that passes the left branch Second term: contribution of the data that passes the right branch The data is evaluated against a categorical distribution Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 76

Random classification tree For given threshold, we can compute parameters in closed form: Computer

Random classification tree For given threshold, we can compute parameters in closed form: Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 77

Random classification tree • The process is repeated recursively • Very fast to train

Random classification tree • The process is repeated recursively • Very fast to train Most parameters are chosen randomly • It can be trained by very large training sets Linear complexity in the number of examples Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 78

Random classification tree Related models: Fern: • A tree where all of the functions

Random classification tree Related models: Fern: • A tree where all of the functions at a level are the same • Threshold may be different at each branch • Very efficient to implement Forest • Collection of trees • Average results to get more robust answer • Similar to `Bayesian’ approach – average of models with different parameters Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 79

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and Gaussian process classification Incremental fitting, boosting and trees Multi-classification Random classification trees Non-probabilistic classification Applications Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 80

Non-probabilistic classifiers Most people use non-probabilistic classification methods such as neural networks, adaboost, support

Non-probabilistic classifiers Most people use non-probabilistic classification methods such as neural networks, adaboost, support vector machines. This is largely for historical reasons Probabilistic approaches: • No serious disadvantages • Naturally produce estimates of uncertainty • Easily extensible to multi-class case • Easily related to each other Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 81

Non-probabilistic classifiers Multi-layer perceptron (neural network) • Non-linear logistic regression with sigmoid functions •

Non-probabilistic classifiers Multi-layer perceptron (neural network) • Non-linear logistic regression with sigmoid functions • Learning known as back propagation • Transformed variable z is hidden layer Adaboost • Very closely related to logitboost • Performance very similar Support vector machines • Similar to relevance vector classification but objective fn is convex • No certainty • Not easily extended to multi-class • Produces solutions that are less sparse • More restrictions on kernel function Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 82

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and

Structure • • • Logistic regression Bayesian logistic regression Non-linear logistic regression Kernelization and Gaussian process classification Incremental fitting, boosting and trees Multi-classification Random classification trees Non-probabilistic classification Applications Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 83

Gender Classification Incremental logistic regression 300 arc tan basis functions: Results: 87. 5% (humans=95%)

Gender Classification Incremental logistic regression 300 arc tan basis functions: Results: 87. 5% (humans=95%) Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 84

Fast Face Detection (Viola and Jones 2001) • Adaboost Computer vision: models, learning and

Fast Face Detection (Viola and Jones 2001) • Adaboost Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 85

Computing Haar Features • Integral images Computer vision: models, learning and inference. © 2011

Computing Haar Features • Integral images Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 86

Pedestrian Detection Computer vision: models, learning and inference. © 2011 Simon J. D. Prince

Pedestrian Detection Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 87

Semantic segmentation • Logitboost with textons Computer vision: models, learning and inference. © 2011

Semantic segmentation • Logitboost with textons Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 88

Recovering surface layout • Logitboost classification tree Computer vision: models, learning and inference. ©

Recovering surface layout • Logitboost classification tree Computer vision: models, learning and inference. © 2011 Simon J. D. Prince 89

Recovering body pose • • • Depth images from Kinect Random forest. Input: Differences

Recovering body pose • • • Depth images from Kinect Random forest. Input: Differences in depth between pixels Classification: which of 31 parts are present at each pixel Proposal. Computer for joints vision: models, learning and inference. © 2011 Simon J. D. Prince 90