PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1 INTRODUCTION

  • Slides: 64
Download presentation
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION

Example Handwritten Digit Recognition

Example Handwritten Digit Recognition

Polynomial Curve Fitting

Polynomial Curve Fitting

Sum-of-Squares Error Function

Sum-of-Squares Error Function

0 th Order Polynomial

0 th Order Polynomial

1 st Order Polynomial

1 st Order Polynomial

3 rd Order Polynomial

3 rd Order Polynomial

Over-fitting Root-Mean-Square (RMS) Error:

Over-fitting Root-Mean-Square (RMS) Error:

9 th Order Polynomial

9 th Order Polynomial

Polynomial Coefficients

Polynomial Coefficients

Data Set Size: 9 th Order Polynomial

Data Set Size: 9 th Order Polynomial

Data Set Size: 9 th Order Polynomial

Data Set Size: 9 th Order Polynomial

Regularization Penalize large coefficient values

Regularization Penalize large coefficient values

Regularization:

Regularization:

Regularization:

Regularization:

Regularization: vs.

Regularization: vs.

Polynomial Coefficients

Polynomial Coefficients

Probability Theory Apples and Oranges

Probability Theory Apples and Oranges

Probability Theory Marginal Probability Joint Probability Conditional Probability

Probability Theory Marginal Probability Joint Probability Conditional Probability

Probability Theory Sum Rule Product Rule

Probability Theory Sum Rule Product Rule

The Rules of Probability Sum Rule Product Rule

The Rules of Probability Sum Rule Product Rule

Bayes’ Theorem posterior likelihood × prior

Bayes’ Theorem posterior likelihood × prior

Probability Densities

Probability Densities

Transformed Densities

Transformed Densities

Expectations Conditional Expectation (discrete) Approximate Expectation (discrete and continuous)

Expectations Conditional Expectation (discrete) Approximate Expectation (discrete and continuous)

Variances and Covariances

Variances and Covariances

The Gaussian Distribution

The Gaussian Distribution

Gaussian Mean and Variance

Gaussian Mean and Variance

The Multivariate Gaussian

The Multivariate Gaussian

Central Limit Theorem The distribution of the sum of N i. i. d. random

Central Limit Theorem The distribution of the sum of N i. i. d. random variables becomes increasingly Gaussian as N grows. Example: N uniform [0, 1] random variables.

Geometry of the Multivariate Gaussian

Geometry of the Multivariate Gaussian

The Gaussian Distribution

The Gaussian Distribution

Moments of the Multivariate Gaussian (2)

Moments of the Multivariate Gaussian (2)

Central Limit Theorem The distribution of the sum of N i. i. d. random

Central Limit Theorem The distribution of the sum of N i. i. d. random variables becomes increasingly Gaussian as N grows. Example: N uniform [0, 1] random variables.

Gaussian Parameter Estimation Likelihood function

Gaussian Parameter Estimation Likelihood function

Maximum (Log) Likelihood

Maximum (Log) Likelihood

Properties of and

Properties of and

Curve Fitting Re-visited

Curve Fitting Re-visited

Maximum Likelihood Determine by minimizing sum-of-squares error, .

Maximum Likelihood Determine by minimizing sum-of-squares error, .

Predictive Distribution

Predictive Distribution

MAP: A Step towards Bayes Determine by minimizing regularized sum-of-squares error, . MAP =

MAP: A Step towards Bayes Determine by minimizing regularized sum-of-squares error, . MAP = Maximum Posterior

Bayesian Curve Fitting

Bayesian Curve Fitting

Bayesian Predictive Distribution

Bayesian Predictive Distribution

Model Selection Cross-Validation

Model Selection Cross-Validation

Curse of Dimensionality

Curse of Dimensionality

Curse of Dimensionality Polynomial curve fitting, M = 3 Gaussian Densities in higher dimensions

Curse of Dimensionality Polynomial curve fitting, M = 3 Gaussian Densities in higher dimensions

Decision Theory Inference step Determine either or . Decision step For given x, determine

Decision Theory Inference step Determine either or . Decision step For given x, determine optimal t.

Minimum Misclassification Rate

Minimum Misclassification Rate

Minimum Expected Loss Example: classify medical images as ‘cancer’ or ‘normal’

Minimum Expected Loss Example: classify medical images as ‘cancer’ or ‘normal’

Minimum Expected Loss

Minimum Expected Loss

Reject Option

Reject Option

Why Separate Inference and Decision? • • Minimizing risk (loss matrix may change over

Why Separate Inference and Decision? • • Minimizing risk (loss matrix may change over time) Reject option Unbalanced class priors Combining models

Decision Theory for Regression Inference step Determine . Decision step For given x, make

Decision Theory for Regression Inference step Determine . Decision step For given x, make optimal prediction, y(x), for t. Loss function:

The Squared Loss Function

The Squared Loss Function

Generative vs Discriminative Generative approach: Model Use Bayes’ theorem Discriminative approach: Model directly

Generative vs Discriminative Generative approach: Model Use Bayes’ theorem Discriminative approach: Model directly

Entropy Important quantity in • coding theory • statistical physics • machine learning

Entropy Important quantity in • coding theory • statistical physics • machine learning

Entropy Coding theory: x discrete with 8 possible states; how many bits to transmit

Entropy Coding theory: x discrete with 8 possible states; how many bits to transmit the state of x? All states equally likely

Entropy

Entropy

Entropy In how many ways can N identical objects be allocated M bins? Entropy

Entropy In how many ways can N identical objects be allocated M bins? Entropy maximized when

Entropy

Entropy

Differential Entropy Put bins of width ¢ along the real line Differential entropy maximized

Differential Entropy Put bins of width ¢ along the real line Differential entropy maximized (for fixed in which case ) when

Conditional Entropy

Conditional Entropy

The Kullback-Leibler Divergence

The Kullback-Leibler Divergence

Mutual Information

Mutual Information