ECE 5984 Introduction to Machine Learning Topics Finish

  • Slides: 36
Download presentation
ECE 5984: Introduction to Machine Learning Topics: – – (Finish) Regression Model selection, Cross-validation

ECE 5984: Introduction to Machine Learning Topics: – – (Finish) Regression Model selection, Cross-validation Error decomposition Bias-Variance Tradeoff Readings: Barber 17. 1, 17. 2 Dhruv Batra Virginia Tech

Administrativia • HW 1 – Solutions available • Project Proposal – Due: Tue 02/24,

Administrativia • HW 1 – Solutions available • Project Proposal – Due: Tue 02/24, 11: 55 pm – <=2 pages, NIPS format – Show Igor’s proposal • HW 2 – Due: Friday 03/06, 11: 55 pm – Implement linear regression, Naïve Bayes, Logistic Regression (C) Dhruv Batra 2

Recap of last time (C) Dhruv Batra 3

Recap of last time (C) Dhruv Batra 3

Regression (C) Dhruv Batra 4

Regression (C) Dhruv Batra 4

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 5

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 5

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 6

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 6

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 7

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 7

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 8

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 8

But, why? • Why sum squared error? ? ? • Gaussians, Watson, Gaussians… (C)

But, why? • Why sum squared error? ? ? • Gaussians, Watson, Gaussians… (C) Dhruv Batra 9

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 10

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 10

Is OLS Robust? • Demo – http: //www. calpoly. edu/~srein/Stat. Demo/All. html • Bad

Is OLS Robust? • Demo – http: //www. calpoly. edu/~srein/Stat. Demo/All. html • Bad things happen when the data does not come from your model! • How do we fix this? (C) Dhruv Batra 11

Robust Linear Regression • y ~ Lap(w’x, b) • On paper (C) Dhruv Batra

Robust Linear Regression • y ~ Lap(w’x, b) • On paper (C) Dhruv Batra 12

Plan for Today • (Finish) Regression – Bayesian Regression – Different prior vs likelihood

Plan for Today • (Finish) Regression – Bayesian Regression – Different prior vs likelihood combination – Polynomial Regression • Error Decomposition – Bias-Variance – Cross-validation (C) Dhruv Batra 13

Robustify via Prior • Ridge Regression • y ~ N(w’x, σ2) • w ~

Robustify via Prior • Ridge Regression • y ~ N(w’x, σ2) • w ~ N(0, t 2 I) • P(w | x, y) = (C) Dhruv Batra 14

Summary Likelihood Prior Name Gaussian Uniform Least Squares Gaussian Ridge Regression Gaussian Laplace Lasso

Summary Likelihood Prior Name Gaussian Uniform Least Squares Gaussian Ridge Regression Gaussian Laplace Lasso Laplace Uniform Robust Regression Student Uniform Robust Regression (C) Dhruv Batra 15

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 16

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 16

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 17

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 17

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 18

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 18

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 19

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 19

Example • Demo – http: //www. princeton. edu/~rkatzwer/Polynomial. Regression/ (C) Dhruv Batra 20

Example • Demo – http: //www. princeton. edu/~rkatzwer/Polynomial. Regression/ (C) Dhruv Batra 20

What you need to know • Linear Regression – – – (C) Dhruv Batra

What you need to know • Linear Regression – – – (C) Dhruv Batra Model Least Squares Objective Connections to Max Likelihood with Gaussian Conditional Robust regression with Laplacian Likelihood Ridge Regression with priors Polynomial and General Additive Regression 21

New Topic: Model Selection and Error Decomposition (C) Dhruv Batra 22

New Topic: Model Selection and Error Decomposition (C) Dhruv Batra 22

Example for Regression • Demo – http: //www. princeton. edu/~rkatzwer/Polynomial. Regression/ • How do

Example for Regression • Demo – http: //www. princeton. edu/~rkatzwer/Polynomial. Regression/ • How do we pick the hypothesis class? (C) Dhruv Batra 23

Model Selection • How do we pick the right model class? • Similar questions

Model Selection • How do we pick the right model class? • Similar questions – How do I pick magic hyper-parameters? – How do I do feature selection? (C) Dhruv Batra 24

Errors • Expected Loss/Error • Training Loss/Error • Validation Loss/Error • Test Loss/Error •

Errors • Expected Loss/Error • Training Loss/Error • Validation Loss/Error • Test Loss/Error • Reporting Training Error (instead of Test) is CHEATING • Optimizing parameters on Test Error is CHEATING (C) Dhruv Batra 25

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 26

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 26

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 27

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 27

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 28

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 28

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 29

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 29

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 30

(C) Dhruv Batra Slide Credit: Greg Shakhnarovich 30

Typical Behavior • a (C) Dhruv Batra 31

Typical Behavior • a (C) Dhruv Batra 31

Overfitting • Overfitting: a learning algorithm overfits the training data if it outputs a

Overfitting • Overfitting: a learning algorithm overfits the training data if it outputs a solution w when there exists another solution w’ such that: (C) Dhruv Batra Slide Credit: Carlos Guestrin 32

Error Decomposition Reality r model class g lin e od ro Er M n

Error Decomposition Reality r model class g lin e od ro Er M n tio a tim r Es Erro Op tim Er izat ro ion r (C) Dhruv Batra 33

Error Decomposition Reality r ng eli d o ro Er M l cl (C)

Error Decomposition Reality r ng eli d o ro Er M l cl (C) Dhruv Batra ass n tio iza im or pt Err O mo de n tio a tim Es Error 34

Error Decomposition r model class ng li de ro Er Reality o M Higher-Order

Error Decomposition r model class ng li de ro Er Reality o M Higher-Order Potentials n io t a im ror t Es Er O pt im Er iza ro tio r n (C) Dhruv Batra 35

Error Decomposition • Approximation/Modeling Error – You approximated reality with model • Estimation Error

Error Decomposition • Approximation/Modeling Error – You approximated reality with model • Estimation Error – You tried to learn model with finite data • Optimization Error – You were lazy and couldn’t/didn’t optimize to completion • (Next time) Bayes Error – Reality just sucks (C) Dhruv Batra 36