INTRODUCTION TO STATISTICAL LEARNING THEORY J Saketha Nath

  • Slides: 53
Download presentation
INTRODUCTION TO STATISTICAL LEARNING THEORY J. Saketha Nath (IIT Bombay)

INTRODUCTION TO STATISTICAL LEARNING THEORY J. Saketha Nath (IIT Bombay)

What is STL? “The goal of statistical learning theory is to study, in a

What is STL? “The goal of statistical learning theory is to study, in a statistical framework, the properties of learning algorithms” – [Bousquet et. al. , 04]

Supervised Learning Setting ■

Supervised Learning Setting ■

Supervised Learning Setting ■

Supervised Learning Setting ■

Supervised Learning Setting ■

Supervised Learning Setting ■

Supervised Learning Setting ■

Supervised Learning Setting ■

Supervised Learning Setting ■

Supervised Learning Setting ■

Supervised Learning Setting ■ Well-defined, but un-realizable.

Supervised Learning Setting ■ Well-defined, but un-realizable.

Supervised Learning Setting ■ How well can we approximate?

Supervised Learning Setting ■ How well can we approximate?

Skyline ? ■ With high probability, average loss (a. k. a. empirical risk) on

Skyline ? ■ With high probability, average loss (a. k. a. empirical risk) on (a large) training set is a good approximation for risk

Skyline ? ■

Skyline ? ■

Some Definitions ■

Some Definitions ■

Some Definitions ■

Some Definitions ■

Some Algorithms ■ [Vapnik, 92]

Some Algorithms ■ [Vapnik, 92]

Some Algorithms ■ [Vapnik, 92]

Some Algorithms ■ [Vapnik, 92]

Some Algorithms ■ [Vapnik, 92] https: //www. coursera. org/course/ml

Some Algorithms ■ [Vapnik, 92] https: //www. coursera. org/course/ml

Some Algorithms ■ [Vapnik, 92] [Robbins & Monro, 51]

Some Algorithms ■ [Vapnik, 92] [Robbins & Monro, 51]

Some Algorithms https: //www. coursera. org/course/ml [Robbins & Monro, 51]

Some Algorithms https: //www. coursera. org/course/ml [Robbins & Monro, 51]

Some Algorithms Fo cu s of th is ta lk ■ [Vapnik, 92] Su

Some Algorithms Fo cu s of th is ta lk ■ [Vapnik, 92] Su m m ar y of re s ul ts [Robbins & Monro, 51]

ERM consistency: Sufficient conditions ■

ERM consistency: Sufficient conditions ■

ERM consistency: Sufficient conditions ■

ERM consistency: Sufficient conditions ■

ERM consistency: Sufficient conditions ■

ERM consistency: Sufficient conditions ■

ERM consistency: Sufficient conditions ■

ERM consistency: Sufficient conditions ■

Story so far … ■

Story so far … ■

Candidate for Problem Complexity

Candidate for Problem Complexity

Candidate for Problem Complexity

Candidate for Problem Complexity

Candidate for Problem Complexity 1. Ensure (asymptotically) goes to zero. 2. Show concentration around

Candidate for Problem Complexity 1. Ensure (asymptotically) goes to zero. 2. Show concentration around mean for max. div.

Candidate for Problem Complexity

Candidate for Problem Complexity

Candidate for Problem Complexity

Candidate for Problem Complexity

Candidate for Problem Complexity MAXIMUM DISCREPANCY

Candidate for Problem Complexity MAXIMUM DISCREPANCY

Towards Rademacher Complexity

Towards Rademacher Complexity

Towards Rademacher Complexity

Towards Rademacher Complexity

Towards Rademacher Complexity

Towards Rademacher Complexity

Rademacher Complexity

Rademacher Complexity

Rademacher Complexity

Rademacher Complexity

Rademacher Complexity

Rademacher Complexity

Story so far … ■

Story so far … ■

■ Choose model with right trade-off using Domain knowledge.

■ Choose model with right trade-off using Domain knowledge.

Relation with classical measures ■

Relation with classical measures ■

Mean concentration: Observation ■

Mean concentration: Observation ■

Mc. Diarmid’s Inequality ■

Mc. Diarmid’s Inequality ■

Mc. Diarmid’s Inequality ■

Mc. Diarmid’s Inequality ■

Learning Bounds ■

Learning Bounds ■

Learning Bounds ■ Computable except this term!

Learning Bounds ■ Computable except this term!

Learning Bounds ■

Learning Bounds ■

Learning Bounds ■

Learning Bounds ■

Story so far … ■

Story so far … ■

Linear model with Lipschitz loss ■

Linear model with Lipschitz loss ■

Linear model with Lipschitz loss ■

Linear model with Lipschitz loss ■

Learnable Problems Shai Shalev-Shwartz et. al. , 2009

Learnable Problems Shai Shalev-Shwartz et. al. , 2009

THANK YOU

THANK YOU