Machine Learning Performance Evaluation Recognition Rate Estimate J

  • Slides: 20
Download presentation
Machine Learning Performance Evaluation: Recognition Rate Estimate J. -S. Roger Jang ( 張智星 )

Machine Learning Performance Evaluation: Recognition Rate Estimate J. -S. Roger Jang ( 張智星 ) CSIE Dept. , National Taiwan Univ. http: //mirlab. org/jang@mirlab. org

Machine Learning Performance Evaluation Outline Performance indices of a given classifier/model • Accuracy (recognition

Machine Learning Performance Evaluation Outline Performance indices of a given classifier/model • Accuracy (recognition rate) • Computation load Methods to estimate the recognition rate • • • 2 Inside test One-sided holdout test Quiz! Two-sided holdout test M-fold cross validation Leave-one-out cross validation

Machine Learning Performance Evaluation Synonyms The following sets of synonyms will be use interchangeably

Machine Learning Performance Evaluation Synonyms The following sets of synonyms will be use interchangeably • Classifier, model • Recognition rate, accuracy 3

Machine Learning Performance Evaluation Performance Indices Performance indices of a classifier • Recognition rate

Machine Learning Performance Evaluation Performance Indices Performance indices of a classifier • Recognition rate - Requires an objective procedure to derive it • Computation load - Design-time computation - Run-time computation Our focus • Recognition rate and the procedures to derive it • The estimated accuracy depends on - Dataset - Model (types and complexity) 4

Machine Learning Performance Evaluation Methods for Deriving Recognition rates Methods to derive the recognition

Machine Learning Performance Evaluation Methods for Deriving Recognition rates Methods to derive the recognition rates • • • Inside test (resubstitution recog. rate) One-sided holdout test Two-sided holdout test M-fold cross validation Leave-one-out cross validation Data partitioning • Training set • Training and test sets • Training, validating, and test sets 5

Machine Learning Performance Evaluation Inside Test (1/2) Dataset partitioning • Use the whole dataset

Machine Learning Performance Evaluation Inside Test (1/2) Dataset partitioning • Use the whole dataset for training & evaluation Recognition rate • Inside-test recognition rate • Resubstitution accuracy 6

Machine Learning Performance Evaluation Inside Test (2/2) Characteristics • Too optimistic since RR tends

Machine Learning Performance Evaluation Inside Test (2/2) Characteristics • Too optimistic since RR tends to be higher • For instance, 1 -NNC always has an RR of 100%! • Can be used as the upper bound of the true RR. Potential reasons for low inside-test RR: • Bad features of the dataset • Bad method for model construction, such as - Bad results from neural network training - Bad results from k-means clustering 7

Machine Learning Performance Evaluation One-side Holdout Test (1/2) Dataset partitioning • Training set for

Machine Learning Performance Evaluation One-side Holdout Test (1/2) Dataset partitioning • Training set for model construction • Test set for performance evaluation Recognition rate • Inside-test RR • Outside-test RR 8

Machine Learning Performance Evaluation One-side Holdout Test (2/2) Characteristics • Highly affected by data

Machine Learning Performance Evaluation One-side Holdout Test (2/2) Characteristics • Highly affected by data partitioning • Usually Adopted when design-time computation load is high 9

Machine Learning Performance Evaluation Two-sided Holdout Test (1/3) Dataset partitioning • Training set for

Machine Learning Performance Evaluation Two-sided Holdout Test (1/3) Dataset partitioning • Training set for model construction • Test set for performance evaluation • Role reversal 10

Machine Learning Performance Evaluation Two-sided Holdout Test (2/3) Two-sided holdout test (used in GMDH)

Machine Learning Performance Evaluation Two-sided Holdout Test (2/3) Two-sided holdout test (used in GMDH) Data set A Data set B construction Model A evaluation RRA Model B evaluation RRB construction Outside-test RR = (RRA + RRB)/2 11

Machine Learning Performance Evaluation Two-sided Holdout Test (3/3) Characteristics • Better usage of the

Machine Learning Performance Evaluation Two-sided Holdout Test (3/3) Characteristics • Better usage of the dataset • Still highly affected by the partitioning • Suitable for models/classifiers with high designtime computation load 12

Machine Learning Performance Evaluation M-fold Cross Validation (1/3) Data partitioning • Partition the dataset

Machine Learning Performance Evaluation M-fold Cross Validation (1/3) Data partitioning • Partition the dataset into m fold • One fold for test, the other folds for training • Repeat m times 13

Machine Learning Performance Evaluation M-fold Cross Validation (2/3) construction m disjoint sets . .

Machine Learning Performance Evaluation M-fold Cross Validation (2/3) construction m disjoint sets . . . Model k . . . Outside test 14 evaluation

Machine Learning Performance Evaluation M-fold Cross Validation (3/3) Characteristics • When m=2 Two-sided holdout

Machine Learning Performance Evaluation M-fold Cross Validation (3/3) Characteristics • When m=2 Two-sided holdout test • When m=n Leave-one-out cross validation • The value of m depends on the computation load imposed by the selected model/classifier. 15

Machine Learning Performance Evaluation Leave-one-out Cross Validation (1/3) Data partitioning • When m=n and

Machine Learning Performance Evaluation Leave-one-out Cross Validation (1/3) Data partitioning • When m=n and Si=(xi, yi) 16

Machine Learning Performance Evaluation Leave-one-out Cross Validation (2/3) Leave-one-out CV construction n i/o pairs

Machine Learning Performance Evaluation Leave-one-out Cross Validation (2/3) Leave-one-out CV construction n i/o pairs . . . 0% or 100%! Model k . . . Outside test 17 evaluation

Machine Learning Performance Evaluation Leave-one-out Cross Validation (3/3) General method for LOOCV • Perform

Machine Learning Performance Evaluation Leave-one-out Cross Validation (3/3) General method for LOOCV • Perform model construction (as a blackbox) n times Slow! To speed up the computation LOOCV • Construct a common part that will be used repeatedly, such as - Global mean and covariance for QC • More info of cross-validation on Wikipedia 18

Machine Learning Performance Evaluation Applications and Misuse of CV Applications of CV • Input

Machine Learning Performance Evaluation Applications and Misuse of CV Applications of CV • Input (feature) selection • Model complexity determination • Performance comparison among different models Misuse of CV • Do not try to boost validation RR too much, or you are running the risk of indirectly training on the left -out data! 19

Machine Learning Performance Evaluation Quiz Explain the following terms for performance evaluation • •

Machine Learning Performance Evaluation Quiz Explain the following terms for performance evaluation • • • 20 Inside test One-sided holdout test Two-sided holdout test M-fold cross validation Leave-one-out cross validation