Machine Learning Performance Evaluation Recognition Rate Estimate J




















- Slides: 20

Machine Learning Performance Evaluation: Recognition Rate Estimate J. -S. Roger Jang ( 張智星 ) CSIE Dept. , National Taiwan Univ. http: //mirlab. org/jang@mirlab. org

Machine Learning Performance Evaluation Outline Performance indices of a given classifier/model • Accuracy (recognition rate) • Computation load Methods to estimate the recognition rate • • • 2 Inside test One-sided holdout test Quiz! Two-sided holdout test M-fold cross validation Leave-one-out cross validation

Machine Learning Performance Evaluation Synonyms The following sets of synonyms will be use interchangeably • Classifier, model • Recognition rate, accuracy 3

Machine Learning Performance Evaluation Performance Indices Performance indices of a classifier • Recognition rate - Requires an objective procedure to derive it • Computation load - Design-time computation - Run-time computation Our focus • Recognition rate and the procedures to derive it • The estimated accuracy depends on - Dataset - Model (types and complexity) 4

Machine Learning Performance Evaluation Methods for Deriving Recognition rates Methods to derive the recognition rates • • • Inside test (resubstitution recog. rate) One-sided holdout test Two-sided holdout test M-fold cross validation Leave-one-out cross validation Data partitioning • Training set • Training and test sets • Training, validating, and test sets 5

Machine Learning Performance Evaluation Inside Test (1/2) Dataset partitioning • Use the whole dataset for training & evaluation Recognition rate • Inside-test recognition rate • Resubstitution accuracy 6

Machine Learning Performance Evaluation Inside Test (2/2) Characteristics • Too optimistic since RR tends to be higher • For instance, 1 -NNC always has an RR of 100%! • Can be used as the upper bound of the true RR. Potential reasons for low inside-test RR: • Bad features of the dataset • Bad method for model construction, such as - Bad results from neural network training - Bad results from k-means clustering 7

Machine Learning Performance Evaluation One-side Holdout Test (1/2) Dataset partitioning • Training set for model construction • Test set for performance evaluation Recognition rate • Inside-test RR • Outside-test RR 8

Machine Learning Performance Evaluation One-side Holdout Test (2/2) Characteristics • Highly affected by data partitioning • Usually Adopted when design-time computation load is high 9

Machine Learning Performance Evaluation Two-sided Holdout Test (1/3) Dataset partitioning • Training set for model construction • Test set for performance evaluation • Role reversal 10

Machine Learning Performance Evaluation Two-sided Holdout Test (2/3) Two-sided holdout test (used in GMDH) Data set A Data set B construction Model A evaluation RRA Model B evaluation RRB construction Outside-test RR = (RRA + RRB)/2 11

Machine Learning Performance Evaluation Two-sided Holdout Test (3/3) Characteristics • Better usage of the dataset • Still highly affected by the partitioning • Suitable for models/classifiers with high designtime computation load 12

Machine Learning Performance Evaluation M-fold Cross Validation (1/3) Data partitioning • Partition the dataset into m fold • One fold for test, the other folds for training • Repeat m times 13

Machine Learning Performance Evaluation M-fold Cross Validation (2/3) construction m disjoint sets . . . Model k . . . Outside test 14 evaluation

Machine Learning Performance Evaluation M-fold Cross Validation (3/3) Characteristics • When m=2 Two-sided holdout test • When m=n Leave-one-out cross validation • The value of m depends on the computation load imposed by the selected model/classifier. 15

Machine Learning Performance Evaluation Leave-one-out Cross Validation (1/3) Data partitioning • When m=n and Si=(xi, yi) 16

Machine Learning Performance Evaluation Leave-one-out Cross Validation (2/3) Leave-one-out CV construction n i/o pairs . . . 0% or 100%! Model k . . . Outside test 17 evaluation

Machine Learning Performance Evaluation Leave-one-out Cross Validation (3/3) General method for LOOCV • Perform model construction (as a blackbox) n times Slow! To speed up the computation LOOCV • Construct a common part that will be used repeatedly, such as - Global mean and covariance for QC • More info of cross-validation on Wikipedia 18

Machine Learning Performance Evaluation Applications and Misuse of CV Applications of CV • Input (feature) selection • Model complexity determination • Performance comparison among different models Misuse of CV • Do not try to boost validation RR too much, or you are running the risk of indirectly training on the left -out data! 19

Machine Learning Performance Evaluation Quiz Explain the following terms for performance evaluation • • • 20 Inside test One-sided holdout test Two-sided holdout test M-fold cross validation Leave-one-out cross validation