Classification of FDGPET Brain Data Fluorodeoxyglucose positron emission

Classification of FDG-PET* Brain Data * Fluorodeoxyglucose positron emission tomography Deborah Mudali 1, * Klaus L. Leenders 2 Michael Biehl 1 Jos B. T. M. Roerdink 1, 3 1 Johann Bernoulli Institute for Mathematics and Computer Science, University of Groningen, NL 2 Department of Neurology University Medical Center Groningen, NL 3 Neuroimaging Center University Medical Center Groningen, NL * Mbarara University of Science & Technology, Uganda

overview Prototype-based classification Learning Vector Quantization Generalized Matrix Relevance Learning (GMLVQ) Example application Classification of Parkinsonian Syndromes based on FDG-PET brain data Combination: PCA + GMLVQ comparison with DT, SVM Conclusion and Outlook WSOM, Houston, 2016 2

Learning Vector Quantization N-dimensional data, feature vectors ∙ identification of prototype vectors from labeled example data ∙ (dis)-similarity based classification (e. g. Euclidean distance) competitive learning: Winner-Takes-All LVQ 1 • initialize prototype vectors for different classes • present a single example • identify the winner (closest prototype) • move the winner - closer towards the data (same class) - away from the data (different class) WSOM, Houston, 2016 [Kohonen, 1990, 1997] feature space

prototype based classifier - represent data by one or several prototypes per class ? - classify a query according to the label of the nearest prototype (or alternative voting schemes) - local decision boundaries according feature space to (e. g. ) Euclidean distances + robustness to outliers, low storage needs and computational effort + parameterization in feature space, interpretability - model selection: number of prototypes per class, etc. ? appropriate distance / (dis-) similarity measure WSOM, Houston, 2016

Learning Vector Quantization fixed distance measures: - choice based on prior knowledge or preprocessing - determine prototypes from example data by means of (iterative) learning schemes e. g. heuristic LVQ 1, cost function based Generalized LVQ relevance learning, adaptive distances: - employ parameterized distance measure - update parameters in one training process with prototypes - optimize adaptive, data driven dissimilarity example: Matrix Relevance LVQ WSOM, Houston, 2016 5

Relevance Matrix LVQ generalized quadratic distance in LVQ: [Schneider et al. , 2009] relevance matrix: quantifies importance of features and pairs of features summarizes relevance of feature j ( for equally scaled features ) training: optimize prototypes and Λ w. r. t. classification of examples variants: global/local matrices WSOM, Houston, 2016 (piecewise quadratic boundaries) diagonal relevances (single feature weights) rectangular (low-dim. representation)

cost function based training one example: Generalized LVQ [Sato & Yamada, 1995] minimize two winning prototypes: linear E favors large margin separation of classes, e. g. sigmoidal (linear for small arguments), e. g. E approximates number of misclassifications small , large E favors class-typical prototypes WSOM, Houston, 2016

cost function based LVQ There is nothing objective about objective functions James Mc. Clelland WSOM, Houston, 2016

classification of FDG-PET data FDG-PET (Fluorodeoxyglucose positron emission tomography, 3 d-images) n=18 HC Healhy controls condition Glucose uptake n= 20 PD Parkinson’s Disease n=21 MSA Multiple System Atrophy n=17 PSP Progressive Supranuclear Palsy [http: //glimpsproject. com] WSOM, Houston, 2016 9

work flow WSOM, Houston, 2016 Scaled Subprofile Model PCA based on a given group of subjects 1…. P Group Invariant Subprofile (GIS) SSMPCA subject socres 1…. P Subject Residual Profile SRP log-transformed high-intensity voxels 1…. N (N≈200000) subjects 1…. P data and pre-processing: D. Mudali, L. K. Teune, R. J. Renken, K. L. Leenders, J. B. T. M. Roerdink. Computational and Mathematical Methods in Medicine. March 2015, Art. ID 136921, 10 p. and refs. therein 10

work flow applied to Scaled Subprofile Model PCA based on a given group of subjects novel subjects 1…. P test Group Invariant Subprofile (GIS) SSMPCA subject socres 1…. P Subject Residual Profile SRP log-transformed high-intensity voxels 1…. N (N≈200000) subjects 1…. P labels (condition) ? GMLVQ classifier prototypes and distance WSOM, Houston, 2016 11

example: HC vs. PD Healthy controls vs. Parkinson’s Disease 38 leave-one-out validation runs averaged… prototypes relevance matrix ROC of leave-one-out prediction (w/o z-score transform. ) WSOM, Houston, 2016 12

example: HC vs. PSP Healthy controls vs. Progressive Supranuclear Palsy 35 leave-one-out validation runs, averaged… prototypes relevance matrix ROC of leave-one-out prediction (w/o z-score transform. ) WSOM, Houston, 2016 13

performance comparison GMLVQ NPC accuracies Decision tree (C 4. 5) using all PC Mudali et al. 2015 Note: maximum margin perceptron - aka SVM with linear kernel - (Matlab svmtrain) achieves performance similar to GMLVQ WSOM, Houston, 2016 14

four classes: HC / PD / MSA / PSP leave-one-out confusion matrix for the four-class problem (1 vs 1) GM class acc. lin. class acc. WSOM, Houston, 2016 77. 8 65. 0 64. 7 76. 2 66. 7 60. 0 52. 9 89. 0 % % % % 15

HC / PD / MSA / PSP GMLVQ MSA PD HC PSP visualization of training data set in terms of the leading eigenvectors of Λ WSOM, Houston, 2016 16

diseases only: PD / MSA / PSP leave-one-out confusion matrix for the three-class problem (1 vs 1) lin. WSOM, Houston, 2016 17

diseases only: PD / MSA / PSP GMLVQ PD MSA PSP WSOM, Houston, 2016 visualization of training data set in terms of the leading eigenvectors of Λ 18

discussion / conclusion - detection and discrimination of Parkinsonian syndromes: GMLVQ classifier and SVM clearly outperform decision trees - serious limitations: small data set leave-one-out validation over-fitting - accuracy is not enough: can we obtain better insight into the classifiers ? WSOM, Houston, 2016 19

outlook/work in progress - larger data sets - optimization of the number of PCs used as features shown to improve decision tree performance potential improvement for other classifiers - understanding relevances in voxel-space relevant PC hint at discriminative between-patient variability PCA: recent example: diagnosis of rheumatoid arthritis based on cytokine expression [L. Yeo et al. , Ann. of the Rheumatic Diseases, 2015] WSOM, Houston, 2016 20

links Pre- and re-prints etc. : http: //www. cs. rug. nl/~biehl/ Matlab code: Relevance and Matrix adaptation in Learning Vector Quantization (GRLVQ, GMLVQ and Li. Ra. M LVQ): http: //matlabserver. cs. rug. nl/gmlvqweb/ A no-nonsense beginners’ tool for GMLVQ: http: //www. cs. rug. nl/~biehl/gmlvq WSOM, Houston, 2016 21

Questions ? WSOM, Houston, 2016