Automated Identification of Abnormal Adult EEGs A Thesis
Automated Identification of Abnormal Adult EEGs A Thesis Proposal by: Silvia López de Diego Neural Engineering Data Consortium College of Engineering Temple University Philadelphia, Pennsylvania, USA
Introduction
Electroencephalography (EEG) • Electroencephalography (EEG) refers to the recording of electrical activity along the scalp • It is used to treat conditions such as sleep disorders and epilepsy • Because EEG is noninvasive and relatively cheap, it is still used despite the emergence of technologies such as Magnetic Resonance Imaging (MRI) S. López de Diego: Abnormal EEGs December 8, 2016 4
Manual Interpretation of EEGs • Manual interpretation of an EEG is performed by a board-certified neurologist. It takes several years to receive this certification. • Interrater agreement is low: the interpretation of an EEG depends somewhat on the training and subjective judgement of the examiner. • Increasing the interrater agreement for EEG interpretation is one of the advantages of an automated technique. Patient Preparation • Patients are prepared for the test S. López de Diego: Abnormal EEGs EEG Recording • EEG ranging from 22 minutes to several days is recorded EEG is Interpreted • Certified physicians interpret EEG Report is Produced • Report of findings (e. g. abnormality) is prepared December 8, 2016 5
Manual Interpretation of EEGs • The EEG interpretation task can be broken down in: • Recognition of transients: Events that include pathological and physiological waveforms, such as spike and sharp waves discharges • Analysis of background: General characteristics present in all EEG recordings that are usually observed when making a normal/abnormal classification Example of EEG Background S. López de Diego: Abnormal EEGs Example of EEG Transient December 8, 2016 6
Normal EEG Characteristics • The main characteristics of a normal EEG are the following: • Reactivity: Response to certain physiological changes or provocations. • Alpha Rhythm: Waves originated in the occipital lobe (predominantly), between 8 -13 Hz and 15 to 45 μV. • Mu Rhythm: Central rhythm of alpha activity commonly between 8 -10 Hz visible in 17% to 19% of adults. • Beta Activity: Activities in the frequency bands of 18 -25 Hz, 1416 Hz and 35 -40 Hz. • Theta Activity: Traces of 6 -7 Hz activity present in the frontal or frontocentral regions of the brain. The normal/Abnormal classification heavily depends on the frequency, presence or distortion of this feature. Its emergence during the closedeyes period is known as Posterior Dominant Rhythm (PDR) We decided to focus on this characteristic. S. López de Diego: Abnormal EEGs December 8, 2016 7
Abnormal EEG Classification S. López de Diego: Abnormal EEGs December 8, 2016 8
Automatic Abnormal EEG Classification • A general method for the classification of normal and abnormal EEGs is a task that has not been explored yet • Previous studies have focused on the classification of very specific conditions, such as the classification of athletes with residual functional deficits after a concussion • Most of these studies have not been conducted with clinical EEGs • This study proposes the establishment of a general method for the classification of normal and abnormal EEGs, which would be more useful in a clinical setting, where patients are evaluated for an ample number of conditions • To do this, the focus of the study will be the automatic analysis of the background EEG Output Input Features Normal Model Abnormal S. López de Diego: Abnormal EEGs December 8, 2016 9
Background
The System Output Input Normal Model Features Abnormal k. NN Pilot Studies S. López de Diego: Abnormal EEGs RF HMM Baseline System December 8, 2016 11
Classification of Sequential Data • EEGs, like speech signals, are the product of a physiological process that unfolds in time • Machine learning approaches that treat the observations as i. i. d. would fail to exploit the sequential nature of the data • This, added to the success that Hidden Markov Models (HMMs) have shown in the area of speech recognition served as motivation for the selection of these model for the baseline system S. López de Diego: Abnormal EEGs December 8, 2016 12
Hidden Markov Models (HMMs) S. López de Diego: Abnormal EEGs December 8, 2016 13
Hidden Markov Models (HMMs) S. López de Diego: Abnormal EEGs December 8, 2016 14
HMMs and Deep Neural Networks (DNNs) • Advances in computer hardware and deep learning/machine learning algorithms have facilitated the faster training of Deep Neural Networks (DNNs) • There have been a series of breakthroughs in the area of automatic speech recognition. Deep Learning has surpassed the performance of HMMs in several speech recognition tasks, such as Switchboard, in which the error rate was decreased to 6. 9% • With sufficient data, deep learning systems can significantly improve performance • Long Short Term Corpus Training Speech SGMM WER DNN WER BABEL Pashto 10 hours 69. 2% 67. 6% BABEL Pashto 80 hours 50. 2% 42. 3% Fisher English 2000 hours 15. 4% 10. 3% S. López de Diego: Abnormal EEGs December 8, 2016 15
Experimental Setup
Data • The data used was a demographically balanced subset of the TUH EEG Corpus. The data was divided as follows: Set Normal Abnormal Training 82 EEGs 80 EEGs Evaluation 51 EEGs 55 EEGs S. López de Diego: Abnormal EEGs December 8, 2016 17
Experimental Design First 60 seconds of each EEG recording were used Signal Features were extracted • MFCC-like features (8 cepstral coefficients) • Differential Energy • First and second derivatives Vectors for the selected channel were concatenated in a supervector PCA was used to reduce the dimensionality of the feature matrix. S. López de Diego: Abnormal EEGs December 8, 2016 18
Random Forest and the Number of Trees • The performance of the systems higher than 20 trees are comparable to each other. • Taking performance and computational time for the classification into account, a number of 50 trees was chosen for the rest of the experiments. S. López de Diego: Abnormal EEGs December 8, 2016 19
k. NN: Tuning the System S. López de Diego: Abnormal EEGs • The lowest k for the best operating interval was chosen. • This point corresponds to k = 20. • The best error rate achieved by the system is 41. 79% for PCA = 86. December 8, 2016 20
Channel Comparison • The system was evaluated for the highlighted channels • The performance for the T 5 -O 1 channel was better for all operating points with PCA dimensions higher than 20. • This correlates with the information learned from neurologists about their reliance on occipital channels for the classification of EEGs. S. López de Diego: Abnormal EEGs December 8, 2016 21
Summary of Pilot Studies Error Rates for the systems described so far: No. System Description Error 1 k. NN (k = 20) 41. 79% 3 RF (Ntrees = 50) 31. 66% Confusion Matrix for k. NN: Ref/Hyp Normal Abnormal Normal 50. 49% 49. 50% Abnormal 34. 00% 66. 00% S. López de Diego: Abnormal EEGs December 8, 2016 22
GMM-HMM Experiments • This set of experiments was conducted with the full set of features • The optimized system was then tested with the same feature input as the pilot experiments for comparison • The experiments can be summarized as follows: • Gaussian Mixture/HMM State Analysis • Signal Input Analysis • Channel Analysis S. López de Diego: Abnormal EEGs December 8, 2016 23
GMM-HMM Experiments Gaussian Mixture/HMM State Analysis Results: # Gaussian Mixtures 1 1 1 2 2 2 3 3 3 4 4 4 S. López de Diego: Abnormal EEGs # HMM States 1 2 3 Correct Detection (%) 69. 81% 65. 09% 76. 42% 80. 19% 77. 36% 76. 42% 82. 08% 83. 02% 82. 08% 64. 15% 77. 36% December 8, 2016 24
GMM-HMM Experiments: GM/HMM Analysis Signal Input Analysis Results: Input (min) #Gaussians/#HMM States Correct Detection (%) 5 3/3 80. 19% 10 3/3 83. 02% 15 3/3 80. 19% 20 3/3 79. 25% 25 3/3 76. 42% Channel Analysis Results: #Gaussians/#HMM States 3/3 3/3 3/3 S. López de Diego: Abnormal EEGs Channel Fp 1 -F 7 T 5 -O 1 F 7 -T 3 C 3 -Cz P 3 -O 1 Correct Detection (%) 80. 19% 83. 02% 80. 19% 79. 25% 76. 42% December 8, 2016 25
Summary of Results • The table below shows a summary of the results obtained through the systems implemented so far: System Description k. NN (k=20) RF (Nt=50) PCA-HMM #GM = 3 #HMM States = 3) GMM-HMM (#GM = 3 #HMM States = 3) Error (%) 41. 80% 31. 70% 25. 64% 16. 98% k. NN Confusion Matrix Ref/Hyp Normal Abnormal Normal 50. 49% 34. 00% Abnormal 49. 50% 66. 00% GMM-HMM Confusion Matrix Ref/Hyp Normal Abnormal Normal 78. 18% 21. 82% Abnormal 11. 76% 88. 24% S. López de Diego: Abnormal EEGs • The GMM-HMM baseline system showed a significant decrease in the false alarm rate in comparison with the k. NN system • The best GMM-HMM system will serve as a baseline for the normal/abnormal classification problem December 8, 2016 26
Timeline of Future Work
GMM-HMM Experiments December-January • Set up deep learning system for a second pass of deep learning after the GMM -HMM processing: • Implement and optimize a Stacked Denoising Autoencoders (Sd. A) system for the classification and increase the number of channels that are taken into account for the classification decision. • Expand evaluate the normal/abnormal TUH database subset: • Generate simple natural language processing (NLP) scripts to obtain EEG sessions that have been evaluated and classified by neurologists and form a larger, demographically balanced, subset of the data. February • Implement a long short term memory system for the normal/abnormal classification of EEGs. • This system will be implemented with the Theano Python library for deep learning and evaluated in the expanded dataset. • Evaluate the Sd. A implementation on the expanded dataset. March-May • Complete the writing of thesis and work on publications. • Defend this thesis. S. López de Diego: Abnormal EEGs December 8, 2016 28
- Slides: 27