UNCERTAINTY QUANTIFICATION WITH MONTE CARLO DROPOUT FOR SRF

  • Slides: 29
Download presentation
UNCERTAINTY QUANTIFICATION WITH MONTE CARLO DROPOUT FOR SRF CAVITY AND FAULT CLASSIFICATION LASITHA VIDYARATNE

UNCERTAINTY QUANTIFICATION WITH MONTE CARLO DROPOUT FOR SRF CAVITY AND FAULT CLASSIFICATION LASITHA VIDYARATNE 1

OUTLINE • SRF cavity and fault classification task § Problem definition § Data §

OUTLINE • SRF cavity and fault classification task § Problem definition § Data § DRL model • Background § Monte-Carlo Dropout • Cavity and fault classification with UQ § MC dropout implementation • Preliminary results and observations • Summary 2

INTRODUCTION • Reduce RF related CEBAF machine downtime § § • Manual recognition of

INTRODUCTION • Reduce RF related CEBAF machine downtime § § • Manual recognition of cavity and fault requires § § • Largest contributor to short machine downtime trips Require manual inspection of large scale RF data Time and energy Subject matter expertise Fast, Automated recognition system § § Saves time, and energy Help minimize failures 3

4 DEFINING THE PROBLEM data from 12 cryomodules in CEBAF 1 cryomodule = collection

4 DEFINING THE PROBLEM data from 12 cryomodules in CEBAF 1 cryomodule = collection of 8 cavities Question #1 Question #2 Which of the 8 cavities faulted first? What kind of trip was it? 17 signals/cavity × 8 cavities = 136 signals 17 signals 5 1 5 Task #1 Task #2

5 DATA ACQUISITION • Waveform harvester captures RF time-series signals after a fault §

5 DATA ACQUISITION • Waveform harvester captures RF time-series signals after a fault § 17 waveform signals for each cavity o Each 8, 192 time points long § 94% of recorded data precedes the fault and 6% after fault event … … 8, 192 samples × 0. 2 ms/sample = 1. 64 seconds

DATASET • Waveforms from latest (Summer 2020) run § A total of 3791 events

DATASET • Waveforms from latest (Summer 2020) run § A total of 3791 events 6

7 DEEP RECURRENT LEARNING MODEL • Bidirectional LSTM layers for temporal feature learning •

7 DEEP RECURRENT LEARNING MODEL • Bidirectional LSTM layers for temporal feature learning • simultaneous classification of cavity and fault: two-branch model • Dropout in all layers Linear Feed-forward Layers Bidirectional LSTM § dropout rate = 0. 5 Input Layers (64 each) Fault ID Cavity ID

BACKGROUND: DROPOUT IN NEURAL NETWORKS 8 • Popular neural network regularization method • Randomly

BACKGROUND: DROPOUT IN NEURAL NETWORKS 8 • Popular neural network regularization method • Randomly disable some connections in the network for each training example • Imposes a regularization based on dropout probability § Mitigates overfitting § prevents memorizing data Standard network Image source: https: //humboldt-wi. github. io/blog/research/information_systems_1819/uncertainty-and-credit-scoring/ After applying dropout

MONTE CARLO DROPOUT FOR UNCERTAINTY COMPUTATION • Common method is by obtaining the predictive

MONTE CARLO DROPOUT FOR UNCERTAINTY COMPUTATION • Common method is by obtaining the predictive posterior distributions with Bayesian Neural networks: • Inspect variance in predictive posterior distribution for a given input to find uncertainty • Computing predictive distribution requires: likelihoo Approx. Parametric d posterior 9

MONTE CARLO DROPOUT FOR UNCERTAINTY COMPUTATION • Learning predictive distribution requires estimating approximate parametric

MONTE CARLO DROPOUT FOR UNCERTAINTY COMPUTATION • Learning predictive distribution requires estimating approximate parametric posterior distribution: • MC dropout helps by sampling from parametric posterior (different NN architecture by dropout for each run) such that 1: • Sampling from approximate parametric posterior enables MC integration of models likelihood: Gal, Yarin, and Zoubin Ghahramani. "Dropout as a bayesian approximation: Representing model uncertainty in deep learning. " international conference on machine learning. PMLR, 2016. 1 10

MONTE CARLO DROPOUT FOR UNCERTAINTY COMPUTATION • Image source: https: //docs. aws. amazon. com/prescriptive-guidance/latest/ml-quantifying-uncertainty/mc-

MONTE CARLO DROPOUT FOR UNCERTAINTY COMPUTATION • Image source: https: //docs. aws. amazon. com/prescriptive-guidance/latest/ml-quantifying-uncertainty/mc- 11

MONTE CARLO DROPOUT IMPLEMENTATION Replicat e Example 1 Testin g Set Testing Exampl e

MONTE CARLO DROPOUT IMPLEMENTATION Replicat e Example 1 Testin g Set Testing Exampl e 2 3 T Monte Carlo Sampling Dropout Config 1 Dropout Config 2 Dropout Config 3 Dropout Config T 12 Compute classification output & Uncertainty measures Classification Result Model Uncertainty

UNCERTAINTY MEASURES • 13

UNCERTAINTY MEASURES • 13

14 RESULTS Method • Fault Classification Input Size Test Accuracy (%) 192 features 86.

14 RESULTS Method • Fault Classification Input Size Test Accuracy (%) 192 features 86. 16% 192 features 84. 84% 256 features 87. 6% 256 features 86% 24, 928 features 89. 3% 24, 928 features 85. 5% 23, 293 features 89. 6% 23, 293 features 86. 2% Deep Recurrent Branched 136 × 256 85. 1% 136 × 256 83. 5% Deep Recurrent Branched 136 × 256 86. 56% 136 × 256 84. 45% Machine Learning (AR) Machine Learning (tsfresh-minimal) Machine Learning (tsfresh-comprehensive +feature selection) (Raw Time Series) MC Dropout 1 Top-3 Cavity Classification accuracy: if the GT class is in the top-3 probabilities, it is counted as correct

CONFUSION MATRICES 15

CONFUSION MATRICES 15

UNCERTAINTY QUANTIFICATION: CAVITY CLASSIFICATION ~90% ~38% 16

UNCERTAINTY QUANTIFICATION: CAVITY CLASSIFICATION ~90% ~38% 16

UNCERTAINTY QUANTIFICATION: CAVITY CLASSIFICATION ~90% ~41% 17

UNCERTAINTY QUANTIFICATION: CAVITY CLASSIFICATION ~90% ~41% 17

UNCERTAINTY QUANTIFICATION: CAVITY CLASSIFICATION 18 ~90% ~78% ~38% ~20%

UNCERTAINTY QUANTIFICATION: CAVITY CLASSIFICATION 18 ~90% ~78% ~38% ~20%

UNCERTAINTY QUANTIFICATION: FAULT CLASSIFICATION ~90% ~50% 19

UNCERTAINTY QUANTIFICATION: FAULT CLASSIFICATION ~90% ~50% 19

UNCERTAINTY QUANTIFICATION: FAULT CLASSIFICATION 20 ~90% ~48%

UNCERTAINTY QUANTIFICATION: FAULT CLASSIFICATION 20 ~90% ~48%

UNCERTAINTY QUANTIFICATION: FAULT CLASSIFICATION 21 ~90% ~80% ~45% ~30%

UNCERTAINTY QUANTIFICATION: FAULT CLASSIFICATION 21 ~90% ~80% ~45% ~30%

22 SUMMARY • Preliminary analysis of model (epistemic) uncertainty quantification § Monte Carlo Dropout

22 SUMMARY • Preliminary analysis of model (epistemic) uncertainty quantification § Monte Carlo Dropout implementation o Applicable to existing DL models o Straight forward implementation § Different uncertainty measures o Setting uncertainty based thresholds to filter decisions • Future plans § Bayesian (recurrent) neural networks § Ensemble models

23 THANK YOU!

23 THANK YOU!

24 UQ CAVITY CLASSIFICATION BOX PLOTS

24 UQ CAVITY CLASSIFICATION BOX PLOTS

25 UQ FAULT CLASSIFICATION BOX PLOTS

25 UQ FAULT CLASSIFICATION BOX PLOTS

INTERESTING OBSERVATIONS • 3 ms Quenches § DL correctly predicts 3 ms Quenches 74.

INTERESTING OBSERVATIONS • 3 ms Quenches § DL correctly predicts 3 ms Quenches 74. 2% (49/66) of the time § When predictions are correct (i. e. DL “first choice” = ground truth): second choice third choice § When predictions are incorrect (i. e. DL “first choice” != ground truth): first choice 26

INTERESTING OBSERVATIONS • 100 ms Quenches § DL correctly predicts 100 ms Quenches 83.

INTERESTING OBSERVATIONS • 100 ms Quenches § DL correctly predicts 100 ms Quenches 83. 6% (46/55) of the time § When predictions are correct (i. e. DL “first choice” = ground truth): second choice third choice § When predictions are incorrect (i. e. DL “first choice” != ground truth): first choice 27

28 CLASSIFICATION ANALYSIS: FAULT Single Cav Turn Quench_100 Microphonics off ms Number of Examples

28 CLASSIFICATION ANALYSIS: FAULT Single Cav Turn Quench_100 Microphonics off ms Number of Examples Accuracy Controls Fault E_Quench_3 ms Multi Cav turn off Heat Riser Choke Unknown 54 57 55 120 94 66 173 127 13 70. 40% 89. 47% 87. 27% 73. 33% 90. 43% 78. 79% 91. 33% 91. 34% 38. 46% • When predictions are incorrect (i. e. DL “first choice” != ground truth), first choice: fault_label Single Cav Turn off Microphonic s Quench_100 ms Controls Fault E_Quench_3 m s Multi Cav turn off Heat Riser Choke Single Cav Microphoni Quench_10 Turn off cs 0 ms Controls Fault E_Quench_3 m Multi Cav s turn off Heat Riser Choke Unknown 0 0 0 9 2 0 5 0 0 2 0 1 0 0 1 2 0 0 1 1 3 0 1 0 5 0 0 0 5 6 12 3 1 1 2 1 0 0 1 2 0 2 1 0 4 4 1 0 2 3 0 0 10 2 0 0 0 1 2 4 0 3 0 0 1

29 CLASSIFICATION ANALYSIS: FAULT • When predictions are incorrect (i. e. DL “first choice”

29 CLASSIFICATION ANALYSIS: FAULT • When predictions are incorrect (i. e. DL “first choice” != ground truth), second choice fault_label Single Cav Turn off Microphonic s Quench_100 ms Controls Fault E_Quench_3 m s Multi Cav turn off Heat Riser Choke Unknown Single Cav Microphoni Quench_10 Turn off cs 0 ms Controls Fault E_Quench_3 m Multi Cav s turn off Heat Riser Choke Unknown 9 0 0 3 0 2 1 0 1 1 2 0 1 1 0 0 1 2 3 0 0 1 0 0 0 7 3 0 14 0 2 3 0 0 3 2 3 0 1 0 0 2 3 0 0 1 6 1 0 1 2 1 2 1 6 0 0 0 1 1 2 1 4 2 1 2 0 1 1 0 0