Machine learning based search for Dark Matter using
Machine learning based search for Dark Matter using data from the ATLAS experiment at CERN Feature selection and model interpretation
Feature selection • Features and combinations of them affect the ML model differently • Some features are more important than others when identifying signal events • The ideal combination optimizes the performance of the model • Systematic removal and addition of features • Sensitivity curve ( B / √(B + S) ) • Model interpretation methods
Model interpretation • Machine learning algorithms are “black box” models • Output is often just an approximation score • Tools to better understand the inside of the black box • Discovery of bias and dependencies between features • What decisions does the model make?
Skater decision tree • Open-source python library • Broad interpretation functionality, some open issues • Skater’s Tree. Surrogate • Machine learning explaining machine learning • Visualizes approximated decisions of the model
Feature importance • Visualization of the importance of the features • Based on the Skater decision tree • No “new” info • Simplifies the analysis process • Want a feature importance method independent from the decision tree
Progress • The role of derived variables • Removed different combinations • All derived variables seem important • Some have greater impact on the model than others
With sum. MT Without sum. MT
Plan • Implement more model interpretation • Feature importance independent from decision tree • Partial dependence plot • Repeat previous experiment with new MC data • Explore the effect of adding more features to the model
- Slides: 8