Lightning Talk Anomaly Detection in the Elasticsearch Service Jennifer Andersson 15/08/2019 1
Introduction Elasticsearch Service: • Search- & analytics engine • 30 clusters & 160 use cases Project goal: Detect service issues before they cause problems 2
Anomaly Detection & Degradation Prediction Deep autoencoder Classifier 3
Project Challenges Apache 300 return codes • Better preprocessing of the input data required • Evolution of cluster characteristics over time requires frequent retraining • Convergence issues makes retraining hard (vanishing gradient problem) Time [two years] 4
Data Preprocessing Raw data Preprocessed data 5
Long Short-Term Memory Neural Networks Colah’s blog post: Understanding LSTM Networks ht-1 x + x ᶴ ᶴ Xt-1 ht+1 ht Xt ᶴ x Xt+1 6
Long Short-Term Memory Neural Networks Colah’s blog post: Understanding LSTM Networks ht-1 x + x ᶴ ᶴ Xt-1 ht+1 ht Xt ᶴ x Xt+1 7
Model Evaluation Ongoing work: • Compare to § non-ML methods, e. g. moving average § other ML-methods , e. g. (Extended) isolation forests • Index anomaly scores to Elasticsearch and create a dashboard Achievement: Much better convergence than the previous model Future work: • Classification of anomalous events • Apply the method to other services 8
Thank you for listening Contact me: Jennifer Andersson jennifer. r. andersson@gmail. com 9