Anomaly Detection in Data Science Oneclass Classification with
- Slides: 34
Anomaly Detection in Data Science One-class Classification with Privileged Information for Malware Detection Pavel Erofeev, IITP RAS, Airbus Group Russia
Find the Panda
Anomaly Detection: Hadlum vs Hadlum ◎ The birth of a child to Mrs. Hadlum happened 349 days after Mr. Haldum left for military service ◎ Average human pregnancy period is 280 days (40 weeks) ◎ Statistically, 39 days is an outlier
“ An outlier is an observation which deviates so much from other observations as to arouse suspicions that it was generated by different mechanism Howkins, 1980
Defining Anomaly Detection ◎ Digital representation vectors describing observations ◎ Mixture of “nominal” and “abnormal” points ◎ Anomaly points are generated by different generative process than the nominal points
Possible Settings in CS ◎ Supervised (Know attacks) ○ Training data labeled with “nominal” or “anomaly” ◎ Clean (Zero-day attacks) ○ Training data are all “nominal”, test data may be contaminated with “anomaly” ◎ Unsupervised (Unknown attacks) ○ Training data consists of mixture of “nominal” and “anomaly” points
Real World Data Problems ◎ Data is multivariate ◎ There is usually more than one generating mechanism underlying the “normal” data ◎ Anomalies may represent a different class objects, so there sre many of them of ◎ Domain specific definition of what to count as anomaly ◎ Normality evaolves in time 7
Anomaly Taxonomy Point Anomaly 8
Anomaly Taxonomy Contextual Anomaly 9
Anomaly Taxonomy Causal Anomaly 10
Taxonomy
Imbalanced classification ■ Normal data - a lot of samples ■ Abnormal - very few ■ Standard methods do not work as expected ■ Standard metrics do not apply 12
Imbalanced classification ◎Weights for classes ○ Proved not to be helpful in most cases ◎Resampling methods ○ Oversampling (Bootstrap, SMOTE, etc. ) ○ Undersampling ◎How to choose which method to use? ◎How to choose resampling parameter? ○ We compared several methods ○ We proposed a meta-model that on average gives best results [Papanov , Erofeev, Burnaev, 2015]
Statistics-based models ◎ Assumption on normal data generation procedure (e. g. Gaussian distribution, etc. ) ◎ PCA is a method commonly used to extract most variant combinations in data ◎ PCA based anomaly detection is good for highly correlated environments 14
Density-based models ◎SVM-based and nearest neighbours based ◎How to choose best kernel parameter? 15
One-class SVM with Privileged Information Evgeny Burnaev Dmitry Smolyakov Skoltech, IITP RAS
One-Class SVM
One-Class SVM
One-Class SVM
One-Class SVM Kernel Trick
Kernel Trick
Hyper-parameter Influence
Decision Functions
Learning with Privileged Info Example: Image classification with textual description
Learning with Privileged Info
Learning with Privileged Info
Learning with Privileged Info
Microsoft Malware Classification Challenge Kaggle. competition data (2015)
Problem Description ◎ 9 malware families ○ Rumnit, Lollipop, Kelihos ver 3, Vundo, Simda, Tracur, Kelihos ver 1, Obfuscator. ACY, Gatak ◎ Raw data ○ Hexadecimal representation of the raw binary content ○ Meta-data extracted from the binaries, including function calls, strings, etc.
Features ◎ Original features ○ Information from binary files such as ◉ Frequencies of bytes ◉ Number of different N-grams, etc. ◎ Privileged features ○ Information from code disassemble such as ◉ Frequencies of commands ◉ Number of calls to external dlls ○ Bytecode as an image ◉ Features based on image texture which is commonly used for image classification
Features
Experimental Setup
Results
Thanks! Any questions? pavel. erofeev@phystech. e
- Anomaly detection for data quality
- Anomaly detection spark
- Elasticsearch anomaly detection
- Flink anomaly detection
- System log analysis for anomaly detection
- Example of transaction flow graph
- Application of data flow testing
- What is my favourite subject
- Aging algorithm
- Anomaly score
- Anomaly score
- Nfu page replacement algorithm
- Anomaly mcm
- Page replacement fifo
- Anomaly score
- How to calculate true anomaly
- May hegglin anomaly
- Hypobranchial eminence
- Belady's anomaly example
- Cisco anomaly detector
- Durupinar site
- Vascular ring anomaly
- Mount judi noah's ark
- Anomaly: instruction "lea" is modifying the stack
- Choanal atresia,
- Congenital anomaly
- Descriptive meaning in semantics
- Anomaly management systems
- Neutrophil
- Signature based vs anomaly based
- Mean anomaly
- Standardized anomaly formula
- C 2 =121
- Data driven fraud detection