CS 685 Presentation Data Mining for Network Intrusion

  • Slides: 21
Download presentation
CS 685 Presentation Data Mining for Network Intrusion Detection Paul Dokas, Levent Ertoz, Vipin

CS 685 Presentation Data Mining for Network Intrusion Detection Paul Dokas, Levent Ertoz, Vipin Kumar, Aleksandar Lazarevic, Jaideep ZSrivastava, Pang-Ning Tan Computer Science Department University of Minnesota Presented By: Song. Yuan@uky. edu

CS 685 Presentation Outlines • Motivation • Related Work • Detection Models and Approaches

CS 685 Presentation Outlines • Motivation • Related Work • Detection Models and Approaches • Experimental Evaluation • Conclusion

CS 685 Presentation Motivation • Organizations are becoming increasingly vulnerable to potential cyber threats,

CS 685 Presentation Motivation • Organizations are becoming increasingly vulnerable to potential cyber threats, e. g. , network intrusions. cyber incidents reported to CERT/CC

CS 685 Presentation Motivation (cont. ) • Intrusion Detection System (IDS) • • •

CS 685 Presentation Motivation (cont. ) • Intrusion Detection System (IDS) • • • collect signatures of known attacks input attack signatures into IDS signature databases extract features from various audit streams compare these features with attacks signatures raise the alarm when possible intrusion happens • Limitations of traditional signature-based methods • manual update of signature database • inability to detect emerging cyber threats

CS 685 Presentation Motivation (cont. ) Why data mining? • large volumes of network

CS 685 Presentation Motivation (cont. ) Why data mining? • large volumes of network data • different data mining techniques clustering, classification

CS 685 Presentation Related Work Data mining based intrusion detection techniques • anomaly detection

CS 685 Presentation Related Work Data mining based intrusion detection techniques • anomaly detection • • Build models of normal data Detect any deviation from normal data Flag deviation as suspect Identify new types of intrusions as deviation from normal behavior • misuse detection • • Label all instances in the data set (“normal” or “intrusion” ) Run learning algorithms over the labeled data to generate classification rules • Automatically retrain intrusion detection models on different input data

CS 685 Presentation Related Work --- misuse detection • Classification Model Bayesian classifier Decision

CS 685 Presentation Related Work --- misuse detection • Classification Model Bayesian classifier Decision tree Association rule Support vector machine Learning from rare class

CS 685 Presentation Related Work --- anomaly detection • Anomaly Detection Model Association rule

CS 685 Presentation Related Work --- anomaly detection • Anomaly Detection Model Association rule Neural network Unsupervised SVM Outlier detection

CS 685 Presentation Detection Models • misuse detection rare class prediction model known intrusions

CS 685 Presentation Detection Models • misuse detection rare class prediction model known intrusions and their variations • anomaly detection outlier detection model novel attacks whose nature is unknown

CS 685 Presentation Learning from Rare Class • Problem: classification model for dataset with

CS 685 Presentation Learning from Rare Class • Problem: classification model for dataset with skewed class distribution ? intrusion class << normal class Mining needle in a haystack

CS 685 Presentation Learning from Rare Class (cont. ) • Novel classification algorithms •

CS 685 Presentation Learning from Rare Class (cont. ) • Novel classification algorithms • PN-rule • P-rule most of intrusive examples • N-rule eliminating false alarms • SMOTEBoost • SMOTE (Synthetic Minority Over-sampling TEchnique) • Boosting

CS 685 Presentation Anomaly Detection • Novel attacks/intrusions deviation from normal behavior • Outlier

CS 685 Presentation Anomaly Detection • Novel attacks/intrusions deviation from normal behavior • Outlier detection algorithm Nearest neighbor approach Distance based approach Density based approach Unsupervised support vector machines

CS 685 Presentation Anomaly Detection • Density based approach (LOF)

CS 685 Presentation Anomaly Detection • Density based approach (LOF)

CS 685 Presentation Anomaly Detection • Identify normal behavior • Construct useful set of

CS 685 Presentation Anomaly Detection • Identify normal behavior • Construct useful set of feature • Define similarity function • Flag deviation as suspect

CS 685 Presentation Experimental Evaluation • Public data set DARPA 1998 Intrusion Detection Evaluation

CS 685 Presentation Experimental Evaluation • Public data set DARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab training data and test data KDD Cup 1999 Data the extension of DARPA’ 98 training data and test data • Real network data Network data from University of Minnesota

CS 685 Presentation Experimental Evaluation --- feature construction Purpose: more informative data set from

CS 685 Presentation Experimental Evaluation --- feature construction Purpose: more informative data set from public data set Method: • connection records • label connection records ‘normal‘ or ‘intrusion‘ • features for each connection record # of {packets, bytes}, {ACK, Re-Tx} packets, SYN/FIN, … time-based features ( Do. S attacks ) connection-based features ( PROBING attacks )

CS 685 Presentation Experimental Evaluation --- single connection attacks ROC curves for single connection

CS 685 Presentation Experimental Evaluation --- single connection attacks ROC curves for single connection attacks

CS 685 Presentation Experimental Evaluation --- bursty attacks ROC curves for bursty attacks

CS 685 Presentation Experimental Evaluation --- bursty attacks ROC curves for bursty attacks

CS 685 Presentation Experimental Evaluation --- real network data • Why? Limitations of DARPA’

CS 685 Presentation Experimental Evaluation --- real network data • Why? Limitations of DARPA’ 98 data set • How? Detect network intrusion in the live network traffic • Result? • Successfully identify some novel intrusions (top ranked outliers)

CS 685 Presentation Conclusion • promising intrusion detection models • performance of algorithm (on-line

CS 685 Presentation Conclusion • promising intrusion detection models • performance of algorithm (on-line detection) • new classification and anomaly detection algorithms

CS 685 Presentation Thanks! Questions?

CS 685 Presentation Thanks! Questions?