CS 685 Presentation Data Mining for Network Intrusion
- Slides: 21
CS 685 Presentation Data Mining for Network Intrusion Detection Paul Dokas, Levent Ertoz, Vipin Kumar, Aleksandar Lazarevic, Jaideep ZSrivastava, Pang-Ning Tan Computer Science Department University of Minnesota Presented By: Song. Yuan@uky. edu
CS 685 Presentation Outlines • Motivation • Related Work • Detection Models and Approaches • Experimental Evaluation • Conclusion
CS 685 Presentation Motivation • Organizations are becoming increasingly vulnerable to potential cyber threats, e. g. , network intrusions. cyber incidents reported to CERT/CC
CS 685 Presentation Motivation (cont. ) • Intrusion Detection System (IDS) • • • collect signatures of known attacks input attack signatures into IDS signature databases extract features from various audit streams compare these features with attacks signatures raise the alarm when possible intrusion happens • Limitations of traditional signature-based methods • manual update of signature database • inability to detect emerging cyber threats
CS 685 Presentation Motivation (cont. ) Why data mining? • large volumes of network data • different data mining techniques clustering, classification
CS 685 Presentation Related Work Data mining based intrusion detection techniques • anomaly detection • • Build models of normal data Detect any deviation from normal data Flag deviation as suspect Identify new types of intrusions as deviation from normal behavior • misuse detection • • Label all instances in the data set (“normal” or “intrusion” ) Run learning algorithms over the labeled data to generate classification rules • Automatically retrain intrusion detection models on different input data
CS 685 Presentation Related Work --- misuse detection • Classification Model Bayesian classifier Decision tree Association rule Support vector machine Learning from rare class
CS 685 Presentation Related Work --- anomaly detection • Anomaly Detection Model Association rule Neural network Unsupervised SVM Outlier detection
CS 685 Presentation Detection Models • misuse detection rare class prediction model known intrusions and their variations • anomaly detection outlier detection model novel attacks whose nature is unknown
CS 685 Presentation Learning from Rare Class • Problem: classification model for dataset with skewed class distribution ? intrusion class << normal class Mining needle in a haystack
CS 685 Presentation Learning from Rare Class (cont. ) • Novel classification algorithms • PN-rule • P-rule most of intrusive examples • N-rule eliminating false alarms • SMOTEBoost • SMOTE (Synthetic Minority Over-sampling TEchnique) • Boosting
CS 685 Presentation Anomaly Detection • Novel attacks/intrusions deviation from normal behavior • Outlier detection algorithm Nearest neighbor approach Distance based approach Density based approach Unsupervised support vector machines
CS 685 Presentation Anomaly Detection • Density based approach (LOF)
CS 685 Presentation Anomaly Detection • Identify normal behavior • Construct useful set of feature • Define similarity function • Flag deviation as suspect
CS 685 Presentation Experimental Evaluation • Public data set DARPA 1998 Intrusion Detection Evaluation Data Set prepared and managed by MIT Lincoln Lab training data and test data KDD Cup 1999 Data the extension of DARPA’ 98 training data and test data • Real network data Network data from University of Minnesota
CS 685 Presentation Experimental Evaluation --- feature construction Purpose: more informative data set from public data set Method: • connection records • label connection records ‘normal‘ or ‘intrusion‘ • features for each connection record # of {packets, bytes}, {ACK, Re-Tx} packets, SYN/FIN, … time-based features ( Do. S attacks ) connection-based features ( PROBING attacks )
CS 685 Presentation Experimental Evaluation --- single connection attacks ROC curves for single connection attacks
CS 685 Presentation Experimental Evaluation --- bursty attacks ROC curves for bursty attacks
CS 685 Presentation Experimental Evaluation --- real network data • Why? Limitations of DARPA’ 98 data set • How? Detect network intrusion in the live network traffic • Result? • Successfully identify some novel intrusions (top ranked outliers)
CS 685 Presentation Conclusion • promising intrusion detection models • performance of algorithm (on-line detection) • new classification and anomaly detection algorithms
CS 685 Presentation Thanks! Questions?
- Eck
- Multimedia data mining
- Intrusion detection open source
- Artificial neural network in data mining
- Neural network in data mining
- Strip mining vs open pit mining
- Chapter 13 mineral resources and mining worksheet answers
- Difference between strip mining and open pit mining
- Web text mining
- 1000-754
- 847-685-1447
- Mth 685
- Prefix 685
- Cs 685
- Cs 685
- Mth 685
- Mth 685
- Cs 685
- Cs 685
- Cs 685
- Cs 685
- Cs 685