Masquerade Detection Mark Stamp Masquerade Detection 1 Masquerade


































- Slides: 34
Masquerade Detection Mark Stamp Masquerade Detection 1
Masquerade Detection q Masquerader --- someone who makes unauthorized use of a computer q How to detect a masquerader? q Here, we consider… Anomaly-based intrusion detection (IDS) u Detection is based on UNIX commands u q Lots and lots of prior work on this problem q We attempt to apply PHMMs u For comparison, we also implement other techniques (HMM and N-gram) Masquerade Detection 2
Schonlau Data Set q Schonlau, et al, collected large data set q Contains UNIX commands for 50 users u 50 files, one for each user u Each file has 15 k commands, 5 k from user plus 10 k for masquerade test data u Test data: 100 blocks, 100 commands each q Dataset u 100 includes map file rows (test blocks), 50 columns (users) u 0 if block is user data, 1 if masquerade data Masquerade Detection 3
Schonlau Data Set q Map file structure q This data set used for many studies u Approximately, 50 published papers Masquerade Detection 4
Previous Work q Approaches to masquerade detection u Information u Text theoretic mining u Hidden Markov models (HMM) u Naïve Bayes u Sequences and bioinformatics u Support vector machines (SVM) u Other approaches q We briefly look at each of these Masquerade Detection 5
Information Theoretic q Original work by Schonlau included a compression technique q Based on theory (hope? ) that legitimate commands compress more than attack q Results were disappointing q Some additional recent work u Still not competitive with best approaches Masquerade Detection 6
Text Mining q. A few papers in this area q One approach extracts repetitive sequences from training data q Another paper use principal component analysis (PCA) u Method of “exploratory data analysis” u Good results on Schonlau data set u But high cost during training phase Masquerade Detection 7
Hidden Markov Models q Several u One q We authors have used HMMs of the best known approaches have implemented HMM detector q We do sensitivity analysis on the parameters u In particular, determine optimal N (number of hidden states) q We also use HMMs for comparison with our PHMM results Masquerade Detection 8
Naïve Bayes q In simplest form, relies only on command frequencies u That is, no sequence info is used q Several papers analyze this approach u Among the simplest approaches u And, results are good Masquerade Detection 9
Sequences q In a sense, this is the opposite extreme from naïve Bayes u Naïve Bayes only considers frequency stats u Sequence/bioinformatics focused on sequence-related information q Schonlau’s original work included elementary sequence-based analysis Masquerade Detection 10
Bioinformatics q We are aware of only one previous paper that uses bioinformatics approach Use Smith-Waterman algorithm to create local alignments u Alignments then used directly for detection u q In contrast, we do pairwise alignments, MSA, PHMM is used for scoring (forward algorithm) u Our scoring is much more efficient u Also, our results are at least as strong u Masquerade Detection 11
Support Vector Machines q Support vector machines (SVM) Machine learning technique u Separate data points (i. e. , classify) based on hyperplanes in high dimensional space u Original data mapped to higher dimension, where separation is likely easier u q SVMs maximize separation And have low computational costs u Used for classification and regression analysis u Masquerade Detection 12
SVMs & Masquerade Detection q SVMs have been applied to masquerade detection problem q Results are good u Comparable to naïve Bayes q Recent work using SVMs focused on improved efficiency Masquerade Detection 13
Other Approaches q The following have also been studied u Detect using low frequency commands u Detect using high frequency commands u Hybrid Bayes “one step Markov” § Natural to consider hybrid approaches u Multistep Markov § Markov process of order greater than 1 q None of these particularly successful Masquerade Detection 14
Other Approaches (Continued) q Non-negative matrix factorization (NMF) u At least 2 papers on this topic u Appears to be competitive q Other hybrids that attempt to combine several approaches u So far, no significant improvement over individual techniques Masquerade Detection 15
HMMs q See previous presentation Masquerade Detection 16
HMM for Masquerade Detection q Using the Schonlau data set we… u Train HMM for each user u Set thresholds u Test the models and plot results q Note that this has been done before q Here, we perform sensitivity analysis u That is, we test different number of hidden states, N q Also use it for comparison with PHMM Masquerade Detection 17
HMM Experiments q Plotted as “ROC” curves u Closer to origin is better q Useful region That is, false positives below 5% u The shaded region u Masquerade Detection 18
HMM Conclusion q Number of hidden states does not matter q So, use N=2 u Since most efficient Masquerade Detection 19
PHMM q See previous presentation Masquerade Detection 20
PHMM Experiments q. A problem with Schonlau data… q For given user, 5000 commands u No q So, begin/end session markers must split it up to obtain multiple sequences u But where to split sequence? u And what about tradeoff between number of sequences and length of each sequence? u That is, how to decide length/number? ? ? Masquerade Detection 21
PHMM Experiments q See done for following cases: next slide… Masquerade Detection 22
PHMM Experiments q Tests various numbers of sequences q Best results u 5 sequences, 1 k commands each seq. u This case in next slide Masquerade Detection 23
PHMM Comparison q Compare PHMM to “weighted N -gram” and HMM q HMM is best u PHMM is competitive Masquerade Detection 24
PHMM Detector q PHMM at disadvantage on Schonlau data u PHMM uses positional information u Such info not available for Schonlau data u We have to guess the positions for PHMM q How to get fairer comparison between HMM and PHMM? u We q Only need different data set option is simulated data set Masquerade Detection 25
Simulated Data q We generate simulated data as follows u Using Schonlau data, construct Markov chain for each user u Use resulting Markov chain to generate sequences representing user behavior u Restrict “begin” to more common commands q What’s the point? u Simulated seqs have sensible begin and end Masquerade Detection 26
Simulated Data q Training data and user data for scoring generated using Markov chain q Attack data taken from Schonlau data q How much data to generate? q First test, we generate same amount of simulated data as is in Schonlau set u That is, 5 k commands per user Masquerade Detection 27
Detection with Simulated Data q PHMM vs u Round 2 q It’s close, but HMM still wins! Masquerade Detection 28
Limited Training Data q What if less training data is available? q In a real application, initially, training data is limited u Can’t detect attacks until sufficient training data has been accumulated u So, less data required, the better q Experiments, using simulated data, limited training date u Used 200 to 800 commands for training Masquerade Detection 29
Limited Training Data q PHMM vs u Round 3 q With 400 or less, PHMM wins big! Masquerade Detection 30
Conclusion q PHMM is competitive with best approaches q PHMM likely to do better, given better training data (begin/end info) q PHMM much better than HMM when limited training data available u Of practical importance u Why does it make sense that PHMM would do better with limited training data? Masquerade Detection 31
Conclusion q Given current state of research… q Optimal masquerade detection approach u Initially, collect small training set u Train PHMM and use for detection u No attack, then continue to collect data u When sufficient data available, train HMM u From then on, use HMM for detection Masquerade Detection 32
Future Work q Collect u better real data set!!! Many problems/limitations with Schonlau data q Improved data set could be basis for lots and lots of research Directly compare PHMM/bioinformatics approaches with previous work (HMM, naïve Bayes, SVM, etc. ) u Consider hybrid techniques u Other techniques? u Masquerade Detection 33
References q Masquerade detection using profile hidden Markov models, L. Huang and M. Stamp, to appear in Computers and Security q Masquerading user data, M. Schonlau Masquerade Detection 34