Masquerade Detection Mark Stamp Masquerade Detection 1 Masquerade

Masquerade Detection q Masquerader --- someone who makes unauthorized use of a computer q

Schonlau Data Set q Schonlau, et al, collected large data set q Contains UNIX

Schonlau Data Set q Map file structure q This data set used for many

Previous Work q Approaches to masquerade detection u Information u Text theoretic mining u

Information Theoretic q Original work by Schonlau included a compression technique q Based on

Text Mining q. A few papers in this area q One approach extracts repetitive

Hidden Markov Models q Several u One q We authors have used HMMs of

Naïve Bayes q In simplest form, relies only on command frequencies u That is,

Sequences q In a sense, this is the opposite extreme from naïve Bayes u

Bioinformatics q We are aware of only one previous paper that uses bioinformatics approach

Support Vector Machines q Support vector machines (SVM) Machine learning technique u Separate data

SVMs & Masquerade Detection q SVMs have been applied to masquerade detection problem q

Other Approaches q The following have also been studied u Detect using low frequency

Other Approaches (Continued) q Non-negative matrix factorization (NMF) u At least 2 papers on

HMMs q See previous presentation Masquerade Detection 16

HMM for Masquerade Detection q Using the Schonlau data set we… u Train HMM

HMM Experiments q Plotted as “ROC” curves u Closer to origin is better q

HMM Conclusion q Number of hidden states does not matter q So, use N=2

PHMM q See previous presentation Masquerade Detection 20

PHMM Experiments q. A problem with Schonlau data… q For given user, 5000 commands

PHMM Experiments q See done for following cases: next slide… Masquerade Detection 22

PHMM Experiments q Tests various numbers of sequences q Best results u 5 sequences,

PHMM Comparison q Compare PHMM to “weighted N -gram” and HMM q HMM is

PHMM Detector q PHMM at disadvantage on Schonlau data u PHMM uses positional information

Simulated Data q We generate simulated data as follows u Using Schonlau data, construct

Simulated Data q Training data and user data for scoring generated using Markov chain

Detection with Simulated Data q PHMM vs u Round 2 q It’s close, but

Limited Training Data q What if less training data is available? q In a

Limited Training Data q PHMM vs u Round 3 q With 400 or less,

Conclusion q PHMM is competitive with best approaches q PHMM likely to do better,

Conclusion q Given current state of research… q Optimal masquerade detection approach u Initially,

Future Work q Collect u better real data set!!! Many problems/limitations with Schonlau data

Slides: 34

Download presentation

Masquerade Detection Mark Stamp Masquerade Detection 1

Masquerade Detection q Masquerader --- someone who makes unauthorized use of a computer q How to detect a masquerader? q Here, we consider… Anomaly-based intrusion detection (IDS) u Detection is based on UNIX commands u q Lots and lots of prior work on this problem q We attempt to apply PHMMs u For comparison, we also implement other techniques (HMM and N-gram) Masquerade Detection 2

Schonlau Data Set q Schonlau, et al, collected large data set q Contains UNIX commands for 50 users u 50 files, one for each user u Each file has 15 k commands, 5 k from user plus 10 k for masquerade test data u Test data: 100 blocks, 100 commands each q Dataset u 100 includes map file rows (test blocks), 50 columns (users) u 0 if block is user data, 1 if masquerade data Masquerade Detection 3

Schonlau Data Set q Map file structure q This data set used for many studies u Approximately, 50 published papers Masquerade Detection 4

Previous Work q Approaches to masquerade detection u Information u Text theoretic mining u Hidden Markov models (HMM) u Naïve Bayes u Sequences and bioinformatics u Support vector machines (SVM) u Other approaches q We briefly look at each of these Masquerade Detection 5

Information Theoretic q Original work by Schonlau included a compression technique q Based on theory (hope? ) that legitimate commands compress more than attack q Results were disappointing q Some additional recent work u Still not competitive with best approaches Masquerade Detection 6

Text Mining q. A few papers in this area q One approach extracts repetitive sequences from training data q Another paper use principal component analysis (PCA) u Method of “exploratory data analysis” u Good results on Schonlau data set u But high cost during training phase Masquerade Detection 7

Hidden Markov Models q Several u One q We authors have used HMMs of the best known approaches have implemented HMM detector q We do sensitivity analysis on the parameters u In particular, determine optimal N (number of hidden states) q We also use HMMs for comparison with our PHMM results Masquerade Detection 8

Naïve Bayes q In simplest form, relies only on command frequencies u That is, no sequence info is used q Several papers analyze this approach u Among the simplest approaches u And, results are good Masquerade Detection 9

Sequences q In a sense, this is the opposite extreme from naïve Bayes u Naïve Bayes only considers frequency stats u Sequence/bioinformatics focused on sequence-related information q Schonlau’s original work included elementary sequence-based analysis Masquerade Detection 10

Bioinformatics q We are aware of only one previous paper that uses bioinformatics approach Use Smith-Waterman algorithm to create local alignments u Alignments then used directly for detection u q In contrast, we do pairwise alignments, MSA, PHMM is used for scoring (forward algorithm) u Our scoring is much more efficient u Also, our results are at least as strong u Masquerade Detection 11

Support Vector Machines q Support vector machines (SVM) Machine learning technique u Separate data points (i. e. , classify) based on hyperplanes in high dimensional space u Original data mapped to higher dimension, where separation is likely easier u q SVMs maximize separation And have low computational costs u Used for classification and regression analysis u Masquerade Detection 12

SVMs & Masquerade Detection q SVMs have been applied to masquerade detection problem q Results are good u Comparable to naïve Bayes q Recent work using SVMs focused on improved efficiency Masquerade Detection 13

Other Approaches q The following have also been studied u Detect using low frequency commands u Detect using high frequency commands u Hybrid Bayes “one step Markov” § Natural to consider hybrid approaches u Multistep Markov § Markov process of order greater than 1 q None of these particularly successful Masquerade Detection 14

Other Approaches (Continued) q Non-negative matrix factorization (NMF) u At least 2 papers on this topic u Appears to be competitive q Other hybrids that attempt to combine several approaches u So far, no significant improvement over individual techniques Masquerade Detection 15

HMMs q See previous presentation Masquerade Detection 16

HMM for Masquerade Detection q Using the Schonlau data set we… u Train HMM for each user u Set thresholds u Test the models and plot results q Note that this has been done before q Here, we perform sensitivity analysis u That is, we test different number of hidden states, N q Also use it for comparison with PHMM Masquerade Detection 17

HMM Experiments q Plotted as “ROC” curves u Closer to origin is better q Useful region That is, false positives below 5% u The shaded region u Masquerade Detection 18

HMM Conclusion q Number of hidden states does not matter q So, use N=2 u Since most efficient Masquerade Detection 19

PHMM q See previous presentation Masquerade Detection 20

PHMM Experiments q. A problem with Schonlau data… q For given user, 5000 commands u No q So, begin/end session markers must split it up to obtain multiple sequences u But where to split sequence? u And what about tradeoff between number of sequences and length of each sequence? u That is, how to decide length/number? ? ? Masquerade Detection 21

PHMM Experiments q See done for following cases: next slide… Masquerade Detection 22

PHMM Experiments q Tests various numbers of sequences q Best results u 5 sequences, 1 k commands each seq. u This case in next slide Masquerade Detection 23

PHMM Comparison q Compare PHMM to “weighted N -gram” and HMM q HMM is best u PHMM is competitive Masquerade Detection 24

PHMM Detector q PHMM at disadvantage on Schonlau data u PHMM uses positional information u Such info not available for Schonlau data u We have to guess the positions for PHMM q How to get fairer comparison between HMM and PHMM? u We q Only need different data set option is simulated data set Masquerade Detection 25

Simulated Data q We generate simulated data as follows u Using Schonlau data, construct Markov chain for each user u Use resulting Markov chain to generate sequences representing user behavior u Restrict “begin” to more common commands q What’s the point? u Simulated seqs have sensible begin and end Masquerade Detection 26

Simulated Data q Training data and user data for scoring generated using Markov chain q Attack data taken from Schonlau data q How much data to generate? q First test, we generate same amount of simulated data as is in Schonlau set u That is, 5 k commands per user Masquerade Detection 27

Detection with Simulated Data q PHMM vs u Round 2 q It’s close, but HMM still wins! Masquerade Detection 28

Limited Training Data q What if less training data is available? q In a real application, initially, training data is limited u Can’t detect attacks until sufficient training data has been accumulated u So, less data required, the better q Experiments, using simulated data, limited training date u Used 200 to 800 commands for training Masquerade Detection 29

Limited Training Data q PHMM vs u Round 3 q With 400 or less, PHMM wins big! Masquerade Detection 30

Conclusion q PHMM is competitive with best approaches q PHMM likely to do better, given better training data (begin/end info) q PHMM much better than HMM when limited training data available u Of practical importance u Why does it make sense that PHMM would do better with limited training data? Masquerade Detection 31

Conclusion q Given current state of research… q Optimal masquerade detection approach u Initially, collect small training set u Train PHMM and use for detection u No attack, then continue to collect data u When sufficient data available, train HMM u From then on, use HMM for detection Masquerade Detection 32

Future Work q Collect u better real data set!!! Many problems/limitations with Schonlau data q Improved data set could be basis for lots and lots of research Directly compare PHMM/bioinformatics approaches with previous work (HMM, naïve Bayes, SVM, etc. ) u Consider hybrid techniques u Other techniques? u Masquerade Detection 33

References q Masquerade detection using profile hidden Markov models, L. Huang and M. Stamp, to appear in Computers and Security q Masquerading user data, M. Schonlau Masquerade Detection 34