PHMM Applications Mark Stamp PHMM Applications 1 Applications

  • Slides: 43
Download presentation
PHMM Applications Mark Stamp PHMM Applications 1

PHMM Applications Mark Stamp PHMM Applications 1

Applications q We consider 2 applications of PHMMs to problems in information security o

Applications q We consider 2 applications of PHMMs to problems in information security o Masquerade detection o Malware detection q Both show some strengths of PHMMs q Both are somewhat unique q PHMMs not always a first choice… PHMM Applications 2

PHMM for Masquerade Detection Lin Huang Mark Stamp PHMM Applications 3

PHMM for Masquerade Detection Lin Huang Mark Stamp PHMM Applications 3

Masquerader? q Masquerader makes unauthorized use of another user’s account o Masquerader tries to

Masquerader? q Masquerader makes unauthorized use of another user’s account o Masquerader tries to evade detection by pretending to be the other user q Can we detect masquerader? o Intrusion Detection System (IDS) q We consider special case where such an IDS is based on UNIX commands PHMM Applications 4

Schonlau Dataset q Collection of UNIX commands, 50 users o 5 k training commands

Schonlau Dataset q Collection of UNIX commands, 50 users o 5 k training commands per user, plus… o 10 k “attack” commands per user q Also, a key to tell which blocks are attack and which belong to same user o Nominally, 100 blocks, 100 commands each q No real session start/end info provided o This could be an issue… PHMM Applications 5

Previous Work q Lots of papers use “Schonlau dataset” q Types of methods that

Previous Work q Lots of papers use “Schonlau dataset” q Types of methods that have been used o o o Information theoretic Text mining Hidden Markov Model Naïve Bayes Sequences and bioinformatics SVM, and Other PHMM Applications 6

Information Theoretic q Schonlau originally used compressionbased scheme o The theory is that commands

Information Theoretic q Schonlau originally used compressionbased scheme o The theory is that commands by same user should compress more o By subsequent standard, poor results q Some other similar work, but… o … no strong results based on compression q Compression PHMM Applications for malware detection? 7

Text Mining q Look for repetitive sequences o Can be used to detect particular

Text Mining q Look for repetitive sequences o Can be used to detect particular user o Almost like a signature q PCA has also been used here o Repetitive sequences, i. e. , patterns o PCA can find such structure o Training cost considered high q Other PHMM Applications ways to do “text mining”? 8

Hidden Markov Model q Need we say more? q HMM is one of the

Hidden Markov Model q Need we say more? q HMM is one of the most popular detection strategies in this field o Results are good o Serves as benchmark in many (most) studies of other techniques q We implement HMM detector and compare to PHMM Applications 9

Naïve Bayes q Naïve Bayes (NB) relies on frequencies o No sequential info used

Naïve Bayes q Naïve Bayes (NB) relies on frequencies o No sequential info used o Very simple o Efficient training & scoring q Discuss naïve Bayes in later chapter q Close connection between HMM and NB o So, not too surprising that this works o But, surprising that it works so well PHMM Applications 10

Sequences and Bioinformatics q n-gram approaches very popular o Like HMM, also used as

Sequences and Bioinformatics q n-gram approaches very popular o Like HMM, also used as benchmark q Sequence alignment has been used o Based on Smith-Waterman algorithm o Like constructing MSA in PHMM o Closest previous work to PHMM q We’ll compare our PHMM results to both n-gram and HMM PHMM Applications 11

Support Vector Machines q Several previous studies use SVM o SVM has nice geometric

Support Vector Machines q Several previous studies use SVM o SVM has nice geometric interpretation o SVMs very popular in machine learning q For masquerade detection, SVM results are about same as NB q Claimed that SVM is more efficient, as compared to naïve Bayes o But, naïve Bayes is very efficient… PHMM Applications 12

Other q Frequent and/or infrequent commands o Neither seems to perform well q “Hybrid

Other q Frequent and/or infrequent commands o Neither seems to perform well q “Hybrid Bayes one step Markov” and “hybrid multistep Markov” o Nice names, but not so good results q “Non-negative matrix factorization” o Good results q Ensemble (combination) approaches o Seem to offer slight improvement PHMM Applications 13

Experimental Results q Again, we compare HMM and n-grams to several PHMM models o

Experimental Results q Again, we compare HMM and n-grams to several PHMM models o All are tested on Schonlau dataset o Then we generate a simulated dataset o All tested again on simulated data q Why simulated data? o Schonlau data has limitations wrt PHMM o This will be explained later… PHMM Applications 14

HMM & n-Gram ROC Curves q First, compare HMM and n -grams PHMM Applications

HMM & n-Gram ROC Curves q First, compare HMM and n -grams PHMM Applications 15

HMM and n-Gram AUC q For ROC curves on previous slide… PHMM Applications 16

HMM and n-Gram AUC q For ROC curves on previous slide… PHMM Applications 16

Training PHMM q How many sequences to use? o More sequences, better for E

Training PHMM q How many sequences to use? o More sequences, better for E matrix… o …but worse for gaps q Length of each sequence? o For Schonlau dataset, we have 5 k training commands per user q Where to begin/end sequences? o No good answers for Schonlau dataset PHMM Applications 17

PHMM Sequences q Note that all 5 k commands used in each case PHMM

PHMM Sequences q Note that all 5 k commands used in each case PHMM Applications 18

PHMM ROC Curves q ROC curves for each PHMM case q Any trend? PHMM

PHMM ROC Curves q ROC curves for each PHMM case q Any trend? PHMM Applications 19

PHMM AUC q AUC for each PHMM case o 5, 10, and 20 sequences

PHMM AUC q AUC for each PHMM case o 5, 10, and 20 sequences are best cases PHMM Applications 20

HMM, n-Gram, and PHMM q Again, for Schonlau dataset q Which method is better?

HMM, n-Gram, and PHMM q Again, for Schonlau dataset q Which method is better? PHMM Applications 21

HMM vs PHMM q HMM and PHMM give similar results on Schonlau dataset q

HMM vs PHMM q HMM and PHMM give similar results on Schonlau dataset q Surprising that PHMM does so well o Why? No begin/end sequence info! q What if we had “better” sequences? o PHMM could certainly do better and maybe much, much better q But how to get a better dataset? PHMM Applications 22

Simulated Dataset q Generate Markov model for each user o Based on monograph &

Simulated Dataset q Generate Markov model for each user o Based on monograph & digraph stats o Like matrices π and A of an HMM q Now we can generate sequences o Use matrix π to select initial element o Then use matrix A to generate sequence q HMM must do well on this data (why? ) q PHMM might do well… or not… PHMM Applications 23

ROC Curves Simulated Data q HMM vs PHMM q Based on 5 k training

ROC Curves Simulated Data q HMM vs PHMM q Based on 5 k training commands PHMM Applications 24

AUC for Simulated Data q Again, PHMM Applications based on 5 k training commands

AUC for Simulated Data q Again, PHMM Applications based on 5 k training commands 25

Real World Problem q Masquerade detection in real world q At first, we have

Real World Problem q Masquerade detection in real world q At first, we have little training data o Can’t protect user until we train a model o So, we want to train as soon as possible q Minimum training data needed to obtain a useful model? q We compare HMM and PHMM with 200, 400, and 800 training commands PHMM Applications 26

Limited Training Data q Simulated data q HMM vs PHMM q Big difference when

Limited Training Data q Simulated data q HMM vs PHMM q Big difference when very little training data available PHMM Applications 27

Limited Training Data q PHMM most impressive with very little data (especially wrt AUC

Limited Training Data q PHMM most impressive with very little data (especially wrt AUC 0. 1) PHMM Applications 28

Limited Training Data q Same results as previous slide PHMM Applications 29

Limited Training Data q Same results as previous slide PHMM Applications 29

Optimal Masquerade Detection Strategy? q Obtain 200 commands, train PHMM q Use this PHMM

Optimal Masquerade Detection Strategy? q Obtain 200 commands, train PHMM q Use this PHMM model until a reliable set of 800+ commands is available q Then train HMM on 800+ commands q Use HMM from then on q Gives us a reliable model with limited data, and best model with more data PHMM Applications 30

Another PHMM Advantage? q PHMM might be better when attacker hijacks ongoing session q

Another PHMM Advantage? q PHMM might be better when attacker hijacks ongoing session q Masquerader mimics average behavior o This is what is modeled by HMM q Harder to mimic sequential behavior o As modeled by PHMM o Depends on position in the sequence q This should be investigated further… PHMM Applications 31

PHMM for Malware Detection Swapna Vemparala Mark Stamp PHMM Applications 32

PHMM for Malware Detection Swapna Vemparala Mark Stamp PHMM Applications 32

Malware Detection q In previous work, PHMM tested for metamorphic detection o Based on

Malware Detection q In previous work, PHMM tested for metamorphic detection o Based on extracted opcodes q Results were generally not impressive q MSA has many gaps and PHMM is weak q Code transposition causes problems o And code transposition common in malware q Opcode PHMM Applications sequence not strong wrt PHMM 33

Malware Detection 2. 0 q Here, again apply PHMM to malware q But what

Malware Detection 2. 0 q Here, again apply PHMM to malware q But what to use as features ? ? ? q Want feature(s) where… o Sequence/order is critical o And, difficult for malware writer to modify sequential information q What feature(s) to use? o (Static) opcodes not good in PHMM Applications 34

Software Birthmarks q Birthmark is inherent feature of code o In contrast to a

Software Birthmarks q Birthmark is inherent feature of code o In contrast to a watermark q We consider both static and dynamic birthmarks q Static collected without executing q Dynamic execution/emulation q Examples of each? q Advantages/disadvantages of each? PHMM Applications 35

This Research q Consider opcodes o Static feature, extracted by disassembly q Also consider

This Research q Consider opcodes o Static feature, extracted by disassembly q Also consider API calls o Dynamic, use Buster Sandbox Analyzer q Compare HMM and PHMM for both o Then 3 cases for each malware family… o Static and dynamic HMM o Dynamic PHMM Applications 36

Data q Malware q Benign PHMM Applications data from Malicia Project set of 20

Data q Malware q Benign PHMM Applications data from Malicia Project set of 20 Windows applications 37

HMM & Opcode Sequences q Scatterplots and ROC curves for Security Shield PHMM Applications

HMM & Opcode Sequences q Scatterplots and ROC curves for Security Shield PHMM Applications 38

HMM Results q Results for all families, static and dynamic birthmarks PHMM Applications 39

HMM Results q Results for all families, static and dynamic birthmarks PHMM Applications 39

PHMM q Dynamic PHMM Applications birthmarks, i. e. , API calls 40

PHMM q Dynamic PHMM Applications birthmarks, i. e. , API calls 40

Results q Static and dynamic HMM q And dynamic PHMM Applications 41

Results q Static and dynamic HMM q And dynamic PHMM Applications 41

Bottom Line q In these cases, dynamic data gives better results o API calls

Bottom Line q In these cases, dynamic data gives better results o API calls better than (static) opcodes q HMM does very well on API calls… q …but PHMM can do even better q Sequential info matters in API calls! q Is PHMM really worth it? PHMM Applications 42

References q Masquerade detection o L. Huang and M. Stamp, Masquerade detection using profile

References q Masquerade detection o L. Huang and M. Stamp, Masquerade detection using profile hidden Markov models, Computers & Security, 30(8): 732 -747, November 2011 q Malware detection o S. Vemparala, et al, Malware detection using dynamic birthmarks, 2 nd International Workshop on Security & Privacy Analytics (IWSPA 2016), co-located with ACM CODASPY 2016, March 9 -11, 2016 PHMM Applications 43