RHMD EvasionResilient Hardware Malware Detectors Khaled N Khasawneh

  • Slides: 48
Download presentation
RHMD: Evasion-Resilient Hardware Malware Detectors Khaled N. Khasawneh*, Nael Abu-Ghazaleh*, Dmitry Ponomarev**, Lei Yu**

RHMD: Evasion-Resilient Hardware Malware Detectors Khaled N. Khasawneh*, Nael Abu-Ghazaleh*, Dmitry Ponomarev**, Lei Yu** University of California, Riverside *, Binghamton University ** MICRO 2017 – Boston, USA, October 2017

Malware is Everywhere!

Malware is Everywhere!

Malware is Everywhere! Over 250, 000 malware registered every day!

Malware is Everywhere! Over 250, 000 malware registered every day!

Traditional Software Malware Detection • Static malware detection – Search for signatures in the

Traditional Software Malware Detection • Static malware detection – Search for signatures in the executable – Can detect all known malware with no false alarms – Can be evaded by new malware and polymorphic malware • Dynamic malware detection – Monitors the behavior of the program – Can detect unknown malware – Very high overhead limiting use in practice

Hardware Malware Detectors (HMDs) • Use Machine Learning: detect malware as computational anomaly •

Hardware Malware Detectors (HMDs) • Use Machine Learning: detect malware as computational anomaly • Use low-level features collected from the hardware • Can be always-on without adding performance overhead • Many research papers including ISCA’ 13, HPCA’ 15 and MICRO’ 16

Paper Contributions Reverse-engineer HMDs Develop evasive malware Evade detection after re-training

Paper Contributions Reverse-engineer HMDs Develop evasive malware Evade detection after re-training

Paper Contributions Can malware evade HMDs? Reverse-engineer HMDs Develop evasive malware Evade detection after

Paper Contributions Can malware evade HMDs? Reverse-engineer HMDs Develop evasive malware Evade detection after re-training If yes Can we make HMDs robust to evasion? Yes! Using RHMDs 1 - Provably harder to reverse-engineer 2 - Robust to evasion

REVERSE ENGINEERING

REVERSE ENGINEERING

How to Reverse Engineer HMDs? • Challenges: – We don’t know the detection period

How to Reverse Engineer HMDs? • Challenges: – We don’t know the detection period – We don’t know the features used – We don’t know the detection algorithm • Approach: 1. 2. Train different classifiers Derive specific parameters as an optimization problem

Reverse Engineering HMDs Attacker Training Data _____ _____

Reverse Engineering HMDs Attacker Training Data _____ _____

Reverse Engineering HMDs Attacker Training Data _____ _____ Victim HMD 1 Black box 0

Reverse Engineering HMDs Attacker Training Data _____ _____ Victim HMD 1 Black box 0 output 1 0 0

Reverse Engineering HMDs Victim HMD Attacker Training Data _____ _____ Data Training model Labels

Reverse Engineering HMDs Victim HMD Attacker Training Data _____ _____ Data Training model Labels 1 Black box 0 output 1 0 0

Reverse Engineering HMDs Victim HMD Attacker Training Data _____ _____ Data Training model Reverse-engineered

Reverse Engineering HMDs Victim HMD Attacker Training Data _____ _____ Data Training model Reverse-engineered HMD Labels 1 Black box 0 output 1 0 0

We Can Guess Detectors Parameters! • Victim HMD parameters: - 10 K detection period

We Can Guess Detectors Parameters! • Victim HMD parameters: - 10 K detection period - Instructions features vector

We Can Guess Detectors Parameters! • Victim HMD parameters: - 10 K detection period

We Can Guess Detectors Parameters! • Victim HMD parameters: - 10 K detection period • Guessing detection period: - LR: Logistic Regression - DT: Decision Tree - SVM: Support Vector Machines - Instructions features vector

We Can Guess Detectors Parameters! • Victim HMD parameters: - 10 K detection period

We Can Guess Detectors Parameters! • Victim HMD parameters: - 10 K detection period • Guessing feature vector: - LR: Logistic Regression - DT: Decision Tree - SVM: Support Vector Machines - Instructions features vector

Reverse Engineering Effectiveness Logistic Regression Neural Networks

Reverse Engineering Effectiveness Logistic Regression Neural Networks

Reverse Engineering Effectiveness Current generation of HMDs can be reverse Logistic Regression Neuralengineered Networks

Reverse Engineering Effectiveness Current generation of HMDs can be reverse Logistic Regression Neuralengineered Networks

EVADING HMDS

EVADING HMDS

How to Create Evasive Malware? • Challenges: - We don’t have malware source code

How to Create Evasive Malware? • Challenges: - We don’t have malware source code - We can’t decompile malware because its obfuscated • Our approach: PIN Dynamic Control Flow Graph

What we Should Add to Evade? • Logistic Regression (LR) – LR is defined

What we Should Add to Evade? • Logistic Regression (LR) – LR is defined by a weight vector θ – Add instructions whose weights are negative

What we Should Add to Evade? • Neural Network (NN) – Collapse the description

What we Should Add to Evade? • Neural Network (NN) – Collapse the description of the NN into a single vector – Add instructions whose weights are negative

What we Should Add to Evade? • Current Neural Network (NN)of HMDs are vulnerable

What we Should Add to Evade? • Current Neural Network (NN)of HMDs are vulnerable to evasion attacks! generation – Collapse the description of the NN into a single vector – Add instructions whose weights are negative

DOES RE-TRAINING HELP?

DOES RE-TRAINING HELP?

Can we Retrain with Samples of Evasive Malware? • Linear Model – Logistic Regression

Can we Retrain with Samples of Evasive Malware? • Linear Model – Logistic Regression

Can we Retrain with Samples of Evasive Malware? • Linear Model – Logistic Regression

Can we Retrain with Samples of Evasive Malware? • Linear Model – Logistic Regression • Non-Linear Model – Neural Network

Explaining Retraining Performance Linear Model (LR)

Explaining Retraining Performance Linear Model (LR)

Explaining Retraining Performance Non-Linear Model (NN)

Explaining Retraining Performance Non-Linear Model (NN)

What if we Keep Retraining?

What if we Keep Retraining?

What if we Keep Retraining?

What if we Keep Retraining?

What if we Keep Retraining?

What if we Keep Retraining?

What if we Keep Retraining?

What if we Keep Retraining?

What if we Keep Retraining? Re-training is not a general solution

What if we Keep Retraining? Re-training is not a general solution

CAN WE BUILD DETECTORS THAT RESIST EVASION?

CAN WE BUILD DETECTORS THAT RESIST EVASION?

Overview of RHMDs RHMD 1 HMD 2 . . . HMD n Pool of

Overview of RHMDs RHMD 1 HMD 2 . . . HMD n Pool of diverse HMDs

Overview of RHMDs RHMD 1 Input HMD 2 . . . HMD n Selector

Overview of RHMDs RHMD 1 Input HMD 2 . . . HMD n Selector Output

Overview of RHMDs Detection period Number of committed instructions 0 Features vector … RHMD

Overview of RHMDs Detection period Number of committed instructions 0 Features vector … RHMD 1 Input HMD 2 . . . HMD n Selector Output

Overview of RHMDs Detection period Number of committed instructions 0 Features vector … …

Overview of RHMDs Detection period Number of committed instructions 0 Features vector … … RHMD 1 Input HMD 2 . . . HMD n Selector Output

Overview of RHMDs Detection period Number of committed instructions 0 Features vector … …

Overview of RHMDs Detection period Number of committed instructions 0 Features vector … … … RHMD 1 Input HMD 2 . . . HMD n Selector Output

Overview of RHMDs Detection period Number of committed instructions 0 Features vector … …

Overview of RHMDs Detection period Number of committed instructions 0 Features vector … … … RHMD Diversify by Different: 1 - Features 2 - Detection periods HMD 1 HMD 2 . . . HMD n Selector

Reverse Engineer RHMDs Randomizing the features (a) Two feature vectors (b) Three feature vectors

Reverse Engineer RHMDs Randomizing the features (a) Two feature vectors (b) Three feature vectors

Reverse Engineer RHMDs Randomizing the features and detection period (a) Two feature vectors and

Reverse Engineer RHMDs Randomizing the features and detection period (a) Two feature vectors and two periods (b) Three feature vectors and two periods

RHMD is Resilient to Evasion

RHMD is Resilient to Evasion

Hardware Overhead • FPGA prototype on open core (AO 486): • RHMD with three

Hardware Overhead • FPGA prototype on open core (AO 486): • RHMD with three detectors: – Area increase 1. 72% – Power increase 0. 78%

Conclusion • Current generation of HMDs vulnerable to evasion – Developed a methodology to

Conclusion • Current generation of HMDs vulnerable to evasion – Developed a methodology to reverse-engineer and evade detectors • Explored Re-training HMDs – Benefit is limited • Developed new class of Evasion-Resilient HMDs – Robust to evasion – Low overhead

Thank you! Questions?

Thank you! Questions?

Can’t Just Randomly Add Instructions

Can’t Just Randomly Add Instructions

Evasion Overhead

Evasion Overhead