Protomatching Network Traffic for High Throughput Network Intrusion

  • Slides: 40
Download presentation
Protomatching Network Traffic for High Throughput Network Intrusion Detection Shai Rubin Somesh Jha Barton

Protomatching Network Traffic for High Throughput Network Intrusion Detection Shai Rubin Somesh Jha Barton P. Miller Microsoft University of Wisconsin Security Analysis Services Comp. Sciences Presented by Zhaosheng Zhu

Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of

Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of the attack”. Attacker NIDS Network Signature database 2

Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of

Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of the attack”. GET <URL>/cmd. exe HTTP/1. 1n Attacker NIDS Network • “cmd. exe” is the attack pattern Signature database cmd. exe 3

Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of

Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of the attack”. Be aware of the “cmd. exe” attack Shai NIDS Network • “cmd. exe” is the attack pattern Signature database cmd. exe 4

Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of

Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of the attack”. Attacker GET <URL>/cmd. exe HTTP/1. 1n • “cmd. exe” is the attack pattern, • but only if it is part of a URL NIDS Network Signature database cmd. exe 5

Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of

Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of the attack”. POST <URL>/cmd. exe HTTP/1. 1n Attacker • “cmd. exe” is the attack pattern, • but only if it is part of a URL, • and the HTTP method is GET NIDS Network Signature database cmd. exe 6

Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of

Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of the attack”. GET <URL>/CMD. exe HTTP/1. 1n Attacker • “cmd. exe” is the attack pattern, • but only if it is part of a URL, • and the HTTP method is GET, • and takes into account upper-lower case characters, NIDS Network Signature database cmd. exe 7

Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of

Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of the attack”. GET <URL>/%43 MD. exe HTTP/1. 1n Attacker • “cmd. exe” is the attack pattern, • but only if it is part of a URL, • and the HTTP method is GET, • and takes into account upper-lower case characters, • and takes into account HTTP encodings NIDS Network Signature database cmd. exe 8

Problem in This Talk TCP streams cmd attack A traditional signature What we specify:

Problem in This Talk TCP streams cmd attack A traditional signature What we specify: a traditional signature that exposes: • false negatives • false positives Goal: Develop a signature that is cheaper to enforce TCP streams What we enforce: a signature that inherently fits the attack. cmd. exe A traditional signature attack 9

Contributions • Conceptual: Protomatching signature • Practical: Superset Protomatcher • Real world impact: 25%

Contributions • Conceptual: Protomatching signature • Practical: Superset Protomatcher • Real world impact: 25% improvement in Snort performance 10

Protomatching Signature • It is a regular expression with two properties: – Ensures that

Protomatching Signature • It is a regular expression with two properties: – Ensures that the characteristics pattern of an attack appears in the context that is necessary for the attack to succeed. – Second, a protomatching signature matches both normalized and encoded versions of an attack. 11

Superset protomatcher • It recognizes a superset of the traffic matched by a full-coverage

Superset protomatcher • It recognizes a superset of the traffic matched by a full-coverage protomatcher. • Three properties: – A superset protomatcher consumes less memory. – Traffic that matches the superset protomatcher may do not match any NIDS signatures – Traffic that does not match the superset protomatcher also does not match any signature in the NIDS database. 12

Related work • Protocol analysis and traffic normalization – Modern NIDS are based on

Related work • Protocol analysis and traffic normalization – Modern NIDS are based on the ANM methodology – Ptacek and Newsham were the first to recognize that a NIDS that does not perform normalization is susceptible to evasion. – The problem of alternate encodings is particularly painful for HTTP traffic. 13

Related Work II • Fast pattern matching for NIDS – Previous work does not

Related Work II • Fast pattern matching for NIDS – Previous work does not solve encodings problem, and does not consider protocol analysis in matching algorithm – Researchers have proposed using regular expression matching – To match regular expressions, Sommer and Paxson used a DFA. However, they performed matching on already-normalized traffic. 14

Related Work III • Dealing with high-speed links. – To deal with high-speed links,

Related Work III • Dealing with high-speed links. – To deal with high-speed links, researchers have suggested a distributed NIDS that balances the network traffic such that each sensor monitors a different portion of the protected network – Our work focuses on the performance of a single sensor. It can perform better with cooperating distributed design. 15

Analyze-normalize-match (ANM) approach • First, a NIDS encodes its signatures in a normalized form

Analyze-normalize-match (ANM) approach • First, a NIDS encodes its signatures in a normalized form • During runtime, NIDS parses the traffic according to the protocol the attack uses and normalizes the traffic • Last, the NIDS matches the normalized traffic against its normalized signatures. 16

Current conversion and signature matching GET <…>/%43 MD. exe HTTP/1. 1n Protocol analysis Method

Current conversion and signature matching GET <…>/%43 MD. exe HTTP/1. 1n Protocol analysis Method = GET URL = <…>/%43 MD. exe Version = HTTP/1. 1 Normalization URL=CMD. EXE Sig=CMD. EXE • Naively, each phase requires traversing the input • In practice (e. g. , Snort) two traversals: • Protocol analysis + normalization • Matching • Notice that all traffic, benign and malicious, requires all three phases String matching No Benign Yes Malicious 17

Protomatching GET <…>/%43 MD. exe HTTP/1. 1n Protocol analysis Method = GET URL =

Protomatching GET <…>/%43 MD. exe HTTP/1. 1n Protocol analysis Method = GET URL = <…>/%43 MD. exe Version = HTTP/1. 1 Normalization URL=CMD. EXE Sig=? ? • Goal: Single traversal on the input • Protomatching= Protocol analysis+ Normalization+ Matching Pattern matching No Benign No Yes Malicious Benign Yes Malicious 18

Protomatching GET <…>/%43 MD. exe HTTP/1. 1n Sig=Regular expression Protocol analysis Method = GET

Protomatching GET <…>/%43 MD. exe HTTP/1. 1n Sig=Regular expression Protocol analysis Method = GET URL = <…>/%43 MD. exe Version = HTTP/1. 1 Normalization URL=CMD. EXE Single pass implies: use a Deterministic Finite State Machine Sig=CMD. EXE Pattern matching No Benign No Yes Malicious Benign Yes Malicious 19

Converting a traditional signature into a protomatching signature 1. 2. Let S be a

Converting a traditional signature into a protomatching signature 1. 2. Let S be a traditional signature Expand S to conform to the protocol specification 20

Traditional signature • *[c|C][m|M][d|D]. [e|E][x|X][e|E] • 8 states • size = 8*256=2048 bytes 21

Traditional signature • *[c|C][m|M][d|D]. [e|E][x|X][e|E] • 8 states • size = 8*256=2048 bytes 21

Add a little bit of context • *”GET” *[c|C][m|M][d|D]. [e|E][x|X][e|E] • 12 states •

Add a little bit of context • *”GET” *[c|C][m|M][d|D]. [e|E][x|X][e|E] • 12 states • size = 12*256=3072 bytes 22

And even more context • ( *nn)*”GET”[SP]+(PN)*[c|C][m|M][d|D]. [e|E][x|X][e|E] • 18 states • size =

And even more context • ( *nn)*”GET”[SP]+(PN)*[c|C][m|M][d|D]. [e|E][x|X][e|E] • 18 states • size = 18*256=4608 bytes SP denotes white space characters, and PN denotes characters that can appear in a URL according to the HTTP specification (e. g. , ‘n’ cannot appear in a URL). 23

Converting a traditional signature into a protomatching signature 1. 2. 3. Let S be

Converting a traditional signature into a protomatching signature 1. 2. 3. Let S be a traditional signature Expand S to conform to the protocol specification, obtaining S’ Expand S’ to account for all possible encodings, obtaining S’’ 24

Representing encodings The character c can be represented as: C, c, %43, %63, %U

Representing encodings The character c can be represented as: C, c, %43, %63, %U 0043, %U 0063, %u 0043, %u 0063 Replace every instance of the small machine with the large machine 25

And even more context • ( *nn)*”GET”[SP]+(PN)*[c|C][m|M][d|D]. [e|E][x|X][e|E] • 18 states • size =

And even more context • ( *nn)*”GET”[SP]+(PN)*[c|C][m|M][d|D]. [e|E][x|X][e|E] • 18 states • size = 18*256=4608 bytes 26

 *nn”GET”[SP]+(PN)*[c-C][m-M][d-D]. [e-E][x-X][e-E] and HEX encoding and Uencoding • 53 states • size =

*nn”GET”[SP]+(PN)*[c-C][m-M][d-D]. [e-E][x-X][e-E] and HEX encoding and Uencoding • 53 states • size = 53*256=13, 568 bytes 27

Building a protomatcher 1. 2. 3. 4. 5. Let S be a traditional signature

Building a protomatcher 1. 2. 3. 4. 5. Let S be a traditional signature Expand S to conform to the protocol specification, obtaining S’ Expand S’ to account for all possible encodings, obtaining S’’ Perform 1 -3 for every traditional signature in your database, obtaining S 1’’, S 2’’, …, Sn’’ Build the protomatcher: an FSM that identifies S 1’’ S 2’’ , …, Sn’’ Problem: we increased each signature by factor of 7 (at least). A full protomatcher does not fit into 2 GB (or 4 GB) of memory 28

Superset protomatching signature • Assumption: the majority of the benign traffic is not only

Superset protomatching signature • Assumption: the majority of the benign traffic is not only benign, but also not even similar to malicious traffic. • For example, most benign traffic not only does not contain “cmd. exe”, but also does not contain “cmd. ” • Note that is a request does not contain “cmd. ”, then it also does not contains “cmd. exe” • “cmd. ” is a superset signature because it matches the attack and more 29

Full protomatching signature for cmd. exe • *nn”GET”[SP]+(PN)*[c-C][m-M][d-D]. [e-E][x-X][e-E] and HEX encoding and Uencoding

Full protomatching signature for cmd. exe • *nn”GET”[SP]+(PN)*[c-C][m-M][d-D]. [e-E][x-X][e-E] and HEX encoding and Uencoding • 53 states • size = 53*256=13, 568 bytes 30

Superset protomatching signature for cmd. exe • *nn”GET”[SP]+(PN)*[c-C][m-M][d-D]. [e-E][x-X][e-E] and HEX encoding and Uencoding

Superset protomatching signature for cmd. exe • *nn”GET”[SP]+(PN)*[c-C][m-M][d-D]. [e-E][x-X][e-E] and HEX encoding and Uencoding • 29 states • size = 29*256=7, 424 bytes 31

Building a superset protomatcher 1. 2. 3. 4. 5. 6. Let S be a

Building a superset protomatcher 1. 2. 3. 4. 5. 6. Let S be a traditional signature Trim S into a superset signature (e. g. , “cmd. exe” into “cmd. ”) obtaining S’ Expand S to conform to the protocol specification, obtaining S’’ Expand S’’ to account for all possible encodings, obtaining S’’’ Perform 1 -3 for every traditional signature in your database, obtaining S 1’’’, S 2’’’, …, Sn’’’ Build the protomatcher: an FSM that identifies S 1’’’ S 2’’’ , …, Sn’’’ 32

Superset Protomatching GET <…>/%43 MD. exe HTTP/1. 1n Sig=superset protomatching signature Protocol analysis Method

Superset Protomatching GET <…>/%43 MD. exe HTTP/1. 1n Sig=superset protomatching signature Protocol analysis Method = GET URL = <…>/%43 MD. exe Version = HTTP/1. 1 Normalization URL=CMD. EXE Yes Superset Protomatcher: match a superset protomatching signature Sig=CMD. EXE Pattern matching No Benign No Yes Malicious Benign Yes Malicious 33

Implementation • Implemented a compiler that converts a traditional signature into a protomatching signature

Implementation • Implemented a compiler that converts a traditional signature into a protomatching signature • The compiler also builds the protomatcher • Incorporated the protomatcher into Snort • Used traditional Snort as the second phase of a superset protomatcher 34

Two ways to implement Protomatcher • Using a deterministic FSM. That is what we

Two ways to implement Protomatcher • Using a deterministic FSM. That is what we do in the examples used. • Using a hierarchical FSM. It has two parts: a matcher and a normalizer. – The matcher is responsible for protocol analysis and pattern matching. – The normalizer is responsible for processing multiple encodings. – Unlike ANM which first normalizes the whole http request, it uses the normalizer only when necessary. – Can help reduce memory needed. 35

Performance improvement Ap. PPT: Average per Packet Processing Time (cycles) 36

Performance improvement Ap. PPT: Average per Packet Processing Time (cycles) 36

Comparison between Protomachers memory size 37

Comparison between Protomachers memory size 37

Sensitivity to Cache Poisoning Attack • We assumed that the attack would have a

Sensitivity to Cache Poisoning Attack • We assumed that the attack would have a larger effect on a protomatcher-based Snort than on vanilla Snort. • But the result contradicts the assumption. There might be two reasons for this result: – First, the attack was ineffective in increasing the number of cache misses. It means that a more sophisticated cache poisoning attack is needed. – Second, the attack was effective, but cache performance is only a minor component of the Ap. PPT. 38

Conclusion • Optimize for the common case is a known method • In this

Conclusion • Optimize for the common case is a known method • In this talk we presented develop a technique that uses this method to improve matching efficiency • Our technique is based on formal methods • These methods enable automation, therefore efficiency, and facilitates accuracy 39

Discussion on shortcomings • Failure due to Cache-poisoning attacks • Converting a Protomatching signature

Discussion on shortcomings • Failure due to Cache-poisoning attacks • Converting a Protomatching signature to a superset signature should be done manually. Better methods? 40