Design of a System for Real Time Worm














- Slides: 14

Design of a System for Real. Time Worm Detection Bharath Madhusudan, John Lockwood Department of Computer Science and Engineering Washington University, St. Louis © 2004 IEEE Presented by Stephen Karg November 14, 2005

Contributions The Problems: 1. Many IDS’s have limited effectiveness due to the fact that they can filter known worms. 2. Dark-space scan detection can’t defend against hit-list worms. Proposed Solutions: 1. Monitor network traffic to automatically detect new worms in real-time. 2. Analyze packet content, not header. Gets a new worm signature.

Their Goals • • • Low reaction time High throughput Low Cost Low False-Positive Rate Robust to simple countermeasures.

System Properties • Designed to work in tandem with signaturebased IDS. Frequently occurring content = new signature. • Hardware-based system to keep pace with high volume traffic (Gigabit Ethernet). • Centralized monitoring. • Computationally intensive, hence the need for H/W-based system.

General Algorithm 1. Hash over sliding 10 -byte window of packetcontent data stream (header data stripped). • So multiple hashes over each payload (gets around basic metamorphism, shuffling blocks, etc. ) 2. On-chip vector of counters* for each hash value. 3 stage pipeline: 1 read/inc. /write per clock cycle. 3. If threshold count exceeded, offending signature hashed to off-chip SRAM. 4. Iff a 2 nd signature is hashed to same SRAM bucket (that matches the first), alert thrown. • This last step reduces false-positives. * 8 -bit, periodically reduced by avg. count (called timeouts)

Design Considerations 1. Throughput: – Steps 1 & 2 implemented in parallel using multiple windows vector pairs. Counters aggregated. 2. Benign Strings: – False-positive potential w/regularly occurring strings (e. g. 1 st several bytes of HTTP request) – Sys. Admin can reconfigure to ignore.

Design Considerations (cont. ) 3. False-Positives: Potential Counter-Attack: Flood IDS with packet(s) repeating the same string. Solution: • Count any given signature only once per window of size T (not same window as before, larger). • Bloom Filter used (prior research). 1. 2. False-positives can be kept low using proven formula. Signatures over window stored compactly and efficiently queried with dual-ported on-chip memory. 4. Threshold vs. timeout relationship • Reduces to well-studied problem in hashing - can again calculate & minimize false-positive rate.

Performance Evaluation • “Normal” packet stream uses 2 -day trace of UC Berkeley FTP server traffic. – What about other types of traffic? Notably SMTP. • Worm-like data inserted in above stream. – Does stream reflect epidemic behavior? Worms are detected, but are they detected in time? – Perhaps reaction/containment out of scope here. • Would have liked to see performance on sandboxed subnet with real traffic and real worms.

Evaluation Results • Detecting larger worms more difficult. Signature Length (in Bytes) 500 1000 5000 20000 50000 Concentration in Trace Data 1% 2% 3% 7% 11% – If worm size exceeds number of buckets/counters, all of them will be incremented as it passes, no stand-out. – Prototype has 64 x 512 counters (each w/10 B window, ~276 KB)

Evaluation Results (cont. ) • Memory collisions decrease with use of more dual-ported memory blocks. – Not surprising, but tests show hardware requirements (and diminishing returns). • 64 blocks, 0. 02 collision rate. – Also shows empirical collision rate to be consistently below theoretical calculations.

Functional prototype – 64 Block RAMs – Calculates 4 hash values per clock cycle. – Targeted to run on FPX platform w/FPGA hardware. – Circuit implementation runs at 91. 5 Mhz – Introduces pipeline delay of 70 ns into datapath. – Allows processing at OC-48 line speeds. – Conclusion: real-time performance.

Conclusions • A move towards more automated NIDS. – Yes, remove the slow humans from equation. – Performance is impressive considering speed-of-light adversary. • Exploit parallelism afforded by hardware to scan much larger amount of traffic than traditional software implementations of similar algorithm. – But do we need to add the H/W requirement & cost? – Does every packet need to seen to spot a trend? – Could software use sampling to produce the same results? Or will it fall too far behind growth in bandwidth?

Conclusions (cont. ) • Argue much easier to deploy and maintain centralized NIDS than host-based system. – Sure, but as effective? (Wu’s presentation) • System robust to “simple” counter-measures. – Perhaps paper’s greatest weakness. Only the most simple metamorphism defended against. (block reordering, some nop insertion) – – Instruction replacement: UNDETECTED Instruction reordering: UNDETECTED Polymorphic decryptor engines: UNDETECTED Or just pad w/garbage until 277 KB long!

Questions? Thanks.