Very Fast Containment of Scanning Worms Written By

Very Fast Containment of Scanning Worms Written By: Nicholas Weaver, Stuart Staniford, Vern Paxson Presentation By: Nathan Johnson A. K. A Space Monkey and Jeff Janies

Worms • Malicious, self propagating programs • Types: – Scanning – picking “random” addresses and attempting to infect – Topological – attempt and discover topology and then infect – Meta Sever – Domain controller attacks – Passive – Sniff other traffic and infect them – Hit list – worm already knows targets to infect – Social – E-mail worms and human stupidity 2

Scanning Worms Cont. – Scanning • Linear – probe the entire address space • Fully random – randomly select address spaces • Bias toward local addresses – random searches within the current domain before propagation 3

Examples • Linear –horizontal and vertical – Blaster • Random – Code Red I (version 2) • Bias towards local – Code Red II and Nimda/Nimba/README. EXE • Permutation Scan – Theoretical 4

How do we Contain them? • Shut the network down – Crude, self-inflicted DOS – Not infected, but not affective – Achieves most attackers goals • Break network into small cells – – Each cell is autonomous Block infected cells connections to healthy cells Still have functionality of most of the network compartmentalized response 5

How do we find a worm? • Scanning worms make many connection attempts. – They do not connect nearly as much as they attempt. • Not always the same host – Sometimes the same system is infected many times – Infected systems may not stay active in propagation 6

Detection with Containment • Cooperation between cells • Sustained scanning threshold • Epidemic threshold – Depends on: – Sensitivity of the containment response devices – The density of the vulnerable machines on the network – The degree to which the worm is able to target its efforts in to the correct network, and even into the current cell 7

Threshold Random Walk (TRW) • Uses an oracle to determine success of connection – Successful connections drives random walk upwards – Failed connections drives random walk downwards • Benign traffic has higher probability of success • Requires fewer connections to detect malicious activity (around 4 or 5 connections) 8

Comparisons between Algorithms 9

Simplified TRW • Advantages – Can be done in hardware or software – Transparent to user – False positives do not increase • Disadvantages – False negatives increase – Stealth worm techniques can avoid detection • Tracks connection establishment rather than using an oracle 10

Hardware Difficulties • Memory access time – On 1 Gigabit connection 8 accesses (DRAM) • 4 in each direction – On 10 Gigabit connections 0 accesses (DRAM) • Must use SRAM 11

Hardware Difficulties (cont) • Memory size – SRAM currently only holds 10 s of megabytes – DRAM is in the Gigabyte range – Must keep memory size small so that both are options 12

Solutions • Use multiple memory banks – Two accesses simultaneously – Cost goes up • Restrict memory size to 16 MB – Approximate network state – For this method of detection this is all that is needed – This method uses only 5 MB for caches 13

Approximation Cache • A cache for which collisions cause imperfections • Simple lookup in bounded space • Structured to avoid false positives • Collisions cause aggregation – Can only cause false negative 14

Attacking the Cache • Predicting the hash – Create collisions to evict or combine data to cause false positives or negatives • Flooding the Cache – Massive amounts of normal data to mask the true attack 15

Block Cipher • Principle – 32 bit block cipher – Permute an N bit value into an index – Use K bits for index and N-K bits for tag • Application – Uses Serpent S-boxes – Requires only 8 levels of logic – Can be implemented on FPGA or ASIC 16

Approximation of TRW • Track connections with the approximation cache • Track success and failure of connection to: – New address – New port at old address – Old port at old address (if entry timed out) • Track everything that you can 17

Structure • Connection table (1 MB) – Stores age and established direction (in-to-out or out-to-in) – Indexed by hash of inside IP, outside IP, and inside port number (in TCP) • Address cache (4 MB) – Stores information about external addresses – Address is encrypted with 32 -bit cipher – Count = Hits - Misses 18

The Structure 19

Variables • Threshold (T) – The constant being compared to the count • Cmin , Cmax - The minimum/maximum values the count can obtain – Legitimate hosts can go bad – Bad hosts can become good • Dmiss , Dconn – The maintenance parameters – Misses are cumulative but not over all time – Need to remove idle connections 20

Operation (from the outside) • Established Connection’s packet – Reduce age in connection table to 0 • Packet from outside – if has corresponding connection request from inside, address’s count = count -1 – Otherwise, external address’s count = count +1 21

Operations (from the inside) • Establishment connection from the other side – External Address’s count = count -2 – Must compensate for the previous charge to the outside address 22

Operations (ultimate goal) • If count is greater than a predefined threshold, it is blocked. – Only already existing connections are maintained • Dropped unless session already exists – TCP RST, RST+ACK, SYN+ACK, FIN+ACK 23

Evaluation • 6000 hosts connected to the internet • 50 -100 Mbps 8 -15 K packets/sec • In a day: – 20 M external connection attempts – 2 M internally initiated connection attempts • Main trace: – 72 minutes – 44 M packets, 48052 external hosts, and 131 K internal addresses 24

Evaluation • Threshold of 5 – 470 alerts – No false positives – These are only the ones between 5 and 19 25

Evaluation • Maximize sensitivity – – Cmin = -5, Dmiss = infinity – Mis-configurations showed up – These are the lowest Max counts 26

Cooperation between Cells • Every containment device knows the number of blocks others have in effect • Each cell computes its own threshold using this knowledge – Reduces T by where θ controls how aggressively to reduce T and X is the number of other blocks in place – Additionally each cell must increase 27

Affect of Theta 28

Inter-cell Communication • Tests performed under the assumption that cell communication is instantaneous in comparison to worm propagation • Slow communications may allow a worm to propagate before any threshold modifications can take place • Possible solutions: – Using a broadcast address – Caching recently contacted addresses 29

Inadvertent False Positives • Artifacts of the detection routines – Potentially more severe – In testing, does not appear to be a problem with the algorithm used in this paper • “Benign” scanning 30

Malicious False Positives • Attacker can “frame” another through packet forging – Internal addresses preventions • Use MAC address and switch features to prevent spoofing or changing MAC addresses. • Setup HTTP proxies and mail filters to filter malicious content – External addresses may still be spoofed and blocked 31

Malicious False Negatives • Occurs when a worm is able to continue despite the active scan-containment • Worm continues to infect the network without being noticed 32

Avoiding Detection • Propagate via a different means – Topological, meta-server, passive, hit-list, etc • Operate Below scanning threshold • Scan for liveliness on white-listed port – Imperfect, but lowers failure rate • Obtain multiple network addresses – Lowers epidemic threshold by a factor of K if the attacker can obtain K network addresses 33

Attacking Cooperation • Outrace containment • Flood containment coordination channels – Cells should have reserved communication bandwidth to prevent this • Cooperative Collapse – High false positives lowering thresholds which in turn increases the false positives – Attacker can amplify this effect by causing scanning within the cells 34

Added Risks using Simplified TRW • Exploiting approximation caches’ hash and permutation functions – Hash countermeasure: Block-cipher based – Hide scanning in a flood of spoofed packets • Pollutes connection cache with half-open connections • Not very feasible due to level of resources required • Could spread as well using slow, distributed scan • Two-sided evasion technique 35

Two-sided Evasion • Requires two computers – One on each side of the containment device • Uses the accomplice machine to provide a valid connection to balance out the scanning 36

Two-sided Countermeasures • Perform only horizontal scans – Advantages: Greatly limits evasion potential – Disadvantages: Cannot detect vertical scans • Split per-address count into two counts – Scanning internal network and on the Internet – Still allows for Internet scanning, but protects internal network • Use two containment implementations – Doubles required resources – Provides protection from general scanning and scanning for evasive techniques 37

Weaknesses • Assume instantaneous communication time between cell – Does not account for bandwidth consumption that occurs in worm attacks • Assume accurate communication between cells • Does not account for the existence of P 2 P networks 38

Contributions • Provides a mechanism for detection and containment – Used in hardware/software • Provides granularity of network – Containment is not limited to an entire subnet • Cooperation between granular units enhances containment and improves containment time 39

References • “Worst-Case Worm”, Paxson, Weaver • “How to 0 wn the Internet in Your Spare Time”, Staniford, Paxson, Weaver • “Fast Portscan Detection Using Sequential Hypothesis Testing”, Jung, Paxson, Berger, and Balakrishnan 40