Botfarm Development Dynamic Malware Containment Vern Paxson Christian

Botfarm Development Dynamic Malware Containment Vern Paxson & Christian Kreibich UC Berkeley Team Nov. 20, 2009

Role of the Botfarm • Controlled habitat for large-scale botnet experimentation - • What are we doing? Support safe yet faithful analysis Tight control over communications Internally: isolation between bots (& infrastructure) - Externally: prevent malicious traffic, preserve C&C - • Habitat for wide range of malware Provide suitable platforms despite anti-VM techniques etc - Create illusion of unconstrained operation - • Enable long-term experimentation & operation Behavioral analysis, botnet infiltration - Automated C&C analysis, C&C rewriting -

Critical issue: Containment • Containment is vital - Must prevent attacks or abusive traffic from impinging on outside world - • Who cares? Ethical, legal & operational motivations Containment is hard - Program behavior not known ahead of time - Behavior may change over time - Botnet infiltration & analysis requires (unknown!) C&C to be allowed - Botmaster trump card: require successful attacks before bot can “join the gang”

State of the Art • How is it done today? Focus on static, packet-level containment Inflexible: need dynamic, per-bot policies - Unsafe: need deep app-level control - • Focus on network traffic - • Incomplete: program inspection required for context Insufficiently granular E. g. “Allow HTTP, redirect SMTP” may leak attacks - Infiltration requires precise C&C rewriting & synthesis - • Single-experiment setup - Production botfarm requires mutually isolated “subfarms”

Botfarm Development Efforts • • Separate policy & mechanism - Containment servers implement per-bot containment - Focus on app-level functionality Automate - Full bot lifecycle management enables scalability o • What's new? Payoffs? Infection / activity monitoring / cleanup/re-infection Diversify the habitat - Virtual machines (VMware, QEMU, Xen), raw iron - Binary analysis & tracing platforms - Integration of external communication modules o Address diversity, Tor, GRE tunnels to remote components

Containment Servers • Flexible, policy-controlled, transparent app-layer proxies - Separate containment decisions (policy) from packet-level forwarding (mechanism) - Provide flexible containment decisions o - E. g. , Remote SMTP banner-grabbing for SMTP sinks Enable lifecycle management o - Drop, Rate-limit, Redirect, Reflect, Rewrite Internal servers can require containment too o • What's new? What difference will it make? Auto-infection, activity monitoring, recovery/restart Scalability via multiple servers per subfarm Realization: shimming protocol

Botfarm Architecture What's new?

Subfarms What's new?

What's new? SYN Response Shim Sender IP/port Request Shim SYN’-ACK SYN’ • Destination IP/port • Sender IP/port REWRITE LIMIT DROP REFLECT REDIRECT • Verdict FORWARD • Destination IP/port • VLAN ID • SYN

Lifecycle Automation What's new? • Prototype of subfarm config. Specifications • E. g. , 10 Rustock bots w/ idleness monitoring & SMTP sink [VLAN 10 -19] Decider = Rustock Infection = bins/rustock. exe Trigger = *: 25/tcp < 1/20 min -> reboot [Autoinfect] Address = 10. 9. 8. 7 Port = 6543 [Smtp. Banner. Sink] Address = 10. 3. 1. 4 Port = 2525

Monitoring and Reporting • What's new? Continuous tracking of inmate behavior (via Bro) Subfarm 1 [Containment server VLAN 11] ------------------------------------Gheg [x. y. 0. 164/10. 3. 9. 247, VLAN 15] -----------------------------------FORWARD - permitted port #flows target port 38 89. 107. 104. 110 https REFLECT - full SMTP containment #flows target port 3433 *. * smtp Rustock [x. y. 0. 130/10. 3. 128. 81, VLAN 165] -----------------------------------REFLECT - full SMTP containment #flows target port 15776 *. * smtp REWRITE - C&C filtering #flows target port 38 174. 139. 29. 114 http 2 72. 247. 242. 201 http • Vision: fully navigable reports w/ drill-down & historical archive

Overarching Challenge • Risks, challenges Starting with specimens of a new botnet … … successfully instantiate in controlled environment … … extract C&C functionality/structure … … engage with working botnet … … analyze its structural vulnerabilities … … determine efficacy of corresponding attacks aimed at intelligence/repurposing/disruption • Achieve such analysis … … Safely … … with high fidelity at scale … … much more readily than today’s one-off approaches achieve

Pending Issues • Bot “quickening” - • • Getting bots to run in the first place / analyzing why an instance fails to functionally activate VM detection? - Compare w/ raw-iron run (diff syscall activity) - Binary analysis to find VM detection logic Containment restriction? - • Risks, challenges Network-based analysis of attempted connections Mundane environmental deficiency? - Need host-side tracing / analysis of exit path

Pending Issues, con’t • Risks, challenges Generation of containment policies - Currently, burdensome manual process - Semi-automation via Matching observed activity against previous containment templates o Generalizing activity across multiple bot runs o • Dynamic containment via speculative execution - Snapshot VM at point of outbound connection - Reflect to internal server for analysis - If vetted okay, recover to snapshot point & allow out

Pending Issues, con’t • Network-level C&C analysis and synthesis - Inference of (non-encrypted) C&C structure - Drive generation of C&C parsers from binary analysis o • Risks, challenges Along with C&C templates for mimicry Construction of faux C&C servers to drive contained experiments of botnet as complex distributed system • Safely explore large-scale effects/dynamics of C&C disruption