Static Analysis for Security A Case Study in

  • Slides: 26
Download presentation
Static Analysis for Security A Case Study in the Automation of Code Auditing Omer

Static Analysis for Security A Case Study in the Automation of Code Auditing Omer Tripp November 9 th, 2009

Agenda • • • Motivation Solution space Security violations Taint analysis Demo Conclusion

Agenda • • • Motivation Solution space Security violations Taint analysis Demo Conclusion

Some Statistics • Average number of bugs per KLOC is 15 [1] • Developers

Some Statistics • Average number of bugs per KLOC is 15 [1] • Developers find 6 defects per hour in code reviews [2]

Some Math • There are 30 MLOC in e-Bay’s codebase – ~45 K bugs

Some Math • There are 30 MLOC in e-Bay’s codebase – ~45 K bugs – ~7. 5 K hours to find • There are 50 MLOC in Windows Server 2003 – ~75 K bugs – ~12. 5 K hours for find

Some More Statistics • Heavy-weight static-analysis techniques process ~1 K LOC per second •

Some More Statistics • Heavy-weight static-analysis techniques process ~1 K LOC per second • Light-weight static-analysis techniques process ~5 K LOC per second • Human reviewers can only (effectively) digest 300 LOC per hour = 0. 2 LOC per second [3]

Bottom Line • Manual auditing is problematic: – Too costly! – Doesn’t fit into

Bottom Line • Manual auditing is problematic: – Too costly! – Doesn’t fit into SDLC – Results influenced by subjective considerations • Sometimes it’s also impossible: – 3 rd-party component packaged as binary – Human auditing leaks IP – No in-house experts

What Can Automation Do? • Wide range of applications, including: – Run-time errors (e.

What Can Automation Do? • Wide range of applications, including: – Run-time errors (e. g. , NPE, unhandled exceptions, etc…) – Security analysis – Performance analysis – Liveness properties – Synchronization problems – Quality issues – Refactoring –…

Static-analysis Tools

Static-analysis Tools

Dynamic-analysis Tools

Dynamic-analysis Tools

Software Security • Integrity – Untrusted inputs flowing into security-sensitive areas • Confidentiality –

Software Security • Integrity – Untrusted inputs flowing into security-sensitive areas • Confidentiality – Private information flowing into public areas • Do. S – Overwhelming the system – Causing crashes

Exemplary Integrity Violations • Cross-site Scripting • SQL injection (SQLi)

Exemplary Integrity Violations • Cross-site Scripting • SQL injection (SQLi)

Exemplary Confidentiality Violations • Error leakage • Insufficient anonymity

Exemplary Confidentiality Violations • Error leakage • Insufficient anonymity

Denial of Service • Classic Do. S/DDo. S • Through an integrity problem

Denial of Service • Classic Do. S/DDo. S • Through an integrity problem

Code Examples public partial class Customize : System. Web. UI. Page { … protected

Code Examples public partial class Customize : System. Web. UI. Page { … protected void Page_Load(object sender, System. Event. Args e) { … string lang. Param = Request. Query. String["lang"]; … if (lang. Param != "") { lang = lang. Param; } … lang. Label. Text = lang; … } S S X public partial class Transfer : System. Web. UI. Page { … protected void Page_Load(object sender, System. Event. Args e) { … string this. User = Request. Cookies["am. User. Id"]. Value; Get. Accounts(this. User ); Get. Accounts(this. User); … } … private void Get. Accounts(string user. Id) { … string query ="SELECT accountid, acct_type From accounts WHERE userid = " + user. Id; … my. Account = new Ole. Db. Data. Adapter(query , my. Connection); … } i L SQ

Taint Analysis • The problem of finding flows from unchecked/poorly checked inputs to security-sensitive

Taint Analysis • The problem of finding flows from unchecked/poorly checked inputs to security-sensitive operations • Can be solved as graph-reachability problem • Captures vast majority of integrity/confidentiality problems

Bird’s-eye View • Build index of all relevant entities (type hierarchy, methods, etc…) •

Bird’s-eye View • Build index of all relevant entities (type hierarchy, methods, etc…) • Represent the program as a call graph • Track control and data flow on top of the call graph • Solve a reachability problem on top of the propagation graph (modulo some enhancements)

Taint Analysis Based on Program Slicing [4, 5] • Run the following algorithm: –

Taint Analysis Based on Program Slicing [4, 5] • Run the following algorithm: – Use statements defining untrusted inputs as slicing criterion – Find the set S of all statements that are (control-) and data-flow dependent on the slicing criterion – For each s in S such that s is a securitysensitive operation, report all flows from statements in the slicing criterion to s

Taint Analysis Based on a Storeless Abstraction X x = req. get. Parameter(); Y

Taint Analysis Based on a Storeless Abstraction X x = req. get. Parameter(); Y y = new Y(); y. f = x; Z z = y. f; { resp. get. Writer(). write(z); { x } { x, y. f } x, y. f, z }

Challenges • The infamous precision-scalability tradeoff • External resources – Configuration files – Framework-specific

Challenges • The infamous precision-scalability tradeoff • External resources – Configuration files – Framework-specific configurations • Beyond graph reachability… • SDLC-induced use cases

Precision versus Scalability • Modular analysis • Demand-driven strategies

Precision versus Scalability • Modular analysis • Demand-driven strategies

External Resources • Synthetic models • Sometimes ignorance is a bliss…

External Resources • Synthetic models • Sometimes ignorance is a bliss…

Beyond Graph Reachability • PQL [6] • String analysis [7]

Beyond Graph Reachability • PQL [6] • String analysis [7]

SDLC-induced Use Cases • Incremental analysis • Parallelization on multi-core build servers

SDLC-induced Use Cases • Incremental analysis • Parallelization on multi-core build servers

DEMO

DEMO

The Remaining 8 Yards • Instead of killing n birds with 1 stone, use

The Remaining 8 Yards • Instead of killing n birds with 1 stone, use n stones to kill 1 bird (like humans) • How do we catch up with changes in technology? • How to tailor the analysis to the needs of different users? • Useful heuristics often resilient to formal definition

References [1] [2] [3] [4] [5] S. Mc. Connell. Code Complete: A Practical Handbook

References [1] [2] [3] [4] [5] S. Mc. Connell. Code Complete: A Practical Handbook of Software Construction W. S. Humphrey. Acquiring Quality Software in Cross. Talk, 18 -12 Code Review at Cisco Systems O. Tripp et al. . TAJ: Effective Taint Analysis of Web Applications C. Hammer and G. Snelting. Flow-sensitive, Context-sensitive, and Object-sensitive Information-flow Control Based on Program Dependence Graphs [6] B. Livshits and M. Lam. Finding Application Errors and Security Flaws Using PQL: a Program Query Language [7] M. Christodorescu et al. . String Analysis for X 86 Binaries