Mining Windows Kernel API Rules Jinlin Yang jinlincs

  • Slides: 16
Download presentation
Mining Windows Kernel API Rules Jinlin Yang jinlin@cs. virginia. edu 09/28/2005 CS 696 09/28/2005

Mining Windows Kernel API Rules Jinlin Yang jinlin@cs. virginia. edu 09/28/2005 CS 696 09/28/2005 Jinlin Yang, CS 696

My Background • Bounded exhaustive testing, 09/2001 -01/2004 – D. Coppit, J. Yang, S.

My Background • Bounded exhaustive testing, 09/2001 -01/2004 – D. Coppit, J. Yang, S. Khurshid, W. Le, and K. Sullivan. Software Assurance by Bounded Exhaustive Testing. IEEE Transactions on Software Engineering. April 2005 – K. Sullivan, J. Yang, D. Coppit, S. Khurshid, and D. Jackson. Software Assurance by Bounded Exhaustive Testing. ISSTA ‘ 04 • Temporal properties inference, 01/2004 -present – J. Yang and D. Evans. Dynamically Inferring Temporal Properties. PASTE ’ 04 – J. Yang and D. Evans. Automatically Inferring Temporal Properties for Program Evolution. ISSRE ’ 04 – J. Yang and D. Evans. Automatically Discovering Temporal Properties for Program Verification. Submitted to FMSD – J. Yang, D. Evans, D. Bhardwah, T. Bhat, and M. Das. Terracotta: Mining Temporal API Rules from Imperfect Traces. Submitted to ICSE ‘ 06 09/28/2005 Jinlin Yang, CS 696 2

Overview • Problem: unavailability of specification is a big issue in defect detection •

Overview • Problem: unavailability of specification is a big issue in defect detection • Solution: automatically inferring specification from execution traces • Benefits: better understanding of legacy code and opportunity to find more defects – Experiments on finding kernel API rules – Found one previously unknown bug in Windows – Found interesting properties that should have been checked 09/28/2005 Jinlin Yang, CS 696 3

Problem • Defect detection technique • Generic properties – E. g. pointer and buffer

Problem • Defect detection technique • Generic properties – E. g. pointer and buffer usage – PREfix [Bush et al, SP&E 00], PREfast – Very effective • Application specific properties – E. g. lock/unlock, resource creation/deletion – SLAM/SDV [Ball et al, SPIN 01], ESP [Das et al, PLDI 02] • Where do we get such properties? 09/28/2005 Jinlin Yang, CS 696 4

My Approach Inferred Properties Post-processing Test Suite Execution Traces Inference Instrumented Program Running Instrumentation

My Approach Inferred Properties Post-processing Test Suite Execution Traces Inference Instrumented Program Running Instrumentation Program Report Property Templates J. Yang and D. Evans. Dynamically inferring temporal properties. PASTE ‘ 04. 09/28/2005 Jinlin Yang, CS 696 5

An Example • Alternating template (PS)*, P≠S. P and S are placeholders 09/28/2005 Jinlin

An Example • Alternating template (PS)*, P≠S. P and S are placeholders 09/28/2005 Jinlin Yang, CS 696 6

Implementation • Terracotta – Inference engine – Context-aware trace analysis – Heuristics for prioritizing

Implementation • Terracotta – Inference engine – Context-aware trace analysis – Heuristics for prioritizing and presenting properties • Performance linear to length of trace and number of distinct events • More information http: //www. cs. virginia. edu/terracotta 09/28/2005 Jinlin Yang, CS 696 7

Lessons • Missing interesting properties – Original algorithm requires 100% satisfaction • Real world

Lessons • Missing interesting properties – Original algorithm requires 100% satisfaction • Real world is never perfect – Trace collected by sampling – Object information unavailable – Imperfect programs • Can we develop better inference to handle this? • Too many noises in results – Interesting properties are buried in a group of uninteresting ones • Can we develop heuristics to select interesting ones? 09/28/2005 Jinlin Yang, CS 696 8

Refinement of Inference • How to detect interesting properties in face of imperfect traces?

Refinement of Inference • How to detect interesting properties in face of imperfect traces? • Example – PS PS PS PPP – The dominant behavior is P and S alternate – 10 subtraces, 90% satisfy Alternating 09/28/2005 Jinlin Yang, CS 696 9

Refinement of Inference (2) • How to pick out interesting properties? void A(){ Ke.

Refinement of Inference (2) • How to pick out interesting properties? void A(){ Ke. Set. Timer(){. . . Ke. Set. Timer. Ex(); } B(); . . . } Case 1 void x(){ C(); Ex. Acquire. Fast. Mutex. Unsafe(&m); . . . D(); Ex. Release. Fast. Mutex. Unsafe(&m); } Case 2 • Which one is more likely to be interesting? – Heuristics: C D is often more interesting – Compute call graph for windows binaries – Keep A B if B is not reachable from A 09/28/2005 Jinlin Yang, CS 696 10

Refinement of Inference (3) • Heuristics: the more similar two events are, the more

Refinement of Inference (3) • Heuristics: the more similar two events are, the more likely that the properties is interesting • Relative edit distance between A and B – Partition A and B into words – A has w. A words, B has w. B, w common words – • For example: – Ke Acquire In Stack Queued Spin Lock Ke Release In Stack Queued Spin Lock – Similarity = 85. 7% 09/28/2005 Jinlin Yang, CS 696 11

Results: Kernel • Approximation – PAL threshold = 0. 90 – 7611 properties •

Results: Kernel • Approximation – PAL threshold = 0. 90 – 7611 properties • Call-graph and edit distance based reduction – Use the call-graph of ntoskrnl. exe, edit dist > 0. 5 – 142 properties. 53 times reduction! – Small enough for manual inspection • 56 apparently interesting properties (40%) – Locking discipline – Resource allocation and deletion 09/28/2005 Jinlin Yang, CS 696 12

Result: Kernel (2) • Found interesting properties that should be checked – Several types

Result: Kernel (2) • Found interesting properties that should be checked – Several types of kernel Spin. Lock – The Static Device Verifier should have checked them • ESP found one previously unknown bug in ntfs. sys – Double-acquire of Fast. Mutex – Confirmed and fixed by the responsible developers Static Driver Verifier: Finding Bugs in Device Drivers at Compile-Time. Win. HEC, April 2004. M. Das, S. Lerner, and M. Seigle. ESP: Path-Sensitive Program Verification in Polynomial Time. PLDI ‘ 02 09/28/2005 Jinlin Yang, CS 696 13

Summary of Experiments • We inferred interesting rules about kernel APIs! – SDV already

Summary of Experiments • We inferred interesting rules about kernel APIs! – SDV already encodes some properties http: //download. microsoft. com/download/5/b/5/5 b 5 bec 17 -ea 71 -4653 -9539 -204 a 672 f 11 cf/SDV-intro. doc – We inferred undocumented ones too • Inference scales well to realistic traces • Approximation is effective in tolerating imperfect traces and detect dominant patterns • Call-graph and edit distance based reduction is very effective • Check with defect detection tool is promising • Other experiments: Vulcan APIs, Daisy file system 09/28/2005 Jinlin Yang, CS 696 14

Conclusion • Constructing interesting properties is important and difficult • Automatic inference from execution

Conclusion • Constructing interesting properties is important and difficult • Automatic inference from execution traces is light-weight and effective • Practical values – Helping developers understand legacy code – Giving us opportunity of leveraging sophisticated static analysis tools to find application specific defects 09/28/2005 Jinlin Yang, CS 696 15

Q&A • For more information jinlin@cs. virginia. edu http: //www. cs. virginia. edu/terracotta •

Q&A • For more information jinlin@cs. virginia. edu http: //www. cs. virginia. edu/terracotta • Great collaborators – UVa David Evans, Ed Mitchell – Microsoft Stephen Adams, Deepali Bhardwaj, Thirumalesh Bhat, Manuvir Das, Damian Hasse, Marne Staples, Rick Vicik, Jason Yang, Zhe Yang 09/28/2005 Jinlin Yang, CS 696 16