Decompilers Resources Cifuentes Thesis Reverse Compilation Techniques I

  • Slides: 21
Download presentation
Decompilers

Decompilers

Resources • Cifuentes Thesis: Reverse Compilation Techniques • I will be posting other resources

Resources • Cifuentes Thesis: Reverse Compilation Techniques • I will be posting other resources on the website

What is a Decompiler?

What is a Decompiler?

Phases of Decompilation 1. Loading 2. Disassembly 3. Lifting 4. Dataflow Analysis 5. Type

Phases of Decompilation 1. Loading 2. Disassembly 3. Lifting 4. Dataflow Analysis 5. Type Inference 6. Code. Gen 7. Name Recovery? ? ?

Loading • You’ve done this!

Loading • You’ve done this!

Disassembly • You’ve done this too!

Disassembly • You’ve done this too!

Lifting • Intermediate Representation (IR) • Concise description of instruction semantics • Three-argument IR

Lifting • Intermediate Representation (IR) • Concise description of instruction semantics • Three-argument IR • Examples: BAP, VEX • Abstract across architectures • Common abstractions? • Representing memory? Registers? • Organized in basic blocks

Data Flow Analysis • Constant Propagation • Calling Convention Analysis • Condition Code Propagation

Data Flow Analysis • Constant Propagation • Calling Convention Analysis • Condition Code Propagation • Register Copy Elimination • Stack Frame Analysis • Dead Register & Condition Code Elimination • Variable Recovery

Constant Propagation 1 2 3 - resolving indirection - deobfuscation 4 5 …

Constant Propagation 1 2 3 - resolving indirection - deobfuscation 4 5 …

Variable Recovery ra var_10 var_x var_y var_40 var_44 …

Variable Recovery ra var_10 var_x var_y var_40 var_44 …

Register Copy Elimination

Register Copy Elimination

Dead Condition Code Elimination Important when lifted to IR

Dead Condition Code Elimination Important when lifted to IR

Condition Code Propagation

Condition Code Propagation

Stack Frame Analysis •

Stack Frame Analysis •

Control Flow Structuring if (…) { … while (…) { … goto lbl }

Control Flow Structuring if (…) { … while (…) { … goto lbl } } else if (…) { … } else { if (…) { lbl: for (…) { … } } … }

Control Flow Structuring

Control Flow Structuring

Control Flow Structuring • Basic Pattern Matching • Cifuentes Thesis • Phoenix • No

Control Flow Structuring • Basic Pattern Matching • Cifuentes Thesis • Phoenix • No More Goto • Focus of next week

Type Inference •

Type Inference •

Codegen • Now we have functions, IR, etc. But how do we get C?

Codegen • Now we have functions, IR, etc. But how do we get C? • Translate semantics of IR and control flow into pseudocode

Future Directions: Name Recovery - Now we have variables, but they don’t mean much!

Future Directions: Name Recovery - Now we have variables, but they don’t mean much! - Can we automate recovering variables names? - DEEP LEARNING!!! - DIRE, Debin