SMU SRG reading by Tey Chee Meng Automatic
SMU SRG reading by Tey Chee Meng: Automatic Patch-Based Exploit Generation is Possible: Techniques and Implications by David Brumley, Pongsin Poosankam, Dawn Song, Jiang Zheng
What the paper is trying to achieve
Given 2 binaries • Program P' if (input % 2 s = input + } else { s = input + } ptr = realloc /* use of ptr if (input % 2 == 0) { s = input + 2; } else { s = input + 3; } if (s <= input) { /* exit with error */ } ptr = realloc (ptr, s); /* use of ptr */ == 0) { 2; 3; (ptr, s); */
Create an 'exploit' • Exploit as defined by paper: – input that crashes P – input causing information leakage – input that hijacks control flow • Note: 'exploit' as defined by paper not the same 'exploit' as used in the security community which assumed – something usable – bypasses all counter measures • Halvar Flake used the term "vulnerability trigger"
How it was done
Step 1: Compare the binary differences • Program P' if (input % 2 s = input + } else { s = input + } ptr = realloc /* use of ptr if (input % 2 == 0) { s = input + 2; } else { s = input + 3; } if (s <= input) { /* exit with error */ } ptr = realloc (ptr, s); /* use of ptr */ == 0) { 2; 3; (ptr, s); */
Step 2: Determine which is the vulnerable point • Concerned with input sanitisation that is missing in P but added in P' • Where there are many changes, use of heuristics: – minimal change => likely to be added input sanitisation – lots of changes, maybe new feature vul point • Program P' if (input % 2 == 0) { s = input + 2; } else { s = input + 3; } if (s <= input) { /* exit with error */ } ptr = realloc (ptr, s); /* use of ptr */
Step 3: Determine path(s) to the vulnerable point • Path 1: – – – start point (input % 2 == 0) is true s = input + 2 (s <= input) is true vulnerable point • Program P' if (input % 2 == 0) { s = input + 2; } else { s = input + 3; } if (s <= input) { /* exit with error */ } ptr = realloc (ptr, s); /* use of ptr */
Step 3: Determine path(s) to the vulnerable point • Path 2: – – – start point (input % 2 == 0) is false s = input + 3 (s <= input) is true vulnerable point • Program P' if (input % 2 == 0) { s = input + 2; } else { s = input + 3; } if (s <= input) { /* exit with error */ } ptr = realloc (ptr, s); /* use of ptr */
Step 3: Determine path(s) to the vulnerable point • Not individual paths, but a graph of many paths: • Program P' if (input % 2 == 0) { s = input + 2; } else { s = input + 3; } if (s <= input) { /* exit with error */ } ptr = realloc (ptr, s); /* use of ptr */
Step 3: Determine path(s) to the vulnerable point • Single paths can be found via dynamic tracing, i. e. monitor the sequence of steps executed upon normal input • Control flow graphs (CFG) determined via static analysis • Combination: – find single path dynamically – choose any step in the path – determine statically the partial CFG from that step to the vulnerable point
Step 4: Generate constraint formula • From the start point to the vulnerable, the sequence of conditions that are met in P', but not in P – (input % 2 == 0) is true – s = input + 2 – (s <= input) is true • Constraint formula: – (input % 2 == 0) is true AND (s <= input) is true AND s = input + 2 • Possible to generate constraint formula over a CFG • Program P' if (input % 2 == 0) { s = input + 2; } else { s = input + 3; } if (s <= input) { /* exit with error */ } ptr = realloc (ptr, s); /* use of ptr */
Step 5: Give constraint formula to solver for solution • NP-hard problem => the larger the constraint formula, the longer (exponential time) it takes to solve • Solution of example constraint formula: – (input % 2 == 0) is true AND (s <= input) is true – where s = input + 2 – addition is mod 232 – possible answer: input = 232 - 2 • Polymorphic exploit: solve the new constraint formula: – (input % 2 == 0) is true AND (s <= input) is false AND (input != solutions_we_already_know) – where s = input + 2 – addition is mod 232
Step 6: Verify the 'exploit' • There exists engines (TEMU) that can verify certain security policies, e. g. whether a return address on the stack is overwritten • Verification: – Run software under engine with specified policy – Feed 'exploit' input – Examine results of engine – If negative, and other paths exists, try other paths
3 rd party comments (Robert Graham, Halvar Flake) • Exploit stated in paper not the same exploit used by others • Able to generate input that triggers a vulnerability • Not yet a usable exploit that can: – defeat security mechanisms (chk_esp (), safe_unlink ()) – steal info for info-leakage or equivalent of shell code for hijack control flow • Useful, but not yet ready to generate the equivalent of a worm using this. Overstated the impact • Practical cases may involve large constraints beyond capability of solver. • Automated part least time consuming of steps in developing usable exploits
My comments • Output of binary difference, which one is relevant ? • For GDI vulnerability test case – vulnerable procedure: Get. Event () – Static analysis start point: Copy. Meta. File. W () – Remember solver cannot solve large constraints quickly or it may run out of memory – How to automate finding of suitable start point for static case ?
Conclusion • Novel approach • Overstated claims
- Slides: 17