SMU SRG reading by Tey Chee Meng Automatic

SMU SRG reading by Tey Chee Meng: Automatic Patch-Based Exploit Generation is Possible: Techniques and Implications by David Brumley, Pongsin Poosankam, Dawn Song, Jiang Zheng

What the paper is trying to achieve

Given 2 binaries • Program P' if (input % 2 s = input + } else { s = input + } ptr = realloc /* use of ptr if (input % 2 == 0) { s = input + 2; } else { s = input + 3; } if (s <= input) { /* exit with error */ } ptr = realloc (ptr, s); /* use of ptr */ == 0) { 2; 3; (ptr, s); */

Create an 'exploit' • Exploit as defined by paper: – input that crashes P – input causing information leakage – input that hijacks control flow • Note: 'exploit' as defined by paper not the same 'exploit' as used in the security community which assumed – something usable – bypasses all counter measures • Halvar Flake used the term "vulnerability trigger"

How it was done

Step 1: Compare the binary differences • Program P' if (input % 2 s = input + } else { s = input + } ptr = realloc /* use of ptr if (input % 2 == 0) { s = input + 2; } else { s = input + 3; } if (s <= input) { /* exit with error */ } ptr = realloc (ptr, s); /* use of ptr */ == 0) { 2; 3; (ptr, s); */

Step 2: Determine which is the vulnerable point • Concerned with input sanitisation that is missing in P but added in P' • Where there are many changes, use of heuristics: – minimal change => likely to be added input sanitisation – lots of changes, maybe new feature vul point • Program P' if (input % 2 == 0) { s = input + 2; } else { s = input + 3; } if (s <= input) { /* exit with error */ } ptr = realloc (ptr, s); /* use of ptr */

Step 3: Determine path(s) to the vulnerable point • Path 1: – – – start point (input % 2 == 0) is true s = input + 2 (s <= input) is true vulnerable point • Program P' if (input % 2 == 0) { s = input + 2; } else { s = input + 3; } if (s <= input) { /* exit with error */ } ptr = realloc (ptr, s); /* use of ptr */

Step 3: Determine path(s) to the vulnerable point • Path 2: – – – start point (input % 2 == 0) is false s = input + 3 (s <= input) is true vulnerable point • Program P' if (input % 2 == 0) { s = input + 2; } else { s = input + 3; } if (s <= input) { /* exit with error */ } ptr = realloc (ptr, s); /* use of ptr */

Step 3: Determine path(s) to the vulnerable point • Not individual paths, but a graph of many paths: • Program P' if (input % 2 == 0) { s = input + 2; } else { s = input + 3; } if (s <= input) { /* exit with error */ } ptr = realloc (ptr, s); /* use of ptr */

Step 3: Determine path(s) to the vulnerable point • Single paths can be found via dynamic tracing, i. e. monitor the sequence of steps executed upon normal input • Control flow graphs (CFG) determined via static analysis • Combination: – find single path dynamically – choose any step in the path – determine statically the partial CFG from that step to the vulnerable point

Step 4: Generate constraint formula • From the start point to the vulnerable, the sequence of conditions that are met in P', but not in P – (input % 2 == 0) is true – s = input + 2 – (s <= input) is true • Constraint formula: – (input % 2 == 0) is true AND (s <= input) is true AND s = input + 2 • Possible to generate constraint formula over a CFG • Program P' if (input % 2 == 0) { s = input + 2; } else { s = input + 3; } if (s <= input) { /* exit with error */ } ptr = realloc (ptr, s); /* use of ptr */

Step 5: Give constraint formula to solver for solution • NP-hard problem => the larger the constraint formula, the longer (exponential time) it takes to solve • Solution of example constraint formula: – (input % 2 == 0) is true AND (s <= input) is true – where s = input + 2 – addition is mod 232 – possible answer: input = 232 - 2 • Polymorphic exploit: solve the new constraint formula: – (input % 2 == 0) is true AND (s <= input) is false AND (input != solutions_we_already_know) – where s = input + 2 – addition is mod 232

Step 6: Verify the 'exploit' • There exists engines (TEMU) that can verify certain security policies, e. g. whether a return address on the stack is overwritten • Verification: – Run software under engine with specified policy – Feed 'exploit' input – Examine results of engine – If negative, and other paths exists, try other paths

3 rd party comments (Robert Graham, Halvar Flake) • Exploit stated in paper not the same exploit used by others • Able to generate input that triggers a vulnerability • Not yet a usable exploit that can: – defeat security mechanisms (chk_esp (), safe_unlink ()) – steal info for info-leakage or equivalent of shell code for hijack control flow • Useful, but not yet ready to generate the equivalent of a worm using this. Overstated the impact • Practical cases may involve large constraints beyond capability of solver. • Automated part least time consuming of steps in developing usable exploits

My comments • Output of binary difference, which one is relevant ? • For GDI vulnerability test case – vulnerable procedure: Get. Event () – Static analysis start point: Copy. Meta. File. W () – Remember solver cannot solve large constraints quickly or it may run out of memory – How to automate finding of suitable start point for static case ?

Conclusion • Novel approach • Overstated claims