Satisfiability Modulo Theories An Appetizer SBMF 2009 Gramado

Satisfiability Modulo Theories: An Appetizer SBMF 2009 - Gramado Leonardo de Moura Microsoft Research

Symbolic Reasoning Verification/Analysis tools need some form of Symbolic Reasoning Satisfiability Modulo Theories: An Appetizer

Symbolic Reasoning Logic is “The Calculus of Computer Science” (Z. Manna). High computational complexity Undecidable (FOL + LA) Semi-decidable (First-order logic) NEXPTime-complete (EPR) PSpace-complete (QBF) P-time (Equality) NP-complete (Propositional logic) Satisfiability Modulo Theories: An Appetizer

Applications Test case generation Verifying Compilers Predicate Abstraction Invariant Generation Type Checking Model Based Testing Satisfiability Modulo Theories: An Appetizer

Some Applications @ Microsoft HAVOC Hyper-V Terminator T-2 VCC NModel Vigilante Spec. Explorer SAGE Satisfiability Modulo Theories: An Appetizer F 7

Test case generation unsigned GCD(x, y) { (y 0 > 0) and requires(y > 0); (m 0 = x 0 % y 0) and while (true) { SSA not (m 0 = 0) and unsigned m = x % y; (x 1 = y 0) and if (m == 0) return y; (y 1 = m 0) and x = y; (m 1 = x 1 % y 1) and y = m; (m 1 = 0) } } We want a trace where the loop is executed twice. Satisfiability Modulo Theories: An Appetizer Solver x 0 = 2 y 0 = 4 m 0 = 2 x 1 = 4 y 1 = 2 m 1 = 0

Type checking Signature: div : int, { x : int | x 0 } int Call site: if a 1 and a b then return div(a, b) Verification condition a 1 and a b implies b 0 Satisfiability Modulo Theories: An Appetizer Subtype

Satisfiability Modulo Theories (SMT) Is formula F satisfiable modulo theory T ? SMT solvers have specialized algorithms for T Satisfiability Modulo Theories: An Appetizer

Satisfiability Modulo Theories (SMT) b + 2 = c and f(read(write(a, b, 3), c-2) ≠ f(c-b+1) Satisfiability Modulo Theories: An Appetizer

Satisfiability Modulo Theories (SMT) b + 2 = c and f(read(write(a, b, 3), c-2) ≠ f(c-b+1) Arithmetic Satisfiability Modulo Theories: An Appetizer

Satisfiability Modulo Theories (SMT) b + 2 = c and f(read(write(a, b, 3), c-2) ≠ f(c-b+1) Array Theory Arithmetic Satisfiability Modulo Theories: An Appetizer

Satisfiability Modulo Theories (SMT) b + 2 = c and f(read(write(a, b, 3), c-2) ≠ f(c-b+1) Uninterpreted Array Theory Arithmetic Functions Satisfiability Modulo Theories: An Appetizer

Theories A Theory is a set of sentences Alternative definition: A Theory is a class of structures Satisfiability Modulo Theories: An Appetizer

SMT@Microsoft: Solver Z 3 is a new solver developed at Microsoft Research. Development/Research driven by internal customers. Free for academic research. Interfaces: C/C++ Text . NET Z 3 OCaml http: //research. microsoft. com/projects/z 3 Satisfiability Modulo Theories: An Appetizer

Ground formulas For most SMT solvers: F is a set of ground formulas Many Applications Bounded Model Checking Test-Case Generation Satisfiability Modulo Theories: An Appetizer

Deciding Equality a = b, b = c, d = e, b = s, d = t, a e, a s a b c d e Satisfiability Modulo Theories: An Appetizer s t

Deciding Equality a = b, b = c, d = e, b = s, d = t, a e, a s a a, b b c d e Satisfiability Modulo Theories: An Appetizer s t

Deciding Equality a = b, b = c, d = e, b = s, d = t, a e, a s a, b, c c d e Satisfiability Modulo Theories: An Appetizer s t

Deciding Equality a = b, b = c, d = e, b = s, d = t, a e, a s a, b, c d d, e e Satisfiability Modulo Theories: An Appetizer s t

Deciding Equality a = b, b = c, d = e, b = s, d = t, a e, a s a, b, c, s d, e Satisfiability Modulo Theories: An Appetizer s t

Deciding Equality a = b, b = c, d = e, b = s, d = t, a e, a s a, b, c, s d, e, t Satisfiability Modulo Theories: An Appetizer t

Deciding Equality a = b, b = c, d = e, b = s, d = t, a e, a s a, b, c, s d, e, t Satisfiability Modulo Theories: An Appetizer

Deciding Equality a = b, b = c, d = e, b = s, d = t, a e, a s a, b, c, s d, e, t Unsatisfiable Satisfiability Modulo Theories: An Appetizer

Deciding Equality a = b, b = c, d = e, b = s, d = t, a e a, b, c, s d, e, t Model |M| = { 0, 1 } M(a) = M(b) = M(c) = M(s) = 0 M(d) = M(e) = M(t) = 1 Satisfiability Modulo Theories: An Appetizer

Deciding Equality + (uninterpreted) Functions a = b, b = c, d = e, b = s, d = t, f(a, g(d)) f(b, g(e)) a, b, c, s d, e, t g(d) g(e) f(a, g(d)) f(b, g(e)) Congruence Rule: x 1 = y 1, …, xn = yn implies f(x 1, …, xn) = f(y 1, …, yn) Satisfiability Modulo Theories: An Appetizer

Deciding Equality + (uninterpreted) Functions a = b, b = c, d = e, b = s, d = t, f(a, g(d)) f(b, g(e)) a, b, c, s d, e, t g(d) g(e) g(d), g(e) f(a, g(d)) f(b, g(e)) Congruence Rule: x 1 = y 1, …, xn = yn implies f(x 1, …, xn) = f(y 1, …, yn) Satisfiability Modulo Theories: An Appetizer

Deciding Equality + (uninterpreted) Functions a = b, b = c, d = e, b = s, d = t, f(a, g(d)) f(b, g(e)) a, b, c, s d, e, t g(d), g(e) f(a, g(d)) f(b, g(e)) f(a, g(d)), f(b, g(e)) Congruence Rule: x 1 = y 1, …, xn = yn implies f(x 1, …, xn) = f(y 1, …, yn) Satisfiability Modulo Theories: An Appetizer

Deciding Equality + (uninterpreted) Functions a = b, b = c, d = e, b = s, d = t, f(a, g(d)) f(b, g(e)) a, b, c, s d, e, t g(d), g(e) f(a, g(d)), f(b, g(e)) Unsatisfiable Satisfiability Modulo Theories: An Appetizer

Deciding Equality + (uninterpreted) Functions (fully shared) DAGs for representing terms Union-find data-structure + Congruence Closure O(n log n) Satisfiability Modulo Theories: An Appetizer

Deciding Polynomial Equations (over C) x 2 y – 1 = 0, xy 2 – y = 0, xz – z + 1 = 0 Tool: Gröbner Basis Satisfiability Modulo Theories: An Appetizer

Deciding Polynomial Equations (over C) Polynomial Ideals: Algebraic generalization of zeroness 0 I p I, q I implies p + q I p I implies pq I Satisfiability Modulo Theories: An Appetizer

Deciding Polynomial Equations (over C) The ideal generated by a finite collection of polynomials P = { p 1, …, pn } is defined as: I(P) = {p 1 q 1 + … + pn qn | q 1 , …, qn are polynomials} P is called a basis for I(P). Intuition: For all s I(P), p 1 = 0, …, pn = 0 implies s = 0 Satisfiability Modulo Theories: An Appetizer

Deciding Polynomial Equations (over C) Hilbert’s Weak Nullstellensatz p 1 = 0, …, pn = 0 is unsatisfiable over C iff I({p 1, …, pn}) contains all polynomials 1 I({p 1, …, pn}) Satisfiability Modulo Theories: An Appetizer

Deciding Polynomial Equations (over C) 1 st Key Idea: polynomials as rewrite rules. xy 2 – y = 0 Becomes xy 2 y The rewriting system is terminating but it is not confluent. xy 2 y, x 2 y 1 x 2 y 2 xy y Satisfiability Modulo Theories: An Appetizer

Deciding Polynomial Equations (over C) 2 nd Key Idea: Completion. xy 2 y, x 2 y 1 x 2 y 2 xy y Add polynomial: xy – y = 0 xy y Satisfiability Modulo Theories: An Appetizer

Deciding Polynomial Equations (over C) x 2 y – 1 = 0, xy 2 – y = 0, xz – z + 1 = 0 x 2 y 1, xy 2 y, xz z – 1, xy y xy 1, xy 2 y, xz z – 1, xy y y 1, x 1, 1 = 0, xy y Satisfiability Modulo Theories: An Appetizer

Combining Solvers In practice, we need a combination of theory solvers. Nelson-Oppen combination method. Reduction techniques. Model-based theory combination. Satisfiability Modulo Theories: An Appetizer

SAT (propositional checkers): DPLL M|F Partial model Set of clauses Satisfiability Modulo Theories: An Appetizer

DPLL Guessing (case-splitting) p | p q, q r p, q | p q, q r Satisfiability Modulo Theories: An Appetizer

DPLL Deducing p | p q, p s p, s| p q, p s Satisfiability Modulo Theories: An Appetizer

DPLL Backtracking p, s, q | p q, s q, p q p, s| p q, s q, p q Satisfiability Modulo Theories: An Appetizer

Modern DPLL Efficient indexing (two-watch literal) Non-chronological backtracking (backjumping) Lemma learning … Satisfiability Modulo Theories: An Appetizer

Solvers = DPLL + Decision Procedures Efficient decision procedures for conjunctions of ground literals. a=b, a<5 | a=b f(a)=f(b), a < 5 a > 10 Satisfiability Modulo Theories: An Appetizer

Theory Conflicts a=b, a > 0, c > 0, a + c < 0 | F backtrack Satisfiability Modulo Theories: An Appetizer

Naïve recipe? SMT Solver = DPLL + Decision Procedure Standard question: Why don’t you use CPLEX for handling linear arithmetic? Satisfiability Modulo Theories: An Appetizer

Efficient SMT solvers Decision Procedures must be: Incremental & Backtracking Theory Propagation a=b, a<5 | … a<6 f(a) = a a=b, a<5, a<6 | … a<6 f(a) = a Satisfiability Modulo Theories: An Appetizer

Efficient SMT solvers Decision Procedures must be: Incremental & Backtracking Theory Propagation Precise (theory) lemma learning a=b, a > 0, c > 0, a + c < 0 | F Learn clause: (a=b) (a > 0) (c > 0) (a + c < 0) Imprecise! Precise clause: a > 0 c > 0 a + c < 0 Satisfiability Modulo Theories: An Appetizer

SMT x SAT For some theories, SMT can be reduced to SAT Higher level of abstraction bvmul 32(a, b) = bvmul 32 (b, a) Satisfiability Modulo Theories: An Appetizer

SMT x First-order provers F T First-order Theorem Prover T may not have a finite axiomatization Satisfiability Modulo Theories: An Appetizer

Test case generation

Test case generation Test (correctness + usability) is 95% of the deal: Dev/Test is 1 -1 in products. Developers are responsible for unit tests. Tools: Annotations and static analysis (SAL + ESP) File Fuzzing Unit test case generation Satisfiability Modulo Theories: An Appetizer

Security is critical Security bugs can be very expensive: Cost of each MS Security Bulletin: $600 k to $Millions. Cost due to worms: $Billions. The real victim is the customer. Most security exploits are initiated via files or packets. Ex: Internet Explorer parses dozens of file formats. Security testing: hunting for million dollar bugs Write A/V Read A/V Null pointer dereference Division by zero Satisfiability Modulo Theories: An Appetizer

Hunting for Security Bugs. Two main techniques used by “black hats”: Code inspection (of binaries). Black box fuzz testing: A form of black box random testing. Randomly fuzz (=modify) a well formed input. Grammar-based fuzzing: rules to encode how to fuzz. Heavily used in security testing At MS: several internal tools. Conceptually simple yet effective in practice Satisfiability Modulo Theories: An Appetizer

Directed Automated Random Testing ( DART) Run Test and Monitor seed Execution Path Test Inputs Path Condition Known Paths New input Solve Constraint System Satisfiability Modulo Theories: An Appetizer

DARTish projects at Microsoft PEX Implements DART for. NET. SAGE Implements DART for x 86 binaries. YOGI Implements DART to check the feasibility of program paths generated statically. Vigilante Partially implements DART to dynamically generate worm filters. Satisfiability Modulo Theories: An Appetizer

What is Pex? Test input generator Pex starts from parameterized unit tests Generated tests are emitted as traditional unit tests Satisfiability Modulo Theories: An Appetizer

Array. List: The Spec Satisfiability Modulo Theories: An Appetizer

Array. List: Add. Item Test class Array. List. Test { [Pex. Method] void Add. Item(int c, object item) { var list = new Array. List(c); list. Add(item); Assert(list[0] == item); } } class Array. List { object[] items; int count; Array. List(int capacity) { if (capacity < 0) throw. . . ; items = new object[capacity]; } void Add(object item) { if (count == items. Length) Resize. Array(); items[this. count++] = item; }. . .

Array. List: Starting Pex… class Array. List. Test { [Pex. Method] void Add. Item(int c, object item) { var list = new Array. List(c); list. Add(item); Assert(list[0] == item); } } class Array. List { object[] items; int count; Array. List(int capacity) { if (capacity < 0) throw. . . ; items = new object[capacity]; } void Add(object item) { if (count == items. Length) Resize. Array(); items[this. count++] = item; }. . . Inputs

Array. List: Run 1, (0, null) class Array. List. Test { [Pex. Method] void Add. Item(int c, object item) { var list = new Array. List(c); list. Add(item); Assert(list[0] == item); } } class Array. List { object[] items; int count; Array. List(int capacity) { if (capacity < 0) throw. . . ; items = new object[capacity]; } void Add(object item) { if (count == items. Length) Resize. Array(); items[this. count++] = item; }. . . Inputs (0, null)

Array. List: Run 1, (0, null) class Array. List. Test { [Pex. Method] void Add. Item(int c, object item) { var list = new Array. List(c); list. Add(item); Assert(list[0] == item); } } class Array. List { object[] items; int count; Array. List(int capacity) { if (capacity < 0) throw. . . ; items = new object[capacity]; } void Add(object item) { if (count == items. Length) Resize. Array(); items[this. count++] = item; }. . . c < 0 false Inputs Observed Constraints (0, null) !(c<0)

Array. List: Run 1, (0, null) class Array. List. Test { [Pex. Method] void Add. Item(int c, object item) { var list = new Array. List(c); list. Add(item); Assert(list[0] == item); } } class Array. List { object[] items; int count; Array. List(int capacity) { if (capacity < 0) throw. . . ; items = new object[capacity]; } void Add(object item) { if (count == items. Length) 0 == c Resize. Array(); items[this. count++] = item; }. . . true Inputs Observed Constraints (0, null) !(c<0) && 0==c

Array. List: Run 1, (0, null) class Array. List. Test { [Pex. Method] void Add. Item(int c, object item) { var list = new Array. List(c); list. Add(item); Assert(list[0] == item); } item == item } class Array. List { object[] items; int count; Array. List(int capacity) { if (capacity < 0) throw. . . ; items = new object[capacity]; } void Add(object item) { if (count == items. Length) Resize. Array(); items[this. count++] = item; }. . . Inputs Observed Constraints (0, null) !(c<0) && 0==c true This is a tautology, i. e. a constraint that is always true, regardless of the chosen values. We can ignore such constraints.

Array. List: Picking the next branch to cover class Array. List. Test { [Pex. Method] void Add. Item(int c, object item) { var list = new Array. List(c); list. Add(item); Assert(list[0] == item); } } class Array. List { object[] items; int count; Array. List(int capacity) { if (capacity < 0) throw. . . ; items = new object[capacity]; } void Add(object item) { if (count == items. Length) Resize. Array(); items[this. count++] = item; }. . . Constraints to solve !(c<0) && 0!=c Inputs Observed Constraints (0, null) !(c<0) && 0==c

Array. List: Solve constraints using SMT solver class Array. List. Test { [Pex. Method] void Add. Item(int c, object item) { var list = new Array. List(c); list. Add(item); Assert(list[0] == item); } } class Array. List { object[] items; int count; Array. List(int capacity) { if (capacity < 0) throw. . . ; items = new object[capacity]; } void Add(object item) { if (count == items. Length) Resize. Array(); items[this. count++] = item; }. . . Constraints to solve Inputs Observed Constraints (0, null) !(c<0) && 0==c !(c<0) && 0!=c (1, null)

Array. List: Run 2, (1, null) class Array. List. Test { [Pex. Method] void Add. Item(int c, object item) { var list = new Array. List(c); list. Add(item); Assert(list[0] == item); } } Constraints to solve Inputs Observed Constraints (0, null) !(c<0) && 0==c !(c<0) && 0!=c (1, null) !(c<0) && 0!=c class Array. List { object[] items; int count; Array. List(int capacity) { if (capacity < 0) throw. . . ; items = new object[capacity]; } void Add(object item) { if (count == items. Length) 0 == c Resize. Array(); items[this. count++] = item; }. . . false

Array. List: Pick new branch class Array. List. Test { [Pex. Method] void Add. Item(int c, object item) { var list = new Array. List(c); list. Add(item); Assert(list[0] == item); } } class Array. List { object[] items; int count; Array. List(int capacity) { if (capacity < 0) throw. . . ; items = new object[capacity]; } void Add(object item) { if (count == items. Length) Resize. Array(); items[this. count++] = item; }. . . Constraints to solve Inputs Observed Constraints (0, null) !(c<0) && 0==c !(c<0) && 0!=c (1, null) !(c<0) && 0!=c c<0

Array. List: Run 3, (-1, null) class Array. List. Test { [Pex. Method] void Add. Item(int c, object item) { var list = new Array. List(c); list. Add(item); Assert(list[0] == item); } } class Array. List { object[] items; int count; Array. List(int capacity) { if (capacity < 0) throw. . . ; items = new object[capacity]; } void Add(object item) { if (count == items. Length) Resize. Array(); items[this. count++] = item; }. . . Constraints to solve Inputs Observed Constraints (0, null) !(c<0) && 0==c !(c<0) && 0!=c (1, null) !(c<0) && 0!=c c<0 (-1, null)

Array. List: Run 3, (-1, null) class Array. List. Test { [Pex. Method] void Add. Item(int c, object item) { var list = new Array. List(c); list. Add(item); Assert(list[0] == item); } } Constraints to solve Inputs Observed Constraints (0, null) !(c<0) && 0==c !(c<0) && 0!=c (1, null) !(c<0) && 0!=c c<0 (-1, null) c<0 class Array. List { object[] items; int count; Array. List(int capacity) { if (capacity < 0) throw. . . ; items = new object[capacity]; } void Add(object item) { if (count == items. Length) Resize. Array(); items[this. count++] = item; }. . . c < 0 true

Array. List: Run 3, (-1, null) class Array. List. Test { [Pex. Method] void Add. Item(int c, object item) { var list = new Array. List(c); list. Add(item); Assert(list[0] == item); } } class Array. List { object[] items; int count; Array. List(int capacity) { if (capacity < 0) throw. . . ; items = new object[capacity]; } void Add(object item) { if (count == items. Length) Resize. Array(); items[this. count++] = item; }. . . Constraints to solve Inputs Observed Constraints (0, null) !(c<0) && 0==c !(c<0) && 0!=c (1, null) !(c<0) && 0!=c c<0 (-1, null) c<0

PEX ↔ Z 3 Rich Combination Models Quantifier API Linear arithmetic Bitvector Arrays Free Functions Model used as test inputs Used to model custom theories (e. g. , . NET type system) Huge number of small problems. Textual interface is too inefficient. Satisfiability Modulo Theories: An Appetizer

SAGE Apply DART to large applications (not units). Start with well-formed input (not random). Combine with generational search (not DFS). Negate 1 -by-1 each constraint in a path constraint. Generate many children for each parent run. generation 1 parent Satisfiability Modulo Theories: An Appetizer

Zero to Crash in 10 Generations Starting with 100 zero bytes … SAGE generates a crashing test for Media 1 parser 0000 h: 00000010 h: 00000020 h: 00000030 h: 00000040 h: 00000050 h: 00000060 h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Generation 0 – seed file Satisfiability Modulo Theories: An Appetizer 00 00 00 ; ; ; ; . . . . . .

Zero to Crash in 10 Generations Starting with 100 zero bytes … SAGE generates a crashing test for Media 1 parser 0000 h: 00000010 h: 00000020 h: 00000030 h: 00000040 h: 00000050 h: 00000060 h: 52 00 00 00 49 00 00 00 46 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Generation 1 Satisfiability Modulo Theories: An Appetizer 00 00 00 ; ; ; ; RIFF. . . . . .

Zero to Crash in 10 Generations Starting with 100 zero bytes … SAGE generates a crashing test for Media 1 parser 0000 h: 00000010 h: 00000020 h: 00000030 h: 00000040 h: 00000050 h: 00000060 h: 52 00 00 00 49 00 00 00 46 00 00 00 00 00 00 00 00 ** 00 00 `00 00 00 ** 00 00 00 20 00 00 00 Generation 2 SMT@Microsoft 00 00 00 00 00 00 ; ; ; ; RIFF. . ***. . . . . .

Zero to Crash in 10 Generations Starting with 100 zero bytes … SAGE generates a crashing test for Media 1 parser 0000 h: 00000010 h: 00000020 h: 00000030 h: 00000040 h: 00000050 h: 00000060 h: 52 00 00 00 49 00 00 00 46 00 00 00 3 D 00 00 00 00 00 00 ** 00 00 00 ** 00 00 00 20 00 00 00 00 00 Generation 3 Satisfiability Modulo Theories: An Appetizer 00 00 00 ; ; ; ; RIFF=. . . ***. . . . . .

Zero to Crash in 10 Generations Starting with 100 zero bytes … SAGE generates a crashing test for Media 1 parser 0000 h: 00000010 h: 00000020 h: 00000030 h: 00000040 h: 00000050 h: 00000060 h: 52 00 00 00 49 00 00 00 46 00 00 00 3 D 00 00 73 00 00 00 74 00 00 00 72 00 00 00 68 00 00 ** 00 00 00 20 00 00 00 00 00 Generation 4 Satisfiability Modulo Theories: An Appetizer 00 00 00 ; ; ; ; RIFF=. . . ***. . . . . strh. . .

Zero to Crash in 10 Generations Starting with 100 zero bytes … SAGE generates a crashing test for Media 1 parser 0000 h: 00000010 h: 00000020 h: 00000030 h: 00000040 h: 00000050 h: 00000060 h: 52 00 00 00 49 00 00 00 46 00 00 00 3 D 00 00 73 00 00 00 74 00 00 00 72 00 00 00 68 00 00 ** 00 00 00 20 00 00 76 00 00 00 69 00 00 Generation 5 Satisfiability Modulo Theories: An Appetizer 00 00 00 64 00 00 00 73 00 00 ; ; ; ; RIFF=. . . ***. . . . . strh. . vids. . . . .

Zero to Crash in 10 Generations Starting with 100 zero bytes … SAGE generates a crashing test for Media 1 parser 0000 h: 00000010 h: 00000020 h: 00000030 h: 00000040 h: 00000050 h: 00000060 h: 52 00 00 00 49 00 00 00 46 00 00 00 3 D 00 00 73 73 00 00 74 74 00 00 72 72 00 00 68 66 00 ** 00 00 00 ** 00 00 00 20 00 00 76 00 00 00 69 00 00 Generation 6 Satisfiability Modulo Theories: An Appetizer 00 00 00 64 00 00 00 73 00 00 ; ; ; ; RIFF=. . . ***. . . . . strh. . vids. . strf. . . .

Zero to Crash in 10 Generations Starting with 100 zero bytes … SAGE generates a crashing test for Media 1 parser 0000 h: 00000010 h: 00000020 h: 00000030 h: 00000040 h: 00000050 h: 00000060 h: 52 00 00 00 49 00 00 00 46 00 00 00 3 D 00 00 73 73 00 00 74 74 00 00 72 72 00 00 68 66 00 ** 00 00 00 ** 00 00 00 20 00 00 76 28 00 00 69 00 00 Generation 7 Satisfiability Modulo Theories: An Appetizer 00 00 00 64 00 00 00 73 00 00 ; ; ; ; RIFF=. . . ***. . . . . strh. . vids. . strf. . (. . .

Zero to Crash in 10 Generations Starting with 100 zero bytes … SAGE generates a crashing test for Media 1 parser 0000 h: 00000010 h: 00000020 h: 00000030 h: 00000040 h: 00000050 h: 00000060 h: 52 00 00 00 49 00 00 00 46 00 00 00 3 D 00 00 73 73 00 00 74 74 00 00 72 72 00 00 68 66 00 ** 00 00 00 ** 00 00 00 20 00 00 76 28 C 9 00 00 00 69 00 9 D Generation 8 Satisfiability Modulo Theories: An Appetizer 00 00 00 64 00 E 4 00 00 00 73 00 4 E ; ; ; ; RIFF=. . . ***. . . . . strh. . vids. . strf. . (. . . . É� äN. .

Zero to Crash in 10 Generations Starting with 100 zero bytes … SAGE generates a crashing test for Media 1 parser 0000 h: 00000010 h: 00000020 h: 00000030 h: 00000040 h: 00000050 h: 00000060 h: 52 00 00 00 49 00 00 00 46 00 00 00 3 D 00 00 73 73 00 00 74 74 00 00 72 72 00 00 68 66 00 ** 00 00 00 ** 00 00 00 20 00 00 76 28 01 00 00 00 69 00 00 Generation 9 Satisfiability Modulo Theories: An Appetizer 00 00 00 64 00 00 00 73 00 00 ; ; ; ; RIFF=. . . ***. . . . . strh. . vids. . strf. . (. . .

Zero to Crash in 10 Generations Starting with 100 zero bytes … SAGE generates a crashing test for Media 1 parser 0000 h: 00000010 h: 00000020 h: 00000030 h: 00000040 h: 00000050 h: 00000060 h: 52 00 00 00 49 00 00 00 46 00 00 00 3 D 00 00 73 73 00 00 74 74 00 00 72 72 00 00 68 66 00 ** 00 00 00 B 2 00 ** 00 00 00 75 00 ** 00 00 00 76 00 20 00 00 00 3 A 00 00 76 28 01 00 00 00 69 00 00 Generation 10 – CRASH Satisfiability Modulo Theories: An Appetizer 00 00 00 64 00 00 00 73 00 00 ; ; ; ; RIFF=. . . ***. . . . . strh. . vids. . strf²uv: (. . .

SAGE (cont. ) SAGE is very effective at finding bugs. Works on large applications. Fully automated Easy to deploy (x 86 analysis – any language) Used in various groups inside Microsoft Powered by Z 3. Satisfiability Modulo Theories: An Appetizer

SAGE↔ Z 3 Formulas are usually big conjunctions. SAGE uses only the bitvector and array theories. Pre-processing step has a huge performance impact. Eliminate variables. Simplify formulas. Early unsat detection. Satisfiability Modulo Theories: An Appetizer

Verifying Compilers Annotated Program Verification Condition F pre/post conditions invariants and other annotations

Annotations: Example class C { private int a, z; invariant z > 0 public void M() requires a != 0 { z = 100/a; } }

States and execution traces State Cartesian product of variables Execution trace Nonempty finite sequence of states Infinite sequence of states Nonempty finite sequence of states followed by special error state (x: int, y: int, z: bool) …

Command language x : = E assert P x : = x + 1 x : = 10 havoc x S; T assume P S�T

Reasoning about execution traces Hoare triple { P } S { Q } says that every terminating execution trace of S that starts in a state satisfying P does not go wrong, and terminates in a state satisfying Q

Reasoning about execution traces Hoare triple { P } S { Q } says that every terminating execution trace of S that starts in a state satisfying P does not go wrong, and terminates in a state satisfying Q Given S and Q, what is the weakest P’ satisfying {P’} S {Q} ? P' is called the weakest precondition of S with respect to Q, written wp(S, Q) to check {P} S {Q}, check P P’

Weakest preconditions wp( x : = E, Q ) = wp( havoc x, Q ) = wp( assert P, Q ) = wp( assume P, Q ) = wp( S ; T, Q ) = wp( S � T, Q ) = Q[ E / x ] ( x Q ) P Q wp( S, wp( T, Q )) wp( S, Q ) wp( T, Q )

Structured if statement if E then S else T end = assume E; S � assume ¬E; T

While loop with loop invariant while E invariant J do S end where x denotes the assignment targets of S check that the loop invariant holds initially = assert J; “fast forward” to an arbitrary havoc x; assume J; iteration of the loop ( assume E; S; assert J; assume false � assume ¬E check that the loop invariant is ) maintained by the loop body

Verification conditions: Structure Axioms (non-ground) Control & Data Flow BIG and-or tree (ground)

Hypervisor: A Manhattan Project Hypervisor Hardware Meta OS: small layer of software between hardware and OS Mini: 60 K lines of non-trivial concurrent systems C code Critical: must provide functional resource abstraction Trusted: a verification grand challenge

Hypervisor: Some Statistics VCs have several Mb Thousands of non ground clauses Developers are willing to wait at most 5 min per VC Satisfiability Modulo Theories: An Appetizer

Challenge: annotation burden Partial solutions Automatic generation of: Loop Invariants Houdini-style automatic annotation generation Satisfiability Modulo Theories: An Appetizer

Challenge Quantifiers, quantifiers, … Modeling the runtime h, o, f: Is. Heap(h) o ≠ null read(h, o, alloc) = t read(h, o, f) = null read(h, o, f), alloc) = t Satisfiability Modulo Theories: An Appetizer

Challenge Quantifiers, quantifiers, … Modeling the runtime Frame axioms o, f: o ≠ null read(h 0, o, alloc) = t read(h 1, o, f) = read(h 0, o, f) (o, f) M Satisfiability Modulo Theories: An Appetizer

Challenge Quantifiers, quantifiers, … Modeling the runtime Frame axioms User provided assertions i, j: i j read(a, i) read(b, j) Satisfiability Modulo Theories: An Appetizer

Challenge Quantifiers, quantifiers, … Modeling the runtime Frame axioms User provided assertions Theories x: p(x, x) x, y, z: p(x, y), p(y, z) p(x, z) x, y: p(x, y), p(y, x) x = y Satisfiability Modulo Theories: An Appetizer

Challenge Quantifiers, quantifiers, … Modeling the runtime Frame axioms User provided assertions Theories Solver must be fast in satisfiable instances. We want to find bugs! Satisfiability Modulo Theories: An Appetizer

Bad news There is no sound and refutationally complete procedure for linear integer arithmetic + free function symbols Satisfiability Modulo Theories: An Appetizer

Many Approaches Heuristic quantifier instantiation Combining SMT with Saturation provers Complete quantifier instantiation Decidable fragments Model based quantifier instantiation Satisfiability Modulo Theories: An Appetizer

Challenge: modeling runtime Is the axiomatization of the runtime consistent? False implies everything Partial solution: SMT + Saturation Provers Found many bugs using this approach Satisfiability Modulo Theories: An Appetizer

Challenge: Robustness Standard complain “I made a small modification in my Spec, and Z 3 is timingout” This also happens with SAT solvers (NP-complete) In our case, the problems are undecidable Partial solution: parallelization Satisfiability Modulo Theories: An Appetizer

Parallel Z 3 Joint work with Y. Hamadi (MSRC) and C. Wintersteiger Multi-core & Multi-node (HPC) Different strategies in parallel Strategy 1 Collaborate exchanging lemmas Strategy 5 Strategy 4 Satisfiability Modulo Theories: An Appetizer Strategy 2 Strategy 3

Conclusion Logic as a platform Most verification/analysis tools need symbolic reasoning SMT is a hot area Many applications & challenges http: //research. microsoft. com/projects/z 3 Thank You! Satisfiability Modulo Theories: An Appetizer

E-matching & Quantifier instantiation SMT solvers use heuristic quantifier instantiation. E-matching (matching modulo equalities). Example: x: f(g(x)) = x { f(g(x)) } a = g(b), b = c, Trigger f(a) c Satisfiability Modulo Theories: An Appetizer

E-matching & Quantifier instantiation SMT solvers use heuristic quantifier instantiation. E-matching (matching modulo equalities). Example: x: f(g(x)) = x { f(g(x)) } a = g(b), x=b f(g(b)) = b b = c, f(a) c Equalities and ground terms come from the partial model M Satisfiability Modulo Theories: An Appetizer

E-matching: why do we use it? Integrates smoothly with DPLL. Efficient for most VCs Decides useful theories: Arrays Partial orders … Satisfiability Modulo Theories: An Appetizer

Efficient E-matching is NP-Hard. In practice Problem Fast retrieval Indexing Technique E-matching code trees Incremental E-Matching Inverted path index Satisfiability Modulo Theories: An Appetizer

E-matching code trees Trigger: Instructions: f(x 1, g(x 1, a), h(x 2), b) Compiler Similar triggers share several instructions. Combine code sequences in a code tree 1. 2. 3. 4. 5. 6. 7. Satisfiability Modulo Theories: An Appetizer init(f, 2) check(r 4, b, 3) bind(r 2, g, r 5, 4) compare(r 1, r 5, 5) check(r 6, a, 6) bind(r 3, h, r 7, 7) yield(r 1, r 7)

DPLL( ) Tight integration: DPLL + Saturation solver. Sa tu rat Axioms ion So (non-ground) lve r SM BIG T and-or tree (ground) Satisfiability Modulo Theories: An Appetizer

) DPLL( Inference rule: DPLL( ) is parametric. Examples: Resolution Superposition calculus … Satisfiability Modulo Theories: An Appetizer

DPLL( ) Ground literals Saturation Solver Ground clauses Satisfiability Modulo Theories: An Appetizer DPLL + Theories