http kframework org Specify and Verify Your Language

  • Slides: 44
Download presentation
http: //k-framework. org Specify and Verify Your Language using K Grigore Rosu University of

http: //k-framework. org Specify and Verify Your Language using K Grigore Rosu University of Illinois at Urbana-Champaign Joint project between the FSL group at UIUC (USA) and the FMSE group at UAIC (Romania)

K Team • UIUC, USA – – – – Grigore Rosu (started K in

K Team • UIUC, USA – – – – Grigore Rosu (started K in 2003) Cansu Erdogan Patrick Meredith Eric Mikida Brandon Moore Daejun Park Andrei Stefanescu Former members – – – – Kyle Blocher Peter Dinges Chucky Ellison Dwight Guth Mike Ilseman David Lazar Traian Serbanuta • UAIC, Iasi, Romania – – – – Dorel Lucanu Traian Serbanuta Andrei Arusoae Denis Bogdanas Stefan Ciobaca Gheorghe Grigoras Radu Mereuta Former Members – – Irina Asavoae Mihai Asavoae Emilian Necula Raluca Necula

Vision and Objective Parser Interpreter Deductive program verifier Test-case generation Formal Language Definition (Syntax

Vision and Objective Parser Interpreter Deductive program verifier Test-case generation Formal Language Definition (Syntax and Semantics) Compiler (semantic) Debugger Model checker Symbolic execution

Current State-of-the-Art in PL Design, Implementation and Analysis Consider some programming language, L •

Current State-of-the-Art in PL Design, Implementation and Analysis Consider some programming language, L • Formal semantics of L? – Typically skipped: considered expensive and useless • Implementations for L – Based on some adhoc understanding of what L is • Model checkers for L – Based on some adhoc encodings/models of L • Program verifiers for L – Based on some other adhoc encodings/models of L • …

Example of C Program • What should the following program evaluate to? int main(void)

Example of C Program • What should the following program evaluate to? int main(void) { int x = 0; return (x = 1) + (x = 2); } • According to the C “standard”, it is undefined • GCC 4, MSVC: it returns 4 GCC 3, ICC, Clang: it returns 3 By April 2011, both Frama-C (with its Jessie verification plugin) and Havoc "prove" it returns 4

A Formal Semantics Manifesto • Programming languages must have formal semantics! – And analysis/verification

A Formal Semantics Manifesto • Programming languages must have formal semantics! – And analysis/verification tools should build on them • Otherwise they are adhoc and likely wrong • Informal manuals are not sufficient – Manuals typically have a formal syntax of the language (in an appendix) – Why not a formal semantics appendix as well?

Motivation and Goal We want a semantic framework which makes it easy, fun and

Motivation and Goal We want a semantic framework which makes it easy, fun and useful to define programming languages and to reason about programs!

Our Approach Parser Interpreter Deductive program verifier Test-case generation Formal Language Definition (Syntax and

Our Approach Parser Interpreter Deductive program verifier Test-case generation Formal Language Definition (Syntax and Semantics) Compiler (semantic) Debugger Model checker Symbolic execution

Formal Language Definition (Syntax and Semantics) If one needs a Ph. D to define

Formal Language Definition (Syntax and Semantics) If one needs a Ph. D to define a language, then we have already failed.

Complete K Definition of Kernel. C

Complete K Definition of Kernel. C

Complete K Definition of Kernel. C Syntax declared using annotated BNF …

Complete K Definition of Kernel. C Syntax declared using annotated BNF …

Complete K Definition of Kernel. C Configuration given as a nested cell structure. Leaves

Complete K Definition of Kernel. C Configuration given as a nested cell structure. Leaves can be sets, multisets, lists, maps, or syntax

Complete K Definition of Kernel. C Semantic rules given contextually <k> X = V

Complete K Definition of Kernel. C Semantic rules given contextually <k> X = V => V …</k> <env>… X |-> (_ => V) …</env>

Underlying Semantics • Best explained in terms of graph rewriting – Double pushout gives

Underlying Semantics • Best explained in terms of graph rewriting – Double pushout gives true concurrency in the [ICGT’ 12] presence of configuration sharing • Also by translation to rewrite logic – Generic translation of graph rewriting (slow) – Eliminating sharing (fast but loses concurrency) • Currently how K is implemented • Most users are not aware of K’s complex semantics; don’t need it in order to use K

Implementation (Java) Front-end Two SDF-based parsers generated, one for programs (concrete syntax) and one

Implementation (Java) Front-end Two SDF-based parsers generated, one for programs (concrete syntax) and one for the semantics (concrete + abstract + K syntax) – One can also use custom parsers for programs Back-ends Maude+Z 3 (most features); Latex; Java+Z 3 (prototype); Coq, ACL 2 (in progress) Maude+Z 3 SDF Latex Custom Java+Z 3 Coq, ACL 2

K Demo • Using Kweb, an online interface to K – http: //kframework. org

K Demo • Using Kweb, an online interface to K – http: //kframework. org

K Scales Besides smaller and paradigmatic teaching languages, several larger languages were defined •

K Scales Besides smaller and paradigmatic teaching languages, several larger languages were defined • Java : 1. 4 by Chen &Farzan, and 7 by Bogdanas • Verilog : by Meredith and Katelman • Phyton : by Guth • C : by Ellison etc.

K Configuration and Definition of C Heap … plus ~1200 rules … 75 Cells!

K Configuration and Definition of C Heap … plus ~1200 rules … 75 Cells!

K Semantics are testable! Parser Interpreter Formal Language Definition (Syntax and Semantics) (semantic) Debugger

K Semantics are testable! Parser Interpreter Formal Language Definition (Syntax and Semantics) (semantic) Debugger

Testing the K definition of C • Tested on thousands of C programs (several

Testing the K definition of C • Tested on thousands of C programs (several benchmarks, including the gcc torture test, code from the obfuscated C competition, etc. ) – Passed 99. 2% so far! – GCC 4. 1. 2 passes 99%, ICC 99. 4%, Clang 98. 3% (no opt. ) • The most complete formal C semantics [POPL’ 12]

Comparisons of C Semantics

Comparisons of C Semantics

Formal Language Definition (Syntax and Semantics) Model checker

Formal Language Definition (Syntax and Semantics) Model checker

Model Checking C Programs • Detects bugs in finite-state C programs; e. g. ,

Model Checking C Programs • Detects bugs in finite-state C programs; e. g. , races • Not discussed here

Deductive program verifier Formal Language Definition (Syntax and Semantics) Symbolic execution

Deductive program verifier Formal Language Definition (Syntax and Semantics) Symbolic execution

Many different State-of-the-Art program logics for properties: FOL, • Redefine the language using“state” a

Many different State-of-the-Art program logics for properties: FOL, • Redefine the language using“state” a different semantic HOL, Separation logic… approach (Hoare/separation/dynamic logic) • Very language specific, error-prone; e. g. :

State-of-the-Art • Thus, these semantics need to be proved sound, sometimes also relatively complete,

State-of-the-Art • Thus, these semantics need to be proved sound, sometimes also relatively complete, wrt trusted, operational semantics of the language • Verification tools developed using them • So we have an inherent gap between trusted, operational semantics, and the semantics currently used for program verification

Deductive program verifier Our Proposal • Use directly the trusted operational semantics! Formal Language

Deductive program verifier Our Proposal • Use directly the trusted operational semantics! Formal Language Definition (Syntax and Semantics) – Has been done before (ACL 2), but proofs are low-level (induction on the transition system) and language-specific Symbolic execution • We propose a language-independent proof system – Takes operational semantics as axioms – Derives reachability properties – Is sound and relatively complete

Need a means to specify static and dynamic program properties Deductive program verifier Formal

Need a means to specify static and dynamic program properties Deductive program verifier Formal Language Definition (Syntax and Semantics) Symbolic execution

Matching Logic for Static Properties http: //matching-logic. org • Logic for specifying static properties

Matching Logic for Static Properties http: //matching-logic. org • Logic for specifying static properties about program configurations and reason with them – Generalizes separation logic • Key insight: – Configuration terms with variables are allowed to be used as predicates, called patterns! – Semantically, their satisfaction means matching

Examples of Patterns • x points to sequence A with |A|>1, and the reversed

Examples of Patterns • x points to sequence A with |A|>1, and the reversed sequence rev(A) has been output |A| >1 • untrusted()can only be called from trusted()

More Formally: Configurations • For concreteness, assume configurations having the following syntax: (matching logic

More Formally: Configurations • For concreteness, assume configurations having the following syntax: (matching logic works with any configurations) • Examples of concrete (ground) configurations:

More Formally: Patterns • Concrete configurations are already patterns, but very simple ones, ground

More Formally: Patterns • Concrete configurations are already patterns, but very simple ones, ground patterns • Example of more complex pattern • Thus, patterns generalize both terms and [FOL]

More Formally: Reasoning • We can now prove (using [FOL] reasoning) properties about configurations,

More Formally: Reasoning • We can now prove (using [FOL] reasoning) properties about configurations, such as

Matching Logic vs. Separation Logic • Matching logic achieves separation through matching at the

Matching Logic vs. Separation Logic • Matching logic achieves separation through matching at the structural (term) level, not through special logical connectives (*). • Separation logic = Matching logic [heap] SL: ML: [OOPSLA’ 12] • Matching logic realizes separation at all levels of the configuration, not only in the heap – the heap was only 1 out of the 75 cells in C’s def.

Need a means to specify static and dynamic program properties Deductive program verifier Formal

Need a means to specify static and dynamic program properties Deductive program verifier Formal Language Definition (Syntax and Semantics) Symbolic execution

Reachability Rules for Dynamic Properties • “Rewrite” rules over matching logic patterns: (generalize to

Reachability Rules for Dynamic Properties • “Rewrite” rules over matching logic patterns: (generalize to conditional rules) • Since patterns generalize terms, matching logic reachability rules capture term rewriting rules • Moreover, deals naturally with side conditions: turn into

Expressivity of Reachability Rules • Capture operational semantics rules: • Capture Hoare Triples:

Expressivity of Reachability Rules • Capture operational semantics rules: • Capture Hoare Triples:

Reachability Logic • Language-independent proof system for deriving sequents of the form where A

Reachability Logic • Language-independent proof system for deriving sequents of the form where A (axioms) and C (circularities) are sets of reachability rules • Intuitively: symbolic execution with operational semantics + reasoning with cyclic behaviors

Proof System for Reachability Proves any reachability property of any lang. , including anything

Proof System for Reachability Proves any reachability property of any lang. , including anything that Hoare logic can (proofs of comparable size) [FM’ 12] Sound (partially correct) and relatively complete [ICALP’ 12], [OOPSLA’ 12], [LICS’ 13]

Traditional Verification vs. Our Approach Traditional proof systems: language-specific Our proof system: language-independent

Traditional Verification vs. Our Approach Traditional proof systems: language-specific Our proof system: language-independent

Match. C Demo?

Match. C Demo?

Example – Swapping Values • What is the K semantics of the swap function?

Example – Swapping Values • What is the K semantics of the swap function? • Let $ be its body $ rule <k> $ => return; …</k> <heap>… x|->(a=>b), y|->(b=>a) …</heap> if x = y rule <k> $ => return; …</k> <heap>… x|-> a …</heap> if x = y

Example – Reversing a list $ • What is the K semantics of the

Example – Reversing a list $ • What is the K semantics of the reverse function? • Let $ be its body rule <k> $ => return p; </k> <heap>… list(x, A) => list(p, rev(A)) …</heap>

Conclusion: It can be done! Parser Interpreter Deductive program verifier Test-case generation Formal Language

Conclusion: It can be done! Parser Interpreter Deductive program verifier Test-case generation Formal Language Definition (Syntax and Semantics) Compiler (semantic) Debugger Model checker Symbolic execution