Secure Compiler Seminar 117 Survey Modular Development of

  • Slides: 28
Download presentation
Secure Compiler Seminar 11/7 Survey: Modular Development of Certified Program Verifiers with a Proof

Secure Compiler Seminar 11/7 Survey: Modular Development of Certified Program Verifiers with a Proof Assistant Toshihiro YOSHINO (D 1, Yonezawa Lab. ) <tossy-2@yl. is. s. u-tokyo. ac. jp>

Today’s Paper l A. Chlipala (UC Berkeley). Modular Development of Certified Program Verifiers with

Today’s Paper l A. Chlipala (UC Berkeley). Modular Development of Certified Program Verifiers with a Proof Assistant. ICFP ’ 06. ¡ Implementation can be downloaded from web site below: ⇒ http: //proofos. sourceforge. net/

Overview of the Paper l Case study to develop a certified program verifier with

Overview of the Paper l Case study to develop a certified program verifier with Coq ¡ Verifies memory safety of x 86 machine code ¡ Its soundness is machine-checked ¡ Modular development by reusable functors l Possible to create a new verifier based on another type system with low cost

Constructing Certified Verifiers l Design and implement with Coq ¡ l Use “extraction” feature

Constructing Certified Verifiers l Design and implement with Coq ¡ l Use “extraction” feature of Coq to obtain a working verifier A verifier can be formalized as: ¡ load: program -> state loads a program l ¡ ¡ The type program represents binary file format safe: state -> Prop is the safety property we wish to verify for programs [[P]] is notation for poption P l option(O’Caml) or Maybe(Haskell) for domain Prop

Constructing Certified Verifiers l Abstraction refinement by multiple stages ¡ Each stage (component) is

Constructing Certified Verifiers l Abstraction refinement by multiple stages ¡ Each stage (component) is a functor which transforms target states into source states l Later components reason at higher levels of abstraction ¡ Use Coq’s module system to implement this modular design

Formalization of x 86 Instruction Set l PCC-style formalization ¡ Subset of x 86

Formalization of x 86 Instruction Set l PCC-style formalization ¡ Subset of x 86 instruction set + ERROR instruction l l mov, jcc, … Safety ≡ ERROR is unreachable In combination with assertion, many properties can be proven ¡ Can be formalized coinductively ¡ l l Cope with infinite derivation

Types and Extraction in Coq l Basically Coq manipulates on terms of dependently-typed lambda

Types and Extraction in Coq l Basically Coq manipulates on terms of dependently-typed lambda calculus ¡A proposition is represented as a type, its proof as a term of that type l Well known as Curry-Howard isomorphism ¡ Proving step corresponds to type inference Given a goal, refine it interactively into subgoals, and eliminate holes l Rules used for these steps are called tactics l

Types and Extraction in Coq l Program extraction from Coq code In short, extraction

Types and Extraction in Coq l Program extraction from Coq code In short, extraction is to erase terms of sorts other than Set ¡ Brief example: is. Even ¡ Definition is. Even : forall (n: nat), poption (even n). refine (fix is. Even (n: nat) : poption (even n) : = match n return … with | O => PSome _ _ | S (S n) => … | _ => PNone _ end); auto. Qed. let rec is. Even (n: nat) = match n with | O -> true | S (S n) -> is. Even n | _ -> false

poption: “option” for Domain “Prop” l Two constructors: PNone and PSome ¡ PSome l

poption: “option” for Domain “Prop” l Two constructors: PNone and PSome ¡ PSome l Literately, PSome means “P holds and I have a proof for that” and PNone “I am not sure” ¡ Can l ¡ In is given a proof of P be used as failure-monad PNone >>= _ PSome p >>= f = PNone =fp extraction, PSome corresponds to true, and PNone to false

soption l soption extends poption with a parameter ¡ Proposition about a term of

soption l soption extends poption with a parameter ¡ Proposition about a term of domain T (of sort Set) ¡ soption, l In too, can be used as failure monad the paper’s theoretical part, written as {{ x : T | P }}

Coq’s Module System l Used to build re-usable verification components ¡ Frequent pattern: Module

Coq’s Module System l Used to build re-usable verification components ¡ Frequent pattern: Module M 86 <: MACHINE. Definition mstate : = state. Definition minstr : = instr. … End M 86. Module Type MACHINE. Parameter mstate : Set. Parameter minit. State : mstate -> Prop. … End MACHINE. Record state : Set : = { Inductive : instr : Set : = st. Regs 32 regs 32; : … exec : … : = … Arith Inductive }. | ……. .

Module Model. Check l Provides fundamental methods of model checking ¡ Methods to prove

Module Model. Check l Provides fundamental methods of model checking ¡ Methods to prove theorems about infinite state systems through exhaustive exploration l Abstract Refine the model in each of the following stages Concrete

Module Model. Check Introduced Elements l abs. State: a set of abstract states ¡

Module Model. Check Introduced Elements l abs. State: a set of abstract states ¡ An abstract state is managed with “hypotheses”, states that are known to be safe l l         describes correspondence between machine states and abstract states ¡ l Hypothesis is used, for example, to formalize return pointer from a function Context(Γ) is deleted in extracting a verifier init is a set (actually a list) of initial states ¡ ¡ It must be a set because one real machine state may correspond to multiple abstract states There must be some elements in init that has no hypothesis

Module Model. Check Introduced Elements l step describes execution step ¡ Execute an instruction

Module Model. Check Introduced Elements l step describes execution step ¡ Execute an instruction from the specified state l ¡ soption is used because the execution may get stuck Progress and Preservation must hold Progress Preservation

Module Model. Check The Concept Illustrated Initial states State space of a real machine

Module Model. Check The Concept Illustrated Initial states State space of a real machine MACHINE: Input to the module abs. State step

Module Reduction l Translates x 86 machine language into simpler RISC-style instruction set (SAL)

Module Reduction l Translates x 86 machine language into simpler RISC-style instruction set (SAL) ¡ x 86 machine language is too complex and not suitable for verification purposes One instruction may perform several basic operations l The same basic operations show up in the working of many instructions l l Reduction module also provides model checking layer for SAL programs

Module Reduction SAL: Simplified Assembly Language l Named after the language used in Proof.

Module Reduction SAL: Simplified Assembly Language l Named after the language used in Proof. Carrying Code[Necula 1997] l RISC-style instruction set ¡ Arithmetics are extended to allow expressions with parentheses and infix operators l Additional temporary registers TMPi

Module Fixed. Code l Ensures that code region is not overwritten by the code

Module Fixed. Code l Ensures that code region is not overwritten by the code itself ¡ To simplify the verification framework l Definition is in the form of Model. Check ¡ Additional check is performed only on storing to the memory

Module Type. System l Support for a standard approach for type systems ¡ ¡

Module Type. System l Support for a standard approach for type systems ¡ ¡ A set of types is introduced and typing rules for values are described Subtype relation is also introduced l ¡ The definition in the figure suffices because Coq takes care of that part And each register is associated with a type

Module Type. System l view. Shift represents shift of types’ view ¡ Occurs at

Module Type. System l view. Shift represents shift of types’ view ¡ Occurs at places a program crosses an abstraction boundary For example, in function calls when the stack frame changes l Introducing existential is also a kind of view shift l

Module Weak. Update l Introduces a type system of weak update ¡ Each memory

Module Weak. Update l Introduces a type system of weak update ¡ Each memory cell has a type associated and this type does not change during a run l A cell can be overwritten only with a value of its type l Dynamic memory management is out of the scope ¡ In real setting, memory is frequently reclaimed and reused l Garbage collector or malloc/free

The Rest of Modules l Module Stack. Types ¡ l Keeps track of types

The Rest of Modules l Module Stack. Types ¡ l Keeps track of types of stack slots Module Simple. Flags ¡ Keeps track of flag values l ¡ In x 86 (too), no atomic instruction for conditional test and jump at one time Crucial for assuring pointer is valid (not null) or checking array boundary

Case Study: A Verifier for Algebraic Datatypes l Implemented the library and a sample

Case Study: A Verifier for Algebraic Datatypes l Implemented the library and a sample verifier with Coq ¡ http: //proofos. sourceforge. net/ ¡ Approx. 20 K(+α) Lo. C Main implementation consists of only 600 Lo. C l 7, 000 Lo. C for implementing library components l 10, 000 for generic utility l • 1, 000 for bitvectors and fixed-precision arithmetics • 1, 000 for a subset of x 86 machine code ¡ Auxiliary library from O’Caml implementation (not counted here) l x 86 binary parsing, etc.

Related Work l Foundational PCC[Appel 2001] ¡ ¡ Reduce TCB and also improve flexibility

Related Work l Foundational PCC[Appel 2001] ¡ ¡ Reduce TCB and also improve flexibility of PCC by constructing a system on some logical framework However, efficiency is sacrificed by generality l l Theoretical issues seem to have priority to pragmatics Epigram[Mc. Bride, Mc. Kinna 2004], ATS[Chen, Xi 2005], RSP[Westbrook et al. 2005] and GADTs[Sheard 2004] ¡ ¡ Incorporate dependent types into program languages But the foundations of Coq’s implementation and metatheories are simpler than them

Summary (of the Paper) l Designed a structure for modular certified verifiers ¡ Components

Summary (of the Paper) l Designed a structure for modular certified verifiers ¡ Components are reusable functors ¡ Pipeline-style design l Implemented ¡ As library components with Coq a case study, memory safety verifier for x 86 machine code is constructed

Relevance to My Research l. I have been studying a framework to build verifiers

Relevance to My Research l. I have been studying a framework to build verifiers for low-level languages ¡ First formalize the common language ADL ¡ Verification is done on the translated program (in ADL) l Trying to prove correctness of translation ¡ Currently ongoing with Coq

Relevance to My Research l Both very similar approach ADL and SAL are both

Relevance to My Research l Both very similar approach ADL and SAL are both designed in a minimalist criteria ¡ Verification logic is built on top of the common language’s semantics ¡ l ¡ To achieve high portability and flexibility From this viewpoint, my project is covered by his… (x_x) l l Correctness of translation is also proven by Coq in proofos Positively thinking, my direction was not so wrong

Relevance to My Research l Comparison of two projects… proofos L 3 Cover [Chlipala

Relevance to My Research l Comparison of two projects… proofos L 3 Cover [Chlipala 06] [Yoshino 06] Common Language SAL ADL Implementation Coq Java Parametrization ML-style module OO-style (inheritance)