Extensible Untrusted Code Verification Robert Schneck with George
Extensible Untrusted Code Verification Robert Schneck with George Necula and Bor-Yuh Evan Chang May 14, 2003 OSQ Retreat
Flexibility for Code Producers • A host receives code from an untrusted agent • Before executing the code, the host wants to verify certain properties (e. g. memory safety) • The host does not want to restrict the code producer…. . . to a particular type system. . . to particular software conventions code untrusted 2/25/2021 ? • Then how do we verify the code? trusted 2
An Untrusted Verifier • The code producer supplies the verifier along with the code verifier code ? verifier untrusted • Too hard to prove correctness of the verifier. . . 2/25/2021 3
An Untrusted Verifier • The code producer supplies the verifier along with the code Open. Ver verifier extension code verifier extension untrusted • Too hard to prove correctness of the verifier. . . • Embed the untrusted verifier as an extension in a trusted framework (the Open Verifier ) 2/25/2021 4
The Open Verifier Decoder state s Core trusted untrusted next states E code Extension • instruction at state s safe P holds • proceed to next states D if • a proof of P • proceed to next states E and a proof that E covers D 2/25/2021 5
The Decoder • The decoder is the canonical symbolic evaluator • Examples of decoding a state (pc = 5 Æ A) instruction 5 r à r + 1 1 2 local safety True next states pc = 6 Æ r 1 = r 2 + 1 Æ 9 x. A[r 1 a x] r 1 à read r 2 addr r 2 pc = 6 Æ r 1 = (sel m r 2) Æ 9 x. A[r 1 a x] jump F True pc = F Æ A if Q then jump F True pc = F Æ A Æ Q , pc = 6 Æ A Æ : Q jump *ra True pc = ra Æ A • The decoder only handles hardware conventions 2/25/2021 6
Soundness and Trustworthiness • We have proven the soundness of the algorithm used by the core of the Open Verifier • The trusted code base (core, decoder, proof checker) could be small and simple – thus easy to trust • We need to ensure the extension is memory safe – Use the extension to verify itself---this one time, run it in a separate address space • What about the extensions? – What does it take to write an extension? – How much can extensions do? 2/25/2021 7
A Type System of Lists • Code producer uses accessible memory for lists – “ 1” is a list (the empty list) – any even address containing a list is a list – nothing else is a list 16 16 20 4 10 1 • Consider the program: 1. 2. 3. 4. 5. 2/25/2021 store a à 1 s à read b if odd(s) then jump 5 store s à a halt s b a 1 8
Informal Proof Obligations 1. 2. 3. 4. 5. store a à 1 s à read b if odd(s) then jump 5 store s à a halt s b a 1 • Proof obligations (informal): “a and b are accessible addresses and if the contents of b after storing 1 at a is even then it is an accessible address ” – Too low level • Code producer would prefer instead: “a and b are non-empty lists ” – Simpler and easier to prove (using the definition of lists) • How can the code producer achieve this effect ? 2/25/2021 9
Typing Rules nelist a list 1 nelist a inv m list (sel m a) 1. store a à 1 2. s à read b 3. if odd(s) then jump 5 4. store s à a 5. halt 2/25/2021 list a even a nelist a inv m nelist a addr a list b Initial inv state: pc = 1 Æ nelist a Æ nelist b Æ inv m Decoder (updlocal m safety: addr a a b) Decoder next state: 1) pc = 2 Æ nelist a Æ nelist b Æ 9 m’. inv m’ Æ m = (upd m’ a Extension next state: pc = 2 Æ nelist a Æ nelist b Æ inv m 10
Typing Rules list 1 nelist a inv m list (sel m a) 1. store a à 1 2. s à read b 3. if odd(s) then jump 5 4. store s à a 5. halt 2/25/2021 nelist a even a nelist a inv m nelist a addr a list b Initial inv state: pc = 2 Æ nelist a Æ nelist b Æ inv m Decoder (updlocal m safety: addr b a b) Decoder next state: pc = 3 Æ nelist a Æ nelist b Æ inv m Æ s = (sel m b) Extension next state: pc = 3 Æ nelist a Æ nelist b Æ inv m Æ list s 11
Producing Proofs • Using the typing lemmas is completely automatizable – We use a Prolog interpreter where the Prolog program consists of the typing rules: nelist a : - list a, even a – Each individual program is then handled automatically • Proving the typing rules is hard – They become lemmas to be proven by hand in Coq Definition nelist [a: val] : = (addr a) / (even a). Definition list [a: val] : = (a = 1) / (nelist a). Lemma rule : (a: val)(list a) ! (even a) ! (nelist a). Proof. . . – But only need to proven once 2/25/2021 12
How to Construct an Extension Proofs of Lemmas Theorem Prover Open. Ver Wrapper Type Checker 2/25/2021 • Instantiate all the typing predicates and rules as definitions and lemmas • Use the typing rules in an automated theorem prover • Package the type-checker state into a logical predicate — requires recognizing invariants which may be implicit in the type-checker • Could simply import an existing type checker (e. g. a bytecode verifier) 13
What can extensions do? • Software conventions of stacks and function calls – The program had better use the stack safely • A low-level extension to prove run-time functions – allocator, garbage collector • Working on an extension for the simple objectoriented language Cool 2/25/2021 14
Experience so far. . . • We have built a prototype implementation – 5500 lines of ML in the trusted framework • 1000 lines to parse Mips assembly code • 2000 lines in the logic and proof checker • 120 lines in the decoder • An extension for the lists example – which also handles the stack and allocation • 1000 lines in the “standard” type checker • 900 lines to package for the Open. Ver and tie it all together • 600 lines for the Prolog interpreter • 250 lines for the Prolog rules • ? ? ? lines of Coq proof – compare 4000 lines of Coq script for Java native-code 2/25/2021 15
Extensible Untrusted Code Verification Robert Schneck with George Necula and Bor-Yuh Evan Chang May 13, 2003 OSQ Retreat
The extension produces different next states. . . 1. generalization – Only need that memory satisfies an invariant, not its contents 2. re-use of states – – loop invariants A program to effect y à y 0 + x 0 (for positive x) 1. while (x > 0) do { 2. xÃx-1 3. yÃy+1 4. } – Invariant on line 1: (x ¸ 0 Æ x + y = x 0 + y 0) 3. “indirect” states – – 2/25/2021 The decoder can’t handle a state without a literal pc An indirect jump could implement function return, method dispatch, switch statement, exception handling. . . 17
- Slides: 17