Secure Compiler Seminar 411 Visions toward a Secure

  • Slides: 42
Download presentation
Secure Compiler Seminar 4/11 Visions toward a Secure Compiler Toshihiro YOSHINO <tossy-2@yl. is. s.

Secure Compiler Seminar 4/11 Visions toward a Secure Compiler Toshihiro YOSHINO <tossy-2@yl. is. s. u-tokyo. ac. jp> (D 1, Yonezawa Lab. )

Talk Agenda Brief Introduction about TAL and PCC ¢ Introduction of my Master Thesis

Talk Agenda Brief Introduction about TAL and PCC ¢ Introduction of my Master Thesis ¢ Visions toward a Secure Compiler ¢

Brief Introduction about TAL and PCC

Brief Introduction about TAL and PCC

Background ¢ Program verification = Mathematically assure a program has certain properties l Useful

Background ¢ Program verification = Mathematically assure a program has certain properties l Useful for security • Memory access safety, information flow analysis, … ¢ Verifying low-level code directly reduces TCB l l l TCB: Trusted Computing Base High-level code must be compiled after verified ⇒ We must trust the compiler Assemblers are much simpler than compilers

Current Techniques and Problems ¢ Code signing l l l ¢ Based on public

Current Techniques and Problems ¢ Code signing l l l ¢ Based on public key cryptography Can prove the genuineness of code Cannot prove the safety by itself Signature matching l l l Use a dictionary of malicious patterns and match target programs against it Employed in many antivirus systems Pass does NOT mean safety • Often unable to detect very new virus

Proof-Carrying Code [Necula et al. 1997] ¢ Technique for safe execution of untrusted code

Proof-Carrying Code [Necula et al. 1997] ¢ Technique for safe execution of untrusted code l ¢ Code consumer does not need to trust the producer Code distributed with the proof of its safety Producer creates a proof l Consumer verifies the proof against his security policy l

Proof-Carrying Code [Necula et al. 1997] ¢ Low consumer’s cost l Consumer has only

Proof-Carrying Code [Necula et al. 1997] ¢ Low consumer’s cost l Consumer has only to verify the proof • For example, by typechecking ¢ Tamper-proof l If passed the check, code does NOT harm even if modified • If modification makes the code fail the check, the code will not run and it is safe • Otherwise code still obeys the consumer’s security policy

Typed Assembly Language [Morrisett et al. 1999] ¢ Extends a conventional assembly language with

Typed Assembly Language [Morrisett et al. 1999] ¢ Extends a conventional assembly language with static type checking l ¢ An instance of Proof-Carrying Code By type checking, it can guarantee l Memory access safety • Program never accesses outside the memory area allocated for it l Interface consistency • Type agreement of arguments / return value of functions etc.

TAL System Illustrated TAL System Type Checker Code with type information Assembler Linker Code

TAL System Illustrated TAL System Type Checker Code with type information Assembler Linker Code Consumer

A Brief Example of TAL Program Type Information fact: {eax: B 4} movl %eax,

A Brief Example of TAL Program Type Information fact: {eax: B 4} movl %eax, %ecx (Used to typechecking a program) movl $1, %eax loop: {eax: B 4, ecx: B 4} mull %ecx decl %ecx cmpl $0, %ecx Program Code (Same as conventional jg loop assembly languages) {eax: B 4} end:

Related Work: TALK, TOS [Maeda, 2005] ¢ TALK: TAL for Kernel Morrisett et al.

Related Work: TALK, TOS [Maeda, 2005] ¢ TALK: TAL for Kernel Morrisett et al. uses garbage collector for memory management in TAL l For OS, GC cannot be assumed l • Must implement memory management (malloc/free) ¢ TOS: Typed Operating System l An experimental OS written in TALK

Introduction of My Master Thesis

Introduction of My Master Thesis

My Work for Master Thesis ¢ “A Framework Using a Common Language to Build

My Work for Master Thesis ¢ “A Framework Using a Common Language to Build Program Verifiers for Low-Level Languages” To help developers of program verifiers l To be a common basis for verification of low-level programs l • Such as assembly and machine languages

Motivation: Verifiers are Hard to Develop Especially in low-level languages… ¢ Complex semantics Semantics

Motivation: Verifiers are Hard to Develop Especially in low-level languages… ¢ Complex semantics Semantics of each instruction is complex l There are many instructions in a language l ¢ Low portability Low-level languages heavily depend on the underlying architecture l Accordingly, entire verifier also depends on the underlying architecture l

Our Idea ¢ Split a verifier into three parts 1. 2. 3. l l

Our Idea ¢ Split a verifier into three parts 1. 2. 3. l l Design a common language, Translate the target program into that language, and Verify the translated program These parts are explicitly independent from each other Thus we can replace them easily

Our Idea Translator Translated Program (2) Target Program Result Success /Fail (3) Verification Logic

Our Idea Translator Translated Program (2) Target Program Result Success /Fail (3) Verification Logic Semantics of Common Language (1) Verifier

How Do We Solve the Problems? ¢ Coping with complex semantics Only translators care

How Do We Solve the Problems? ¢ Coping with complex semantics Only translators care the semantics of the source language l Translator is reusable l • Once description is done, we can reuse it ¢ Improving portability l Verification logic is also reusable • Once implemented, it can be used for other architectures simply by replacing translators

How Do We Solve the Problems? Translator Translated Program in Target Another Program Language

How Do We Solve the Problems? Translator Translated Program in Target Another Program Language Result Success /Fail Verification Logic Semantics of Common Language Verifier

Overview of the Work ¢ Designed a framework to build program verifier Designed a

Overview of the Work ¢ Designed a framework to build program verifier Designed a common language ADL l Discussed the correctness of translators l Proved that the properties assured are preserved throughout translation l l Implemented the framework using Java

ADL: A Common Language Translator Translated Program Target Program Result Success /Fail Verification Logic

ADL: A Common Language Translator Translated Program Target Program Result Success /Fail Verification Logic Semantics of Common Language Verifier

ADL: A Common Language Design Concept ¢ ADL: Architecture Description Language ¢ From observation

ADL: A Common Language Design Concept ¢ ADL: Architecture Description Language ¢ From observation of many architectures l l ¢ Expressiveness l l ¢ Data is stored in registers and memory, and manipulates it according to program Only jumps are sufficient for control flow structure Arithmetics, logical operations, … C-like expressions Conservative semantics l l No need to describe indecent programs To simplify semantics

ADL: A Common Language Overview of the Language ¢ Imperative language which manipulates registers

ADL: A Common Language Overview of the Language ¢ Imperative language which manipulates registers and memory l 5 kinds of commands • nop, error, assignment, goto, if-then-else l Much like C than assembly • Infix operators, parenthesized formulae • Conditional execution by arbitrary condition using if command l Only goto modifies control flow • Unconditional branch

ADL: A Common Language A Brief Example data: . . . ADL data: .

ADL: A Common Language A Brief Example data: . . . ADL data: . . . main: %ebx = &data; %eax = 0; goto &lp; lp: %eax = %eax + *[4](%ebx); %ebx = *[4](%ebx + 4); if %ebx == &null then goto &end else goto &lp; main: movl end: goto &end; end: x 86 $data, %ebx $0, %eax lp: addl movl cmpl je jmp 0(%ebx), %eax 4(%ebx), %ebx $0, %ebx end lp jmp end

ADL: A Common Language Restrictions ¢ ADL has a few restrictions by design l

ADL: A Common Language Restrictions ¢ ADL has a few restrictions by design l l Code and data are completely separated We assume NOTHING about memory layout of a program To simplify the semantics Some programs cannot be expressed • • However, most of decent programs can be written even under these restrictions To be discussed in the next slide

ADL: A Common Language > Restrictions Separation of Code and Data ¢ Do not

ADL: A Common Language > Restrictions Separation of Code and Data ¢ Do not treat code as data l ¢ ADL programs cannot read / write code We cannot express the programs which uses dynamic code generation l But, patterns of the generated code is fixed in many cases ⇒ Other solution is possible • For example, prepare a function for each pattern of code

ADL: A Common Language > Restrictions Not Assume Memory Layout ¢ Casting is prohibited

ADL: A Common Language > Restrictions Not Assume Memory Layout ¢ Casting is prohibited l ADL distinguishes integers and pointers • In real architectures, pointers are not distinguished from integers ¢ Pointer arithmetic is restricted l Only pointer+integer, pointer-pointer are defined • Other operations returns ‘undetermined’ l Sufficient for array/structure operations and offset calculation

Program Translator Translated Program Target Program Result Success /Fail Verification Logic Semantics of Common

Program Translator Translated Program Target Program Result Success /Fail Verification Logic Semantics of Common Language Verifier

Program Translator ¢ Translates low-level programs into ADL ¢ We must assure that program

Program Translator ¢ Translates low-level programs into ADL ¢ We must assure that program translators are correct Otherwise, we cannot trust the entire verifier l Correctness is defined in the following discussion l

Program Translator What Is Correctness of Program Translation? ¢ ¢ Instruction = Function over

Program Translator What Is Correctness of Program Translation? ¢ ¢ Instruction = Function over machine states Correctness = Correspondence between states of two machines are preserved in translation State Original Program State’ State Translated Program State’

Program Translator How to Confirm Correctness of Translation ¢ Any programs result in corresponding

Program Translator How to Confirm Correctness of Translation ¢ Any programs result in corresponding states for any input ⇒ Correctness Total inspection is NOT realistic l Theorem prover would be useful l • Automatic proving is one of future work • But how to confirm the correctness of the description of the source language? ¢ At this time, we take empirical approach l Test several cases using an interpreter

Verification Logic Translator Translated Program Target Program Result Success /Fail Verification Logic Semantics of

Verification Logic Translator Translated Program Target Program Result Success /Fail Verification Logic Semantics of Common Language Verifier

Verification Logic ¢ Verifies the properties of translated programs l l Function that takes

Verification Logic ¢ Verifies the properties of translated programs l l Function that takes a program and returns success or fail Soundness must be assured • This is the task for the creator of a verification logic • Here we do not discuss any further ¢ Definition: Soundness of a verification logic l l Verification logic V: State → Bool The set {S | V(S)} is closed about step execution • If V(S), execution never falls into error state, and • If V(S) and S→T (→ means step execution), then V(T)

Verification Logic Soundness of Verification Logic Machine States Soundness = V(S) ∧ S→T then

Verification Logic Soundness of Verification Logic Machine States Soundness = V(S) ∧ S→T then V(T) S such that V(S)

Verification Logic Program Translation and Verification We proved the following theorem If program translator

Verification Logic Program Translation and Verification We proved the following theorem If program translator is correct, and ¢ Verification logic is sound, then ¢ ⇒ Verification on original program and translated program are equivalent l Closed subset can be defined on the states of translation source language

Implementation ¢ Framework l l ADL data structures ADL interpreter • Used to confirm

Implementation ¢ Framework l l ADL data structures ADL interpreter • Used to confirm the correctness of translators l l Translator, verification logic interfaces Translation rule compiler • Compiles translation rule into Java implementation of a translator ¢ And for proof of concept, l l Translator from Intel x 86 and SPARC A simple type checker

Related Works Foundational TAL [Crary, 2003] ¢ TAL type checker is still large l

Related Works Foundational TAL [Crary, 2003] ¢ TAL type checker is still large l ¢ TCB is reduced by using a logical framework l l ¢ ¢ TALx 86 type checker consists of approx. 23 k Lo. C in O’Caml (!) Designed a language called TALT on Twelf logical framework [Pfenning et al. , 1999] Proved GC safety of TALT by machine Correspondence between TALT and realistic architectures are not discussed TALT type system is fixed l Our work allows replacement of verification logics

Future Work ¢ Automatically confirm the correctness of translation l Automatic testing • Cooperating

Future Work ¢ Automatically confirm the correctness of translation l Automatic testing • Cooperating with emulators or debuggers l ¢ Support dynamic memory allocation l ¢ Or, build a model and use a theorem prover Currently all memory must be allocated statically Support concurrent programs l l Concurrency is not taken into consideration To apply for OSes, etc. , concurrency takes an important role

Visions toward a Secure Compiler

Visions toward a Secure Compiler

What Is Secure Compiler? ¢ A compiler which produces certified code For example, TAL

What Is Secure Compiler? ¢ A compiler which produces certified code For example, TAL code as output l Like Popcorn compiler in TALx 86 l • Safe dialect of C → TALx 86 ¢ A compiler which assures correct compilation (optionally) Like credible compiler [Rinard, 1999] l Reduces TCB l

Motivation ¢ Infrastructure has been built TALK, TOS [Maeda, 2005] l Verifier framework [Yoshino,

Motivation ¢ Infrastructure has been built TALK, TOS [Maeda, 2005] l Verifier framework [Yoshino, 2006] l ¢ Next we have to build a house on it! l Most people do not want to write lowlevel code directly ⇒ Secure Compiler

Toward Secure World If we built a secure compiler… ¢ Memory-error-free systems l Prevent

Toward Secure World If we built a secure compiler… ¢ Memory-error-free systems l Prevent memory-error-based attacks • OS kernel, core libraries, network server… ¢ Writing secure code l l ¢ Vulnerable code will result in verification failure So code security will be improved Rest to be discovered…

Tasks to Do ¢ Determine what properties to assure l l ¢ Design the

Tasks to Do ¢ Determine what properties to assure l l ¢ Design the verification logic l ¢ Memory access safety? Information flow? Must be mechanically checkable Use verifier framework? Design the language l Target: TAL-base? ADL? • ADL can be used as certified language • Register allocation is done, so simple mapping will be possible… l Source: ? ? ?