Formalising Java Safety An overview Pieter H Hartel

  • Slides: 37
Download presentation
Formalising Java Safety – An overview Pieter H. Hartel phh@ecs. soton. ac. uk 2003.

Formalising Java Safety – An overview Pieter H. Hartel phh@ecs. soton. ac. uk 2003. 10. 02. 목 박숙영 HPCC Lab 1

Contents o o o o Introduction Methodology Java Semantics The compiler Java extensions Small

Contents o o o o Introduction Methodology Java Semantics The compiler Java extensions Small footprint devices Conclusions HPCC Lab 2

Introduction 1/2 o Java is a safe programming language n o Type safe and

Introduction 1/2 o Java is a safe programming language n o Type safe and memory safe The two main features n Java does not offer pointer arithmetic o o n Java offers references to objects Unused objects are automatically garbage collected Java is a strongly typed language o Java performs runtime checks to avoid array index errors HPCC Lab 3

Introduction 2/2 o Class loader n o Byte code verifier n o Accepting and

Introduction 2/2 o Class loader n o Byte code verifier n o Accepting and loading JVM programs into the Java runtime environment Another type checker operating on the JVM byte codes. Both do their work before execution ofo the code from a newly loaded class starts. HPCC Lab 4

Methodology 1/4 o Formal specifications n n The semantics of Java The semantics of

Methodology 1/4 o Formal specifications n n The semantics of Java The semantics of the JVM language The Java to JVM compiler The runtime support, that is parts of the Java API, including all java. * classes. HPCC Lab 5

Methodology 2/4 o The methodology to build these specifications n n Construct clear and

Methodology 2/4 o The methodology to build these specifications n n Construct clear and concise formal specifications of the relevant components Validate the specifications by animating them, and by stating and proving relevant properties of the components. Refine the specifications into implementations Create all specifications in machine-readable form HPCC Lab 6

Methodology 3/4 o Principal difficulties n n Multi-threading, exception handling, object orientation and garbage

Methodology 3/4 o Principal difficulties n n Multi-threading, exception handling, object orientation and garbage collection Careful consideration o n Ambiguous, inconsistent, incomplete Reference implementation is complex HPCC Lab 7

Methodology 4/4 o Popular assumptions n n n Unlimited memory Individual storage locations can

Methodology 4/4 o Popular assumptions n n n Unlimited memory Individual storage locations can hold all primitive data types Individual JVM program locations can hold all byte code instructions HPCC Lab 8

Java and JVM language features o o o o IM: Imperative core consisting of

Java and JVM language features o o o o IM: Imperative core consisting of basic data, expressions and statements OO: Object orientation, i. e. Objects, classes, interfaces, and arrays TY: The Java type system, or byte code verification in the JVM CL: Class loading EH: Exception handling MT: Multi-threading, monitors, synchronisation GC: Garbage collection HPCC Lab 9

Java Semantics o Table 1 HPCC Lab 10

Java Semantics o Table 1 HPCC Lab 10

Object Orientation o Alves-Foss and LAM[1] n n n denotational semantics of most of

Object Orientation o Alves-Foss and LAM[1] n n n denotational semantics of most of Java detail on the various basic data types in Java Better understanding HPCC Lab 11

The type system 1/2 o o Based on simple sub typing One novel feature

The type system 1/2 o o Based on simple sub typing One novel feature n o o o Java offers interfaces by way of creating multiple inheritance Drossopoulou and Eisenbach[24] n Static semantics and dynamic semantics of a relatively small subset of Java Drossopoulou et al[23] n Extend their subset to include exception handling Syme[55] n DECLARE system, gives proofs n To uncover 40 errors made during the translation n Found two non-trivial errors in the hand written proofs of Drossopoulou and Eisenbach HPCC Lab 12

The type system 2/2 o Nipkow and von Oheimb[45] n n Prove type soundness

The type system 2/2 o Nipkow and von Oheimb[45] n n Prove type soundness of a similar subset to Drossopoulou et al. o o n Not able to validate the specifications o o Use Isabelle/HOL to machine-check the proofs from the outset Higher degree of confidence in the correctness of the specifications and the proofs Due to the lack of support for generating executable semantics[58] Glesner and Zimmermann[26] o Specify the type system for a small fragment of Java HPCC Lab 13

Class Loader o Wragg et al[62] n Offer a model of class loading for

Class Loader o Wragg et al[62] n Offer a model of class loading for a relatively small subset of Java to study one of Java’s more experimental features(binary compatibility) Multi-threading o Borger and Schulte[10] and Cenciarelli et al[13] n n Multi-threading at the Java level The study of the issues left open by the official SUN documentation HPCC Lab 14

The compiler o Diehl[21] n n o Compilation schemes for a subset of the

The compiler o Diehl[21] n n o Compilation schemes for a subset of the Java that excludes exceptioin handling, multi-threading and garbage collection to the corresponding subset of the JVM Operational semantics of this JVM subset Rose[50] n n n Natural semantics of a subset of Java Static type systems for both(Java, JVM) A specification of the compiler for the subsets HPCC Lab 15

The Abstract State Machine approach 1/3 o Borger and Shulte n n n Working

The Abstract State Machine approach 1/3 o Borger and Shulte n n n Working on formal specifications of Java, JVM, Compiler Based on the Abstract State Machine formalism o Full semantic account: in Gurevich[29] Specify a modular semantics of a subset of the JVM[11], a subset of Java[10] o Modular approach o The two subsets do not entirely coincide HPCC Lab 16

The Abstract State Machine approach 2/3 o [7] n n Reducing the subsets of

The Abstract State Machine approach 2/3 o [7] n n Reducing the subsets of Java and the JVM to omit Multithreading, class loading and arrays. Main result o o Informal theorem stating the correctness of the compiler Two papers revisit exception handling and object initialisation n n [8]: On problems with the initialisation of objects [9]: exception handling mechanism of java, the JVM, and the Compiler o Main result: Formulation of the correctness of compiling exception handling, with a full proof HPCC Lab 17

The Abstract State Machine approach 3/3 o Stark[53] n n n o A forthcoming

The Abstract State Machine approach 3/3 o Stark[53] n n n o A forthcoming book[6] n o More complete specification of Java, the JVM, the compiler, the byte code [9] n o The specification of Java and the JVM from Borger and Schulte[11, 10] Presents a compiler from the imperative core of Java Gives a correctness proof of the compiler Mechanical checking of the specification Wallace[60] n n Includes Multi-threading, exception handling Excludes class loading and garbage collection HPCC Lab 18

Java extensions o The safety of Java programs n n o By using program

Java extensions o The safety of Java programs n n o By using program verification techniques Fewer design and implementation problems Smart cards HPCC Lab 19

Model checking 1/3 o Demartini et al[18], Havelund et al[31] n n n How

Model checking 1/3 o Demartini et al[18], Havelund et al[31] n n n How core features of Java can be mapped onto the Promela language of the SPIN model checker. multi-threading and objects. (Havelund et al model exceptions. ) the objects using Promela’s arrays(one array element per instance of the class) o n The resulting models quickly grow too large to model check effectively only check for safety properties(assertions, deadlock) o do not provide support for the checking of liveness properties HPCC Lab 20

Model checking 2/3 o One of the most useful features of the SPIN model

Model checking 2/3 o One of the most useful features of the SPIN model checker n n Its ability to display scenarios leading to problems(deadlock) Demartini et al o o To relate these scenarios back to the original Java sources More user friendly than that of Havelund et al. HPCC Lab 21

Model checking 3/3 o Jensen et al[33] n n Use model checking to verify

Model checking 3/3 o Jensen et al[33] n n Use model checking to verify properties of Java programs, more abstract approach Static analysis techniques o To reduce a Java program to a control flow graph n o Method calls, method returns, assertions § Defines the state transitions of the abstract Java program Example[38] n n How the system can be used to model Java’s sandbox The stack inspection introduced by Java 2 HPCC Lab 22

Theorem proving 1/3 o Detlefs et al, Modula 3[20], Java[52] n n n Offers

Theorem proving 1/3 o Detlefs et al, Modula 3[20], Java[52] n n n Offers by requiring the programmer to annotate programs with pre- and post-conditions. The compiler is able to generate and prove the verification conditions. The system of Detlefs et al o o does not require the programmer to annotate programs with loop invariants and variants derives loop invariants automatically Assume that loops are executed at most once Powerful n The type checker < the system < full verification HPCC Lab 23

Theorem proving 2/3 o The LOOP project of Jacobs et al n n Full

Theorem proving 2/3 o The LOOP project of Jacobs et al n n Full verification of Java programs Use a denotational semantics based tool to translate Java into the higher order logic of widely used theorem provers(PVS[32], Isabelle/HOL[57]). . Properties o o Termination of a method In-variants on the fields of a class HPCC Lab 24

Theorem proving 3/3 o Poetzsch-Heffter and Muller[47] n n n o An operational/axiomatic semantics

Theorem proving 3/3 o Poetzsch-Heffter and Muller[47] n n n o An operational/axiomatic semantics of a subset of Java prove the soundness of the axiomatic semantics with respect to the operational semantics. embedded in HOL o Mechanical checking of the soundness proof would be feasible. Moore[39] n n A new version of a small subset of Cohen’s specification[15] of the JVM How the ACL 2 theorem prover is capable HPCC Lab 25

Controlling type casts o Java’s lack of polymorphism n n Requires programmers to insert

Controlling type casts o Java’s lack of polymorphism n n Requires programmers to insert type casts in their programs Example o When storing an object, My. Object n o o One must remember to cast the raw object back into the user class My. Object when retrieving the information Erroneous type casts: cause unexpected runtime exceptions Pizza[46] and Generic Java[12] n n Automatically inserting the required type casts. Generic Java o No cast inserted by the compiler will fail HPCC Lab 26

Controlling execution time o Java safety would be able to guarantee that computations terminate(within

Controlling execution time o Java safety would be able to guarantee that computations terminate(within certain bounds) n o The denial of service attack would be prevented Execution time is one of the most difficult to control resources. HPCC Lab 27

Code certification 1/2 o Necula and Lee[40]: proof carrying code(PCC) n n Automatic verification

Code certification 1/2 o Necula and Lee[40]: proof carrying code(PCC) n n Automatic verification technique(assembly level programs) The producer o o o n expresses a safety property in terms of pre and post conditions on the program annotates the program, with loop invariants etc generates a proof of the safety property(by hand/using a mechanical proof assistant) The consumer o o receives the code and the proof mechanically checks that the proof is consistent with the program n o o The program satisfies the safety property Does not need to trust the producer relies only on a small trusted infrastructure(type checker) HPCC Lab 28

Code certification 2/2 o The problems of the PCC approaches n The size of

Code certification 2/2 o The problems of the PCC approaches n The size of a proofs: exponential in the size of the program[42] o n Necula and Lee[41] o o Reduce a proof of size n to a proof of size √n by avoiding some redundancy Program verification requires special skills n n n o The amount of redundancy To formulate properties To discover appropriate loop invariants To drive mechanical theorem provers etc. It is essential that tools are automatic, or at least require as little programmer intervention as possible HPCC Lab 29

Small footprint devices o Small footprint devices n n Mobile phones, PDAs, K Virtual

Small footprint devices o Small footprint devices n n Mobile phones, PDAs, K Virtual Machine: 128 KB of RAM Smart card o o o A few hundred bytes of RAM & a dozen or so KB of EEPROM Java-Card VM(JCVM) 3 disadvantages n n n o The full potential and flexibility of client server software development cannot be realised Java applets running on the smallest embedded controllers cannot be verified appropriately before they are run The freedom of code migration is restricted Based on the Split VM concept n n Pushes part of the byte code verification from the loading to the compilation/linking phase. JVM byte code ☞ JCVM format § Byte code verification, optimises, prepares the code for loading into the device. HPCC Lab 30

Byte code compression o Clausen et al[14] n n n Retain JVM byte codes

Byte code compression o Clausen et al[14] n n n Retain JVM byte codes Propose to compress them for the benefit of embedded systems The compression technique o Commonly occuring sequences of instructions n o A new ‘macro’ instruction 30% loading time increase ☞ 30% space save up HPCC Lab 31

Class file conversion 1/3 o Hartel et al[30]: the Java Secure Processor(JSP) n n

Class file conversion 1/3 o Hartel et al[30]: the Java Secure Processor(JSP) n n n o Methodological point[56] n n o Provide a complete specification of an early version of the JCMV Excludes multi-threading, garbage collection and exception handling Validated using the letos tool Earlier JSP n the full JVM ☞ cutting back unwanted features. Newer KVM n Scratch ☞ adding features as required. The developers of the pico. PERC version of the JVM [44] n n offer a core VM(64 KB) provide tools to add further functionality to the core VM HPCC Lab 32

Class file conversion 2/3 o Lanet and Requet[35] n B-method n n To study

Class file conversion 2/3 o Lanet and Requet[35] n B-method n n To study one particular aspect of the conversion from JVM to JCVM code Their results include 1. 2. 3. 4. A specification of the constraints imposed by the byte code verifier for a small subset of the JVM A specification of the semantics of this subset of the JVM byte codes A specification of the semantics of the corresponding subset of the JCVM byte codes A proof that the specification of the JCVM subset is a data refinement of the JVM subset HPCC Lab 33

Class file conversion 3/3 o Denney and Jensen[19] n Complementary to that studied by

Class file conversion 3/3 o Denney and Jensen[19] n Complementary to that studied by Lanet and Requet. n Lanet and Requet The conversion of JVM class files to JCVM class files by a ‘tokenisation’ Replaces names in the class files o n o o n n Reducing the size of the class files Speeding up the loading process Use the Coq theorem prover to mechanically check their proofs. Use an elegant method to parameterise their operational semantics over name resolution HPCC Lab 34

Byte code verification revisited 1/2 o Split VM concept n o Posegga and Vogt[49,

Byte code verification revisited 1/2 o Split VM concept n o Posegga and Vogt[49, 48] n o Off-line verification: Signing the results digitally(signature) To use a model checker(SMV) to perform off-line byte code verification for smart cards. Posegga et al[27] n Propose to implement a tiny proof checker on a smart card. HPCC Lab 35

Byte code verification revisited 2/2 o Rose and Rose[51] n n Use Necula and

Byte code verification revisited 2/2 o Rose and Rose[51] n n Use Necula and Lee’s proof carrying code(PCC) method to ‘split’ the byte code verifier. 1. The verification n To reconstruct the types associated with all local variables and stack locations of JVM code 2. The certification n To check based on the reconstructed types, that each instruction is correctly typed. Advantage 1. The certification process is simple 2. Only the certification needs to be trusted, not the verification HPCC Lab 36

Conclusions o o o o On modelling garbage collection, and the Java API. On

Conclusions o o o o On modelling garbage collection, and the Java API. On building more appropriate theories for programming language semantics modelling. On simplifying and modularising the individual components of Java implementations. On reducing the size of the trusted computing base, so that flaws are less likely to compromise the security of the system as a whole. On considering formal specification, validation and provably correct implementation as a whole, rather than in separation. On presenting clear an concise formalisations of systems, which are accessible to the designers and implementors of these systems. On using machine-readable specifications. HPCC Lab 37