Type Checking in Cool Alex Aiken Modified by

Type Checking in Cool Alex Aiken (Modified by Mooly Sagiv)

Outline • • What is type checking Simple type rules Self_Type Implementation

Types • What is a type – Varies from language to language • Consensus – A set of values – A set of operations • Classes – One instantiation of the modern notion of types

Why do we need type systems? • Consider assembly code – add $r 1, $r 2, $r 3 • What are the types of $r 1, $r 2, $r 3?

Types and Operations • Certain operations are legal for values of each type – It does not make sense to add a function pointer and an integer in C – It does make sense to add two integers – But both have the same assembly language implementation!

Type Systems • A language’s type system specifies which operations are valid for which types • The goal of type checking is to ensure that operations are used with the correct types – Enforces intended interpretation of values because nothing else will!

Type Checking Overview • Three kinds of languages – Statically typed: (Almost) all checking of types is done as part of compilation • Semantic Analysis • C, Java, Cool, ML – Dynamically typed: Almost all checking of types is done as part of program execution • Code generation • Scheme – Untyped • No type checking (Machine Code)

Type Wars • Competing views on static vs. dynamic typing • Static typing proponents say: – Static checking catches many programming errors – Prove properties of your code – Avoids the overhead of runtime type checks • Dynamic typing proponents say – Static type systems are restrictive – Rapid prototyping difficult with types systems – Complicates the programming language and the compiler – Compiler optimizations can hide costs

Type Wars (cont. ) • In practice, most code is written in statically typed languages with escape mechanisms – Unsafe casts in C Java – union in C • It is debatable whether this compromise represents the best or worst of both worlds

Types Outline • Types concepts in Cool • Notation for type rules – Logical rules of inference • Cool type rules • General properties of type systems

Cool Types • The types are – Class Names – SELF_TYPE • The user declares types for identifiers • The semantic analysis infers types for expressions – Every expression has a unique type

Type Checking and Type Inference • Type checking – The process of verifying fully typed programs • Type inference – The process of filling in missing type information • Different terms used interchangeably

Rules of Inference • We have seen two examples of formal notions specifying parts of the compiler – Regular expressions – Context-free grammars • Appropriate formalisms for static type checking – Syntax directed translations – Attribute grammars – Logical rules of inference

Why Rules of Inference? • Inference rules have the form: – “If Hypothesis is true, then conclusion is true” • Type checking computes via reasoning – “If E 1 and E 2 have certain types, then E 1+E 2 have certain type” • Rules of inference are compact notation of “If-Then” statements

From English to an inference rule • [Easy to read with practice] • Start with a simplified system and gradually add features • Building blocks – Symbol is ‘and’ – Symbol is ‘if then’ – Symbol x: T is ‘x has type T’

From English to an inference rule(2) • If e 1 has type Int and e 2 has type Int, then e 1 + e 2 has type Int • (e 1 has type Int e 2 has type Int) e 1 +e 2 has type Int • (e 1: Int e 2: Int) e 1+e 2: Int

From English to an inference rule(3) • The statement (e 1: Int e 2: Int) e 1+e 2: Int is a special case of Hypothesis 1 . . . Hypothesisn Conclusion • This is an inference rule

Notation for Inference Rules • By tradition inference rules are written Hypothesis 1 . . . Hypothesisn Conclusion • Cool type rules have hypothesis and conclusion e: T • means “it is provable that. . . ”

Two Rules i is an integer i : Int [Int] e 1 : Int e 2 : Int e 1+e 2 : Int [Add]

Type Rules (cont. ) • These rules give templates describing how to type integers and + expressions • By filling the templates, we can produce complete typings for expressions

Example 1 +2 1 is an integer 1 : Int 2 is an integer [Int] 2: Int [Int] 1+2: Int [Add]

Soundness • A type system is sound if – whenever e : T – Then e evaluates to a value of type T • We only want sound rules – But some sound rules are better than others: 1 is an integer 1 : Object [Strange]

Type Checking Proofs • Type checking proves facts e: T – Proof is on structure of the AST – Proof has the shape of the AST – One type rule is used for each AST node • If the type rule used for a node e: – Hypotheses are the proofs of e’s subexpressions – Conclusion is the type of e • Bottom up pass over the AST

Rules for Constants false : Bool s is a string constant s: String [Bool] [String]

Rules for New • new T produces an object of type T – Ignore SELF_TYPE for now. . . [New] new T: T

Two More Rules e: Bool e 1: Bool e 2: T while e 1 loop e 2 pool: Object [Not] [Loop]

A Problem • What is the type of a variable reference x is an identifier [Var] x: ? • The local structure rules does not carry enough information to give x a type

A Solution • Put more information in the rules • A type environment gives types for free variables – A type environment is a function from Object. Identifiers to Types – Symbol table – A variable is free in an expression if it is not defined within the expression

Type Environments • Let O be a function from Object. Identifiers to Types • The sentence O e: T is read: under the assumption that variables have the types given by O, it is provable that e has the type T

Modified Rules i is an integer O i : Int [Int] O e 1 : Int O e 2 : Int O e 1+e 2 : Int [Add]

New Rules O(x)=T O x: T [Var]

Let Rule O(T 0/x) e 1 : T 1 [Let-No-Init] O let x: T 0 in e 1 : T 1 O[T/y] means O modified to return T on argument y Enforces a variable scope

Notes • The type environment gives types to free identifiers in the current scope • The type environment is passed down the AST from the root to the leaves • Types are computed up the AST from the leaves towards the root

Let Rule with Initialization O e 0: T 0 O(T 0/x) e 1 : T 1 O let x: T 0 e 0 in e 1 : T 1 Weak rule [Let-Init]

Subtyping • Define a relation on classes –X X – X Y if X inherits from Y – X Z if X Y and Y Z O e 0: T T T 0 O(T 0/x) e 1 : T 1 O let x: T 0 e 0 in e 1 : T 1 [Let-Init]

Assignment • Both rules are sound but more programs typecheck with the second one • More uses of subtyping O (Id) = T 0 O e 1 : T 1 T 0 O Id e 1 : T 1 [Assign]

Initialized Attributes • Let Oc(x) = T for all attributes x: T in class C • Attribute initialization is similar to let, except the scope of names Oc(Id)= T 0 Oc e 1 : T 1 T 0 Oc id T 0 e 1 : T 1 ; [Attr-Init]

If-Then-Else • Consider: if e 0 then e 1 else e 2 • The result can be either e 1 or e 2 • The type is either e 1’s type or e 2’ type • The best we can do is the smallest supertype larger then the type of e 1 and e 2

Least Upper Bounds • lub(X, Y) is the least upper bound of X and Y (denoted by Z) – X Z and Y Z Z is upper bound – X Z’ and Y Z’ Z Z’ Z is the least upper bound • In Cool, the least upper bound of two types is their least common ancestor in the inheritance tree

If-Then-Else-Revisited Oc e 0 : Bool [If-Then-Else] O e 1: T 1 O e 2 : T 2 O if e 0 then e 1 else e 2 : lub(T 1, T 2)

Case • The rule for case expressions takes lub over all branches O e 0 : T 0 O[T 1/x 1] e 1: T’ 1. . . O[Tn/xn] en: T’n [Case] O case e 0 of x 1: T 1; . . . ; xn: Tn esac : lub(T’ 1, . . , T’n)

Method Dispatch • There is a problem with type checking method calls O e 0 : T 0 O e 1: T 1. . . O en : Tn [Dispatch] O e 0. f(e 1, , , en): ? • We need information about the formal parameters of and return type

Notes on dispatch • In Cool, the method and object identifiers live in different name spaces – A method foo and an object foo can coexist in the same scope – In the type rules this is reflected by a separate mapping M for method signatures M(C, f) = (T 1, . . . , Tn+1) means that in class C there is a method f f(X 1: T 1, . . . , Xn: Tn): Tn+1

The Dispatch Rule Revisited O, M e 0 : T 0 O, M e 1: T 1. . . O, M en: Tn M(T 0, f) = (T’ 1, . . . , T’n+1) Ti T’i for 1 i n O e 0. f(e 1, , , en): T’n+1 [Dispatch]

Static Dispatch • A variation of normal dispatch • The method is found in the class explicitly named by the programmer • The inferred type of the dispatch expression must conform to the specified type

Static Dispatch O, M e 0 : T 0 O, M e 1: T 1. . . O, M en: Tn T 0 T M(T 0, f) = (T’ 1, . . . , T’m, T’n+1) Ti T’i for 1 i n O e 0@T. f(e 1, , , en): T’n+1 [Static. Dispatch]

The Method Environment • The method environment must be added to all rules • In most cases, M is passed but not actually used O, M e 1 : Int O, M e 2 : Int O, M e 1+e 2 : Int [Add]

More Environments • For some cases involving SELF_TYPE, we need to know the class in which an expression appears • The full environment for COOL – A mapping O gives types to Object Id’s – A mapping M giving types to methods – The current C

Sentences • The form of a sentence in the logic is O, M, C e : C O, M, C e 1 : Int O, M, C e 2 : Int O, M, C e 1+e 2 : Int [Add]

Effectiveness of Static Type Systems • Static type systems detect common errors • But some correct programs are disallowed – Some argue for dynamic checking instead – Other argue for more expressive type systems • But more expressive type systems are more complex

Dynamic and Static Types • The dynamic type of an object is the class that is used in the new expression – a runtime notion – Even languages that are statically typed have dynamic types • The static type of an expression captures all the dynamic types that the expression could have – A compile-time notion

Dynamic and static types • In early type systems the set of static types correspond directly to the dynamic types • Soundness theorem: for all expressions E dynamic_type(E) = static_type(E) • Gets more complicated in advanced type systems

Dynamic and Static Types in Cool class A {. . . } class B inherits A {. . . } class Main ( x: A new A ; . . . x new B ; } • A variable of static type A can hold the value of static type B if B A

Dynamic and Static Types • Soundness of the Cool type system: – E. dynamic_type(E) static_type(E) • Why is this ok – All operations that can be used on an object of type C can be also used on an object of type C’ C – Subclasses only add behavior (attributes or methods) – Methods can be redefined but with the same type!

An Example class Count { i : Int 0 ; Inc() Count { class Stock inherits Count { name: String ; }; { i i+1; self ; Stock a (new Stock). inc(); } . . . a. name. . }; }; class Main { };

SELF_TYPE to the Rescue • We will extend the type system • Insight – Inc returns “self” – The return value has type as “self” – Any subtype of Count • Introduce the Keyword SELF_TYPE to use for return value of such functions – Need to modify the typing rules

$(An Example (revisited class count { i : Int 0 ; Inc(): Self_Type {$

(An Example (revisited class count { i : Int 0 ; Inc(): Self_Type { class Stock inherits Count { name: String ; }; { i i+1; self ; Stock a (new Stock). inc(); } . . . a. name. . }; }; class Main { };

Type Systems • The rules in this lecture are Cool-specific • A lot of theory about type systems • General themes – Type rules are defined on the structure of expressions – Types of variables are modeled by an environment

One Pass Type Checking • COOL type checking can be implemented in a single traversal over the AST • Type environment is passed down the tree – From parent to child • Types are passed up the tree – From child to parent

Implementing Type Systems O, M, C e 1 : Int O, M, C e 2 : Int O, M, C e 1+e 2 : Int Type. Check(Environment, e 1, e 2) { T 1 = Type. Check(Environment, e 1); T 2 = Type. Check(Environment, e 2); check T 1 == T 2 == Int; return INT’ } [Add]