Types Antonio Cisternino Giuseppe Attardi Universit di Pisa

Types Antonio Cisternino Giuseppe Attardi Università di Pisa

Types l l Computer hardware is capable of interpreting bits in memory in several

Type System l A type system consists of: – A mechanism for defining types

Type checking is the process of ensuring that a program obeys the language’s type

Programming Languages and type checking l l l l No type checking Assembly C

Different views for types l Denotational: – types are set of values (domains) –

Language types l l l boolean int, long, float, double (signed/unsigned) character (1 byte,

Type Conversions and Casts l Consider the following definition: int add(int i, int j);

Memory Layout l Primitive types on 32 bits architectures require from 1 to 8

$Memory layout example 4 bytes/32 bits struct element { char name[2]; int atomic_number; double$

Optimizing Memory Layout C requires that fields of struct be placed in the same

Union types allow sharing the same memory area among different types l The size

Abstract Data Types According to the abstraction-based view of types a type is an

Example: a C list struct node { int val; struct list *next; }; struct

ADT, Modules and Classes l l C doesn’t provide any mechanism to hide the

Class type l l l Class is a type constructor like struct and array

$The Node Type in Java class Node { int val; Node m_next; Node(int v)$

Inheritance l If the class A inherits from class B (A<: B) when an

Example class int int } class int } A { i; j; foo() {

Questions l Consider the following: A a = new A(); A b = new

Upcasting l Late binding happens because we convert a reference to an object of

Downcasting l Once we have a reference of the superclass we may want to

Upcasting, downcasting We have shown upcasting and downcasting as expressed in languages such as

Late Binding The output of the example depends on the language: the second output

Late Binding l l l In the example the compiler cannot determine statically the

Late Binding l l With inheritance it is possible to treat objects in a

Example (Java) class A { final void foo() {…} void baz() {…} void bar()

Abstract classes l Sometimes it is necessary to model a set S of objects

Example l l l We want to manipulate documents with different formats The set

Abstract methods Often when a class is abstract some of its methods could not

Syntax l Abstract classes can be declared using the abstract keyword in Java or

Inheritance is a relation among classes Often systems impose some restriction on inheritance relation

Multiple inheritance Why systems should impose restrictions on inheritance? Multiple inheritance introduces both conceptual

Java and Mix-in inheritance l l l Both single and mix-in inheritance fix the

Implementing Single and Mix-in inheritance that Upcasting andthe Downcasting comes l Consists only in.

Implementing multiple inheritance l With multiple inheritance becomes more complex than reinterpreting a pointer!

Late binding l l l l How to identify which method to invoke? Solution:

Late binding: an example (Java) A’s v-table class A { void foo() {…} void

Overriding and Overloading A’s v-table class A { void foo() {…} void f() {…}

JVM invokevirtual l A call like: x. equals("test") l is translated into: aload_1 ;

Invokevirtual optimization The Java compiler can arrange every subclass method table (mtable) in the

Virtual Method in Interface Optimization does not work for interfaces interface Incrementable { public

Runtime type information Execution environments may use the v-table pointer as a mean of

Overloading is the mechanism that a language may provide to bind more than one

Method overloading l l l Overloading is mostly used for methods because the compiler

Operator overloading Syntax for operators such as + and – have is different from

Late binding: only on first argument A’s v-table class A { void foo(A a)

Slides: 47

Download presentation

Types Antonio Cisternino Giuseppe Attardi Università di Pisa

Types l l Computer hardware is capable of interpreting bits in memory in several different ways A type limits the set of operations that may be performed on a value belonging to that type The hardware usually doesn’t enforce the notion of type, though it provides operations for numbers and pointers Programming languages tend to associate types to values to enforce error-checking

Type System l A type system consists of: – A mechanism for defining types and associating them with certain language constructs – A set of rules for: • type equivalence: two values are the same • type compatibility: a value of a given type can be used in a given context • type inference: type of an expression given the type of its constituents

Type checking is the process of ensuring that a program obeys the language’s type compatibility rules l A language is strongly typed (or type safe) if it prohibits, in a way that the language implementation can enforce, performing an operation on an object that does not support it l A language is statically typed if it is strongly typed and type checking can be performed at compile time l

Programming Languages and type checking l l l l No type checking Assembly C Static type checking Not type entirely strongly typed (union, checking Pascal Static interoperability of pointers and arrays) Not entirely strongly typed (untagged C++ variant records) Static type checking Java Not entirely strongly typed (as C) Static type checking Dynamic type checking (virtual methods) Dynamic type checking (virtual methods, Lisp upcasting) Dynamic type checking Prolog Dynamic type checking Strongly typed Static type checking ML Strongly typed

Different views for types l Denotational: – types are set of values (domains) – Application: semantics l Constructive: – Built-in types – Composite types (application of type constructors) l Abstraction-based: – Type is an interface consisting of a set of operations

Language types l l l boolean int, long, float, double (signed/unsigned) character (1 byte, 2 bytes) Enumeration Subrange (n 1. . n 2) Composite types: – – – struct union arrays pointers list

Type Conversions and Casts l Consider the following definition: int add(int i, int j); int add 2(int i, double j); l And the following calls: add(2, 3); // Exact add(2, (int)3. 0); // Explicit cast add 2(2, 3); // Implicit cast

Memory Layout l Primitive types on 32 bits architectures require from 1 to 8 bytes l Composite types are represented by chaining constituent values together l For performance reasons often compilers employ padding to align fields to multiple of 4 bytes addresses

$Memory layout example 4 bytes/32 bits struct element { char name[2]; int atomic_number; double$

Memory layout example 4 bytes/32 bits struct element { char name[2]; int atomic_number; double atomic_weight; char metallic; }; name atomic_number atomic_weight metallic

Optimizing Memory Layout C requires that fields of struct be placed in the same order of the declaration (essential for working with pointers!) l Not all languages behaves like this: for instance ML doesn’t specify any order l If the compiler is free of reorganizing fields holes can be minimized (in the example by packing metallic with name saving 4 bytes) l

Union types allow sharing the same memory area among different types l The size of the value is the maximum of the constituents l 4 bytes/32 bits name union u { struct element e; int number; }; atomic_number atomic_weight metallic number

Abstract Data Types According to the abstraction-based view of types a type is an interface l An ADT defines a set of values and the operations allowed on it l In their evolution programming languages have included mechanisms to define ADT l Definition of an ADT requires the ability of incapsulating values and operations l

Example: a C list struct node { int val; struct list *next; }; struct node* next(struct node* l) { return l->next; } struct node* init. Node(struct node* l, int v) { l->val = v; l->next = NULL; return l; } void append(struct node* l, int v) { struct node p = l; while (p->next) p = p->next; p->next = init. Node((struct node)malloc(sizeof(struct node)), v); }

ADT, Modules and Classes l l C doesn’t provide any mechanism to hide the structure of data types A program can access the next pointer without using the next function The notion of module has been introduced to define data types and restrict the access to their definition An evolution of module is the class: values and operations are tied together (with the addition of inheritance)

Class type l l l Class is a type constructor like struct and array A class combines other types like structs Class definition contains also methods which are the operations allowed on the data The inheritance relation is introduced Two special operations provide control over initialization and finalization of objects

$The Node Type in Java class Node { int val; Node m_next; Node(int v)$

The Node Type in Java class Node { int val; Node m_next; Node(int v) { val = v; } Node next() { return m_next; void append(int v) { Node n = this; while (n. m_next != null) n n. m_next = new Node(v); } } } = n. m_next;

Inheritance l If the class A inherits from class B (A<: B) when an object of class B is expected an object of class A can be used instead l Inheritance expresses the idea of adding features to an existing type (both methods and attributes) l Inheritance can be single or multiple

Example class int int } class int } A { i; j; foo() { return i + j; } B : A { k; foo() { return k + super. foo(); }

Questions l Consider the following: A a = new A(); A b = new B(); Console. Write. Line(a. foo()); Console. Write. Line(b. foo()); Which version of foo is invoked in the second print? l What is the layout of class B? l

Upcasting l Late binding happens because we convert a reference to an object of class B into a reference of its super-class A (upcasting): B b = new B(); A a = b; The runtime should not convert the object: only use the part inherited from A l This is different from the following implicit cast where the data is modified in the assignment: l int i = 10; long l = i;

Downcasting l Once we have a reference of the superclass we may want to convert it back: A a = new B(); B b = (B)a; During downcast it is necessary to explicitly indicate which class is the target: a class may be the ancestor of many sub-classes l Again this transformation informs the compiler that the referenced object is of type B without changing the object in any way l

Upcasting, downcasting We have shown upcasting and downcasting as expressed in languages such as C++, C# and Java; though the problem is common to OO languages l Note that the upcast can be verified at compile time whereas the downcast cannot l Upcasting and downcasting don’t require runtime type checking: l – in Java casts are checked at runtime – C++ simply changes the interpretation of an expression at compile time without any attempt to check it at runtime

Late Binding The output of the example depends on the language: the second output may be the result of invoking A: : foo() or B: : foo() l In Java the behavior would result in the invocation of B: : foo l In C++ A: : foo would be invoked l The mechanism which associates the method B: : foo() to b. foo() is called late binding l

Late Binding l l l In the example the compiler cannot determine statically the exact type of the object referenced by b because of upcasting To allow the invocation of the method of the exact type rather than the one known at compile time it is necessary to pay an overhead at runtime Programming languages allow the programmer to specify whether to apply late binding in a method invocation In Java the keyword final is used to indicate that a method cannot be overridden in subclasses: thus the JVM may avoid late binding In C++ only methods declared as virtual are considered for late binding

Late Binding l l With inheritance it is possible to treat objects in a generic way The benefit is evident: it is possible to write generic operations manipulating objects of types inheriting from a common ancestor OOP languages usually support late binding of methods: which method should be invoked is determined at runtime This mechanism involves a small runtime overhead: at runtime the type of an object should be determined in order to invoke its methods

Example (Java) class A { final void foo() {…} void baz() {…} void bar() {…} } class B extends A { // Suppose it’s possible! final void foo() {…} void bar(); } A B A a b c = = = new A(); new B(); b; a. foo(); a. baz(); a. bar(); b. foo(); b. bar(); c. foo(); c. bar(); // // A: : foo() A: : baz() A: : bar() B: : foo() B: : bar() A: : foo() B: : bar()

Abstract classes l Sometimes it is necessary to model a set S of objects which can be partitioned into subsets (A 0, … An) such that their union covers S: – x S Ai S, x Ai l If we use classes to model each set it is natural that – A S, A<: S Each object is an instance of a subclass of S and no object is an instance of S. l S is useful because it abstracts the commonalities among its subclasses, allowing to express generic properties about its objects. l

Example l l l We want to manipulate documents with different formats The set of documents can be partitioned by type: doc, pdf, txt, and so on For each document type we introduce a class that inherits from a class Doc that represents the document In the class Doc we may store common properties to all documents (title, location, …) Each class is responsible for reading the document content It doesn’t make sense to have an instance of Doc though it is useful to scan a list of documents to read

Abstract methods Often when a class is abstract some of its methods could not be defined l Consider the method read() in the previous example l In class Doc there is no reasonable implementation for it l We leave it abstract so that through late binding the appropriate implementation will be called l

Syntax l Abstract classes can be declared using the abstract keyword in Java or C#: abstract class Doc { … } l C++ assumes a class is abstract if it contains an abstract method – it is impossible to instantiate an abstract class, since it will lack that method l A virtual method is abstract in C++ if its definition is empty: virtual string Read() = 0; l In Java and C# abstract methods are annotated with abstract and no body is provided: abstract String Read();

Inheritance is a relation among classes Often systems impose some restriction on inheritance relation for convenience l We say that class A is an interface if all its members are abstract; has no fields and may inherit only from one or more interfaces l Inheritance can be: l l – Single (A <: B ( C. A <: C C = B)) – Mix-in (S = {B | A <: B}, 1 B S ¬interface(B)) – Multiple (no restriction)

Multiple inheritance Why systems should impose restrictions on inheritance? Multiple inheritance introduces both conceptual and implementation issues l The crucial problem, in its simplest form, is the following: l l – B <: A C <: A – D <: B D <: C l In presence of a common ancestor: – The instance part from A is shared between B and C – The instance part from A is duplicated This situation is not infrequent: in C++ ios: >istream, ios: >ostream and iostream<: istream, iostream<: ostream l The problem in sharing the ancestor A is that B and C may change the inherited state in a way that may lead to conflicts l

Java and Mix-in inheritance l l l Both single and mix-in inheritance fix the common ancestor problem Though single inheritance can be somewhat restrictive Mix-in inheritance has become popular with Java and represents an intermediate solution Classes are partitioned into two sets: interfaces and normal classes Interfaces constraints elements of the class to be only abstract methods: no instance variables are allowed A class inherits instance variables only from one of its ancestors avoiding the diamond problem of multiple inheritance

Implementing Single and Mix-in inheritance that Upcasting andthe Downcasting comes l Consists only in. Note combining state of for free: the pointer at the base of the instance can be seen both as a pointer to an instance of A or B a class and its super-classess A B<: A C<: B<: A D<: C<: B<: A A A B B B D

Implementing multiple inheritance l With multiple inheritance becomes more complex than reinterpreting a pointer! A B<: A C<: A D<: B, D<: C A A A (B) B C B A (C) C D B C D

Late binding l l l l How to identify which method to invoke? Solution: use a v-table for each class that has polymorphic methods Each virtual method is assigned a slot in the table pointing to the method code Invoking the method involves looking up in the table at a specific offset to retrieve the address to use in the call instruction Each instance holds a pointer to the v-table Thus late binding incurs an overhead both in time (2 indirections) and space (one pointer per object) The overhead is small and often worth the benefits

Late binding: an example (Java) A’s v-table class A { void foo() {…} void f() {…} int ai; } class B extends A { void foo() {…} void g() {…} int bi; } A a = new A(); a. foo(); a. f(); foo f V-pointer a ai b B’s v-table foo f g B b = new B(); b. foo(); b. g(); b. f(); V-pointer ai bi A c = b; c. foo(); c. f(); c

Overriding and Overloading A’s v-table class A { void foo() {…} void f() {…} int ai; } class B extends A { void foo(int i) {…} void g() {…} int bi; } foo() f V-pointer a ai b B’s v-table foo() f foo(int) V-pointer ai bi g A a = new A(); a. foo(); a. f(); B b = new B(); b. foo(); b. g(); b. f(); A c = b; c. foo(3); c. f(); c

JVM invokevirtual l A call like: x. equals("test") l is translated into: aload_1 ; push local variable 1 (x) onto the operand stack ldc "test" ; push string "test" onto the operand stack invokevirtual java. lang. Object. equals(Ljava. lang. Object; )Z where java. lang. Object. equals(Ljava. lang. Object; )Z is a method specification l When invokevirtual is executed, the JVM looks at method specification and determines its # of args l From the object reference it retrieves the class, searches the list of methods for one matching the method descriptor. l If not found, searches its superclass l

Invokevirtual optimization The Java compiler can arrange every subclass method table (mtable) in the same way as its superclass, ensuring that each method is located at the same offset l The bytecode can be modified after first execution, by replacing with: l invokevirtual_quick mtable-offset l Even when called on objects of different types, the method offset will be the same

Virtual Method in Interface Optimization does not work for interfaces interface Incrementable { public void incr(); } class Counter implements Incrementable { public void incr(); } class Timer implements Incrementable { public void decr(); public void inc(); } Incrementable i; i. incr(); l Compiler cannot guarantee that method incr() is at the same offset. l

Runtime type information Execution environments may use the v-table pointer as a mean of knowing the exact type of an object at runtime l This is what happens in C++ with RTTI, in. NET CLR and JVM l Thus the cost of having exact runtime type information is allocating the v-pointer to all objects l C++ leaves the choice to the programmer: without RTTI no v-pointer is allocated in classes without virtual methods l

Overloading is the mechanism that a language may provide to bind more than one object to a name l Consider the following class: l class A { void foo() {…} void foo(int i) {…} } l The name foo is overloaded and it identifies two methods

Method overloading l l l Overloading is mostly used for methods because the compiler may infer which version of the method should be invoked by looking at argument types Behind the scenes the compiler generates a name for the method which includes the type of the signature (not the return type!) This process is known as name mangling In the previous example the name foo_v may be associated to the first method and foo_i to the second When the method is invoked the compiler looks at the types of the arguments used in the call and chooses the appropriate version of the method Sometimes implicit conversions may be involved and the resolution process may lead to more than one method: in this case the call is considered ambiguous and a compilation error is raised

Operator overloading Syntax for operators such as + and – have is different from invocation of the function they represent l C++ and other languages (i. e. C#) allow overloading these operators in the same way as ordinary functions and methods l Conceptually each invocation of + is reinterpreted as a function invocation so the standard overloading process applies l Example (C++): l c = a + b; // operator=(c, operator+(a, b))

Late binding: only on first argument A’s v-table class A { void foo(A a) void f() {…} int ai; } class B extends void foo(B b) void g() {…} int bi; } {…} foo() f V-pointer a ai b A { {…} B’s v-table foo(A) f foo(B) V-pointer ai bi g A a = new A(); a. foo(); a. f(); B b = new B(); b. foo(); b. g(); b. f(); A c = b; c. foo(c); c. f(); c