Type Systems Overview Introduction to Type System Type

  • Slides: 39
Download presentation
Type Systems

Type Systems

Overview • Introduction to Type System • Type Systems: Static vs. Dynamic • Specifying

Overview • Introduction to Type System • Type Systems: Static vs. Dynamic • Specifying Types: Manifest vs. Implicit • Checking Types: Nominal vs. Structural • Type Safety and Soundness

Basic Concepts • A type system is a way of classifying entities in a

Basic Concepts • A type system is a way of classifying entities in a program (expressions, variables, etc. ) by the kinds of values they represent, in order to prevent undesirable program states • The classification assigned to an entity its type • Most programming languages include some kind of type system

Basic Concepts [C++] int abc; • In C++, int is a type. Variable abc

Basic Concepts [C++] int abc; • In C++, int is a type. Variable abc has type int [C++] cout << 123 + 456; • Literals 123 and 456 also represent values of type int in C++, although this does not need to be explicitly stated • In C++, applying a binary + operator to two values of type int results in another value of type int • Therefore, the expression 123 + 456 also has type int

Basic Concepts • In the past, programming languages often had a fixed set of

Basic Concepts • In the past, programming languages often had a fixed set of types. • In contrast, modern programming languages typically have an extensible type system: one that allows programmers to define new types. For example, in C++, a class is a type [C++] class Zebra { // New type named "Zebra" … • Type checking means checking and enforcing the restrictions associated with a type system

How Types are Used - Determining Legal Values • Types are used to determine

How Types are Used - Determining Legal Values • Types are used to determine which values an entity may take on • Being of type int, the above variable abc may be set to 42, which also has type int [C++] abc = 42; • Being of type int also makes some values illegal for abc [C++] vector<int> v; abc = v; // ILLEGAL! • We cannot set a variable of type int to a value of type vector<int>. The above code contains a type error. When a C++ compiler flags a type error, it prevents the program from compiling successfully

How Types are Used - Determining Legal Operations • Types are used to determine

How Types are Used - Determining Legal Operations • Types are used to determine which • Some operations are forbidden for operations are legal type int. For example, there is no unary * operator for this type. A • Since variable abc is of type int, the C++ compiler would flag a type binary + operator may be used with error in the code it [C++] cout << abc + abc; [C++] cout << *abc; // ILLEGAL! • A type whose values can be manipulated with the ease and facility of a type like int in C++, is said to be first-class

First Class Type • A first-class type is one for which • new values

First Class Type • A first-class type is one for which • new values may be created from existing values at runtime • values may be passed as arguments to functions and returned from functions • values may be stored in containers • In some programming languages, such has Haskell, Scheme, Python, and Lua, function types are first-class • Such programming languages are said to have first-class functions

How Types are Used - Determining Which Operation • Third, types are used to

How Types are Used - Determining Which Operation • Third, types are used to determine which of multiple possible operations to perform [C++] template <typename T, typename U> T add. Em(T a, U b) { return a + b; } • We can call add. Em with two arguments of type int or with two arguments of type string. The operation performed by the + operator is determined by the argument type: addition for int arguments, concatentation for string arguments

How Types are Used - Determining Which Operation • Python similarly uses the binary

How Types are Used - Determining Which Operation • Python similarly uses the binary + operator for both numeric addition and string concatenation. The following Python function, like its C++ counterpart, may be called with two integers or with two strings [Python] def add. Em(a, b): return a + b • An important application of this third use of types is to provide a single interface for entities of different types. This called polymorphism. Both versions of add. Em are polymorphic functions

Type Systems: Static vs. Dynamic • A type system can be characterized as static

Type Systems: Static vs. Dynamic • A type system can be characterized as static or dynamic • In a static type system, types are determined and checked before program execution • This is typically done by a compiler • Type errors flagged during static type checking generally prevent a program from being executed • Programming languages with static type systems include C, C++, Java, Haskell, Objective-C, Go, Rust, and Swift

Type Systems: Static vs. Dynamic • In a dynamic type system, types are determined

Type Systems: Static vs. Dynamic • In a dynamic type system, types are determined and checked during program execution • Types are tracked by attaching to each value a tag indicating its type • Type errors in a particular portion of code are flagged only when that code actually executes • Programming languages with dynamic type systems include Python, Lua, Java. Script, Ruby, Scheme, and PHP

Type Systems: Static vs. Dynamic • Static typing and dynamic typing are two very

Type Systems: Static vs. Dynamic • Static typing and dynamic typing are two very different things • They are handled at different times, are implemented very differently, give different information, and allow for the solution of different problems • Static and dynamic typing are so different that some people prefer not to use the same word for both • They typically reserve the term “type” for use with a static type system, referring to the categories in a dynamic “type” system as tags. • The purpose of type system is to prevent undesirable program states - prevent the execution of operations that are incorrect for a type, because it is undesirable that such operations execute

Consequences of Static & Dynamic Typing • Now we consider some of the ways

Consequences of Static & Dynamic Typing • Now we consider some of the ways that static vs. dynamic type systems affect programming. • Our examples will use • C++, which is statically typed • Python & Lua, which are dynamically typed

Compilation vs. Runtime Errors • In C++, Python, and Lua, it is illegal to

Compilation vs. Runtime Errors • In C++, Python, and Lua, it is illegal to divide by a string • A C++ compiler will flag a type error in the following code [C++] cout << "Hello" << endl; cout << 1 / "bye" << endl; // ILLEGAL! • The static type checking in C++ means that the above code will not compile. An executable will not be created—much less executed

Compilation vs. Runtime Errors • The following Python and Lua code will result in

Compilation vs. Runtime Errors • The following Python and Lua code will result in a type error being flagged [Python] print("Hello") print(1 / "bye") # ILLEGAL! [Lua] io. write("Hellon") io. write(1 / "bye". . "n") -ILLEGAL! • However, because of the dynamic type checking in Python and Lua, the programs above can still compile successfully and begin execution • The type error will not be flagged until execution reaches the second statement. So both of the above programs will print “Hello”, and then the type error will be flagged

Compilation vs. Runtime Errors • In some dynamically typed programming languages, type errors raise

Compilation vs. Runtime Errors • In some dynamically typed programming languages, type errors raise exceptions that can be caught and handled [Python] def ff(x): try: print(1 / x) # MAYBE illegal, depending on type of x except Type. Error: print("TYPE ERROR") • Above, if a type error is flagged in print(1 / x), then we catch the resulting exception and print a message • If we do ff(2), then “ 0. 5” will be printed • If we do ff("bye"), then “TYPE ERROR” will be printed. In both cases, execution will continue

Typing of Both Variables and Values vs. Only Values • In a static type

Typing of Both Variables and Values vs. Only Values • In a static type system, types are generally applied to both variables and values [C++] int x = 42; • Above, “x” is an identifier naming a variable of type int, and 42 is a value of type int

Typing of Both Variables and Values vs. Only Values • In a dynamic type

Typing of Both Variables and Values vs. Only Values • In a dynamic type system, types are represented by tags attached to values. So generally only values have types in a dynamic type system [Python OR Lua] x = 42 x = "bye" • The above presents no problem in a dynamically typed programming language like Python or Lua. Variable x does not have a type. The value 42 does have a type. And the value "bye" has a different type. But in both cases, x is merely a reference to the value

Typing of Both Variables and Values vs. Only Values • A bit more subtly,

Typing of Both Variables and Values vs. Only Values • A bit more subtly, in dynamically typed programming languages container items typically do not have types; only their values do. So there is generally no problem with a container holding values of different types. Here are Python and Lua lists containing a number, a string, and a boolean [Python] mylist = [ 123, "hello", True ] [Lua] mylist = { 123, "hello", true }

Manifest & Implicit Typing • When we specify the type of an entity by

Manifest & Implicit Typing • When we specify the type of an entity by explicitly stating it, we are doing manifest typing • The typing of variables and functions in C, C++, and Java is mostly manifest [C++] double sq 47(double n) { double result = 4. 7 * n; return result; } • Above, the types of n, sq 47, result are explicitly stated. Such an explicit specification of a type is a type annotation

Manifest & Implicit Typing • When types are not specified explicitly, we have implicit

Manifest & Implicit Typing • When types are not specified explicitly, we have implicit typing • The typing in Python and Lua is mostly implicit. Here is a Python function that is more or less equivalent to the above C++ function [Python] def sq 47(n): result = 4. 7 * n return result • In the above code, there are no type annotations at all

Manifest & Implicit Typing • In dynamically typed programming languages, typing is usually mostly

Manifest & Implicit Typing • In dynamically typed programming languages, typing is usually mostly implicit • It is therefore tempting to conflate manifest typing with static typing; however, the two are not the same • For example, here is function sq 47 in Haskell, which has a static type system [Haskell] sq 47 n = result where result = 4. 7 * n

Manifest & Implicit Typing • Again, there are no type annotations. However, identifiers sq

Manifest & Implicit Typing • Again, there are no type annotations. However, identifiers sq 47, n, and result in the above code still have types • This is because a Haskell compiler performs type inference, determining types from the way entities are used in the code • Haskell types are said to be inferred. However, while type annotations are mostly not required in Haskell, they are still allowed

Explicit & Implicit Type Conversions • Since 2011, C++ standards have allowed for the

Explicit & Implicit Type Conversions • Since 2011, C++ standards have allowed for the increasing use of type inference in C++. For example, the following is legal under the 2014 C++ Standard [C++14] auto sq 47(double n) { auto result = 4. 7 * n; return result; } • While the type of parameter n is explicitly specified above, the other two type annotations are no longer required

Explicit & Implicit Type Conversions • A type conversion takes a value of one

Explicit & Implicit Type Conversions • A type conversion takes a value of one type and returns an equivalent, or at least similar, value of another type • When we specify in our code that a type conversion is to be done, we are doing an explicit type conversion; other conversions are implicit

Explicit & Implicit Type Conversions • For example, C++ does implicit type conversion from

Explicit & Implicit Type Conversions • For example, C++ does implicit type conversion from int to double [C++] int n = 32; auto z = sq 47(n); • Function sq 47 takes a parameter of type double, while n has type int. This mismatch is dealt with via an implicit type conversion: a double version of the value of n is computed, and this is passed to function sq 47 • We could also do the conversion explicitly, as below [C++] int n = 32; auto z = sq 47(static_cast<double>(n));

Type Checking: Nominal vs. Structural • Various standards can be applied when type checking

Type Checking: Nominal vs. Structural • Various standards can be applied when type checking is done. Consider the following C++ types A and B [C++] struct A { int h; int m; }; struct B { int h; int m; }; Should types A and B be considered the same, for type-checking purposes? For example, should we allow the following function gg to be called with an argument of type B?

Type Checking: Nominal vs. Structural [C++] void gg(A x) {. . . • It

Type Checking: Nominal vs. Structural [C++] void gg(A x) {. . . • It might seem that there is no possible reason to distinguish between A and B. But here is one: they are different types. The programmer made them distinct, and perhaps we should honor that decision

Type Checking: Nominal vs. Structural • Type checking by this standard is nominal typing

Type Checking: Nominal vs. Structural • Type checking by this standard is nominal typing (“nominal” because we check whether a type has the right name) • C++ checks ordinary function parameters using nominal typing. And indeed, if we try to call the above function gg with an argument of type B, a C++ compiler will flag a type error • There are looser ways to apply nominal typing • For example, in the context of C++ inheritance, when a function takes a parameter of base-class pointer type, the type system allows a derived-class pointer to be passed as an argument

Type Checking: Nominal vs. Structural [C++] class Derived : public Base {. . .

Type Checking: Nominal vs. Structural [C++] class Derived : public Base {. . . }; void hh(Base * bp); Derived * derp; hh(derp); // Legal, even though different type

Structural Typing • Another possible standard for type checking is structural typing, which considers

Structural Typing • Another possible standard for type checking is structural typing, which considers two types to be interchangeable if they have the same structure and support the same operations • Under structural typing, the types A and B defined above would be considered the same • Structural typing may also be applied in a relatively loose manner. Perhaps the loosest variation on structural typing allows an argument to be passed to a function as long as every operation that the function actually uses is defined for the argument • This is duck typing. (The name comes from the Duck Test: “If it looks like a duck, swims like a duck, and quacks like a duck, then it’s a duck. ”)

Structural Typing • C++ checks template-parameter types using duck typing. The following function template

Structural Typing • C++ checks template-parameter types using duck typing. The following function template ggt can be called with arguments of type A or B [C++] template <typename T> void ggt(T x) { cout << x. h << " " << x. m << endl; } • The add. Em functions discussed earlier are examples of duck typing

Structural Typing [C++] [Python] template <typename T, typename U> def add. Em(a, b): T

Structural Typing [C++] [Python] template <typename T, typename U> def add. Em(a, b): T add. Em(T a, U b) { return a + b; } • We have noted that C++ template-parameter types are checked using duck typing. Python, Lua, and some other dynamic languages check all function parameter types using duck typing. Both versions of add. Em above can be called with arguments of any type, as long as all the operations used are defined for those types • In particular, both versions of add. Em may be called with two integer arguments or with two string arguments

No Type Checking • An alternative to the nominal and structural versions of type

No Type Checking • An alternative to the nominal and structural versions of type checking is no type checking at all (Forth, as defined in the 1994 ANSI standard) • Forth distinguishes between integer and floating-point values, so it arguably has a notion of type. However, these two types are dealt with using different syntax. There is no need to check whether an integer parameter is actually an integer; Forth provides no facilities for passing a floating-point value in its place • Thus, while Forth has types, it has no type checking • This idea was once common in programming-language design • Virtually all modern programming languages include some form of type checking

Type Safety • A programming language or programming-language construct is type-safe if it forbids

Type Safety • A programming language or programming-language construct is type-safe if it forbids operations that are incorrect for the types on which they operate • Some programming languages/constructs may discourage incorrect operations or make them difficult, without completely forbidding them. We may thus compare the level of type safety offered by two programming languages/constructs

Type Safety • For example, the C++ printf function, inherited from the C Standard

Type Safety • For example, the C++ printf function, inherited from the C Standard Library, is not type-safe • This function takes an arbitrary number of parameters. The first should be a format string containing references to the other parameters [C++] printf("I am %d years old. ", age); • The above inserts the value of variable age in place of the %d in the format string, on the assumption that age has type int. However, the C++ Standard specifies that the type of age is not checked. It could be a floating-point value, a pointer, or a struct; the code would then compile, and a type error would slip by unnoticed

Type Safety • In contrast, C++ stream I/O is type-safe [C++] cout << "I

Type Safety • In contrast, C++ stream I/O is type-safe [C++] cout << "I am " << age << " years old. "; • When the above code is compiled, the correct output function is chosen based on the type of the variable age

Soundness • A static type system is sound if it guarantees that operations that

Soundness • A static type system is sound if it guarantees that operations that are incorrect for a type will not be performed; otherwise it is unsound • Haskell has a sound type system. The type system of C (and thus C++) is unsound, since there is always a way of treating a value of one type as if it has a different type. This might appear to be a criticism • However, the type system of C was deliberately designed to be unsound. Being able to interpret a value in memory in arbitrary ways makes C useful for low-level systems programming • There does not seem to be any equivalent of the notion of soundness in the world of dynamic typing. However, we can still talk about whether a dynamic type system strictly enforces type safety