CPS 506 Comparative Programming Languages Type Systems Semantics



































![Composite Data Types (con’t) • Example –C • Array ([]), Pointer (*), Struct, enum Composite Data Types (con’t) • Example –C • Array ([]), Pointer (*), Struct, enum](https://slidetodoc.com/presentation_image_h/7509949c6940b35506d7a759d694fae2/image-36.jpg)









![Composite Data Types (con’t) • Array Initialization – C-based languages int list [] = Composite Data Types (con’t) • Array Initialization – C-based languages int list [] =](https://slidetodoc.com/presentation_image_h/7509949c6940b35506d7a759d694fae2/image-46.jpg)









![Composite Data Types (con’t) • Record (con’t) –C struct student_type { char name[20]; int Composite Data Types (con’t) • Record (con’t) –C struct student_type { char name[20]; int](https://slidetodoc.com/presentation_image_h/7509949c6940b35506d7a759d694fae2/image-56.jpg)








![Composite Data Types (con’t) • Pointer Arithmetics in C, C++ float stuff[100]; float *p; Composite Data Types (con’t) • Pointer Arithmetics in C, C++ float stuff[100]; float *p;](https://slidetodoc.com/presentation_image_h/7509949c6940b35506d7a759d694fae2/image-65.jpg)









- Slides: 74

CPS 506 Comparative Programming Languages Type Systems, Semantics and Data Types

Type Systems • A completely defined language: Defined syntax, semantics and type system • Type: A set of values and operations – int • Values=Z • Operations={+, -, *, /, mod} – Boolean • Values={true, false} • Operations={AND, OR, NOT, XOR} 2

Type Systems • Type System – A system of types and their associated variables and objects in a program – To formalize the definition of data types and their usage in a programming language – A bridge between syntax and semantics • Type checked in compile time: a part of syntax analysis • Type checked in run time: a part of semantics 3

Type Systems (con’t) • Statically Typed: each variable is associated with a single type during its life in run time. – Could be explicit or implicit declaration – Example: C and Java, Perl – Type rules are defined on abstract syntax (Static Semantics) 4

Type Systems (con’t) • Dynamically Typed: a variable type can be changed in run time – Example: LISP, Java. Script, PHP Java Script example: List = [10. 2 , 3. 5] … List = 47 – Less reliable, difficult to debug – More flexible – Fast compilation – Slow execution (Type checking in run-time) 5

Type Systems (con’t) • Type Error: a non well-defined operation on a variable in run time – Example: union in C union flex. Type { int i; float f; }; union flex. Type u; float x; … u. I = 10; x = u. f; … – Another example in C ? 6

Type Systems (con’t) • Strongly Typed: All type errors are detected in compile or run time before execution – More reliable – Example: Java is nearly strongly typed, but C is not x+1 regardless of the type x – Coercion (implicit type conversion) rules have an effect on strong typing • Weak type example x = 2; y = “ 5”; print x+y Visual Basic: 7 Java. Script: “ 25” 7

Type Systems (con’t) • Type Safe: A language without type error – Strongly Typed -> Type Safe – Example: Java, Haskell, and ML 8

Type Binding • The process of associating an attribute, name, location, value, or type, to an object • Example Identifier i is bound to the integer type and to a location specified by the underlying compiler = 10; Identifier i is bound to value 10 or value 10 is bound to a location int i; i 9

Type Binding (con’t) • Binding time – Language definition time • Java: Integers are bound to int, and real numbers are bound to float – Language implementation time • Bounding real values to IEEE 754 standard – Program writing time • Declaration of variables – Compile/Load time • Bounding static objects to stack or fixed memory • Execution code is assigned to a memory block – Run time • Value are bound to variables 10

Type Binding (con’t) • Early binding – An element is bound to a property as early as possible – The earlier the binding the more efficient the language • Late Binding – Delay binding until the last possible time – The later the binding the more flexible the language – Supports overloading and overriding in Object Oriented languages – C++ example ? 11

Type Checking • Type checking is the activity of ensuring that the operands of an operator are of compatible types • A compatible type is one that is either legal for the operator, or is allowed under language rules to be implicitly converted, by compilergenerated code, to a legal type • If all type bindings are static, nearly all type checking can be static • If type bindings are dynamic, type checking must be dynamic 12

Type Conversion • A narrowing conversion is one that converts an object to a type that cannot include all of the values of the original type e. g. float to int • A widening conversion is one in which an object is converted to a type that can include at least approximations to all of the values of the original type e. g. int to float 13

Type Conversion (con’t) • Implicit type conversion (Coercion) – decreases type error detection ability. In most languages, all numeric types are coerced in expressions, using widening conversions. Ada has no implicit Conversion 14

Type Conversion (con’t) –C double d; long l; int i; … d = i; l = i; if (d == l) l; – Java int x; double d; x = 5; d = x + 2; d = 2 * 15

Type Conversion (con’t) • Explicit type conversion (Casting) – ( type-name ) cast-expression • C double d = 3. 14; int i = (int) d; • Java boolean t = true; byte b = (byte) (t ? 1 : 0); • Ada (similar to function call) 3 * Integer(2. 0) 2. 0 + Float(2) 16

Semantic Domains • Semantic Domain – A set with well-defined properties and operations – Environment • A set of pairs <variable, location> – Memory • A set of pairs <location, value> • State – Product of environment and its memory σ = { <Var 1, Val 1>, <Var 2, Val 2>, …, <Varn, Valn>} 17

Semantic Domains (con’t) • Three ways to define the meaning of a program – Operational Semantics • Program is interpreted as a set of sequences of computational steps • A set of execution rules Premise -> Conclusion σ(x) => 4 and σ(y) => 2 -> σ(x+y) => 6 18

Semantic Domains (con’t) • Three ways to define the meaning of a program – Operational Semantics (con’t) • Usage – Language manuals and textbooks – Teaching programming languages • Structural: define program behavior in terms of the behavior of its parts • Natural: define program behavior in terms of its overall effects, and not from its single steps 19

Semantic Domains (con’t) – Axiomatic Semantics • The program does what it is supposed to do • Agreement of the program result and specification • Formal verification of a program using logic expressions, assertions • Hoare triple {Pre-condition} s {Post-condition} • Example {a = 2} b = a; {b = 2} • Weakest Pre-condition {? } a = b+1; {a > 1} 20

Semantic Domains (con’t) – Axiomatic Semantics (con’t) • Axioms – Rule of Consequence – Rule of Conjunction – Rule of Assignment (s : b = a) – Rule of sequence – Rule of Condition s : if c then a else b 21

Semantic Domains (con’t) – Axiomatic Semantics (con’t) • Axioms – Rule of Loop s : while c do b end – I is loop invariant – Loop Invariant is true before the loop, at the bottom of the loop in each iteration, and when the loop is terminated. – Find the loop invariant to prove the correctness of the loop 22

Semantic Domains (con’t) – Denotational Semantics • Define the meaning of statement as a statetransforming mathematical function • A state of a program indicates the current values of the active objects • Example – Denotational semantics of Integer arithmetic expressions » Production rules: Number : : = N D | D Digit : : = 0 | 1 | … | 9 Expression : : = E 1 + E 2 | E 1 – E 2 | E 1 * E 2 | E 1 / E 2| (E) | N 23

Semantic Domains (con’t) – Denotational Semantics (con’t) – Semantic domain: Integer = { …, -1, 0, 1, …} – Semantic functions: Value: Numner => Number Digit: Digit => Number Expr: Expression => Integer – Auxiliary functions: plus: Number + Number => Number … – Semantic equations: Expr[[E 1+E 2]] = plus(Expr[E 1] , Expr[E 2]) 24

Data Types • Elements of a data type – – Set of possible values Set of operations Internal representation External representation • Type information – Implicit • 5 is implicitly integer • I is integer, implicitly, in Fortran – Explicit • Using variable or function declaration 25

Data Types (con’t) • Data type classifications – Built-in • Included in the language definition – Primitive – Composite – Recursive – User-defined • Data types defined by users • Declared and defined before usage 26

Primitive Data Types • Unstructured and indivisible entities • Integer, Real, Boolean, Char • Depends to the language application domain – COBOL: fixed-length strings and fixedpoint numbers – SNOBOL: Strings with different length – Scheme: integer, rational, real, complex 27

Primitive Data Types (con’t) • Example –C • int, float, char – Java • int, float, char, boolean – Pascal • Integer, Char, Real, Longint – ML • bool, real, int, word, char – Scheme • integer? , real? , boolean? , char? 28

Primitive Data Types (con’t) • Integer – Almost always an exact reflection of the hardware so the mapping is trivial – There may be as many as eight different integer types in a language – Java’s signed integer sizes: byte, short, int, long 29

Primitive Data Types (con’t) • Float – Model real numbers, but only as approximations – Languages for scientific use support at least two floating-point types (e. g. , float and double; sometimes more – Usually exactly like the hardware, but not always – IEEE Floating-Point – Standard 754 30

Primitive Data Types (con’t) • Complex – Some languages support a complex type, e. g. , C 99, Fortran, and Python – Each value consists of two floats, the real part and the imaginary part – Literal form (in Python): (7 + 3 j), where 7 is the real part and 3 is the imaginary part 31

Primitive Data Types (con’t) • Decimal – For business applications (money) • Essential to COBOL • C# offers a decimal data type – Store a fixed number of decimal digits, in coded form (BCD) (Binary-Coded Decimal) – Advantage: accuracy – Disadvantages: limited range, wastes memory 32

Primitive Data Types (con’t) • Boolean – Simplest of all – Range of values: two elements, one for “true” and one for “false” – Could be implemented as bits, but often as bytes 33

Primitive Data Types (con’t) • Character – Stored as numeric codings – Most commonly used coding: ASCII – An alternative, 16 -bit coding: Unicode (UCS -2) (Universal Character Set) • Includes characters from most natural languages • Originally used in Java • C# and Java. Script also support Unicode – 32 -bit Unicode (UCS-4) • Supported by Fortran, starting with 2003 34

Composite Data Types • Structured or compound types • Array, String, Enumeration, Pointer, Record, List, Function • Homogeneous like Array • Heterogeneous like Record • Fixed size like Array • Dynamic size like Linked List • Inside the core or as a separate library 35
![Composite Data Types cont Example C Array Pointer Struct enum Composite Data Types (con’t) • Example –C • Array ([]), Pointer (*), Struct, enum](https://slidetodoc.com/presentation_image_h/7509949c6940b35506d7a759d694fae2/image-36.jpg)
Composite Data Types (con’t) • Example –C • Array ([]), Pointer (*), Struct, enum – Java • String, Array – Pascal • Record, Array, Pointer (^) 36

Composite Data Types (con’t) • String – C and C++ • Not primitive • Use char arrays and a library of functions that provide operations – SNOBOL 4 (a string manipulation language) • Primitive • Many operations, including elaborate pattern matching – Fortran and Python • Primitive type with assignment and several operations – Java • Primitive via the String class – Perl, Java. Script, Ruby, and PHP • Provide built-in pattern matching, using regular expressions 37

Composite Data Types (con’t) • String length option – Static: COBOL, Java’s String class – Limited Dynamic Length: C and C++ • In these languages, a special character is used to indicate the end of a string’s characters, rather than maintaining the length – Dynamic (no maximum): SNOBOL 4, Perl, Java. Script – Ada supports all three string length options 38

Composite Data Types (con’t) • String Implementation – Static length: compile-time descriptor – Limited dynamic length: may need a run-time descriptor for length (but not in C and C++) – Dynamic length: need run-time descriptor; allocation/de-allocation is the biggest implementation problem 39

Composite Data Types (con’t) • Enumeration – All possible values, which are named constants, are provided in the definition – C# example enum days {mon, tue, wed, thu, fri, sat, sun}; – Design issues • Is an enumeration constant allowed to appear in more than one type definition, and if so, how is the type of an occurrence of that constant checked? • Are enumeration values coerced to integer? • Any other type coerced to an enumeration type? 40

Composite Data Types (con’t) • Enumeration (con’t) – Aid to readability, e. g. no need to code a color as a number enum Colors {Red, Blue, Green, Yellow}; – Aid to reliability, e. g. compiler can check: • operations (don’t allow colors to be added) • No enumeration variable can be assigned a value outside its defined range • Ada, C#, and Java 5. 0 provide better support for enumeration than C++ because enumeration type variables in these languages are not coerced into integer types 41

Composite Data Types (con’t) • Sub-range Types – An ordered contiguous subsequence of an ordinal type • Example: 12. . 18 is a sub-range of integer type – Ada’s design type Days is (mon, tue, wed, thu, fri, sat, sun); subtype Weekdays is Days range mon. . fri; subtype Index is Integer range 1. . 100; Day 1: Days; Day 2: Weekday; Day 2 : = Day 1; 42

Composite Data Types (con’t) • Enumeration and Sub-range implementation – Enumeration types are implemented as integers – Sub-range types are implemented like the parent types with code inserted (by the compiler) to restrict assignments to subrange variables 43

Composite Data Types (con’t) • Array – An array is an aggregate of homogeneous data elements in which an individual element is identified by its position in the aggregate, relative to the first element. – A heterogeneous array is one in which the elements need not be of the same type • Supported by Perl, Python, Java. Script, and Ruby 44

Composite Data Types (con’t) • Array Index Type – FORTRAN, C: integer only – Ada: integer or enumeration (includes Boolean and char) – Java: integer types only – Index range checking • C, C++, Perl, and Fortran do not specify range checking • Java, ML, C# specify range checking • In Ada, the default is to require range checking, but it can be turned off 45
![Composite Data Types cont Array Initialization Cbased languages int list Composite Data Types (con’t) • Array Initialization – C-based languages int list [] =](https://slidetodoc.com/presentation_image_h/7509949c6940b35506d7a759d694fae2/image-46.jpg)
Composite Data Types (con’t) • Array Initialization – C-based languages int list [] = {1, 3, 5, 7} char *names [] = {“Mike”, “Fred”, “Mary Lou”}; – Ada List : array (1. . 5) of Integer : = (1 => 17, 3 => 34, others => 0); – Python List comprehensions list = [x ** 2 for x in range(12) if x % 3 == 0] puts [0, 9, 36, 81] in list 46

Composite Data Types (con’t) • Array Operations – APL provides the most powerful array processing operations for vectors and matrixes as well as unary operators (for example, to reverse column elements) – Ada allows array assignment but also concatenation – Python’s array assignments, but they are only reference changes. Python also supports array concatenation and element membership operations 47

Composite Data Types (con’t) • Array Operations (con’t) – Ruby also provides array concatenation – Fortran provides elemental operations because they are between pairs of array elements – For example, + operator between two arrays results in an array of the sums of the element pairs of the two arrays 48

Composite Data Types (con’t) • Rectangular and Jagged Arrays – A rectangular array is a multi-dimensioned array in which all of the rows have the same number of elements and all columns have the same number of elements – A jagged matrix has rows with varying number of elements • Possible when multi-dimensioned arrays actually appear as arrays of arrays – C, C++, and Java support jagged arrays – Fortran, Ada, and C# support rectangular arrays (C# also supports jagged arrays) 49

Composite Data Types (con’t) • Slices – A slice is some substructure of an array; nothing more than a referencing mechanism – Slices are only useful in languages that have array operations – Fortran 95 Integer, Dimension (10) : : Vector Integer, Dimension (3, 3) : : Mat Integer, Dimension (3, 3, 4) : : Cube Vector (3: 6) is a four element array – Ruby supports slices with the slice method list. slice(2, 2) returns the third and fourth elements of list 50

Composite Data Types (con’t) 51

Composite Data Types (con’t) • Array Access – Access function maps subscript expressions to an address in the array – Access function for single-dimensioned arrays: address(list[k]) = address (list[lower_bound]) + ((k-lower_bound) * element_size) – Two common ways: • Row major order (by rows) – used in most languages • column major order (by columns) – used in Fortran 52

Composite Data Types (con’t) • Record – A record is a possibly heterogeneous aggregate of data elements in which the individual elements are identified by names – COBOL uses level numbers to show nested records; others use recursive definition 01 EMP-REC. 02 EMP-NAME. 05 FIRST PIC X(20). 05 MID PIC X(10). 05 LAST PIC X(20). 02 HOURLY-RATE PIC 99 V 99. 53

Composite Data Types (con’t) • Record (con’t) – Ada type Emp_Rec_Type is record First: String (1. . 20); Mid: String (1. . 10); Last: String (1. . 20); Hourly_Rate: Float; end record; Emp_Rec: Emp_Rec_Type; 54

Composite Data Types (con’t) • Record (con’t) – Pascal Month. Type = (Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oc t, Nov, Dec); Date. Type = record Month : Month. Type; Day : 1. . 31; Year : 1900. . 2000; end; 55
![Composite Data Types cont Record cont C struct studenttype char name20 int Composite Data Types (con’t) • Record (con’t) –C struct student_type { char name[20]; int](https://slidetodoc.com/presentation_image_h/7509949c6940b35506d7a759d694fae2/image-56.jpg)
Composite Data Types (con’t) • Record (con’t) –C struct student_type { char name[20]; int ID; } 56

Composite Data Types (con’t) • Record (con’t) – Java: No record in Java. It is defined using class Person { String name; int id_number; Date birthday; int age; } 57

Composite Data Types (con’t) • Pointer and Reference Types – A pointer type variable has a range of values that consists of memory addresses and a special value, nil – Provide the power of indirect addressing – Provide a way to manage dynamic memory – A pointer can be used to access a location in the area where storage is dynamically created (usually called a heap) 58

Composite Data Types (con’t) • Pointer Design Issues – What are the scope and lifetime of a pointer variable? – Are pointers restricted as to the type of value to which they can point? – Are pointers used for dynamic storage management, indirect addressing, or both? – Should the language support pointer types, reference types, or both? 59

Composite Data Types (con’t) • Pointer Operations – Two fundamental operations: assignment and dereferencing – Assignment is used to set a pointer variable’s value to some useful address – Dereferencing yields the value stored at the location represented by the pointer’s value • Dereferencing can be explicit or implicit • C++ uses an explicit operation via * j = *ptr sets j to the value located at ptr 60

Composite Data Types (con’t) • Pointer Illustration – The assignment operation j = *ptr 61

Composite Data Types (con’t) • Pointer Problems – Dangling pointers (dangerous) • A pointer points to a heap-dynamic variable that has been de-allocated – Lost heap-dynamic variable • An allocated heap-dynamic variable that is no longer accessible to the user program (often called garbage) – Pointer p 1 is set to point to a newly created heap-dynamic variable – Pointer p 1 is later set to point to another newly created heapdynamic variable – The process of losing heap-dynamic variables is called memory leakage 62

Composite Data Types (con’t) • Pointer Problems (con’t) – Ada • Some dangling pointers are disallowed because dynamic objects can be automatically deallocated at the end of pointer's type scope – C, C++ • Extremely flexible but must be used with care • Pointers can point at any variable regardless of when or where it was allocated • Used for dynamic storage management and addressing 63

Composite Data Types (con’t) • Pointer Problems (con’t) – C, C++ • Pointer arithmetic is possible • Explicit dereferencing and address-of operators • Domain type need not be fixed (void *) void * can point to any type and can be type checked (cannot be de-referenced) 64
![Composite Data Types cont Pointer Arithmetics in C C float stuff100 float p Composite Data Types (con’t) • Pointer Arithmetics in C, C++ float stuff[100]; float *p;](https://slidetodoc.com/presentation_image_h/7509949c6940b35506d7a759d694fae2/image-65.jpg)
Composite Data Types (con’t) • Pointer Arithmetics in C, C++ float stuff[100]; float *p; p = stuff; *(p+5) *(p+i) is equivalent to stuff[5] stuff[i] and p[5] p[i] 65

Composite Data Types (con’t) • Reference Types – C++ includes a special kind of pointer type called a reference type that is used primarily formal parameters • Advantages of both pass-by-reference and pass-by-value – Java extends C++’s reference variables and allows them to replace pointers entirely • References are references to objects, rather than being addresses – C# includes both the references of Java and the pointers of C++ 66

Composite Data Types (con’t) • Heap Management – A very complex run-time process – Single-size cells vs. variable-size cells – Two approaches to reclaim garbage • Reference counters (eager approach): reclamation is gradual • Mark-sweep (lazy approach): reclamation occurs when the list of variable space becomes empty 67

Composite Data Types (con’t) • Heap Management (con’t) – Reference counters • Maintain a counter in every cell that store the number of pointers currently pointing at the cell • Disadvantages: space required, execution time required, complications for cells connected circularly • Advantage: it is intrinsically incremental, so significant delays in the application execution are avoided 68

Composite Data Types (con’t) • Heap Management (con’t) – Mark-Sweep • The run-time system allocates storage cells as requested and disconnects pointers from cells as necessary; mark-sweep then begins • Every heap cell has an extra bit used by collection algorithm • All cells initially set to garbage • All pointers traced into heap, and reachable cells marked as not garbage • All garbage cells returned to list of available cells • Disadvantages: in its original form, it was done too infrequently. When done, it caused significant delays in application execution. Contemporary mark-sweep algorithms avoid this by doing it more often—called incremental mark-sweep 69

Recursive Data Types • Recursive or circular data types • Type composed from objects of the same type • Example – Linked list in C and Pascal – ML datatype intlist = nil | cons of int * intlist 5 10 70

Exercises 1. Determine which of the following programming languages are statically typed or not: (Explain by example) – – – – Ada Perl Python Haskell Prolog Fortran Ruby 71

Exercises 2. Bring another example of type error in C. 3. Show two examples for early and late binding in a language. 4. Is there any programming language which does not allow implicit type conversion, say int to float? 5. Which type of coercions is not safe? 6. compute the Weakest Pre-condition of {? } a = b * -1; {a > 10} 72

Exercises 2. Using an example, show the rule of consequence in axiomatic semantic. 3. Find the loop invariant of the following while loop. i = 1; s = 0; while (i <= 10) { s = s + i; i = i + 1; } 73

Exercises 7. Which programming language(s) except Ada and different versions of C, support pointer? 8. What are the rules of call-by-value and callby-reference in Pascal? Give examples. 9. Name two programming languages which have automatic garbage collection. What are the negative and positive effects of this operation in a language? 74