UNIT II Variables Names Bindings Type Checking and
UNIT II Variables: Names, Bindings, Type Checking and Scope 1 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Introduction This chapter introduces the fundamental semantic issues of variables. – It covers the nature of names and special words in programming languages, attributes of variables, concepts of binding and binding times. – It investigates type checking, strong typing and type compatibility rules. – At the end it discusses named constraints and variable initialization techniques. 2 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Names Design issues: Maximum length? Are connector characters allowed? Are names case sensitive? Are special words reserved words or keywords? Length FORTRAN I: maximum 6 COBOL: maximum 30 FORTRAN 90 and ANSI C: maximum 31 Ada: no limit, and all are significant C++: no limit, but implementors often impose one Connectors Pascal, Modula-2, and FORTRAN 77 don't allow Others do 3 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Case sensitivity • Foo = foo? • The first languages only had upper case • Case sensitivity was probably introduced by Unix and hence C. • Disadvantage: • Poor readability, since names that look alike to a human are different; worse in Modula-2 because predefined names are mixed case (e. g. Write. Card) • Advantages: • Larger namespace, ability to use case to signify classes of variables (e. g. , make constants be in uppercase) • C, C++, Java, and Modula-2 names are case sensitive but the names in many other languages are not 4 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Special words Def: A keyword is a word that is special only in certain contexts – Disadvantage: poor readability – Advantage: flexibility Def: A reserved word is a special word that cannot be used as a user-defined name 5 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Variables • A variable is an abstraction of a memory cell • Variables can be characterized as a 6 -tuple of attributes: Name: identifier Address: memory location(s) Value: particular value at a moment Type: range of possible values Lifetime: when the variable accessible Scope: where in the program it can be accessed 6 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Variables • Name - not all variables have them (examples? ) • Address - the memory address with which it is associated • A variable may have different addresses at different times during execution • A variable may have different addresses at different places in a program • If two (or more) variable names can be used to access the same memory location, they are called aliases • Aliases are harmful to readability, but they are useful under certain circumstances 7 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Aliases • How aliases can be created: • Pointers, reference variables, Pascal variant records, C and C++ unions, and FORTRAN EQUIVALENCE (and through parameters discussed in Chapter 8) • Some of the original justifications for aliases are no longer valid; e. g. memory reuse in FORTRAN • replace them with dynamic allocation 8 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Variables Type and Value Type - determines the range of values of variables and the set of operations that are defined for values of that type; in the case of floating point, type also determines the precision Value - the contents of the location with which the variable is associated • Abstract memory cell - the physical cell or collection of cells associated with a variable 9 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
lvalue and rvalue Are the two occurrences of “a” in this expression the same? a : = a + 1; In a sense, • The on the left of the assignment refers to the location of the variable whose name is a; • The on the right of the assignment refers to the value of the variable whose name is a; We sometimes speak of a variable’s lvalue and rvalue • The lvalue of a variable is its address • The rvalue of a variable is its value 10 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Binding Def: A binding is an association, such as between an attribute and an entity, or between an operation and a symbol Def: Binding time is the time at which a binding takes place. Possible binding times: – Language design time -- e. g. , bind operator symbols to operations – Language implementation time -- e. g. , bind floating point type to a representation – Compile time -- e. g. , bind a variable to a type in C or Java – Link time – Load time--e. g. , bind a FORTRAN 77 variable to memory cell (or a C static variable) – Runtime -- e. g. , bind a nonstatic local variable to a memory cell 11 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Type Bindings • Def: A binding is static if it occurs before run time and remains unchanged throughout program execution. • Def: A binding is dynamic if it occurs during execution or can change during execution of the program. • Type binding issues • How is a type specified? • When does the binding take place? • If static, type may be specified by either an explicit or an implicit declaration 12 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Variable Declarations Def: An explicit declaration is a program statement used for declaring the types of variables Def: An implicit declaration is a default mechanism for specifying types of variables (the first appearance of the variable in the program) – E. g. : in Perl, variables of type scalar, array and hash begin with a $, @ or %, respectively. – E. g. : In Fortran, variables beginning with I-N are assumed to be of type integer. – E. g. : ML (and other languages) use sophisticated type inference mechanisms Advantages: writability, convenience Disadvantages: reliability 13 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Dynamic Type Binding • The type of a variable can chance during the course of the program and, in general, is re-determined on every assignment. • Usually associated with languages first implemented via an interpreter rather than a compiler. • Specified through an assignment statement, e. g. APL LIST <- 2 4 6 8 LIST <- 17. 3 23. 5 • Advantages: • Flexibility • Obviates the need for “polymorphic” types • Development of generic functions (e. g. sort) • Disadvantages: • High cost (dynamic type checking and interpretation) • Type error detection by the compiler is difficult 14 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Type Inferencing • Type Inferencing is used in some programming languages, including ML, Miranda, and Haskell. • Types are determined from the context of the reference, rather than just by assignment statement. • Legal: fun circumf(r) = 3. 14159 * r; // infer r is real fun time 10(x) = 10 * x; // infer x is integer • Illegal: fun square(x) = x * x; // can’t deduce anything • Fixed fun square(x) : int = x * x; declaration // use explicit 15 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Storage Bindings and Lifetime • Storage Bindings • Allocation - getting a cell from some pool of available cells • Deallocation - putting a cell back into the pool • Def: The lifetime of a variable is the time during which it is bound to a particular memory cell • Categories of variables by lifetimes • Static • Stack dynamic • Explicit heap dynamic • Implicit heap dynamic 16 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Static Variables • Static variables are bound to memory cells before execution begins and remains bound to the same memory cell throughout execution. • Examples: • all FORTRAN 77 variables • C static variables Advantage: efficiency (direct addressing), history-sensitive subprogram support Disadvantage: lack of flexibility, no recursion! 17 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Static Dynamic Variables • Stack-dynamic variables -- Storage bindings are created for variables when their declaration statements are elaborated. • If scalar, all attributes except address are statically bound – e. g. local variables in Pascal and C subprograms • Advantages: – allows recursion – conserves storage • Disadvantages: – Overhead of allocation and deallocation – Subprograms cannot be history sensitive – Inefficient references (indirect addressing) IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0 18
Explicit heap-dynamic variables are allocated and deallocated by explicit directives, specified by the programmer, which take effect during execution • Referenced only through pointers or references • e. g. dynamic objects in C++ (via new and delete), all objects in Java Advantage: provides for dynamic storage management Disadvantage: inefficient and unreliable Example: int *intnode; . . . intnode = new int; . . . delete intnode; 19 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Implicit heapdynamic Implicit heap-dynamic variables -Allocation and deallocation caused by assignment statements and types not determined until assignment. e. g. all variables in APL Advantage: – flexibility Disadvantages: – Inefficient, because all attributes are dynamic – Loss of error detection IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0 20
Type Checking Generalize the concept of operands and operators to include subprograms and assignments • Type checking is the activity of ensuring that the operands of an operator are of compatible types • A compatible type is one that is either legal for the operator, or is allowed under language rules to be implicitly converted, by compiler-generated code, to a legal type. • This automatic conversion is called a coercion. • A type error is the application of an operator to an operand of an inappropriate type • Note: If all type bindings are static, nearly all checking can be static If type bindings are dynamic, type checking must be dynamic 21 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Strong Typing A programming language is strongly typed if • type errors are always detected • There is strict enforcement of type rules with no exceptions. • All types are known at compile time, i. e. are statically bound. • With variables that can store values of more than one type, incorrect type usage can be detected at run-time. • Strong typing catches more errors at compile time than weak typing, resulting in fewer run-time exceptions. 22 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Which languages have strong typing? • Fortran 77 isn’t because it doesn’t check parameters and because of variable equivalence statements. • The languages Ada, Java, and Haskell are strongly typed. • Pascal is (almost) strongly typed, but variant records screw it up. • C and C++ are sometimes described as strongly typed, but are perhaps better described as weakly typed because parameter type checking can be avoided (how? ) and unions are not type checked • Coercion rules strongly affect strong typing—they can weaken it considerably (C++ versus Ada) 23 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Type Compatibility Type compatibility by name means the two variables have compatible types if they are in either the same declaration or in declarations that use the same type name • Easy to implement but highly restrictive: • Subranges of integer types aren’t compatible with integer types • Formal parameters must be the same type as their corresponding actual parameters (Pascal) Type compatibility by structure means that two variables have compatible types if their types have identical structures • More flexible, but harder to implement 24 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Type Compatibility Consider the problem of two structured types. Suppose they are circularly defined • Are two record types compatible if they are structurally the same but use different field names? • Are two array types compatible if they are the same except that the subscripts are different? (e. g. [1. . 10] and [-5. . 4]) • Are two enumeration types compatible if their components are spelled differently? With structural type compatibility, you cannot differentiate between types of the same structure (e. g. different units of speed, both float) 25 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Type Compatibility Language examples Pascal: usually structure, but in some cases name is used (formal parameters) C: structure, except for records Ada: restricted form of name – Derived types allow types with the same structure to be different – Anonymous types are all unique, even in: A, B : array (1. . 10) of INTEGER: 26 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Variable Scope • The scope of a variable is the range of statements in a program over which it’s visible • Typical cases: • Explicitly declared => local variables • Explicitly passed to a subprogram => parameters • The nonlocal variables of a program unit are those that are visible but not declared. • Global variables => visible everywhere. • The scope rules of a language determine how references to names are associated with variables. • The two major schemes are static scoping and dynamic scoping IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0 27
Static Scope • Also known as “lexical scope” • Based on program text and can be determined prior to execution (e. g. , at compile time) • To connect a name reference to a variable, you (or the compiler) must find the declaration • Search process: search declarations, first locally, then in increasingly larger enclosing scopes, until one is found for the given name • Enclosing static scopes (to a specific scope) are called its static ancestors; the nearest static ancestor is called a static parent 28 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Blocks • A block is a section of code in which local variables are allocated/deallocated at the start/end of the block. • Provides a method of creating static scopes inside program units • Introduced by ALGOL 60 and found in most PLs. • Variables can be hidden from a unit by having a "closer" variable with same name C++ and Ada allow access to these "hidden" variables IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0 29
C and C++: for (. . . ) { int index; . . . } Examples of Blocks Common Lisp: (let ((a 1) (b foo) (c)) (setq a (* a a)) (bar a b c)) Ada: declare LCL : FLOAT; begin. . . end 30 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Static scoping example MAIN A MAIN calls A and B A C B A calls C and D D B calls A and E C D E B E MAIN A C MAIN B D A E C B D E 31 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Evaluation of Static Scoping Suppose the spec is changed so that D must now access some data in B Solutions: 1. Put D in B (but then C can no longer call it and D cannot access A's variables) 2. Move the data from B that D needs to MAIN (but then all procedures can access them) Same problem for procedure access! Overall: static scoping often encourages many globals 32 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Dynamic Scope • Based on calling sequences of program units, not their textual layout (temporal versus spatial) • References to variables are connected to declarations by searching back through the chain of subprogram calls that forced execution to this point • Used in APL, Snobol and LISP – Note that these languages were all (initially) implemented as interpreters rather than compilers. • Consensus is that PLs with dynamic scoping leads to programs which are difficult to read and maintain. – Lisp switch to using static scoping as it’s default circa 1980, though dynamic scoping is still possible as an option. 33 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Static vs. dynamic scope Define MAIN declare x Define SUB 1 declare x. . . call SUB 2. . . Define SUB 2. . . reference x. . . call SUB 1. . . MAIN calls SUB 1 calls SUB 2 uses x • Static scoping - reference to x is to MAIN's x • Dynamic scoping - reference to x is to SUB 1's x 34 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Dynamic Scoping Evaluation of Dynamic Scoping: • Advantage: convenience • Disadvantage: poor readability 35 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Scope vs. Lifetime • While these two issues seem related, they can differ • In Pascal, the scope of a local variable and the lifetime of a local variable seem the same • In C/C++, a local variable in a function might be declared static but its lifetime extends over the entire execution of the program and therefore, even though it is inaccessible, it is still in memory 36 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Referencing Environments • The referencing environment of a statement is the collection of all names that are visible in the statement • In a static scoped language, that is the local variables plus all of the visible variables in all of the enclosing scopes. • A subprogram is active if its execution has begun but has not yet terminated • In a dynamic-scoped language, the referencing environment is the local variables plus all visible variables in all active subprograms. 37 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Named Constants • A named constant is a variable that is bound to a value only when it is bound to storage. • The value of a named constant can’t be changed while the program is running. • The binding of values to named constants can be either static (called manifest constants) or dynamic • Languages: Pascal: literals only Modula-2 and FORTRAN 90: constant-valued expressions Ada, C++, and Java: expressions of any kind • Advantages: increased readability and modifiability without loss of efficiency 38 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Example in Pascal Procedure example; type a 1[1. . 100] of integer; a 2[1. . 100] of real; . . . begin. . . for I : = 1 to 100 do begin. . . end; . . . for j : = 1 to 100 do begin. . . end; . . . avg = sum div 100; . . . Procedure example; type const MAX 100; a 1[1. . MAX] of integer; a 2[1. . MAX] of real; . . . begin. . . for I : = 1 to MAX do begin. . . end; . . . for j : = 1 to MAX do begin. . . end; . . . avg = sum div MAX; . . . 39 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Variable Initialization • For convenience, variable initialization can occur prior to execution • FORTRAN: Integer Sum Data Sum /0/ • Ada: Sum : Integer : =0; • ALGOL 68: int first : = 10; • Java: int num = 5; • LISP (Let (x y (z 10) (sum 0) ). . . ) 40 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
DATA TYPES This chapter introduces the concept of a data type and discusses: – Characteristics of the common primitive data types – Character strings – User-defined data types – Design of enumerations and sub-range data types – Design of structured data types including arrays, records, unions and set types. – Pointers and heap management 41 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Data Types • Every PL needs a variety of data types in order to better model/match the world • More data types makes programming easier but too many data types might be confusing • Which data types are most common? Which data types are necessary? Which data types are uncommon yet useful? • How are data types implemented in the PL? 42 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Evolution of Data Types FORTRAN I (1956) - INTEGER, REAL, arrays Ada (1983) - User can create a unique type for every category of variables in the problem space and have the system enforce the types Def: A descriptor is the collection of the attributes of a variable Design Issues for all data types: 1. What is the syntax of references to variables? 2. What operations are defined and how are they specified? IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0 43
Primitive Data Types These types are supported directly in the hardware of the machine and not defined in terms of other types: – Integer: Short Int, Integer, Long Int (etc. ) – Floating Point: Real, Double Precision Stored in 3 parts, sign bit, exponent and mantissa (see Fig 5. 1 page 199) – Decimal: BCD (1 digit per 1/2 byte) Used in business languages with a set decimal for dollars and cents – Boolean: (TRUE/FALSE, 1/0, T/NIL) – Character: Using EBCDIC, ASCII, UNICODE, etc. 44 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Floating Point • Model real numbers, but only as approximations • Languages for scientific use support at least two floating-point types; sometimes more • Usually exactly like the hardware, but not always; some languages allow accuracy specs in code e. g. (Ada) type SPEED is digits 7 range 0. 0. . 1000. 0; type VOLTAGE is delta 0. 1 range -12. 0. . 24. 0; • IEEE Floating Point Standard 754 • Single precision: 32 bit representation with 1 bit sign, 8 bit exponent, 23 bit mantissa • Double precision: 64 bit representation with 1 bit sign, 11 bit exponent, 52 bit mantissa 45 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Decimal and Boolean Decimal – For business applications (money) – Store a fixed number of decimal digits (coded) – Advantage: accuracy – Disadvantages: limited range, wastes memory Boolean – Could be implemented as bits, but often as bytes – Advantage: readability 46 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Character Strings • Characters are another primitive data type which map easily into integers. • We’ve evolved through several basic encodings for characters: – 50 s – 70 s: EBCDIC (Extended Binary Coded Decimal Interchange Code) -- Used five bits to represent characters – 70 s – 00 s: ASCII (American Standard Code for Information Interchange) -- Uses seven bits to represent 128 possible “characters” – 90 s – 00 s - : Unicode -- Uses 16 bits to represent ~64 K different characters Needed as computers become less Eurocentric to represent the full range of non-roman alphabets and pictographs. 47 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Character String Types Values are sequences of characters Design issues: • Is it a primitive type or just a special kind of array? • Is the length of objects static or dynamic? Typical String Operations: • Assignment • Comparison (=, >, etc. ) • Catenation • Substring reference • Pattern matching 48 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Character Strings • Should a string be a primitive or be definable as an array of chars? – In Pascal, C/C++, Ada, strings are not primitives but can “act” as primitives if specified as “packed” arrays (i. e. direct assignment, <, =, > comparisons, etc. . . ). – In Java, strings are objects and have methods to support string operations (e. g. length, <, >) • Should strings have static or dynamic length? • Can be accessed using indices (like arrays) 49 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
String examples • SNOBOL - had elaborate pattern matching • FORTRAN 77/90, COBOL, Ada - static length strings • PL/I, Pascal - variable length with static fixed size strings • SNOBOL, LISP - dynamic lengths • Java - objects which are immutable (to change the length, you have to create a new string object) and + is the only overloaded operator for string (concat), no overloading for <, >, etc 50 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
String Examples • Some languages, e. g. Snobol, Perl and Tcl, have extensive built-in support for strings and operations on strings. • SNOBOL 4 (a string manipulation language) – Primitive data type with many operations, including elaborate pattern matching • Perl – Patterns are defined in terms of regular expressions providing a very powerful facility! /[A-Za-z][A-Za-zd]+/ • Java - String class (not arrays of char) IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0 51
String Length Options Static - FORTRAN 77, Ada, COBOL e. g. (FORTRAN 90) CHARACTER (LEN = 15) NAME; Limited Dynamic Length - C and C++ actual length is indicated by a null character Dynamic - SNOBOL 4, Perl 52 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Character String Types Evaluation • Aid to writability • As a primitive type with static length, they are inexpensive to provide -- why not have them? • Dynamic length is nice, but is it worth the expense? Implementation: • Static length - compile-time descriptor • Limited dynamic length - may need a run-time descriptor for length (but not in C and C++) • Dynamic length - need run-time descriptor; allocation/deallocation is the biggest implementation problem 53 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
User-Defined Ordinal Types • An ordinal type is one in which the range of possible values can be easily associated with the set of positive integers • Enumeration Types -the user enumerates all of the possible values, which are given symbolic constants • Can be used in For-loops, case statements, etc. • Operations on ordinals in Pascal, for example, include PRED, SUCC, ORD • Usually cannot be I/O easily • Mainly used for abstraction/readability 54 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Examples Pascal - cannot reuse constants; they can be used for array subscripts, for variables, case selectors; NO input or output; can be compared Ada - constants can be reused (overloaded literals); disambiguate with context or type_name ‘ (one of them); can be used as in Pascal; can be input and output C and C++ - like Pascal, except they can be input and output as integers Java - does not include an enumeration type 55 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Ada Example • Some PLs allow a symbolic constant to appear in more than one type, Standard Pascal does not • Ada is one of the few languages that allowed a symbol to name a value in more than one enumerated type. Type letters is (‘A’, ‘B’, ‘C’, . . . ‘Z’); Type vowels is (‘A’, ‘E’, ‘I’, ‘O’, ‘U’); • Making the following ambiguous: For letter in ‘A’. . ‘O’ loop • So Ada allows (requires) one to say: For letter in vowels(‘A’). . vowels(‘U’) loop 56 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Pascal Example Pascal was one of the first widely used language to have good facilities for enumerated data types. Type colorstype = (red, orange, yellow, green, blue, indigo, violet); Var a. Color : colortype; . . . a. Color : = blue; . . . If a. Color > green. . . For a. Color : = red to violet do. . . ; . . . IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0 57
Subrange Type • Limits a large type to a contiguous subsequence of values within the larger range, providing additional flexibility in programming and readability/abstraction • Available in C/C++, Ada, Pascal, Modula-2 • Pascal Example Type upper. Case =‘A’. . ‘Z’; lower. Case=‘a’. . ’z’; index =1. . 100; • Ada Example – Subtypes are not new types, just constrained existing types (so they are compatible); can be used as in Pascal, plus case constants, e. g. subtype POS_TYPE is INTEGER range 0. . INTEGER'LAST; 58 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Ordinal Types Implementation • Implementation is straightforward: enumeration types are implemented as non-negative integers • Subrange types are the parent types with code inserted (by the compiler) to restrict assignments to subrange variables 59 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Evaluation of Enumeration Types • Aid to efficiency – e. g. , compiler can select and use a compact efficient representation (e. g. , small integers) • Aid to readability -- e. g. no need to code a color as a number • Aid to maintainability – e. g. , adding a new color doesn’t require updating hard-coded constants. • Aid to reliability -- e. g. compiler can check operations and ranges of value. 60 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Array Types • An array is an aggregate of homogeneous data elements in which an individual element is identified by its position in the aggregate, relative to the first element. • Design Issues include: – What types are legal for subscripts? – When are subscript ranges bound? – When does array allocation take place? – How many subscripts are allowed? – Can arrays be initialized at allocation time? – Are array slices allowed? 61 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Array Indices • An index maps into the array to find the specific element desired map(array. Name, index. Value) array element • Usually placed inside of [ ] (Pascal, Modula-2, C, Java) or ( ) (FORTRAN, PL/I, Ada) marks – if the same marks are used for parameters then this weakens readability and can introduce ambiguity • Two types in an array definition – type of value being stored in array cells – type of index used • Lower bound - implicit in C, Java and early FORTRAN 62 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Subscript Bindings and Array Categories Subscript Types: FORTRAN, C - int only Pascal - any ordinal type (int, boolean, char, enum) Ada - int or enum (includes boolean and char) Java - integer types only 63 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Array Categories Four Categories of Arrays based on subscript binding and binding to storage 1. Static - range of subscripts and storage bindings are static – e. g. FORTRAN 77, some arrays in Ada – Advantage: execution efficiency (no allocation or deallocation) 2. Fixed stack dynamic - range of subscripts is statically bound, but storage is bound at elaboration time. – e. g. Pascal locals and C locals that are not static – Advantage: space efficiency 64 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
3. Stack-dynamic - range and storage are dynamic, but fixed from then on for the variable’s lifetime e. g. Ada declare blocks Declare STUFF : array (1. . N) of FLOAT; begin. . . end; Advantage: flexibility - size need not be known until the array is about to be used 65 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Array Categories 4. Heap-dynamic - subscript range and storage bindings are dynamic and not fixed e. g. (FORTRAN 90) INTEGER, ALLOCATABLE, ARRAY (: , : ) : : MAT (Declares MAT to be a dynamic 2 -dim array) ALLOCATE (MAT (10, NUMBER_OF_COLS)) (Allocates MAT to have 10 rows and NUMBER_OF_COLS columns) DEALLOCATE MAT (Deallocates MAT’s storage) - In APL and Perl, arrays grow and shrink as needed - In Java, all arrays are objects (heap-dynamic) 66 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Array dimensions • Some languages limit the number of dimensions that an array can have • FORTRAN I - limited to 3 dimensions • FORTRAN IV and onward - up to 7 dimensions • C/C++, Java - limited to 1 but arrays can be nested (i. e. array element is an array) allowing for any number of dimensions • Most other languages have no restrictions 67 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Array Initialization • FORTRAN 77 - initialization at the time storage is allocated INTEGER LIST(3) DATA LIST /0, 5, 5/ • C - length of array is implicit based on length of initialization list int stuff [] = {2, 4, 6, 8}; Char name [] = ‘’Maryland’’; Char *names [] = {‘’maryland’’, ‘’virginia’’, delaware’’}; • C/C++, Java - have optional initializations • Pascal, Modula-2 – don’t have array initializations (Turbo Pascal does) 68 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Array Operations • Operations that apply to an array as a unit (as opposed to a single array element) • Most languages have direct assignment of one array to another (A : = B) if both arrays are equivalent • FORTRAN: Allows array addition A+B • Ada: Array concatenation A&B • FORTRAN 90: library of Array ops including matrix multiplication, transpose • APL: includes operations for vectors and matrices (transpose, reverse, etc. . . ) 69 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Array Operations in Java • In Java, arrays are objects (sometimes called aggregate types) • Declaration of an array may omit size as in: – int [ ] array 1; – array 1 is a pointer initialized to nil – at a later point, the array may get memory allocated to it, e. g. array 1 = new int [ 100 ]; • Array operations other than access (array 1[2]) are through methods such as array 1. length 70 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Slices A slice is some substructure of an array; nothing more than a referencing mechanism 1. FORTRAN 90 Example INTEGER MAT (1: 4, 1: 4) INTEGER CUBE(1: 4, 1: 4) MAT(1: 4, 1) - the first column of MAT(2, 1: 4) - the second row of MAT CUBE(1: 3, 2: 3) – 3 x 3 x 2 sub array 2. Ada Example single-dimensioned arrays only LIST(4. . 10) 71 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Arrays Implementation of Arrays • Access function maps subscript expressions to an address in the array • Row major (by rows) or column major order (by columns) An associative array is an unordered collection of data elements that are indexed by an equal number of values called keys Design Issues: 1. What is the form of references to elements? 2. Is the size static or dynamic? 72 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Perl’s Associative Arrays • • Perl has a primitive datatype for hash tables aka “associative arrays”. Elements indexed not by consecutive integers but by arbitrary keys %ages refers to an associative array and @people to a regular array Note the use of { }’s for associative arrays and [ ]’s for regular arrays %ages = (“Bill Clinton”=>53, ”Hillary”=>51, "Socks“=>"27 in cat years"); $ages{“Hillary”} = 52; b @people=("Bill Clinton“, "Hillary“, "Socks“); $ages{“Bill Clinton"}; # Returns 53 $people[1]; # returns “Hillary” • keys(X), values (X) and each(X) foreach $person (keys(%ages)) {print "I know the age of $personn"; } foreach $age (values(%ages)){print "Somebody is $agen"; } while (($person, $age) = each(%ages)) {print "$person is $agen"; } 73 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Records A record is a possibly heterogeneous aggregate of data elements in which the individual elements are identified by names Design Issues: 1. What is the form of references? 2. What unit operations are defined? 74 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Record Field References • Record Definition Syntax -- COBOL uses level numbers to show nested records; others use familiar dot notation field_name OF rec_name_1 OF. . . OF rec_name_n rec_name_1. rec_name_2. . . rec_name_n. field_name • Fully qualified references must include all record names • Elliptical references allow leaving out record names as long as the reference is unambiguous • With clause in Pascal and Modula 2 With employee. address do begin street : = ‘ 422 North Charles St. ’; city : = ‘Baltimore’; zip : = 21250 end; 75 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Record Operations 1. Assignment • Pascal, Ada, and C allow it if the types are identical – In Ada, the RHS can be an aggregate constant 2. Initialization • Allowed in Ada, using an aggregate constant 3. Comparison • In Ada, = and /=; one operand can be an aggregate constant 4. MOVE CORRESPONDING (Cobol) (In PL/I this was called assignment by name) Move all fields in the source record to fields with the same names in the destination record MOVE CORRESPONDING INPUT-RECORD TO OUTPUTRECORD 76 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Records and Arrays Comparing records and arrays 1. Access to array elements is much slower than access to record fields, because subscripts are dynamic (field names are static) 2. Dynamic subscripts could be used with record field access, but it would disallow type checking and it would be much slower 77 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Union Types A union is a type whose variables are allowed to store different type values at different times during execution Design Issues for unions: 1. What kind of type checking, if any, must be done? 2. Should unions be integrated with records? 3. Is a variant tag or discriminant required? 78 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Examples: Unions 1. FORTRAN - with EQUIVALENCE 2. Algol 68 - discriminated unions • Use a hidden tag to maintain the current type • Tag is implicitly set by assignment • References are legal only in conformity clause union (int, real) ir 1; int count; real sum; … case ir 1 in (int intval): count : = intval; (realval): sum : = realval esac • This runtime type selection is a safe method of accessing union objects IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0 79
Pascal Union Types Problem with Pascal’s design: type checking is ineffective. Reasons: User can create inconsistent unions (because the tag can be individually assigned) var blurb : intreal; x : real; blurb. tag : = true; { it is an integer } blurb. blint : = 47; { ok } blurb. tag : = false; { it is a real } x : = blurb. blreal; { assigns an integer to a real } The tag is optional! Now, only the declaration and the second and last assignments are required to cause trouble 80 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Pascal Union Types Pascal has record variants which support both discriminated & nondiscriminated unions, e. g. type shape = (circle, triangle, rectangle); colors = (red, green, blue); figure = record filled: boolean; color: colors; case form: shape of circle: (diameter: real); triangle: (leftside: integer; rightside: integer; angle: real); rectangle: (side 1: integer; side 2: integer) end; 81 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Pascal Union Types case myfigure. form of circle : writeln(‘It is a circle; its diameter is’, myfigure. diameter); triangle : begin writeln(‘It is a triangle’); writeln(‘ its sides are: ’ myfigure. leftside, myfigure. rightside); wtiteln(‘ the angle between the sides is : ’, myfigure. angle); end; rectangle : begin writeln(‘It is a rectangle’); writeln(‘ its sides are: ‘ myfigure. side 1, myfigure. side 2) end 82 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Pascal Union Types But, Pascal allowed for problems because: – The user could explicitly set the record variant tag myfigure. form : = triangle – The variant tag is option. We could have defined a figure as: Type figure = record … case shape of circle: (diameter: real); … end Pascal’s variant records introduce potential type problems, but are also a loophole which allows you to do, for example, pointer arithmetic. 83 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Ada Union Types Ada only has “discriminated unions” These are safer than union types in Pascal & Modula 2 because: – The tag must be present – It is impossible for the user to create an inconsistent union (because tag cannot be assigned by itself -- All assignments to the union must include the tag value) 84 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Union Types C and C++ have only free unions (no tags) • Not part of their records • No type checking of references 6. Java has neither records nor unions, but aggregate types can be created with classes, as in C++ Evaluation - potentially unsafe in most languages (not Ada) 85 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Set Types • A set is a type whose variables can store unordered collections of distinct values from some ordinal type • Design Issue: – What is the maximum number of elements in any set base type? • Usually implemented as a bit vector. – Allows for very efficient implementation of basic set operations (e. g. , membership check, intersection, union) 86 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Sets in Pascal • No maximum size in the language definition and implementation dependant and usually a function of hardware word size (e. g. , 64, 96, …). • Result: Code not portable, poor writability if max is too small • Set operations: union (+), intersection (*), difference (-), =, <>, superset (>=), subset (<=), in Type colors = (red, blue, green, yellow, orange, white, black); colorset = set of colors; var s 1, s 2 : colorset; … s 1 : = [red, blue, yellow, white]; s 2 : = [black, blue]; 87 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Examples 2. Modula-2 and Modula-3 • Additional operations: INCL, EXCL, / (symmetric set difference (elements in one but not both operands)) 3. Ada - does not include sets, but defines in as set membership operator for all enumeration types 4. Java includes a class for set operations 88 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Evaluation • If a language does not have sets, they must be simulated, either with enumerated types or with arrays • Arrays are more flexible than sets, but have much slower operations Implementation • Usually stored as bit strings and use logical operations for the set operations. 89 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Pointers A pointer type is a type in which the range of values consists of memory addresses and a special value, nil (or null) Uses: 1. Addressing flexibility 2. Dynamic storage management Design Issues: • • What is the scope and lifetime of pointer variables? What is the lifetime of heap-dynamic variables? Are pointers restricted to pointing at a particular type? Are pointers used for dynamic storage management, indirect addressing, or both? • Should a language support pointer types, reference types, or both? 90 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Fundamental Pointer Operations • Assignment of an address to a pointer • References (explicit versus implicit dereferencing) 91 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Problems with pointers 1. Dangling pointers (dangerous) • A pointer points to a heap-dynamic variable that has been deallocated • Creating one: • Allocate a heap-dynamic variable and set a pointer to point at it • Set a second pointer to the value of the first pointer • Deallocate the heap-dynamic variable, using the first pointer 92 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Problems with pointers 2. Lost Heap-Dynamic Variables (wasteful) • A heap-dynamic variable that is no longer referenced by any program pointer • Creating one: a. Pointer p 1 is set to point to a newly created heap-dynamic variable b. p 1 is later set to point to another newly created heap-dynamic variable • The process of losing heap-dynamic variables is called memory leakage 93 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Problems with Pointers 1. Pascal: used for dynamic storage management only • Explicit dereferencing • Dangling pointers are possible (dispose) • Dangling objects are also possible 2. Ada: a little better than Pascal and Modula-2 • Some dangling pointers are disallowed because dynamic objects can be automatically deallocated at the end of pointer's scope • All pointers are initialized to null • Similar dangling object problem (but rarely happens) 94 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Pointer Problems: C and C++ • Used for dynamic storage management and addressing • Explicit dereferencing and address-of operator • Can do address arithmetic in restricted forms • Domain type need not be fixed (void *) float stuff[100]; float *p; p = stuff; *(p+5) is equivalent to stuff[5] and p[5] *(p+i) is equivalent to stuff[i] and p[i] void * - can point to any type and can be type checked (cannot be dereferenced) 95 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Pointer Problems: Fortran 90 • Can point to heap and non-heap variables • Implicit dereferencing • Special assignment operator for non dereferenced references REAL, POINTER : : ptr (POINTER is an attribute) ptr => target (where target is either a pointer or a nonpointer with the TARGET attribute) The TARGET attribute is assigned in the declaration, e. g. INTEGER, TARGET : : NODE 96 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Pointers 5. C++ Reference Types • Constant pointers that are implicitly dereferenced • Used for parameters • Advantages of both pass-by-reference and pass-by-value 6. Java - Only references • No pointer arithmetic • Can only point at objects (which are all on the heap) • No explicit deallocator (garbage collection is used) • Means there can be no dangling references • Dereferencing is always implicit 97 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Memory Management • Memory management: identify unused, dynamically allocated memory cells and return them to the heap • Approaches – Manual: explicit allocation and deallocation (C, C++) – Automatic: • Reference counters (modula 2, Adobe Photoshop) • Garbage collection (Lisp, Java) • Problems with manual approach: – Requires programmer effort – Programmer’s failures leads to space leaks and dangling references/sharing – Proper explicit memory management is difficult and has been estimated to account for up to 40% of development time! IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0 98
Reference Counting • Idea: keep track how many references there are to a cell in memory. If this number drops to 0, the cell is garbage. • Store garbage in free list; allocate from this list • Advantages – immediacy – resources can be freed directly – immediate reuse of memory possible • Disadvantages – Can’t handle cyclic data structures – Bad locality properties – Large overhead for pointer manipulation 99 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Garbage Collection (GC) • GC is a process by which dynamically allocated storage is reclaimed during the execution of a program. • Usually refers to automatic periodic storage reclamation by the garbage collector (part of the run-time system), as opposed to explicit code to free specific blocks of memory. • Usually triggered during memory allocation when available free memory falls below a threshold. Normal execution is suspended and GC is run. • Major GC algorithms: – Mark and sweep – Copying – Incremental garbage collection algorithms IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0 100
Mark and Sweep • Oldest and simplest algorithm • Has two phases: mark and sweep • Collection algorithms: When program runs out of memory, stop program, do garbage collection and resume program. • Here: Keep free memory in free pool. When allocation encounters empty free pool, do garbage collection. • Mark: Go through live memory and mark all live cells. • Sweep: Go through whole memory and put a reference to all non-live cells into free pool. 101 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Evaluation of pointers • Dangling pointers and dangling objects are problems, as is heap management • Pointers are like goto's -- they widen the range of cells that can be accessed by a variable • Pointers are necessary--so we can't design a language without them 102 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Expressions and Assignment Statements • Expressions are the fundamental means of specifying computations in a PL – While variables are the means for specifying storage • Primary use of expressions: assignment – Main purpose: change the value of a variable – Essence of all imperative PLs: expressions change contents of variables (computations change states) • To understand expression evaluation, need to know orders of operator and operand evaluation – Dictated by associativity and precedence rules 103 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Arithmetic Expressions • Arithmetic expressions consist of operators, operands, parentheses, and function calls – Unary, binary, ternary (e. g. , _? _: _) operators • Implementation involves: – Fetching operands, usually from memory – Executing arithmetic operations on the operands • Design issues for arithmetic expressions – Operator precedence/associativity rules? – Order of operand evaluation and their side effects? – Operator overloading? – Type mixing in expressions? 104 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Operator Precedence Rules • Define the order in which “adjacent” operators of different precedence levels are evaluated – Based on the hierarchy of operator priorities • Typical precedence levels – – – parentheses unary operators ** (exponentiation, if the language supports it) *, / +, 105 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Operator Associativity Rule • Define the order in which adjacent operators with the same precedence level are evaluated • Typical associativity rules – Left to right, except **, which is right to left – Sometimes unary operators associate right to left • Precedence and associativity rules can be overridden with parentheses 106 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Conditional Expressions • Conditional expressions by ternary operator ? : – C-based languages (e. g. , C, C++), e. g. , average = (count == 0)? 0 : sum / count – Evaluates as if written like if (count == 0) average = 0 else average = sum /count 107 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Operand Evaluation Order • How operands in expressions are “evaluated”? – Variables: fetch the value from memory – Constants: sometimes fetched from memory; sometimes in the machine language instruction – Parenthesized expressions: evaluate all inside operands and operators first – Operands on the two sides of an operator: evaluation order is usually irrelevant, except when the operand may cause side effects, e. g. , b = a + foo(&a); 108 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Side Effects in Expressions • Functional side effects: a function changes a two-way parameter or a non-local variable – i. e. , change the state “external” to the function • Problem with functional side effects: – When a function referenced in an expression alters another operand of the expression: a = 10; /* assume foo changes its parameter */ Order in which operand is evaluated first will make b = a + foo(&a); difference 109 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Functional Side Effects • Functions in pure mathematics do not have side effects, i. e. , y = f(x) – Input, x, determines output, y; no states • Same with pure functional programming languages • Side effects occur due to von Neumann arch. and associated imperative PL and computation model (state machines) – Memory/processor, variables/expressions, state/state change IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0 110
Functional Side Effects • Solution 1: define the language by disallowing functional side effects – No two-way parameters in functions – No non-local references in functions – Disadvantage: inflexibility of one-way parameters and lack of non-local references • Solution 2: write the language definition to demand that operand evaluation order be fixed – Disadvantage: limits some compiler optimizations – Java requires that operands appear to be evaluated in left-to-right order 111 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Overloaded Operators int a, b; float x, y; … b = a + 3; y = x + 3. 0; • We wish to use the same operator ‘+’ to operate on integers and floating-point numbers – Let compiler make proper translation, e. g. , ADD vs FADD – How about ‘+’ to operate on two array variables? 112 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Overloaded Operators • Use of an operator for more than one purpose is called operator overloading • Some are common (e. g. , + for int and float) • Some are troublesome (e. g. , * in C and C++) – Loss of compiler error detection (omission of an operand should be a detectable error) – Some loss of readability • C++/C# allow user-defined overloaded operator – Users can define nonsense operations – Readability may suffer, even when operators make sense, e. g. , need to check operand types to know 113 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Type Conversions int a, b; float x, y; a = y; x = b; b = y + a; • How should data be converted for assignment? • What kinds of data format should compiler use during evaluation of the expressions? 114 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Type Conversions • A narrowing conversion is one that converts an object to a type that cannot include all of the values of the original type, e. g. , float to int – Not always safe • A widening conversion is one in which an object is converted to a type that can include at least approximations to all of the values of the original type, e. g. , int to float – Usually safe but may lose accuracy 115 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Type Conversions: Mixed Mode • A mixed-mode expression is one that has operands of different types – Need type conversion implicitly or explicitly • Implicit type conversion by compiler: coercion – Disadvantage: decrease in the type error detection ability of the compiler – In most languages, all numeric types are coerced in expressions, using widening conversions – In Ada, there are virtually no coercions in expressions to minimize errors due to mixed-mode expressions 116 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Type Conversions • Explicit type conversion by programmer: casting in C-based languages, e. g. , – C: (int) angle – Ada: Float (Sum) • Causes of errors in expressions – Inherent limitations of arithmetic, e. g. , division by zero – Limitations of computer arithmetic, e. g. overflow • Errors often ignored by the run-time system 117 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Relational Expressions • Expressions using relational operators and operands of various types; evaluate to Boolean – Relational operators: compare values of 2 operands – Operator symbols vary among languages (!=, /=, ~=, . NE. , <>, #) 118 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Boolean Expressions • Expressions using Boolean operators and Boolean operands, and evaluate to Boolean – Boolean operands: Boolean variables, Boolean constants, relational expressions – Example operators: FORTRAN 77 FORTRAN 90 C Ada . AND. and && and. OR. or || or. NOT. not ! not 119 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
No Boolean Type in C • C 89 has no Boolean type: it uses int type with 0 for false and nonzero for true – Expression evaluates to 0 for false and 1 for true • One odd characteristic of C’s expressions: a < b < c is a legal expression, but the result is not what you might expect: – Left operator is evaluated, producing 0 or 1 – The evaluation result is then compared with the third operand (i. e. , c) 120 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Precedence Operators in C Highest Lowest postfix ++, -unary +, -, prefix ++, --, ! *, /, % binary +, <, >, <=, >= =, != && || 121 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Short Circuit Evaluation • An expression in which the result is determined w/o evaluating all operands and/or operators (13*a) * (b/13– 1) – If a is zero, there is no need to evaluate (b/13 -1) • Problem with non-short-circuit evaluation index = 0; while (index < listlen) && (LIST[index] != key) index = index + 1; – When index==listlen, LIST[index] causes an indexing problem (if LIST has listlen-1 elements) 122 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Short Circuit Evaluation • C, C++, and Java: use short-circuit evaluation for the usual Boolean operators (&& and ||), but also provide bitwise Boolean operators that are not short circuit (& and |) • Ada: programmer can specify either (shortcircuit is specified with and then and or else) • Short-circuit evaluation exposes the potential problem of side effects in expressions e. g. , (a > b) || (b++ / 3) 123 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Assignment Statements • The general syntax <target_var> <assign_operator> <expression> • The assignment operator – = FORTRAN, BASIC, the C-based languages – : = ALGOLs, Pascal, Ada • = can be bad when it is overloaded for the relational operator for equality (that’s why the C-based languages use == as the relational operator) 124 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Conditional Targets • Conditional targets (Perl) ($flag ? $total : $subtotal) = 0 – Which is equivalent to if ($flag){ $total = 0 } else { $subtotal = 0 } IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0 125
Compound Assignment Operators • A shorthand method of specifying a commonly needed form of assignment • Introduced in ALGOL; adopted by C • Example: a=a+b is written as a += b IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0 126
Unary Assignment Operators • Unary assignment operators in C-based languages combine increment and decrement operations with assignment • Examples: – sum = ++count (count incremented, assigned to sum) – sum = count++ (count assigned to sum and then incremented) – count++ (count incremented) – -count++ (count incremented then negated) 127 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Assignment as an Expression • In C, C++, and Java, the assignment statement produces a result and can be used as operands while ((ch = getchar())!= EOF){…} – ch = getchar() is carried out; result is used as a conditional value for the while statement – Has expression side effect: a=b+(c=d/b)-1 – Multiple-target assignment: sum = count = 0; – Hard to tell: if (x = y) and if (x == y) • Perl and Ruby support list assignments, e. g. , ($first, $second, $third) = (20, 30, 40); 128 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Mixed-Mode Assignment • Assignment statements can also be mixedmode • In Fortran, C, and C++, any numeric type value can be assigned to any numeric type variable • In Java, only widening assignment coercions are done • In Ada, there is no assignment coercion 129 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Statement-Level Control Structures Controlling Program Flows • Computations in imperative-language programs – Evaluating expressions – reading variables, executing operations – Assigning resulting values to variables – Selecting among alternative control flow paths – Causing repeated execution • A control structure is a control statement and the statements whose execution it controls • Most programming languages follow a single thread of control (or scheduling) 130 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Selection Statements • A selection statement chooses between two or more paths of execution • Two general categories: – Two-way selectors – Multiple-way selectors 131 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Two-Way Selection Statements • General form: if control_expression then clause else clause • Control expression: – In C 89, C 99, Python, and C++, the control expression can be arithmetic – In languages such as Ada, Java, Ruby, and C#, the control expression must be Boolean 132 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Then and Else Clauses • In contemporary languages, then and else clauses can be single or compound statements – In Perl, all clauses must be delimited by braces (they must be compound even if there is only 1 statement) – Python uses indentation to define clauses if x > y : x=y print "case 1" 133 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Nesting Selectors • Consider the following Java code: if (sum == 0) if (count == 0) result = 0; else result = 1; • Which if gets the else? (dangling else) • Java's static semantics rule: else matches with the nearest if 134 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Nesting Selectors (cont. ) • To force an alternative semantics, compound statements may be used: if (sum == 0) { if (count == 0) result = 0; } else result = 1; • The above solution is used in C, C++, and C# • Perl requires that all then and else clauses to be compound avoid the above problem 135 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Nesting Selectors (cont. ) • The problem can also be solved by alternative means of forming compound statements, e. g. , using a special word end in Ruby if sum == 0 then if count == 0 then result = 0 else result = 1 end if sum == 0 then if count == 0 then result = 0 end else result = 1 end 136 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Multiple-Way Selection Statements • Allow the selection of one of any number of statements or statement groups • Switch in C, C++, Java: switch (expression) { case const_expr_1: stmt_1; … case const_expr_n: stmt_n; [default: stmt_n+1] } 137 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Switch in C, C++, Jave • Design choices for C’s switch statement – Control expression can be only an integer type – Selectable segments can be statement sequences, blocks, or compound statements – Any number of segments can be executed in one execution of the construct (there is no implicit branch at the end of selectable segments); break is used for exiting switch reliability of missing break – default clause is for unrepresented values (if there is no default, the whole statement does nothing) 138 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Switch in C, C++, Java switch (x) default: if (prime(x)) case 2: case 3: case 5: case 7: process_prime(x); else case 4: case 6: case 8: case 9: case 10: process_composite(x); 139 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Multiple-Way Selection in C# • It has a static semantics rule that disallows the implicit execution of more than one segment – Each selectable segment must end with an unconditional branch (goto or break) • The control expression and the case constants can be strings switch (value) { case -1: Negatives++; break; case 0: Zeros++; goto case 1; case 1: Positives++; break; default: Console. Write. Line(“!!!n”); } 140 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Multiple-Way Selection in Ada • Ada case expression is when choice list => stmt_sequence; … when choice list => stmt_sequence; when others => stmt_sequence; ] end case; • More reliable than C’s switch – Once a stmt_sequence execution is completed, control is passed to the first statement after the case statement 141 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Multiple-Way Selection Using if • Multiple selectors can appear as direct extensions to two-way selectors, using else-if clauses, for example in Python: if count < 10 : bag 1 = True elif count < 100 : bag 2 = True elif count < 1000 : bag 3 = True More readable than deeply nested two-way selectors! Can compare ranges 142 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Iterative Statements • The repeated execution of a statement or compound statement is accomplished either by iteration or recursion • Counter-controlled loops: – A counting iterative statement has a loop variable, and a means of specifying the loop parameters: initial, terminal, stepsize values – Design Issues: • What are the type and scope of the loop variable? • Should it be legal for the loop variable or loop parameters to be changed in the loop body? 143 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Iterative Statements: C-based for ([expr_1] ; [expr_2] ; [expr_3]) statement • The expressions can be whole statements or statement sequences, separated by commas – The value of a multiple-statement expression is the value of the last statement in the expression – If second expression is absent, it is an infinite loop • Design choices: – No explicit loop variable the loop needs not count – Everything can be changed in the loop – 1 st expr evaluated once, others with each iteration 144 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Iterative Statements: C-based for (count 1 = 0, count 2 = 1. 0; count 1 <= 10 && count 2 <= 100. 0; sum = ++count 1 + count 2, count 2 *= 2); • C++ differs from earlier C in two ways: – The control expression can also be Boolean – Initial expression can include variable definitions (scope is from the definition to the end of loop body) • Java and C# – Differs from C++ in that the control expression must be Boolean 145 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Logically-Controlled Loops • Repetition control based on Boolean expression • C and C++ have both pretest and posttest forms, and control expression can be arithmetic: while (ctrl_expr) loop body do loop body while (ctrl_expr) • Java is like C, except control expression must be Boolean (and the body can only be entered at the beginning -- Java has no goto) 146 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
User-Located Loop Control • Programmers decide a location for loop control (other than top or bottom of the loop) • Simple design for single loops (e. g. , break) • C , C++, Python, Ruby, C# have unconditional unlabeled exits (break), and an unlabeled control statement, continue, that skips the remainder of current iteration, but not the loop • Java and Perl have unconditional labeled exits (break in Java, last in Perl) and labeled versions of continue 147 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
User-Located Loop Control • In Java: outer. Loop: for (row = 0; row < num. Rows; row++) for (col = 0; col < num. Cols; col++) { sum += mat[row][col]; if (sum > 1000. 0) break outer. Loop; } 148 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Iteration Based on Data Structures • Number of elements in a data structure control loop iteration • Control mechanism is a call to an iterator function that returns the next element in the data structure in some chosen order, if there is one; else loop is terminated • C's for statement can be used to build a userdefined iterator: for(p=root; p==NULL; traverse(p)){} 149 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Iteration Based on Data Structures • PHP: reset $list; print(“ 1 st: “+current($list) + “ ”); while($current_value = next($list)) print(“next: “+$current_value+” ”); • Java 5. 0 (uses for, although called foreach) – For arrays and any other class that implements Iterable interface, e. g. , Array. List for (String my. Element : my. List) { … } 150 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Unconditional Branching • Transfers execution control to a specified place in the program, e. g. , goto • Major concern: readability – Some languages do not support goto statement (e. g. , Java) – C# offers goto statement (can be used in switch statements) • Loop exit statements are restricted and somewhat hide away goto’s 151 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Guarded Commands • Designed by Dijkstra • Purpose: to support a new programming methodology that supports verification (correctness) during development • Basis for two linguistic mechanisms for concurrent programming (in CSP and Ada) • Basic Idea: if the order of evaluation is not important, the program should not specify one 152 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Selection Guarded Command • Form if <Boolean exp> -> <statement> [] <Boolean exp> -> <statement>. . . [] <Boolean exp> -> <statement> fi • Semantics: – – Evaluate all Boolean expressions If > 1 are true, choose one non-deterministically If none are true, it is a runtime error Prog correctness cannot depend on statement chosen 153 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Selection Guarded Command if x >= y -> max : = x [] y >= x => max : = y fi Compare with the following code: if (x >= y) max = x; else max = y; 154 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
Loop Guarded Command • Form do <Boolean> -> <statement> [] <Boolean> -> <statement>. . . [] <Boolean> -> <statement> od • Semantics: for each iteration – Evaluate all Boolean expressions – If more than one are true, choose one nondeterministically; then start loop again – If none are true, exit loop 155 IFETCE/ME CSE/I YEAR/II SEM/CP 7203/PPL/UNIT 2/PPT/VER 1. 0
- Slides: 155