Introduction fundamental semantic issues Imperative languages are abstractions

Introduction § 本章主要介紹與變數有關的fundamental semantic issues § Imperative languages are abstractions of von Neumann architecture l l Memory Processor § Variables characterized by attributes l Type: to design, must consider scope, lifetime, type checking, initialization, and type compatibility 淡江大學資訊管理系侯永昌 1

Names § Design issues for names: l Maximum length? l Are connector characters allowed? l Are names case sensitive? l Are special words reserved words or keywords? 淡江大學資訊管理系侯永昌 2

Names § Length l l l If too short, they cannot be connotative(望文生義 ) Language examples: • FORTRAN I: maximum 6 • COBOL: maximum 30 • FORTRAN 90 and ANSI C: maximum 31 • Ada and Java: no limit, and all are significant • C++: no limit, but implementors often impose one 淡江大學資訊管理系侯永昌 3 取決於symbol table的儲存和處理

Names § Connectors：是否允許使用連接符號 l Pascal, Modula-2, and FORTRAN 77 don't allow l Others do § 可以使命名更為清晰 § 例：S SUMSQR SUMOFSQUARE SUM_OF_SQUARE 哪一個比較清楚？淡江大學資訊管理系侯永昌 4

Names § Case sensitivity：命名的大小寫是否有區別 l C, C++, and Java names are case sensitive l The names in other languages are not l 有區別的缺點為：對程式的readability和 writability造成障礙(names that look alike are different) l worse in C++ and Java because predefined names are mixed case (e. g. Index. Out. Of. Bounds. Exception) 淡江大學資訊管理系侯永昌 5

Names § Special words：是否有key words或reserved words § keyword is a word that is special only in certain contexts，在程式的特定位置上，有特別意義的字。例：in Fortran REAL APPLE 在宣告區的REAL代表宣告 REAL = 3. 4 在程式區的REAL代表變數的 assignment l Disadvantage: poor readability 淡江大學資訊管理系侯永昌 6

Names § reserved word is a special word that cannot be used as a user-defined name，user不能將它當成變數的名字 § In ALGOL：用粗體字表示reserved word § 一般而言，使用keyword或reserved word有助於readability; used to delimit or separate statement clauses 淡江大學資訊管理系侯永昌 7

Names § 就language design的眼光，reserved word比 key word 好 § 例：INTEGER REAL 宣告變數REAL為整數 REAL INTEGER 宣告變數INTEGER為實數 REAL = 3 INTEGER = 5. 2 使用keyword作為變數名稱，對readability會造成傷害，造成誤解，應盡量不要用keyword 作為變數名稱淡江大學資訊管理系侯永昌 8

Variables § A variable is an abstraction of a memory cell，在programming上是一大進步，從此不再使用absolute address，使程使易讀、易寫、易維護 § Variables can be characterized as a sextuple of attributes: (name, address, value, type, lifetime, and scope) § Name - not all variables have them (anonymous) 淡江大學資訊管理系侯永昌 9

Variables § Address - the memory address with which it is associated (also called l-value) l A variable may have different addresses at different times during execution，例如： local variable X in recursive program l A variable may have different addresses at different places in a program，例如：sum in procedure P 1 & P 2 淡江大學資訊管理系侯永昌 10

Variables § Aliases - If two variable names can be used to access the same memory location § Aliases are harmful to readability (program readers must remember all of them) 淡江大學資訊管理系侯永昌 11

How Aliases Can Be Created § Pointers, reference variables：例：new (p); q: = p; p, q are aliases § C and C++ unions：discuss in chapter 6 § Procedure parameters：例：Procedure SUB (var X, Y : integer); SUB (A, A); X, Y are aliases § Declare explicitly：例：EQUIVALENCE (NAME 1, NAME 2, …) in Fortran; EQUIVALENCE (A, B(5), D(1)) 它們都是 aliases 淡江大學資訊管理系侯永昌 12

Why Aliases? § Share memory：尤其是以前memory很小的時候，現在多用dynamic allocation的技巧來reuse memory § 將不同dimension的data structure連在一起，以方便運作，並增進運算速度 § 例： EQUIVALENCE (MAT(1, 1), COL 1(1)), (MAT(1, 2), COL 2(1)), …) 則想要處理MAT的某一個column時，可以用COL 1 或COL 2這個array代替，因為一度空間的運算要比二度空間快的多 § 儘可能不要用alias，因為它對程式的readability有害，而且Some of the original justifications for aliases are no longer valid; e. g. memory reuse in FORTRAN 淡江大學資訊管理系侯永昌 13

Variables § Type - determines the range of values of variables and the set of operations that are defined for values of that type; in the case of floating point, type also determines the precision § 例：for 2 bytes integer: -32768 ~ 32767 4 bytes real: 2. 9387*10 -39 ~ 1. 7014*1038 precision約為 6位小數點淡江大學資訊管理系侯永昌 14

Round Off Error § 例： 0. 2897*100, 0. 4976*100, 0. 2488*101, 0. 7259*101, § § § 0. 1638*102, 0. 6249*102, 0. 2162*103, 0. 5233*103, 0. 1403*104, 0. 5291*104，假設有效位數為 4位則由小加到大 = 0. 7523*104，由大加到小 = 0. 7520*104，而真正的值為 0. 75229043*104 例： 0. 9964*101+0. 9803*101=1. 9767*101=0. 1976*102 例： 0. 2631*102 -0. 1976*102=0. 0655*102=0. 6550*101 例： 0. 3472*105 -0. 3471*105=0. 0001*105=0. 1000*102 例： 0. 3472*106+0. 4437*102 =0. 3472*106+0. 00004437*106=0. 3472*106 § 注意有效位數的變化淡江大學資訊管理系侯永昌 16

Variables § Abstract memory cell - the physical cell or collection of cells associated with a variable § The l-value of a variable is its address § The r-value of a variable is its value 淡江大學資訊管理系侯永昌 17

The Concept of Binding § Binding is an association, such as between an attribute and an entity, or between an operation and a symbol § Binding time is the time at which a binding takes place. 淡江大學資訊管理系侯永昌 18

Possible binding times § Language design time -- e. g. , bind operator symbols to operations § Language implementation time -- e. g. , bind floating point type to a representation § Compile time -- e. g. , bind a variable to a type in C or Java § Load time -- e. g. , bind a FORTRAN 77 variable to a memory cell (or a C static variable) § Runtime -- e. g. , bind a nonstatic local variable to a memory cell 淡江大學資訊管理系侯永昌 19

Possible binding times § 例︰int count; § § § count = count + 10; count的data type是整數︰在compile-time決定 count的所有可能值是-32768~32767︰在compiler的 design (implementation) time決定 count的值是多少︰在run-time決定 +代表整數的加法︰在compile-time決定 10是 10，而非 2︰在compiler的design time決定用什麼樣的bit string來表現 10，例如︰ 1010︰在 compiler的implementation time決定淡江大學資訊管理系侯永昌 20

The Concept of Binding § A binding is static if it first occurs before run time (compile time) and remains unchanged throughout program execution. § A binding is dynamic if it first occurs during execution or can change during execution of the program. 淡江大學資訊管理系侯永昌 21

The Concept of Binding § Design issues of type bindings l How is a type specified? l When does the binding take place? § If static, the type may be specified by either an explicit or an implicit declaration 淡江大學資訊管理系侯永昌 22

The Concept of Binding § An explicit declaration is a program statement used for declaring the types of variables，每一個變數都必須事先宣告 § An implicit declaration is a default mechanism for specifying types of variables (the first appearance of the variable in the program)，也就是如果使用default type，就可以不需要宣告淡江大學資訊管理系侯永昌 23

The Concept of Binding § FORTRAN, PL/I, BASIC, and Perl provide implicit declarations 例：in Fortran: default: I ~ N : INTEGER others: REAL 例：in Perl: $apple: scalar @apple: array %apple: hash structure § Advantage: convenience, 有助於writability § Disadvantage: 無法偵測出writing error，對於reliability有損害淡江大學資訊管理系侯永昌 24

The Concept of Binding § Dynamic Type Binding (Java. Script and PHP) § 不是經由宣告來做variable與data type之間的 binding，而是經由assignment statement § 例：in Java. Script list = [2, 4. 33, 6, 8]; array of length 3 list = 17. 3; integer 淡江大學資訊管理系侯永昌 25

The Concept of Binding § Advantage: flexibility (generic program units) § Disadvantages: l High cost (dynamic type checking and interpretation)：每一個變數需要加上一個 tag，用來表明資料形態，checking要在runtime時做，使execution time加長 l Type error detection by the compiler is difficult，在compile-time無法做type checking，要在run-time時才知道它type mismatch 淡江大學資訊管理系侯永昌 26

The Concept of Binding § 因此，dynamic type binding一般多為 interpreter language而非compile language，而compile language一般多用static type binding，因此執行時比較有效率淡江大學資訊管理系侯永昌 27

Storage Bindings & Lifetime § Lifetime：variable的lifetime是由variable與某一個memory location發生binding時開始，一直到binding的關係結束時為止 § Allocation - getting a cell from some pool of available cells，將某一個memory location給某一個variable § Deallocation - putting a cell back into the pool ，某一個variable將它的memory location還給available space 淡江大學資訊管理系侯永昌 28

Categories of variables by lifetimes § Static -- bound to memory cells before execution begins and remains bound to the same memory cell throughout execution. § 例：all FORTRAN 77 variables, C static variables 淡江大學資訊管理系侯永昌 29

Categories of variables by lifetimes § Advantages: l 適合direct addressing：efficiency l 方便使用global variables：convenient l history-sensitive subprogram support l No run-time overhead for allocation/ deallocation § Disadvantage: lack of flexibility (recursion is not allowed) 淡江大學資訊管理系侯永昌 30

Categories of variables by lifetimes § Stack-dynamic -- Storage bindings are created for variables when their declaration statements are elaborated and remains bound until execution terminates. § If scalar, all attributes except address are statically bound § 例：local variables in C subprograms and Java methods 淡江大學資訊管理系侯永昌 31

Categories of variables by lifetimes § Advantages: l allow recursion l Share memory space § Disadvantages: l Overhead of allocation and deallocation l Subprograms cannot be history sensitive l Inefficient references (indirect addressing) 淡江大學資訊管理系侯永昌 32

Categories of variables by lifetimes § Explicit heap-dynamic -- Allocated and deallocated by explicit directives, specified by the programmer, which take effect during execution § Nameless object is referenced only through pointers or references 例：dynamic objects in C++ (via new and delete)，all objects in Java 淡江大學資訊管理系侯永昌 33

Categories of variables by lifetimes § 例：int *intnode; intnode = new int; /* allocate an int cell */ delete intnode; /* deallocate the cell to which intnode points */ § A intnode is bound to integer type at compiletime, but this variable is referenced through intnode at run-time 淡江大學資訊管理系侯永昌 34

Categories of variables by lifetimes § Advantages: l Provide for dynamic storage management l 特別適合處理tree與linked list的問題 § Disadvantages: l Inefficient：high cost for referencing, allocating and deallocating memory l Unreliable：會產生dangling reference and garbage – discuss in chapter 6 淡江大學資訊管理系侯永昌 35

Categories of variables by lifetimes § Implicit heap-dynamic -- Allocation and deallocation caused by assignment statements § 例：all variables in APL; all strings and arrays in Perl and Java. Script § Advantage: flexibility § Disadvantages: l Inefficient, because all attributes are dynamic：run-time overhead l Loss of error detection：compile-time時無法做error detection 淡江大學資訊管理系侯永昌 36

Type Checking § Generalize the concept of operands and operators to include subprograms and assignments § Type checking is the activity of ensuring that the operands of an operator are of compatible types § compatible type is one that is either legal for the operator, or is allowed under language rules to be implicitly converted, by compiler-generated code, to a legal type. This automatic conversion is called a coercion. § type error is the application of an operator to an operand of an inappropriate type 淡江大學資訊管理系侯永昌 37

Type Checking § If all type bindings are static, nearly all type checking can be static︰at compile-time § If type bindings are dynamic, type checking must be dynamic︰at run-time § A programming language is strongly typed if type errors are always detected § 雖然一般多希望儘可能的做static type checking，但是仍然有很多特性非要在 run-time做check不可淡江大學資訊管理系侯永昌 38

Type Checking § 例︰V: array [1. . 100] of integer check array V[I]的attributes，例如︰type, name, …，可以在compile-time時處理，但是check I是否在 1~100之間，則非等到run -time不可淡江大學資訊管理系侯永昌 39

Strong Typing § 每一個變數的type，在compile-time時就必須要確定(static binding) § 在compile-time時，必須check每一個運算中所需要的變數的type是否相符 § 如果一個變數(storage)在不同的時間，允許儲存兩種以上不同的data type，則在使用以前必須做過static or dynamic checking § Advantage : allows the detection of the misuses of variables that result in type errors § Disadvantage : 減少flexibility 淡江大學資訊管理系侯永昌 40

Strong Typing Examples § FORTRAN 77 is not: parameters, EQUIVALENCE § Pascal is not: variant records § C and C++ are not: parameter type checking can be avoided; unions are not type checked § Ada is, almost (UNCHECKED CONVERSION is loophole), (Java is similar) § Coercion rules strongly affect strong typing -- they can weaken it considerably (C++ versus Ada) § Java has half as many assignment coercions as C++, its strong typing is still far less effective than that of Ada 淡江大學資訊管理系侯永昌 41

Type Compatibility § Our concern is primarily for structured types § 例︰type T 1 = array [1. . 100] of integer; T 2 = array [1. . 100] of integer; var A : T 1; B, C : T 1; D : T 2; E, F : array [1. . 100] of integer; G : array [1. . 100] of integer; 則下述的句型有沒有語法上的錯誤？(可不可以？) A[i] : = B[i]; A[i] : = D[i] + C[i]; B[i] : = C[i-1] + 3; E[i] : = F[i]; F[i] : = G[i] 淡江大學資訊管理系侯永昌 42

Type Compatibility § Name type compatibility means the two variables have compatible types if they are in either the same declaration or in declarations that use the same type name § A, B, C是同一個資料型態，與D不相同 § 對於不名的資料形態(anonymous type)， name equivalent有一些不同的考慮︰ in Ada︰E, F, G均不相等 in Pascal︰E, F相等，但與G不相等淡江大學資訊管理系侯永昌 43

Name Type Compatibility § 優點︰Easy to implement § 缺點︰Highly restrictive: l l Subranges of integer types are not compatible with integer types，Subrange is treated as a new type • type Indextype is 1. . 100; count: Integer; Count : = index + 3; (? ) index: Indextype; Formal parameters must be the same type as their corresponding actual parameters。in Pascal，如果一個 structured data type被當成是一個parameter，由一個 procedure送到另外一個procedure，因為不能重複定義 data type，因此那一個type definition就只好被定義成 global data type 淡江大學資訊管理系侯永昌 44

Type Compatibility § Structure type compatibility means that two variables have compatible types if their types have identical structures § 優點︰More flexible § 缺點︰harder to implement § 前述的例子(p. 42)中，所有的變數都是同一個type 淡江大學資訊管理系侯永昌 45

Structure Type Compatibility的問題 § Are two record types compatible if they are structurally the same but use different field names? § Are two array types compatible if they are the same except that the subscripts are different? (e. g. [1. . 10] and [0. . 9]) § Are two enumeration types compatible if their components are spelled differently? § With structural type compatibility, you cannot differentiate between types of the same structure (e. g. different units of speed, both float) 淡江大學資訊管理系侯永昌 46

Type Compatibility Examples § Pascal: usually structure, but in some cases name is used (formal parameters) § C: structure, except for records (name) § C++: name § Ada: restricted form of name l Derived types allow types with the same structure to be different l Anonymous types are all unique, even in: A, B : array (1. . 10) of INTEGER; 淡江大學資訊管理系侯永昌 47

Type Compatibility § Which one is better? l Name equivalent is safer, since it is more restrictive l Name equivalent is simpler to implement 淡江大學資訊管理系侯永昌 48

Scope § The scope of a variable is the range of statements over which it is visible § Local variable︰定義(宣告)在這個 program內的變數 § Nonlocal variables︰在這個program中可以使用(visible)，但是並不是定義(宣告) 在這個program內的變數 § The scope rules of a language determine how references to names are associated with variables 淡江大學資訊管理系侯永昌 49

Static Scope rules § Based on program text § To connect a name reference to a variable, you (or the compiler) must find the declaration § Search process: search declarations, first locally, then in increasingly larger enclosing scopes, until one is found for the given name § Enclosing static scopes (to a specific scope) are called its static ancestors; the nearest static ancestor is called a static parent 淡江大學資訊管理系侯永昌 50

Static Scope rules § 假設compiler在procedure A中遇到一個變數X，它判斷X的屬性(e. g. , type, value, …)的方式是︰ l l l 先在procedure A中找X的定義(local variable) 如果找不到，則到定義procedure A的procedure (static parent)中去找X的定義(global variable) 如果還找不到，則再向static parent’s parent去找， …。如果一直找不到，可能是predefined or error 在內層procedure中定義的變數，對於外層的 procedure而言是hidden的 Redefined variable always hides a variable with same name in its static ancestors 淡江大學資訊管理系侯永昌 51

Scope § Procedure BIG; Var X : integer; Procedure SUB 1; Var X : real; begin SUB 2; X : = …; end; Procedure SUB 2; begin X : = …; end; begin SUB 1; end; 此處的X是SUB 1中所定義的X，因此是real，而BIG中所定義的X是被hidden的此處的X是BIG中所定義的X，因此是integer 淡江大學資訊管理系侯永昌 52

Static Scope rules § Variables can be hidden from a unit by having a "closer" variable with the same name § C++ and Ada允許在變數前面加上ancestor 的scope name來access to these "hidden" variables l l In Ada: unit. name，例︰BIG. X In C++: class_name: : name，例︰BIG: : X 淡江大學資訊管理系侯永昌 53

Static Scope rules § 對於有predefined name (e. g. , SIN, MAXINT, …)的language而言，scoping rule要複雜一些︰ l l 對於實施reserved word的language而言︰先到 predefined name list中去找變數的定義，如果找不到，才開始實施static scope rule(由local variable)開始對於有key word的language而言︰先用static scope rule去找變數的定義，如果找不到，才到 predefined name list中去找它的定義淡江大學資訊管理系侯永昌 54

Blocks § 從ALGOL 60開始，有很多language e. g. , Ada, PL/I, …允許將一些compound statement 自成一個block，而且擁有自己的local variables § 例︰C and C++︰for (. . . ) { int index; } Ada︰declare LCL : FLOAT; begin end 淡江大學資訊管理系侯永昌 55

Static Scope Example § 例︰Assume MAIN calls A and B, A calls C and D, B calls A and E MAIN A C D A B B E C D 淡江大學資訊管理系侯永昌 E 56

Static Scope的缺點 MAIN A C MAIN B D The desired calls A E B C D E The potential calls § Potential procedure call太複雜 § Too much data access︰只要是外圈結構所定義的的變數，除非被hidden，否則內圈結構就可以使用淡江大學資訊管理系侯永昌 57

Static Scope § 討論︰ l Static scoping使變數的使用範圍在 compile-time即可決定 efficiency l Static scoping 鼓勵使用global variables，並使程式結構的層級(level of nesting)減少，無法確實反映conceptual design 淡江大學資訊管理系侯永昌 59

Dynamic Scope § Based on calling sequences of program units, not their textual layout (temporal versus spatial) § References to variables are connected to declarations by searching back through the chain of subprogram calls that forced execution to this point 淡江大學資訊管理系侯永昌 60

Dynamic Scope rules § Dynamic scoping是由calling sequence來決定，而不是由他們之間彼此的nesting的關係，因此是在run-time時才決定的 § 在run-time時，procedure A中遇到一個變數X，它判斷X的屬性的方式是︰ l l l 先在procedure A中找X的定義(local variable) 如果找不到，則沿著calling sequence到它的 caller去找X的定義(global variable) 如果一直找到MAIN program都找不到X的定義 run-time error 淡江大學資訊管理系侯永昌 61

Scope Example MAIN - Var x : interger; SUB 1 - Var x : real. . . call SUB 2. . . - reference to x. . . call SUB 1. . . MAIN calls SUB 1 calls SUB 2 uses x Dynamic scoping: SUB 1 calls SUB 2︰ SUB 2之x為SUB 1中之x x為real Static scoping: SUB 2之x為MAIN中之x x為integer 淡江大學資訊管理系侯永昌 62

Evaluation of Dynamic Scoping § 任何caller的local variable都可以被callee所 visible(使用)，使program unit之間的資料交換更為方便。但是也因此無法防止被誤用 reliability比較差 § 對於non-local variable的type無法做static type checking efficiency比較差 § 而且每一次call這些non-local variable的type 都可能不同，必須追蹤calling sequence才能了解non-local variable的意義 readability比較差淡江大學資訊管理系侯永昌 63

Scope Example MAIN - Var x : interger; SUB 1 - Var x : real. . . call SUB 2. . . - reference to x. . . call SUB 1 call SUB 2 MAIN calls SUB 1 calls SUB 2 uses x SUB 1 calls SUB 2︰ SUB 2之x為SUB 1中之x x為real MAIN calls SUB 2︰ SUB 2之x為MAIN中之x x為integer 淡江大學資訊管理系侯永昌 64

Scope and Lifetime § Scope and lifetime are sometimes closely related, but are different concepts § Scope︰由程式宣告的地方開始，一直到end statement為止，是空間上的觀念 § Lifetime︰由程式的執行開始，一直到程式執行到end statement為止，是時間上的觀念 § Pascal是dynamic allocation，程式開始執行時 OS給它一塊memory，等程式結束後OS將該 memory收回，因此這兩個意義上有點相似 § 但是對Fortran而言，它是static allocation，因此雖然它的scoping也是static and local to subprogram，但是lifetime卻是延伸至整個程式淡江大學資訊管理系侯永昌 65

Scope and Lifetime § 在subprogram之間的互call，也造成scope 和lifetime的差異︰ § 例︰ void printheader ( ) { } void compute ( ) { int sum; printheader ( ); } § sum的scope是在computer程式中，與 printheader無關。但sum的lifetime卻延伸到printheader中淡江大學資訊管理系侯永昌 66

Referencing Environments § The referencing environment of a statement is the collection of all names that are visible in the statement § In a static-scoped language, it is the local variables plus all of the visible variables in all of the enclosing scopes § A subprogram is active if its execution has begun but has not yet terminated § In a dynamic-scoped language, the referencing environment is the local variables plus all visible variables in all active subprograms 淡江大學資訊管理系侯永昌 67

Named Constants § A named constant is a variable that is bound to a value only when it is bound to storage，它的值不能經由assignment or input statement所改變 § Used to parameterize programs § 例︰int[] int. List = new int [100]; String[] str. List = new String [100]; for (index = 0; index < 100; index++) {…} average = sum / 100; 淡江大學資訊管理系侯永昌 68

Named Constants § 例︰const int len = 100; int[] int. List = new int [len]; String[] str. List = new String [len]; for (index = 0; index < len; index++) {…} average = sum / len; § Advantages: l l 增加readability︰例如︰使用PI，而非 3. 14159 幫助modifiability︰更改程式更方便淡江大學資訊管理系侯永昌 69

Named Constants § The binding of values to named constants can be either static (called manifest constants) or dynamic § 例︰in C++︰const int result = 2 * width + 1; § Languages: l l l Pascal: literals only FORTRAN 90: constant-valued expressions Ada, C++, and Java: expressions of any kind 淡江大學資訊管理系侯永昌 70

Variable Initialization § Initialization︰The binding of a variable to a value at the time it is bound to storage，但是在它的lifetime中，它的值是可以更改的 § Initialization is often done on the declaration statement 淡江大學資訊管理系侯永昌 71

Variable Initialization Examples § in Java int sum = 0; § in Fortran：DATA statement DATA A, B, C /10. 5, 23. 4, 11. 2/ § in Ada OLD_COUNT : integer : = 0; initialization COUNT : integer : = OLD_COUNT + 1; initialization LIMIT : constant integer : = OLD_COUNT + 1; name constant 淡江大學資訊管理系侯永昌 72

Variable Initialization Examples § in ALGOL 68 int first : = 10; initialization int second = 10; name constant 違反了相似的外觀應有相同意義的原則 § in Pascal：不可以設定初值淡江大學資訊管理系侯永昌 73

Variable Initialization § Static variable：只能initialize一次 § Dynamic variable：每一次dynamic allocation時，都initialize一次淡江大學資訊管理系侯永昌 74