Introduction fundamental semantic issues Imperative languages are abstractions
Introduction § 本章主要介紹與變數有關的fundamental semantic issues § Imperative languages are abstractions of von Neumann architecture l l Memory Processor § Variables characterized by attributes l Type: to design, must consider scope, lifetime, type checking, initialization, and type compatibility 淡江大學資訊管理系侯永昌 1
Names § Design issues for names: l Maximum length? l Are connector characters allowed? l Are names case sensitive? l Are special words reserved words or keywords? 淡江大學資訊管理系侯永昌 2
Names § Length l l l If too short, they cannot be connotative(望文生義 ) Language examples: • FORTRAN I: maximum 6 • COBOL: maximum 30 • FORTRAN 90 and ANSI C: maximum 31 • Ada and Java: no limit, and all are significant • C++: no limit, but implementors often impose one 淡江大學資訊管理系侯永昌 3 取決於symbol table的儲存和處理
Names § Connectors:是否允許使用連接符號 l Pascal, Modula-2, and FORTRAN 77 don't allow l Others do § 可以使命名更為清晰 § 例:S SUMSQR SUMOFSQUARE SUM_OF_SQUARE 哪一個比較清楚? 淡江大學資訊管理系侯永昌 4
Names § Case sensitivity:命名的大小寫是否有區別 l C, C++, and Java names are case sensitive l The names in other languages are not l 有區別的缺點為:對程式的readability和 writability造成障礙(names that look alike are different) l worse in C++ and Java because predefined names are mixed case (e. g. Index. Out. Of. Bounds. Exception) 淡江大學資訊管理系侯永昌 5
Names § Special words:是否有key words或reserved words § keyword is a word that is special only in certain contexts,在程式的特定位置上,有特別意義 的字。例:in Fortran REAL APPLE 在宣告區的REAL代表宣告 REAL = 3. 4 在程式區的REAL代表變數的 assignment l Disadvantage: poor readability 淡江大學資訊管理系侯永昌 6
Names § reserved word is a special word that cannot be used as a user-defined name,user不能將它當 成變數的名字 § In ALGOL:用粗體字表示reserved word § 一般而言,使用keyword或reserved word有助 於readability; used to delimit or separate statement clauses 淡江大學資訊管理系侯永昌 7
Names § 就language design的眼光,reserved word比 key word 好 § 例:INTEGER REAL 宣告變數REAL為整數 REAL INTEGER 宣告變數INTEGER為實數 REAL = 3 INTEGER = 5. 2 使用keyword作為變數名稱,對readability會 造成傷害,造成誤解,應盡量不要用keyword 作為變數名稱 淡江大學資訊管理系侯永昌 8
Variables § A variable is an abstraction of a memory cell,在programming上是一大進步,從 此不再使用absolute address,使程使易讀、 易寫、易維護 § Variables can be characterized as a sextuple of attributes: (name, address, value, type, lifetime, and scope) § Name - not all variables have them (anonymous) 淡江大學資訊管理系侯永昌 9
Variables § Address - the memory address with which it is associated (also called l-value) l A variable may have different addresses at different times during execution,例如: local variable X in recursive program l A variable may have different addresses at different places in a program,例如:sum in procedure P 1 & P 2 淡江大學資訊管理系侯永昌 10
Variables § Aliases - If two variable names can be used to access the same memory location § Aliases are harmful to readability (program readers must remember all of them) 淡江大學資訊管理系侯永昌 11
How Aliases Can Be Created § Pointers, reference variables: 例:new (p); q: = p; p, q are aliases § C and C++ unions:discuss in chapter 6 § Procedure parameters: 例:Procedure SUB (var X, Y : integer); SUB (A, A); X, Y are aliases § Declare explicitly: 例:EQUIVALENCE (NAME 1, NAME 2, …) in Fortran; EQUIVALENCE (A, B(5), D(1)) 它們都是 aliases 淡江大學資訊管理系侯永昌 12
Why Aliases? § Share memory:尤其是以前memory很小的時候, 現在多用dynamic allocation的技巧來reuse memory § 將不同dimension的data structure連在一起,以方便 運作,並增進運算速度 § 例: EQUIVALENCE (MAT(1, 1), COL 1(1)), (MAT(1, 2), COL 2(1)), …) 則想要處理MAT的某一個column時,可以用COL 1 或COL 2這個array代替,因為一度空間的運算要比 二度空間快的多 § 儘可能不要用alias,因為它對程式的readability有害, 而且Some of the original justifications for aliases are no longer valid; e. g. memory reuse in FORTRAN 淡江大學資訊管理系侯永昌 13
Variables § Type - determines the range of values of variables and the set of operations that are defined for values of that type; in the case of floating point, type also determines the precision § 例:for 2 bytes integer: -32768 ~ 32767 4 bytes real: 2. 9387*10 -39 ~ 1. 7014*1038 precision約為 6位小數點 淡江大學資訊管理系侯永昌 14
Round Off Error § 例: 0. 2897*100, 0. 4976*100, 0. 2488*101, 0. 7259*101, § § § 0. 1638*102, 0. 6249*102, 0. 2162*103, 0. 5233*103, 0. 1403*104, 0. 5291*104,假設有效位數為 4位 則由小加到大 = 0. 7523*104,由大加到小 = 0. 7520*104,而真正的值為 0. 75229043*104 例: 0. 9964*101+0. 9803*101=1. 9767*101=0. 1976*102 例: 0. 2631*102 -0. 1976*102=0. 0655*102=0. 6550*101 例: 0. 3472*105 -0. 3471*105=0. 0001*105=0. 1000*102 例: 0. 3472*106+0. 4437*102 =0. 3472*106+0. 00004437*106=0. 3472*106 § 注意有效位數的變化 淡江大學資訊管理系侯永昌 16
Variables § Abstract memory cell - the physical cell or collection of cells associated with a variable § The l-value of a variable is its address § The r-value of a variable is its value 淡江大學資訊管理系侯永昌 17
The Concept of Binding § Binding is an association, such as between an attribute and an entity, or between an operation and a symbol § Binding time is the time at which a binding takes place. 淡江大學資訊管理系侯永昌 18
Possible binding times § Language design time -- e. g. , bind operator symbols to operations § Language implementation time -- e. g. , bind floating point type to a representation § Compile time -- e. g. , bind a variable to a type in C or Java § Load time -- e. g. , bind a FORTRAN 77 variable to a memory cell (or a C static variable) § Runtime -- e. g. , bind a nonstatic local variable to a memory cell 淡江大學資訊管理系侯永昌 19
Possible binding times § 例︰int count; § § § count = count + 10; count的data type是整數︰在compile-time決定 count的所有可能值是-32768~32767︰在compiler的 design (implementation) time決定 count的值是多少︰在run-time決定 +代表整數的加法︰在compile-time決定 10是 10,而非 2︰在compiler的design time決定 用什麼樣的bit string來表現 10,例如︰ 1010︰在 compiler的implementation time決定 淡江大學資訊管理系侯永昌 20
The Concept of Binding § A binding is static if it first occurs before run time (compile time) and remains unchanged throughout program execution. § A binding is dynamic if it first occurs during execution or can change during execution of the program. 淡江大學資訊管理系侯永昌 21
The Concept of Binding § Design issues of type bindings l How is a type specified? l When does the binding take place? § If static, the type may be specified by either an explicit or an implicit declaration 淡江大學資訊管理系侯永昌 22
The Concept of Binding § An explicit declaration is a program statement used for declaring the types of variables,每一個變數都必須事先宣告 § An implicit declaration is a default mechanism for specifying types of variables (the first appearance of the variable in the program),也就是如果使用default type, 就可以不需要宣告 淡江大學資訊管理系侯永昌 23
The Concept of Binding § FORTRAN, PL/I, BASIC, and Perl provide implicit declarations 例:in Fortran: default: I ~ N : INTEGER others: REAL 例:in Perl: $apple: scalar @apple: array %apple: hash structure § Advantage: convenience, 有助於writability § Disadvantage: 無法偵測出writing error,對 於reliability有損害 淡江大學資訊管理系侯永昌 24
The Concept of Binding § Dynamic Type Binding (Java. Script and PHP) § 不是經由宣告來做variable與data type之間的 binding,而是經由assignment statement § 例:in Java. Script list = [2, 4. 33, 6, 8]; array of length 3 list = 17. 3; integer 淡江大學資訊管理系侯永昌 25
The Concept of Binding § Advantage: flexibility (generic program units) § Disadvantages: l High cost (dynamic type checking and interpretation):每一個變數需要加上一個 tag,用來表明資料形態,checking要在runtime時做,使execution time加長 l Type error detection by the compiler is difficult,在compile-time無法做type checking,要在run-time時才知道它type mismatch 淡江大學資訊管理系侯永昌 26
The Concept of Binding § 因此,dynamic type binding一般多為 interpreter language而非compile language,而compile language一般多 用static type binding,因此執行時比較 有效率 淡江大學資訊管理系侯永昌 27
Storage Bindings & Lifetime § Lifetime:variable的lifetime是由variable與 某一個memory location發生binding時開始, 一直到binding的關係結束時為止 § Allocation - getting a cell from some pool of available cells,將某一個memory location給 某一個variable § Deallocation - putting a cell back into the pool ,某一個variable將它的memory location還 給available space 淡江大學資訊管理系侯永昌 28
Categories of variables by lifetimes § Static -- bound to memory cells before execution begins and remains bound to the same memory cell throughout execution. § 例:all FORTRAN 77 variables, C static variables 淡江大學資訊管理系侯永昌 29
Categories of variables by lifetimes § Advantages: l 適合direct addressing:efficiency l 方便使用global variables:convenient l history-sensitive subprogram support l No run-time overhead for allocation/ deallocation § Disadvantage: lack of flexibility (recursion is not allowed) 淡江大學資訊管理系侯永昌 30
Categories of variables by lifetimes § Stack-dynamic -- Storage bindings are created for variables when their declaration statements are elaborated and remains bound until execution terminates. § If scalar, all attributes except address are statically bound § 例:local variables in C subprograms and Java methods 淡江大學資訊管理系侯永昌 31
Categories of variables by lifetimes § Advantages: l allow recursion l Share memory space § Disadvantages: l Overhead of allocation and deallocation l Subprograms cannot be history sensitive l Inefficient references (indirect addressing) 淡江大學資訊管理系侯永昌 32
Categories of variables by lifetimes § Explicit heap-dynamic -- Allocated and deallocated by explicit directives, specified by the programmer, which take effect during execution § Nameless object is referenced only through pointers or references 例:dynamic objects in C++ (via new and delete),all objects in Java 淡江大學資訊管理系侯永昌 33
Categories of variables by lifetimes § 例:int *intnode; intnode = new int; /* allocate an int cell */ delete intnode; /* deallocate the cell to which intnode points */ § A intnode is bound to integer type at compiletime, but this variable is referenced through intnode at run-time 淡江大學資訊管理系侯永昌 34
Categories of variables by lifetimes § Advantages: l Provide for dynamic storage management l 特別適合處理tree與linked list的問題 § Disadvantages: l Inefficient:high cost for referencing, allocating and deallocating memory l Unreliable:會產生dangling reference and garbage – discuss in chapter 6 淡江大學資訊管理系侯永昌 35
Categories of variables by lifetimes § Implicit heap-dynamic -- Allocation and deallocation caused by assignment statements § 例:all variables in APL; all strings and arrays in Perl and Java. Script § Advantage: flexibility § Disadvantages: l Inefficient, because all attributes are dynamic:run-time overhead l Loss of error detection:compile-time時 無法做error detection 淡江大學資訊管理系侯永昌 36
Type Checking § Generalize the concept of operands and operators to include subprograms and assignments § Type checking is the activity of ensuring that the operands of an operator are of compatible types § compatible type is one that is either legal for the operator, or is allowed under language rules to be implicitly converted, by compiler-generated code, to a legal type. This automatic conversion is called a coercion. § type error is the application of an operator to an operand of an inappropriate type 淡江大學資訊管理系侯永昌 37
Type Checking § If all type bindings are static, nearly all type checking can be static︰at compile-time § If type bindings are dynamic, type checking must be dynamic︰at run-time § A programming language is strongly typed if type errors are always detected § 雖然一般多希望儘可能的做static type checking,但是仍然有很多特性非要在 run-time做check不可 淡江大學資訊管理系侯永昌 38
Type Checking § 例︰V: array [1. . 100] of integer check array V[I]的attributes,例如︰type, name, …,可以在compile-time時處理,但 是check I是否在 1~100之間,則非等到run -time不可 淡江大學資訊管理系侯永昌 39
Strong Typing § 每一個變數的type,在compile-time時就必 須要確定(static binding) § 在compile-time時,必須check每一個運算 中所需要的變數的type是否相符 § 如果一個變數(storage)在不同的時間,允 許儲存兩種以上不同的data type,則在使 用以前必須做過static or dynamic checking § Advantage : allows the detection of the misuses of variables that result in type errors § Disadvantage : 減少flexibility 淡江大學資訊管理系侯永昌 40
Strong Typing Examples § FORTRAN 77 is not: parameters, EQUIVALENCE § Pascal is not: variant records § C and C++ are not: parameter type checking can be avoided; unions are not type checked § Ada is, almost (UNCHECKED CONVERSION is loophole), (Java is similar) § Coercion rules strongly affect strong typing -- they can weaken it considerably (C++ versus Ada) § Java has half as many assignment coercions as C++, its strong typing is still far less effective than that of Ada 淡江大學資訊管理系侯永昌 41
Type Compatibility § Our concern is primarily for structured types § 例︰type T 1 = array [1. . 100] of integer; T 2 = array [1. . 100] of integer; var A : T 1; B, C : T 1; D : T 2; E, F : array [1. . 100] of integer; G : array [1. . 100] of integer; 則下述的句型有沒有語法上的錯誤?(可不可以?) A[i] : = B[i]; A[i] : = D[i] + C[i]; B[i] : = C[i-1] + 3; E[i] : = F[i]; F[i] : = G[i] 淡江大學資訊管理系侯永昌 42
Type Compatibility § Name type compatibility means the two variables have compatible types if they are in either the same declaration or in declarations that use the same type name § A, B, C是同一個資料型態,與D不相同 § 對於不名的資料形態(anonymous type), name equivalent有一些不同的考慮︰ in Ada︰E, F, G均不相等 in Pascal︰E, F相等,但與G不相等 淡江大學資訊管理系侯永昌 43
Name Type Compatibility § 優點︰Easy to implement § 缺點︰Highly restrictive: l l Subranges of integer types are not compatible with integer types,Subrange is treated as a new type • type Indextype is 1. . 100; count: Integer; Count : = index + 3; (? ) index: Indextype; Formal parameters must be the same type as their corresponding actual parameters。in Pascal,如果一個 structured data type被當成是一個parameter,由一個 procedure送到另外一個procedure,因為不能重複定義 data type,因此那一個type definition就只好被定義成 global data type 淡江大學資訊管理系侯永昌 44
Type Compatibility § Structure type compatibility means that two variables have compatible types if their types have identical structures § 優點︰More flexible § 缺點︰harder to implement § 前述的例子(p. 42)中,所有的變數都是同一 個type 淡江大學資訊管理系侯永昌 45
Structure Type Compatibility的問題 § Are two record types compatible if they are structurally the same but use different field names? § Are two array types compatible if they are the same except that the subscripts are different? (e. g. [1. . 10] and [0. . 9]) § Are two enumeration types compatible if their components are spelled differently? § With structural type compatibility, you cannot differentiate between types of the same structure (e. g. different units of speed, both float) 淡江大學資訊管理系侯永昌 46
Type Compatibility Examples § Pascal: usually structure, but in some cases name is used (formal parameters) § C: structure, except for records (name) § C++: name § Ada: restricted form of name l Derived types allow types with the same structure to be different l Anonymous types are all unique, even in: A, B : array (1. . 10) of INTEGER; 淡江大學資訊管理系侯永昌 47
Type Compatibility § Which one is better? l Name equivalent is safer, since it is more restrictive l Name equivalent is simpler to implement 淡江大學資訊管理系侯永昌 48
Scope § The scope of a variable is the range of statements over which it is visible § Local variable︰定義(宣告)在這個 program內的變數 § Nonlocal variables︰在這個program中可 以使用(visible),但是並不是定義(宣告) 在這個program內的變數 § The scope rules of a language determine how references to names are associated with variables 淡江大學資訊管理系侯永昌 49
Static Scope rules § Based on program text § To connect a name reference to a variable, you (or the compiler) must find the declaration § Search process: search declarations, first locally, then in increasingly larger enclosing scopes, until one is found for the given name § Enclosing static scopes (to a specific scope) are called its static ancestors; the nearest static ancestor is called a static parent 淡江大學資訊管理系侯永昌 50
Static Scope rules § 假設compiler在procedure A中遇到一個變數X, 它判斷X的屬性(e. g. , type, value, …)的方式是︰ l l l 先在procedure A中找X的定義(local variable) 如果找不到,則到定義procedure A的procedure (static parent)中去找X的定義(global variable) 如果還找不到,則再向static parent’s parent去找, …。如果一直找不到,可能是predefined or error 在內層procedure中定義的變數,對於外層的 procedure而言是hidden的 Redefined variable always hides a variable with same name in its static ancestors 淡江大學資訊管理系侯永昌 51
Scope § Procedure BIG; Var X : integer; Procedure SUB 1; Var X : real; begin SUB 2; X : = …; end; Procedure SUB 2; begin X : = …; end; begin SUB 1; end; 此處的X是SUB 1中所 定義的X,因此是real, 而BIG中所定義的X是 被hidden的 此處的X是BIG中所定 義的X,因此是integer 淡江大學資訊管理系侯永昌 52
Static Scope rules § Variables can be hidden from a unit by having a "closer" variable with the same name § C++ and Ada允許在變數前面加上ancestor 的scope name來access to these "hidden" variables l l In Ada: unit. name,例︰BIG. X In C++: class_name: : name,例︰BIG: : X 淡江大學資訊管理系侯永昌 53
Static Scope rules § 對於有predefined name (e. g. , SIN, MAXINT, …)的language而言,scoping rule要複雜一些 ︰ l l 對於實施reserved word的language而言︰先到 predefined name list中去找變數的定義,如果找 不到,才開始實施static scope rule(由local variable)開始 對於有key word的language而言︰先用static scope rule去找變數的定義,如果找不到,才到 predefined name list中去找它的定義 淡江大學資訊管理系侯永昌 54
Blocks § 從ALGOL 60開始,有很多language e. g. , Ada, PL/I, …允許將一些compound statement 自成一個block,而且擁有自己的local variables § 例︰C and C++︰for (. . . ) { int index; } Ada︰declare LCL : FLOAT; begin end 淡江大學資訊管理系侯永昌 55
Static Scope Example § 例︰Assume MAIN calls A and B, A calls C and D, B calls A and E MAIN A C D A B B E C D 淡江大學資訊管理系侯永昌 E 56
Static Scope的缺點 MAIN A C MAIN B D The desired calls A E B C D E The potential calls § Potential procedure call太複雜 § Too much data access︰只要是外圈結構所定義的的 變數,除非被hidden,否則內圈結構就可以使用 淡江大學資訊管理系侯永昌 57
Static Scope § 討論︰ l Static scoping使變數的使用範圍在 compile-time即可決定 efficiency l Static scoping 鼓勵使用global variables, 並使程式結構的層級(level of nesting)減 少,無法確實反映conceptual design 淡江大學資訊管理系侯永昌 59
Dynamic Scope § Based on calling sequences of program units, not their textual layout (temporal versus spatial) § References to variables are connected to declarations by searching back through the chain of subprogram calls that forced execution to this point 淡江大學資訊管理系侯永昌 60
Dynamic Scope rules § Dynamic scoping是由calling sequence來決 定,而不是由他們之間彼此的nesting的 關係,因此是在run-time時才決定的 § 在run-time時,procedure A中遇到一個變 數X,它判斷X的屬性的方式是︰ l l l 先在procedure A中找X的定義(local variable) 如果找不到,則沿著calling sequence到它的 caller去找X的定義(global variable) 如果一直找到MAIN program都找不到X的定 義 run-time error 淡江大學資訊管理系侯永昌 61
Scope Example MAIN - Var x : interger; SUB 1 - Var x : real. . . call SUB 2. . . - reference to x. . . call SUB 1. . . MAIN calls SUB 1 calls SUB 2 uses x Dynamic scoping: SUB 1 calls SUB 2︰ SUB 2之x為SUB 1中 之x x為real Static scoping: SUB 2之x為MAIN中 之x x為integer 淡江大學資訊管理系侯永昌 62
Evaluation of Dynamic Scoping § 任何caller的local variable都可以被callee所 visible(使用),使program unit之間的資料 交換更為方便。但是也因此無法防止被誤 用 reliability比較差 § 對於non-local variable的type無法做static type checking efficiency比較差 § 而且每一次call這些non-local variable的type 都可能不同,必須追蹤calling sequence才 能了解non-local variable的意義 readability比較差 淡江大學資訊管理系侯永昌 63
Scope Example MAIN - Var x : interger; SUB 1 - Var x : real. . . call SUB 2. . . - reference to x. . . call SUB 1 call SUB 2 MAIN calls SUB 1 calls SUB 2 uses x SUB 1 calls SUB 2︰ SUB 2之x為SUB 1中 之x x為real MAIN calls SUB 2︰ SUB 2之x為MAIN中 之x x為integer 淡江大學資訊管理系侯永昌 64
Scope and Lifetime § Scope and lifetime are sometimes closely related, but are different concepts § Scope︰由程式宣告的地方開始,一直到end statement為止,是空間上的觀念 § Lifetime︰由程式的執行開始,一直到程式執 行到end statement為止,是時間上的觀念 § Pascal是dynamic allocation,程式開始執行時 OS給它一塊memory,等程式結束後OS將該 memory收回,因此這兩個意義上有點相似 § 但是對Fortran而言,它是static allocation,因 此雖然它的scoping也是static and local to subprogram,但是lifetime卻是延伸至整個程式 淡江大學資訊管理系侯永昌 65
Scope and Lifetime § 在subprogram之間的互call,也造成scope 和lifetime的差異︰ § 例︰ void printheader ( ) { } void compute ( ) { int sum; printheader ( ); } § sum的scope是在computer程式中,與 printheader無關。但sum的lifetime卻延伸 到printheader中 淡江大學資訊管理系侯永昌 66
Referencing Environments § The referencing environment of a statement is the collection of all names that are visible in the statement § In a static-scoped language, it is the local variables plus all of the visible variables in all of the enclosing scopes § A subprogram is active if its execution has begun but has not yet terminated § In a dynamic-scoped language, the referencing environment is the local variables plus all visible variables in all active subprograms 淡江大學資訊管理系侯永昌 67
Named Constants § A named constant is a variable that is bound to a value only when it is bound to storage,它的值不 能經由assignment or input statement所改變 § Used to parameterize programs § 例︰int[] int. List = new int [100]; String[] str. List = new String [100]; for (index = 0; index < 100; index++) {…} average = sum / 100; 淡江大學資訊管理系侯永昌 68
Named Constants § 例︰const int len = 100; int[] int. List = new int [len]; String[] str. List = new String [len]; for (index = 0; index < len; index++) {…} average = sum / len; § Advantages: l l 增加readability︰例如︰使用PI,而非 3. 14159 幫助modifiability︰更改程式更方便 淡江大學資訊管理系侯永昌 69
Named Constants § The binding of values to named constants can be either static (called manifest constants) or dynamic § 例︰in C++︰const int result = 2 * width + 1; § Languages: l l l Pascal: literals only FORTRAN 90: constant-valued expressions Ada, C++, and Java: expressions of any kind 淡江大學資訊管理系侯永昌 70
Variable Initialization § Initialization︰The binding of a variable to a value at the time it is bound to storage, 但是在它的lifetime中,它的值是可以更 改的 § Initialization is often done on the declaration statement 淡江大學資訊管理系侯永昌 71
Variable Initialization Examples § in Java int sum = 0; § in Fortran:DATA statement DATA A, B, C /10. 5, 23. 4, 11. 2/ § in Ada OLD_COUNT : integer : = 0; initialization COUNT : integer : = OLD_COUNT + 1; initialization LIMIT : constant integer : = OLD_COUNT + 1; name constant 淡江大學資訊管理系侯永昌 72
Variable Initialization Examples § in ALGOL 68 int first : = 10; initialization int second = 10; name constant 違反了相似的外觀應有相同意義的原則 § in Pascal:不可以設定初值 淡江大學資訊管理系侯永昌 73
Variable Initialization § Static variable:只能initialize一次 § Dynamic variable:每一次dynamic allocation時,都initialize一次 淡江大學資訊管理系侯永昌 74
- Slides: 75