Topic 3 Binding Time and Symbol Tables Dr

Topic 3 -Binding Time and Symbol Tables Dr. William A. Maniatty Assistant Prof. Dept. of Computer Science University At Albany CSI 511 Programming Languages and Systems Concepts Fall 2002 Monday Wednesday 2: 30 -3: 50 LI 99

Introduction to Binding refers to associating an entity with a value, such as ' ' Variable name with address 0 ' Result of expression with ephemeral storage ' Constant with its value ' Seperately compiled function with address

Binding Time ' Binding time refers to when entities are associated with their values is made.

Design Binding Times There are extra binding times available to programming language designers. ' ' ' Language Design Time - Choose fundamental primitives, reserved words, etc. ' Compiler/Interpreter Implementation Time -How to internally represent language constructs. Programming Time - Language users

Object - What does it mean? The word Object has many meanings in program languages. ' ' ' Object Module -A compiled (but not linked) module of a program. ' Object (OOP sense) - An instance of a class in Object Oriented Programming. ' Object (Programming Language Sense) The entities which are bound to values. Use the programming language for

Binding Time Design Issues Late binding of objects indicates that interpreters. ' ' Dynamic Type Systems Care needs to be taken to avoid ambiguity when binding. ' ' Name Space Collisions ' Polymorphism (Overloading)

Object Attributes Objects have many attributes ' ' Lifetime (Persistence) ' Type ' Scope ' Value/Address Language should: ' ' Precisely specify attributes ' Be Orthogonal -Separate Controls

Object Persistence vs. Lifetime Persistence -Persistant objects last longer than the process that created it. ' ' ' Examples - Files, databases. ' Memory for nonpersistent objects is called volatile (you lose data if powered down). Lifetime - When is the storage allocated to an object available?

Events Impacting Object Lifetime Life Time has several aspects. ' ' Creation of objects ' Creation of bindings ' References to variables/subroutines/types/etc. ' (Re)activation and Deactivation of bindings ' Destruction of Objects

Allocation and Object Lifetime How can objects be allocated? ' ' Statically -Exist during Program's Lifetime ' Stack -Used for ephemeral objects and ephemeral objects. ' Heap Objects -Have controlled lifetimes Deallocation: How is it indicated? ' ' Explicitly - Destructors/free/delete Implicitly - Garbage collection Initialization - Separate (Constructors)

Static Allocation Done at compile time ' ' Literals (and constants) bound to values ' Variables bound to addresses Compiler notes undefined symbols ' ' ' Library functions ' Global Variables and System Constants Linker (and loader if DLLs used) resolve undefined references.

Stack Based Allocation Stack Layout determined at compile time ' Variables bound to offsets from top of stack. ' ' ' Layout called stack frame or activation record Compilers use registers Function parameters and results need consistent treatment across modules ' ' C/C++ use prototypes

Parameter Passing Conventions ' Actual Parameters -at the call site ' Formal Parameters - at the subroutine declaration ' Address - a memory location, data objects containing addresses can be called: ' Pointer - use explicit dereferencing operation. ' Reference - use implicit dereferencing.

Parameter Passing Conventions ' Call by value - Copy to the function ' Call by reference - Pass reference ' Call by address - Pass address to function ' Call by result - Pass result back to caller ' Call by value result - Copy inputs to the function and copy results to caller.

Call Site Code Generation for Stack Allocation Call Setup ' ' Push Register Values on stack (if caller saves) ' Push parameters on stack (or load into registers) Call Function ' ' ' Push Return Address on stack ' Goto Function's Start Address Call Cleanup (if caller saves)

Subroutine Code Generation for Stack Allocation ' Prologue -Push Registers that will be overwritten on stack (if callee saves) ' Body of function ' Call Cleanup (if caller saves) ' Copy results (if any) ' Pop Parameters off stack. ' Pop registers ' Return

Stack and Frame Layout ' Stack here grows toward low addresses.

Heap Allocation Heap provides dynamic memory management. ' ' Not to be confused with binary heap or binomial heap data structures. ' Under the hood, may periodically need to request additional memory from the O/S. ' Requested large regions (requests are expensive). ' Done using a library (e. g. C) ' Or as part of the language (C++, Java,

Heap Data Structures ' Must track allocated/Free Memory. ' Metadata is added (pointers, size, etc).

Memory Management Holes can form where memory is freed. ' ' Coalesce adjacent holes ' Small holes fragment the memory. Suppose you allocate a smaller chunk, which hole do we take it from? ' ' First fit - The first hole found that it fits into ' Best fit - The smallest segment it fits into ' Worst fit - The largest segment it fits into

When to Free Memory Depends on language. ' ' Explicit deallocation -needed for library approaches (e. g. C). ' Implicit Deallocation - aka garbage collection ' Garbage is unreferenced memory. ' Compaction moves allocated memory to contiguous addresses (coalescing all holes). ' Can cause timing variations (care is needed in real time systems).

Speeding Up Searching for a Free Block ' Recall fitting scheme require finding sufficiently large blocks. ' Idea: Organize Free List according to block size. ' Fibbonacci Heap - Use Fibbonacci numbers for block sizes. ' Buddy System -Use Block sizes of 2 k

Introduction to Scope ' Scope refers to the region of a program during which a binding is active. ' Consider the following code segment, what should the output be?

Scope Rules Two popular answers to the problem. ' ' ' Static (lexical) scope -Use compile time analysis. Normally in block structured languages, the containing scope is preferred, output is 1 in this case. ' Dynamic Scope -Value found at run time by resolving to nearest stack frame in which the value is defined, output is 2 in this case. Lexical scope is more popular.

Variants of Static Scope ' Single Global Scope (BASIC) simplest ' Global and Local (Fortran) Fortran Common Blocks ' ' ' Supports separate compilation ' Gives base address of region ' Each program specifies (possibly different) layout Block Structured (Pascal)

Modules and Separate Compilation Modules support encapsulation (much like classes). ' ' Found in Modula 2, Euclid, Oberon and Ada. For separate compilation ' ' define interfaces (data and subroutines) ' Export statements - published interfaces ' Import statements - uses published interfaces

More Notation Fundamental question: Does the scope need to be explicitly imported to be visible? ' ' ' Yes - Referred to as closed scope. ' No - Referred to as open scope. Aliasing -having more than one way to refer to the same object.

Classes and Scope Classes provide encapsulation in object oriented programming (OOP). ' ' Supports aggregating heterogeneous data and operations together. ' Interfaces are published ' C++ public section in classes ' Internals can be hidden (ala private section in C++) ' Constructors and destructors supported.

OOP Features I think of OOP as providing ' ' Encapsulation -groups data with operations ' Inheritance -permits extension of more general base classes (and overriding behaviors) ' Polymorphism (overloading) - allows operators/subroutines to have behaviors dependent on the types of arguments and results expected.

Dynamic Scope ' Dynamic scoping prefers the instance defined in the most recently invoked function. ' Not very popular currently (hard to debug) ' ' Found in interpreted languages (APL, older Lisp dialects, e. g. EMACS Lisp). Fans claim that it makes customizing subroutines easier.

Another Dynamic vs. Static Scope Example

Symbol Table Design Criteria Symbol tables require: ' ' Fast insertion ' Fast lookup ' Occasional deletion (should be fast). ' Which motivates the use of hash tables. ' But ordinary hash tables are not good with nesting (ala classes/records/subroutines)

Operations on Symbol Tables (Static Scope) A Symbol Table should support: ' ' ' Entering Scope ' Leaving Scope ' Inserting a symbol (with scope information) ' Looking up a symbol (with scope information) It is often useful to store symbol table in object/executables

Le. Blanc-Cook Symbol Table Lookup 1/5 Le. Blanc-Cook Symbol Table Lookup ' ' Each Scope is assigned a serial number ' Elements are never deleted from the table ' A Scope Counter is maintained ' ' The first scope is 0 ' Every new scope encountered increments the counter To track nesting, a scope stack is maintained.

Le. Blanc-Cook Symbol Table Lookup 2/5 Put all symbols in a single hash table. ' ' Keywords not inserted (can use another hash). ' Entries indexed using both name and scope. To lookup a name ' ' Look in the hash table for (name, scope) pair. ' If not found:

Le. Blanc-Cook Symbol Table Lookup 3/5 About Hashing and Hash Functions: ' Is the universe of keys known in advance? ' ' Yes - perfect minimal hashing may be possible. ' No - must handle collisions ' ' e. g. Quadratic Rehash or Chaining Symbol Table Algorithm has to handle collisions if hashing is used.

Le. Blanc-Cook Symbol Table Lookup 4/5

Le. Blanc-Cook Symbol Table Lookup 5/5

Le. Blanc-Cook Symbol Table (An Example)

Dynamic Scope and Symbol Table Management Dynamic scope has different symbol table management needs than static scope ' Needs insert, lookup, enter scope, leave scope. ' ' Just like static scope Competing Approaches: simplicity vs. speed ' ' Association Lists -Simple, fast scope entry/exit. ' Central Reference Table -Like Leblanc-Cook

Association Lists (A-Lists) combine list and stack treatment. ' When a new scope is entered ' ' Push its symbols on the stack ' Use a unidirectional linked list to implement stack. To find an item ' ' ' Scan stack starting at top of stack. When leaving a scope

Central Reference Tables (1) Central Reference Tables use hashing ' ' ' Elements are keyed by symbol ' Each element is a stack ' So we have one stack per symbol ' Newest Scope is on top Use a unidirectional linked list to implement stack.

Central Reference Tables (2) To insert a symbol/scope ' ' Hash on symbol, push symbol/scope on stack. To find a symbol in a scope ' ' Hash to symbol's stack ' Use scope at top of stack. When leaving scope ' ' Pop all symbols in that scope from top of their respective stacks.

Resolving Static Scope at Run Time Consider a function F containing G. ' i. e. F and G are nested functions ' Suppose G uses an identifier in F's scope. ' How can G find F's frame pointer at run time? ' If G is always invoked by F, just do base + offset ' ' Called static chaining - offset computed at compile time.

An Example Requiring Dynamic Chaining

Subroutine Closures Consider when a function, F, is passed as an argument to another function, G ' ' ' E. g. Comparison Operators for sorting ' When G invokes F, how can we determine the scope? Subroutine closures describe a function's scope and instruction space address

Overloading Defined An overloaded function or operator selects its semantics based on the types of its parameters and result ' Implicit overloading - provided by language ' ' ' e. g. addition in Pascal can handle real or integers ' Write and Writeln in Pascal Explicit overloading - programmers resolve actions

Some thoughts on Overloading Should user defined overloading of operators be permitted? ' Pro: Permits consistent interface ' ' e. g. A = B * C; good for integer, real, complex. . . Cons: You may need to read the entire program to understand a single line of code. ' ' e. g. A = B * C; What if B and C are objects? Inheritance? ' What to do with ephemeral objects? e. g A * B *

More Thoughts Meyer's Eiffel overloads A(i) ' ' Single parameter function ' Single index array ' Because functions and arrays are often interchangeable! Operator vs. function overloading ' ' Operator - Syntactic Sugar ' Function - Programmers know to read code

Challenges of Overloading ' Compiler needs to be smart about types ' Separate compilation hard ' e. g. Unix Linker - Predates C++ ' Name Space Mangling ' Can break system tools (profilers/debuggers) ' Compiler creates a unique name based on operator/function name and parameter/result types.

Templates in C++ are used for container classes. ' ' The base type describes elements in the container. ' The base type is a parameter to the template passed when instantiated (or in a typedef). Makes separate compilation hard ' ' Typically interface needs to be compiled

Templates Pros and Cons Templates promote code reuse ' ' But also promotes compiled code bloat Recovering from syntax errors is hard! ' ' Make a small STL error, get pages of errors ' And the error messages are not helpful! ' Vandevoorde's Xroma - Have template developer give compiler hints (also for code generation).

Summary ' Binding associates names and values ' Scope rules govern which name binds to which value in the event that a name is reused. ' Naming combined with type information permits overloading (promoting code reuse).