Compiler Design 13 Symbol Tables Kanat Bolazar March

Compiler Design 13. Symbol Tables Kanat Bolazar March 4, 2010

Symbol Tables The job of the symbol table is to store all the names of the program and information about each name In block structured languages, roughly speaking, the symbol table collects information from declarations and uses that information whenever a name is used later in the program this information could be part of the syntax tree, but is put into a table for efficient access to names If there are different occurrences of the same name, the symbol table assists in name resolution

Symbol Table Entries: Simple Variables, Basic Information Variables (identifiers) Character string (lexeme), may have limits on number of characters Data type Storage class (if not already implied by the data type) Name and lexical level of block in which it is declared Other access information, if necessary, such as modifiability constraints

Symbol Table Entries: Beyond Simple Variables Arrays Also needs number of dimensions Upper and lower bounds of each dimension Records and structures List of fields Information about each field Functions and Procedures Number and types of parameters Type of return value

Symbol Table Representation The two main operations are insert (name) makes an entry for this name lookup (name) finds the relevant occurrence of the name by searching the table Lookups occur a lot more often than insert Hash tables are commonly used Because of goodvar 1 average timeclass 1 complexity forvar 3 lookup (O(1)). fn 1 var 2 fn 2

Scope Analysis The scope of a name is tied to the idea of a block in the programming language Standard blocks (statement sequences, sometimes if statement) Procedures and functions Program (global program level) Universe (predefined functions, etc. ) Names must be unique within the block in which they are declared (no two objects with the same name in one block) There are some languages with exceptions for different types (a function and a variable may have same name)

Declaration Before Use? We are dealing primarily with languages in which there are declarations of names required Names of variables, constants, arrays, etc. must be declared before use Names of functions and procedures vary C requires functions and procedures to also be declared before use, or at least given a prototype Java does not require this for methods (can call first, define later in *. java file) Scope of a name (in a statically scoped language): The scope of a constant, variable, array, etc. is from the end of its definition to the end of the block in which it is declared

Further Structure of Symbol Table For nested scopes, we may use lists of hash tables, with one element of the list for each scope The lookup function will first search the current lexical level table and then continue on up the list, using the first occurrence of the name that it finds ● Parts of the table not currently active may be Table A B kept for future semantic Table analysis x z x y B. x shadows A. x ; lookup finds B. x first

More Symbol Table Functions In addition to lookup and insert, the symbol table will also need initialize. Scope (level) , when a block is entered to create a new hash table entry in the symbol table list finialize. Scope (level), on block exit put the current hash table into a background list Essentially makes a tree structure (scope A may contain scopes B 1, B 2, B 3. . . ), where one child may be distinguised as the active block The symbol tables shown so far are all for the program being compiled, also needed is a way to look up names in the “universe”

Example: Predeclared Names in Micro. Java Example: Predeclared in Micro. Java: Types: int, char Constants: null Methods: ord(ch), chr(i), len(arr) We can put in the symbol table as well: Type these. Const Method Type int char null ord chr len Var ch Var i Var arr

Alternate Representation The lists of hash tables can be inefficient for lookup since the system has to search up the list of lexical levels An optimization of the symbol table as lists of hash tables is to keep one giant hash table More names tend to be declared at level 0, thus making the most common occurrence be the most expensive Within that table each name will have a list of occurrences identified by lexical level This representation keeps the (essentially) constant time lookup

Alternate Representation Single Symbol Table Faster lookup. Slow scope close. Must remove c 1, c 2 after scope C ends. Hierarchical Symbol Table Faster scope close. Slow lookup (of globals, especially)

Static Scope The scoping system described so far assumes that the scope rules are for static scoping The static problem layout of enclosing blocks determines the scoping of a name There also languages with dynamic scoping The scoping of a name depends on the call structure of the program at run-time The name resolution will be to the closest block on the call stack of a block with a declaration of that name – the most recently called function or block

Object-Oriented Scoping Languages like Java must keep symbol tables for The code being compiled Any external classes that are known and referenced inside the code The inheritance hierarchy above the class containing the code One method of implementation is to attach a symbol table to each class with two nesting hierarchies One for lexical scoping inside individual methods One to follow the inheritance hierarchy of the class

Testing and Error Recovery If a name is used, but the lookup fails to find any definition If a name is defined twice Give an error but enter the name with a dummy type information so that further uses do not also trigger errors Give an ambiguity error, choose which type to use in later analysis, usually the first Testing cases Include all types of correct declarations Incorrect cases may include

References Nancy Mc. Cracken's original slides Linz University Compiler course materials (Micro. Java). Keith Cooper and Linda Torczon, Engineering a Compiler, Elsevier, 2004. Kenneth C. Louden, Compiler Construction: Principles and Practices, PWS Publishing, 1997. Per Brinch Hansen, On Pascal Compilers, Prentice-Hall, 1985. Out of print. Aho, Lam, Sethi, and Ullman, Compilers:
- Slides: 16