SyntaxDirected Translation By Dr D Sasi Raja Sekhar

Syntax-Directed Translation By Dr D Sasi Raja Sekhar

Syntax Directed Translation 1) The required information is associated to programming language construct by attaching attributes to the grammar symbols representing the construct(whose values are computed by semantic rules). 2) There are two notions for associating semantic rules with productions: a) b) Syntax Directed Definitions Translation Schemes

Syntax Directed Translation Syntax-directed definitions are high-level specifications for translations(impletion details are hidden). Translation schemes indicate the order in which semantic rules are to be evaluated (implementation details are allowed to be shown).

Conceptual View of Syntax-directed Translation Input String Parse Tree Dependency graph evaluation order for semantic rules An implementation does not have to follow the same literally.

Syntax-Directed Definitions It is a generalization of a context-free grammar in which each grammar symbol has an associated set of attributes partitioned into two subsets called Synthesized and Inherited attributes of that grammar An attribute can be a string, a number, a type, a memory location etc. The value of an attribute is defined by a semantic rule associated with a production.

Syntax-Directed Definitions, II • The value of a synthesized attribute at a node is computed from the values of attributes at the children of that node in the parse tree. • The value of an inherited attribute is computed from the values of attributes at the siblings and the parent of that node. • Semantic rules set up dependencies between attributes that are represented by a graph called dependency graph. • From the dependency graph set up by the semantic rules an evaluation order is derived.

Annotated Parse-Trees • A Semantic rule may also have side effects like printing a value or updating a global variable. • Parse-tree that also shows the values of the attributes at each node is called annotated parse tree. The process of computing the values at the node is called annotating or decorating the parse tree.

Form of a Syntax-Directed Definition • Semantic Rules for a production A have the form: b = f (c 1, …, cn) where f is a function and either b is a synthesized attribute of A and c 1…cn are attributes of symbols in or b is an inherited attribute of some symbol in and c 1…cn are attributes of symbols in A,

Synthesized Attributes • A syntax-directed definition that uses synthesized attributes exclusively is said to be an S-attributed definition. • A parse tree for an S-attributed definition can always be annotated by evaluating the semantic rules for the attributes at each node bottom up, from the leaves to the root.

Example of a Syntax-Directed Definition Grammar symbols: L, E, T, F, n , + , * , ( , ) , digit Non-terminals E, T, F have an attribute called val Terminal digit has an attribute called lexval The value for lexval is provided by the lexical analyzer. The rule associated with the production L En for the starting nonterminal L is just a procedure that prints as output the value of the arithmetic expression generated by E; it can be as defining a dummy attribute for the nonterminal L. PRODUCTION L En E E 1 + T E T T T 1 * F T F F (E) F digit SEMANTIC RULE print(E. val) E. val = E 1. val + T. val E. val = T 1. val * F. val T. val = F. val = E. val F. val = digit. lexval

Draw the Tree Example 3*5+4 n

Example 3*5+4 n Draw the Tree L Print(19) E val=19 T val=15 T val=4 T val=3 F val=3 digit lexval =3 F val=5 * digit lexval =5 F val=4 + digit lexval =4 n

Inherited Attributes • Inherited attributes are convenient for expressing the dependence of a programming language construct on the context in which it appears. • We can use an inherited attribute to keep track of whether an identifier appears on the left or right side of an assignment in order to decide whether the address or the value of the identifier is needed.

Example with Inherited Attributes • The nonterminal T has a synthesized attribute type whose value is determined by the keyword in the declaration. The semantic rule in L. in=T. type, associated with the production D TL sets inherited attribute L. in to the type in the declaration. The rules then pass this type down the parse tree using the inherited attribute L. in. PRODUCTION SEMANTIC RULE D TL L. in = T. type T int T. type = integer T real T. type = real L L 1 , id L 1. in = L. in addtype(id. entry, L. in) L id addtype(id. entry, L. in)

Draw the Tree Example real id 1, id 2 , id 3

Draw the Tree Example real id 1, id 2 , id 3 D addtype(id 3, real) L in=real addtype(id 2, real) L in=real T type=real addtype(id 1, real) L in=real id entry=id 1 , id entry=id 2 , id entry=id 3

Dependency Graph • The interdependencies among the inherited and synthesized attributes at the node in a parse tree can be depicted by a directed graph called a dependency graph. • Construction: • Put each semantic rule into the form b=f(c 1, …, ck) by introducing dummy synthesized attribute b for every semantic rule that consists of a procedure call. • E. g. , • L En • Becomes: • Etc. print(E. val) dummy = print(E. val)

Dependency Graph Construction for each node n in the parse tree do for each attribute a of the grammar symbol at n do construct a node in the dependency graph for a for each node n in the parse tree do for each semantic rule b = f(c 1, …, ck) associated with the production used at n do for i= 1 to k do construct an edge from the node for ci to the node for b

Example I L Print(19) E val=19 T val=15 T val=4 T val=3 F val=3 digit lexval =3 F val=5 * digit lexval =5 F val=4 + digit lexval =4 n

Example I dummy val=19 val=15 val=4 val=3 val=5 val=4 digit lexval =3 digit lexval =5 digit lexval =4

Example II D addtype(id 3, real) L in=real addtype(id 2, real) L in=real T type=real addtype(id 1, real) L in=real id entry=id 1 , id entry=id 2 , id entry=id 3

Dependency Graph-Example-II

Dependency Graph Example-II Nodes in the dependency graph are marked by numbers There is an edge to node 5 for L. in from node 4 according to the semantic rule L. in=T. type for the production D TL The two downward edges into nodes 7 and 9 arise because L 1. in depends on L. in according to semantic rule L 1. in=L. in for the production L L 1, id. Each of the semantic rules addtype(id. entry, L. in) associated with the L-productions leads to the creation of a dummy attribute. Nodes 6, 8, and 10 are constructed for these dummy attributes.

L-Attributed definitions • A SDD is L-Attributed if the edges in dependency graph goes from Left to Right but not from Right to Left. • More precisely, each attribute must be either • Synthesized • Inherited, but if there is a production A->X 1 X 2…Xn and there is an inherited attribute Xi. a computed by a rule associated with this production, then the rule may only use: • Inherited attributes associated with the head A • Either inherited or synthesized attributes associated with the occurrences of symbols X 1, X 2, …, Xi-1 located to the left of Xi • Inherited or synthesized attributes associated with this occurrence of Xi itself, but in such a way that there is no cycle in the graph

Evaluation Order • A topological sort of a directed acyclic graph is any ordering m 1, m 2, ……. , mk of the nodes of the graph such that edges go from nodes earlier in the ordering to later nodes; that is, if mi mj is an edge from mi to mj then mi appears before mj in the ordering • In the topological sort the dependent attributes c 1, c 2, ……ck in a semantic rule b=f(c 1, c 2, …. ck) are available at a node before f is evaluated.

Evaluation Order • The translation specified by a syntax-directed definition can be made precise as follows: 1) The underlying grammar is used to construct a parse tree 2) The dependency graph is constructed. 3) From a topological sort of the dependency graph an evaluation order for the semantic rules is obtained 4) Evaluation of the semantic rules in this order yields the translation of the input string.

Topological sort for example dependency graph Each of the edges in the dependency graph goes from a lower numbered node to a higher numbered node. A topological sort of the dependency graph is obtained by writing down the nodes in the order of their number. a 4=real a 5=a 4 addtype(id 3. entry, a 5) a 7=a 5 addtype(id 2. entry, a 7) a 9=a 7 addtype(id 1. entry, a 9) Evaluating these semantic rules stores the type real in the symbol table entry for each identifier.

Methods for evaluating Semantic Rules-I • Parse-tree methods: At compile time, these methods obtain an evaluation order from a topological sort of the dependency graph constructed from the parse tree for each input. • These methods will fail to find an evaluation order only if the dependency graph for the particular parse tree under consideration has a cycle.

Methods for evaluating Semantic Rules-II • Rule-based methods: At compile construction time, the semantic rules associated with productions are analysed, either by hand or by a specialized tool. • For each production the order in which the attributes associated with that production are evaluated is predetermined at compiler construction time

Methods for evaluating Semantic Rules-III • Oblivious Methods: An evaluation order is chosen without considering the semantic rules • An Oblivious evaluation order restricts the class of syntax-directed definitions that can be implemented.

Applications of Syntax-Directed Translation ---Syntax Trees • Syntax-Tree: an intermediate representation of the compiler’s input. • Decoupling Translation from Parsing-Trees. • Syntax trees are used because translation routines that are invoked during parsing must live with two kinds of restrictions: • First- A Grammar that is suitable for parsing may not reflect the natural hierarchical structure of the constructs in the language. • Second- The parsing method constraints the order in which nodes in a parse tree are considered

Syntax Tree • An (abstract) syntax tree is a condensed form of parse tree useful for representing language constructs. • In a syntax tree operators and keywords do not appear at leaves, but rather are associated with the interior node that would be the parent of those leaves in the parse tree. • Chains of single productions are collapsed in a parse tree to become the syntax tree.

Difference between parse tree and syntax tree

Constructing Syntax Tree for Expressions • The construction of a syntax tree for an expression is similar to the translation of the expression into postfix form. • Each node in a syntax tree can be implemented as a record with several fields. • The functions to create the nodes of syntax tree for expressions with binary operators are: • mknode(op, left, right) creates an operator node with label op and two fields containing pointers to left and right. • mkleaf (id, entry) creates an identifier node with label id and a field containing entry, a pointer to the symbol-table entry for the identifier. • mkleaf(num, val) creates a number node with label num and a field containing val, the value of number.

Syntax Tree PRODUCTION SEMANTIC RULE E E 1 + T E. node = mknode(“+”, E 1. node , T. node) E E 1 - T E. node = mknode(“-”, E 1. node , T. node) E T E. node = T. node T (E) T. node = E. node T id T. node = mkleaf(id, id. lexval) T num T. node = mkleaf(num, num. val)

Construction of Syntax Tree for a-4+c

Syntax Directed Translation Schemes A SDT is a CFG with semantic actions embedded with in the production bodies. Curly braces are placed around actions. SDTs are implemented during a preorder traversal. We focus on the use of SDT to implement two classes of SDDs: 1. The underlying grammar is LR-parsable and the SDD is S-attributed. 2. The underlying grammar is LL-parsable and the SDD is L-attributed.

Postfix translation schemes • Simplest SDDs are those that we can parse the grammar bottom-up and the SDD is s-attributed • For such cases we can construct SDT where each action is placed at the end of the production and is executed along with the reduction of the body to the head of that production • SDT’s with all actions at the right ends of the production bodies are called postfix SDT’s

Example of postfix SDT 1) 2) 3) 4) 5) 6) 7) L -> E n E -> E 1 + T E -> T T -> T 1 * F T -> F F -> (E) F -> digit {print(E. val); } {E. val=E 1. val+T. val; } {E. val = T. val; } {T. val=T 1. val*F. val; } {T. val=F. val; } {F. val=E. val; } {F. val=digit. lexval; }

Parse-Stack implementation of postfix SDT’s • In a shift-reduce parser we can easily implement semantic action using the parser stack • For each nonterminal (or state) on the stack we can associate a record holding its attributes • Then in a reduction step we can execute the semantic action at the end of a production to evaluate the attribute(s) of the non-terminal at the leftside of the production • And put the value on the stack in replace of the rightside of production

Parse-Stack implementation of postfix SDT’s

Example L -> E n E -> E 1 + T E -> T T -> T 1 * F T -> F F -> (E) {print(stack[top-1]. val); top=top-1; } {stack[top-2]. val=stack[top-2]. val+stack. val; top=top-2; } {stack[top-2]. val=stack[top-1]. val top=top-2; } F -> digit

SDT’s with actions inside productions • For a production B->X {a} Y • If the parse is bottom-up then we perform action “a” as soon as this occurrence of X appears on the top of the parser stack • If the parser is top down we perform “a” just before we expand Y 1) 2) 3) 4) 5) 6) 7) L -> E n E -> {print(‘+’); } E 1 + T E -> T T -> {print(‘*’); } T 1 * F T -> F F -> (E) F -> digit {print(digit. lexval); }

SDT’s with actions inside productions (cont) L • Any SDT can be implemented as follows 1. Ignore the actions and produce a parse tree 2. Examine each interior node N and add actions as new children at the correct position 3. Perform a traversal and execute actions when their nodes are visited E {print(‘+’); } {print(‘*’); } E + T T F {print(4); } T *F digit {print(5); } F digit {print(3); } digit

Eliminating Left Recursion from SDT’s Here we consider two cases: 1. Consider the case in which the only thing we care about is the order in which the actions in an SDT are performed. (printing the output) 2. Consider the case where the SDD compute attributes rather than merely printing the output.

Eliminating Left Recursion from SDT’s Case 1: Example: E E 1+T { print (‘+’)} E T After the elimination of left recursion becomes E TE’ E’ + T {print(‘+’)} E’ E’ ϵ

Eliminating Left Recursion from SDT’s Case 2: A A 1 Y {A. a=g(A 1. a, Y. y)} A X {A. a=f(X. x)} After the removal of left recursion the grammar becomes: A X {A’. i=f(X. x)} A’ {A. a=R. s} A’ Y {A’ 1. i=g(A’. i, Y. y)} A’ 1 {A’. s=A 1’. s} A’ ϵ {R. s=R. i}

Eliminating Left Recursion from SDT’s

SDT’s for L-Attributed definitions • We can convert an L-attributed SDD into an SDT using following two rules: • Embed the action that computes the inherited attributes for a nonterminal A immediately before that occurrence of A. if several inherited attributes of A are dpendent on one another in an acyclic fashion, order them so that those needed first are computed first • Place the action of a synthesized attribute for the head of a production at the end of the body of the production

SDT’s for L-Attributed definitions

SDT’s for L-Attributed definitions SDD S -> while (C) S 1 L 1=new(); L 2=new(); S 1. next=L 1; C. false=S. next; C. true=L 2; S. code=label||L 1||C. code||label||L 2||S 1. code SDT S -> while ( {L 1=new(); L 2=new(); C. false=S. next; C. true=L 2; } C) {S 1. next=L 1; } S 1{S. code=label||L 1||C. code||label||L 2||S 1. code; }

Implementing L-Attributed SDD’s In this section the following methods for translation during parsing are discussed: a. Use a recursive descent parser with one function for each nonterminal which receives inherited attributes of A as arguments and returns the synthesized attributes of A. b. Generate code on the fly using a recursive-descent parser. c. Implement an SDT in conjugation with an LL-parser d. Implement an SDT in conjugation with an LR-parser

Translation during Recursive-Descent Parsing A recursive-descent parser has a function A for each nonterminal A. The parser can be extended into a translator as follows: a. The arguments of function A are the inherited attributes of nonterminal A. b. The return value of function A is the collection of synthesized attributes of nonterminal A.

Translation during Recursive-Descent Parsing In the body of function A attributes are to be handled along with parsing: 1. Decide upon the production used to expand A. 2. Check that each terminal appears on the input when it is required. Assume no backtracking is needed

Translation during Recursive-Descent Parsing 3. Preserve, in local variables, the values of all attributes needed to compute inherited attributes for non-terminals in the body or synthesized attributes for the head nonterminal. 4. Call functions corresponding to non-terminals in the body of the selected production, providing them with the proper arguments.

Translation during Recursive-Descent Parsing

On-The-Fly Code Generation Instead of constructing long strings of code as in recursive-descent translation pieces of code can be incrementally generated into an array or output file by executing actions in an SDT. The elements that are needed to make this technique work are: 1. There is for one or more nonterminals a main attribute. (S. code and C. code) 2. The main attributes are synthesized.

On-The-Fly Code Generation 3. The rules that evaluate the main attributes ensure that: a. The main attribute is the concatenation of main attributes and other attributes of the non-terminal appearing in the production. b. The main attributes of the non-terminals appear in the rule in the same order as the order of non-terminals in the production rules.

On-The-Fly Code Generation

Intermediate Code Generation