Course Overview PART I overview material 1 2

  • Slides: 28
Download presentation
Course Overview PART I: overview material 1 2 3 Introduction Language processors (tombstone diagrams,

Course Overview PART I: overview material 1 2 3 Introduction Language processors (tombstone diagrams, bootstrapping) Architecture of a compiler PART II: inside a compiler 4 5 6 7 Syntax analysis Contextual analysis Runtime organization Code generation PART III: conclusion 8 Interpretation 9 Review Syntax Analysis (Chapter 4) 1

Abstract Syntax Trees • So far we have talked about how to build a

Abstract Syntax Trees • So far we have talked about how to build a recursive descent parser which recognizes a given language described by an (LL 1) EBNF grammar. • Now we will look at – how to represent AST as data structures. – how to modify the parser to construct an AST data structure. • We make heavy use of Object–Oriented Programming! (classes, inheritance, dynamic method binding) Syntax Analysis (Chapter 4) 2

AST Representation: Possible Tree Shapes The possible form of AST structures is determined by

AST Representation: Possible Tree Shapes The possible form of AST structures is determined by an AST grammar (as described earlier in Chapter 1) Example: remember the Mini-triangle abstract syntax Command : : = V-name : = Expression | Identifier ( Expression ) | if Expression then Command else Command | while Expression do Command | let Declaration in Command | Command ; Command Syntax Analysis (Chapter 4) Assign. Cmd Call. Cmd If. Cmd While. Cmd Let. Cmd Sequential. Cmd 3

AST Representation: Possible Tree Shapes Example: remember the Mini-triangle abstract syntax (excerpt below) Command

AST Representation: Possible Tree Shapes Example: remember the Mini-triangle abstract syntax (excerpt below) Command : : = VName : = Expression |. . . Assign. Cmd Assign. Command V Syntax Analysis (Chapter 4) E 4

AST Representation: Possible Tree Shapes Example: remember the Mini-triangle abstract syntax (excerpt below) Command

AST Representation: Possible Tree Shapes Example: remember the Mini-triangle abstract syntax (excerpt below) Command : : =. . . | Identifier ( Expression ). . . Call. Cmd Call. Command Identifier E Spelling Syntax Analysis (Chapter 4) 5

AST Representation: Possible Tree Shapes Example: remember the Mini-triangle abstract syntax (excerpt below) Command

AST Representation: Possible Tree Shapes Example: remember the Mini-triangle abstract syntax (excerpt below) Command : : =. . . | if Expression then Command else Command. . . If. Cmd If. Command E Syntax Analysis (Chapter 4) C 1 C 2 6

AST Representation: Java (or C++) Data Structures Example: Java classes to represent Mini-Triangle AST’s

AST Representation: Java (or C++) Data Structures Example: Java classes to represent Mini-Triangle AST’s 1) A common (abstract) super class for all AST nodes public abstract class AST {. . . } 2) A Java class for each “type” of node. • abstract as well as concrete node types LHS : : =. . . |. . . Tag 1 Tag 2 abstract AST abstract LHS concrete Tag 1 Syntax Analysis (Chapter 4) Tag 2 … 7

Example: Mini Triangle Commands AST’s Command : : = V-name : = Expression Assign.

Example: Mini Triangle Commands AST’s Command : : = V-name : = Expression Assign. Cmd | Identifier ( Expression ) | if Expression then Command else Command | while Expression do Command | let Declaration in Command | Command ; Command Call. Cmd If. Cmd While. Cmd Let. Cmd Sequential. Cmd public abstract class Command extends AST {. . . } public class Assign. Command extends Command {. . . } public class Call. Command extends Command {. . . } public class If. Command extends Command {. . . } etc. Syntax Analysis (Chapter 4) 8

Example: Mini Triangle Command AST’s Command : : = V-name : = Expression |

Example: Mini Triangle Command AST’s Command : : = V-name : = Expression | Identifier ( Expression ) |. . . Assign. Cmd Call. Cmd public class Assign. Command extends Command { public Vname V; // variable on left side of : = public Expression E; // expression on right side of : =. . . } public class Call. Command extends Command { public Identifier I; // procedure name public Expression E; // actual parameter. . . }. . . Syntax Analysis (Chapter 4) 9

AST Terminal Nodes public abstract class Terminal extends AST { public String spelling; .

AST Terminal Nodes public abstract class Terminal extends AST { public String spelling; . . . } public class Identifier extends Terminal {. . . } public class Integer. Literal extends Terminal {. . . } public class Operator extends Terminal {. . . } Syntax Analysis (Chapter 4) 10

AST Construction Of course, every concrete AST class needs a constructor. Examples: public class

AST Construction Of course, every concrete AST class needs a constructor. Examples: public class Assign. Command extends Command { public Vname V; // Left side variable public Expression E; // right side expression public Assign. Command (Vname V, Expression E) { this. V = V; this. E=E; }. . . } public class Identifier extends Terminal { public class Identifier (String spelling) { this. spelling = spelling; }. . . } Syntax Analysis (Chapter 4) 11

AST Construction We will now show to refine our recursive descent parser to actually

AST Construction We will now show to refine our recursive descent parser to actually construct an AST. N : : = X private N parse. N( ) { // note that return type is N N the. AST; parse X and simultaneously construct the. AST return the. AST; } Syntax Analysis (Chapter 4) 12

Example: Construction Mini-Triangle AST’s Command : : = single-Command ( ; single-Command )* //

Example: Construction Mini-Triangle AST’s Command : : = single-Command ( ; single-Command )* // AST-generating old (recognizing version only) version: private void Command parse. Command( ){ ){ parse. Single. Command( Command the. AST; ); while (current. Token. kind==Token. SEMICOLON) the. AST = parse. Single. Command( ); { while accept. It( (current. Token. kind==Token. SEMICOLON) ); { parse. Single. Command( accept. It( ); ); } Command extra. Cmd = parse. Single. Command( ); } the. AST = new Sequential. Command (the. AST, extra. Cmd); } return the. AST; } Syntax Analysis (Chapter 4) 13

Example: Construction Mini-Triangle AST’s single-Command : : = Identifier ( : = Expression |

Example: Construction Mini-Triangle AST’s single-Command : : = Identifier ( : = Expression | ( Expression ) ) | if Expression then single-Command else single-Command | while Expression do single-Command | let Declaration in single-Command | begin Command end private Command parse. Single. Command( ) { Command com. AST; parse it and construct AST return com. AST; } Syntax Analysis (Chapter 4) 14

Example: Construction Mini-Triangle AST’s private Command parse. Single. Command( ) { Command com. AST;

Example: Construction Mini-Triangle AST’s private Command parse. Single. Command( ) { Command com. AST; switch (current. Token. kind) { case Token. IDENTIFIER: parse Identifier ( : = Expression | ( Expression ) ) case Token. IF: parse if Expression then single-Command else single-Command case Token. WHILE: parse while Expression do single-Command case Token. LET: parse let Declaration in single-Command case Token. BEGIN: parse begin Command end default: report syntax error } return com. AST; } Syntax Analysis (Chapter 4) 15

Example: Construction Mini-Triangle AST’s . . . case Token. IDENTIFIER: // parse Identifier (

Example: Construction Mini-Triangle AST’s . . . case Token. IDENTIFIER: // parse Identifier ( : = Expression | ( Expression ) ) Identifier id. AST = parse. Identifier( ); switch (current. Token. kind) { case Token. BECOMES: accept. It( ); Expression exp. AST = parse. Expression( ); com. AST = new Assignment. Command (id. AST, exp. AST); break; case Token. LPAREN: accept. It( ); Expression exp. AST = parse. Expression( ); com. AST = new Call. Command (id. AST, exp. AST); accept(Token. RPAREN); break; } break; . . . Syntax Analysis (Chapter 4) 16

Example: Construction Mini-Triangle AST’s. . . break; case Token. IF: // parse if Expression

Example: Construction Mini-Triangle AST’s. . . break; case Token. IF: // parse if Expression then single-Command // else single-Command accept. It( ); Expression exp. AST = parse. Expression( ); accept(Token. THEN); Command then. AST = parse. Single. Command( ); accept(Token. ELSE); Command else. AST = parse. Single. Command( ); com. AST = new If. Command (exp. AST, then. AST, else. AST); break; case Token. WHILE: . . . Syntax Analysis (Chapter 4) 17

Example: Construction Mini-Triangle AST’s . . . break; case Token. BEGIN: // parse begin

Example: Construction Mini-Triangle AST’s . . . break; case Token. BEGIN: // parse begin Command end accept. It( ); com. AST = parse. Command( ); accept(Token. END); break; default: report syntax error } return com. AST; } Syntax Analysis (Chapter 4) 18

Syntax Analysis: Scanner Dataflow chart Source Program (Stream of Characters) Scanner Error Reports Stream

Syntax Analysis: Scanner Dataflow chart Source Program (Stream of Characters) Scanner Error Reports Stream of “Tokens” Parser Error Reports Abstract Syntax Tree Syntax Analysis (Chapter 4) 19

Scanner Remember: public class Parser { private Token current. Token; private void accept (byte

Scanner Remember: public class Parser { private Token current. Token; private void accept (byte expected. Kind) { if (current. Token. kind == expected. Kind) current. Token = scanner. scan( ); else We have not yet report syntax error } implemented this private void accept. It( ) { current. Token = scanner. scan( ); } public void parse( ) {. . . } Syntax Analysis (Chapter 4) 20

Steps for Developing a Scanner 1) Express the “lexical” grammar in EBNF (do necessary

Steps for Developing a Scanner 1) Express the “lexical” grammar in EBNF (do necessary transformations) 2) Implement scanner based on this grammar (details explained later) 3) Modify scanner to keep track of spelling and kind of currently scanned token To save some time we’ll do steps 2 and 3 together Syntax Analysis (Chapter 4) 21

Developing a Scanner Express the “lexical” grammar in EBNF Token : : = Identifier

Developing a Scanner Express the “lexical” grammar in EBNF Token : : = Identifier | Integer-Literal | Operator | ; | : = | ~ | ( | ) | eot Identifier : : = Letter (Letter | Digit)* Integer-Literal : : = Digit* Operator : : = + | - | * | / | < | > | = Separator : : = Comment | space | eol Comment : : = ! Graphic* eol Next perform substitution and left factorization. . . Token : : = Letter (Letter | Digit)* | Digit* |+|-|*|/|<|>|= | ; | : (= | e) | ~ | ( | ) | eot Separator : : = ! Graphic* eol | space | eol Syntax Analysis (Chapter 4) 22

Developing a Scanner Now implement the scanner public class Scanner { private char current.

Developing a Scanner Now implement the scanner public class Scanner { private char current. Char; private String. Buffer current. Spelling; private byte current. Kind; private char take (char expected. Char) {. . . } private char take. It( ) {. . . } // analogous to accept. It // other private auxiliary methods and scanning methods go here public Token scan( ) {. . . } } Syntax Analysis (Chapter 4) 23

Developing Scanner will return instances of Token public class Token { byte kind; String

Developing Scanner will return instances of Token public class Token { byte kind; String spelling; final static byte IDENTIFIER = 0; INTLITERAL = 1; OPERATOR = 2; BEGIN = 3; CONST = 4; . . . // in C++ can improve this by using an enum type public Token (byte kind, String spelling) { this. kind = kind; this. spelling = spelling; if spelling matches a keyword then change kind automatically (e. g. “begin” => 3, “const” => 4, …) }. . . } Syntax Analysis (Chapter 4) 24

Developing a Scanner public class Scanner { private char current. Char = get first

Developing a Scanner public class Scanner { private char current. Char = get first source char; private String. Buffer current. Spelling; private byte current. Kind; private char take (char expected. Char) { if (current. Char == expected. Char) { current. Spelling. append (current. Char); current. Char = get next source char; } else report lexical error } private char take. It( ) { current. Spelling. append (current. Char); current. Char = get next source char; }. . . Syntax Analysis (Chapter 4) // analogous to accept. It 25

Developing a Scanner. . . public Token scan( ) { // get rid of

Developing a Scanner. . . public Token scan( ) { // get rid of potential separators before scanning a token while ( (current. Char == ‘!’) || (current. Char == ‘n’ ) ) scan. Separator( ); current. Spelling = new String. Buffer( ); current. Kind = scan. Token( ); return new Token (currentkind, current. Spelling. to. String( )); } private void scan. Separator( ) {. . . } private byte scan. Token( ) {. . . }. . . Syntax Analysis (Chapter 4) Developed in much the same way as parsing methods 26

Developing a Scanner Token : : = Letter (Letter | Digit)* | Digit* |+|-|*|/|<|>|=

Developing a Scanner Token : : = Letter (Letter | Digit)* | Digit* |+|-|*|/|<|>|= | ; | : (=|e) | ~ | ( | ) | eot private byte scan. Token( ) { switch (current. Char) { case ‘a’: case ‘b’: . . . case ‘z’: case ‘A’: case ‘B’: . . . case ‘Z’: scan Letter (Letter | Digit)* return Token. IDENTIFIER; case ‘ 0’: . . . case ‘ 9’: scan Digit* return Token. INTLITERAL; case ‘+’: case ‘-’: . . . : case ‘=’: take. It( ); return Token. OPERATOR; . . . etc. . . } Syntax Analysis (Chapter 4) 27

Developing a Scanner Look at the identifier case in more detail. . . return.

Developing a Scanner Look at the identifier case in more detail. . . return. . . case ‘a’: case ‘b’: . . . case ‘z’: case ‘A’: case ‘B’: . . . case ‘Z’: scan Letter take. It( ); (Letter | Digit)* return(Letter scan while (is. Letter(current. Char) Token. IDENTIFIER; | Digit)* || is. Digit(current. Char) ) case return scan take. It( ‘ 0’: . . . Token. IDENTIFIER; (Letter ); case |‘ 9’: Digit) case. . . return ‘ 0’: . . . Token. IDENTIFIER; case ‘ 9’: case. . . ‘ 0’: . . . case ‘ 9’: . . . Syntax Analysis (Chapter 4) 28