2 1 LEX YACC Overview Syntax What its

  • Slides: 16
Download presentation
2 -1. LEX & YACC

2 -1. LEX & YACC

Overview § Syntax What its program looks like – Context-free grammar, BNF § Syntax-directed

Overview § Syntax What its program looks like – Context-free grammar, BNF § Syntax-directed translation – A grammar-oriented compiling technique – Semantics § Yacc 2

Syntax Definition § Grammar : describing the hierarchical structures of many programming language structures

Syntax Definition § Grammar : describing the hierarchical structures of many programming language structures § stmt if (expr) stmt else stmt § A context-free grammar – A set of tokens, known as terminal symbols – A set of nonterminals – A set of productions • left-side -> right side (left-side: nonterminal, right side: a sequence of tokens and/or nonterminals – A designation of one of the nonterminals as the start symbol 3

An Example expression + term | term * factor | factor (expression) | identifier

An Example expression + term | term * factor | factor (expression) | identifier 실제 예를 들어서 설명함. 4

Parse Trees § 정의 – The root is labeled by the start symbol –

Parse Trees § 정의 – The root is labeled by the start symbol – Each leaf is labeled by a token or by ∈ – Each interior node is labeled by a nonterminal • When A is a node then X 1, …, Xn are children if A -> X 1 … Xn is a production § 앞의 예를 이용하여 설명 5

Ambiguity, Associativity, Precedence § Ambiguity of a grammar – A grammar having more than

Ambiguity, Associativity, Precedence § Ambiguity of a grammar – A grammar having more than one parse tree generating a given string of tokens – E E+E | E*E| id § Associativity of operators – between operators with same precedence – left/right associative § Precedence of operators – between different operators § 예로서 설명 +, *, ** 6

Syntax-directed translation Production expr -> expr 1 + term expr -> expr 1 –

Syntax-directed translation Production expr -> expr 1 + term expr -> expr 1 – term expr -> term -> 0 term -> 1 … term -> 9 Semantic Rule expr. t : = expr 1. t || term. t || ‘+’ expr. t : = expr 1. t || term. t || ‘-’ expr. t : = term. t : = ‘ 0’ term. t : = ‘ 1’ … term. t : = ‘ 9’ Figure 2. 5 syntax-directed definition for infix to postfix translation 7

Figure 2. 6 Attribute values at nodes Figure 2. 8 Annotated parse tree for

Figure 2. 6 Attribute values at nodes Figure 2. 8 Annotated parse tree for In parse tree begin west south 8

Lex is a lexical analyzer generator and Yacc is a parser generator. Lex programs

Lex is a lexical analyzer generator and Yacc is a parser generator. Lex programs recognize regular expressions and yacc generates parsers that accept a large class of context-free grammars. Below is a figure which shows how lex and yacc can be combined to perform the "lexical analysis" phase of a compiler 9

Lex - specify a set of lexical rules to lex in a source file.

Lex - specify a set of lexical rules to lex in a source file. • The general format of the source file is: {definitions} %% {rules} %% {programmer subroutines} digit [0 -9] digits {digit}+ whitespace [ tn] %% "[" { printf("OPEN_BRACn"); } "]" { printf("CLOSE_BRACn"); } "+" { printf("ADDOPn"); } "*" { printf("MULTOPn"); } {digits} { printf("NUMBER = %sn", yytext); } whitespace ; 10

lex foo. lex cc lex. yy. c -ll $ a. out /* a. out

lex foo. lex cc lex. yy. c -ll $ a. out /* a. out expects it's input from standard input */ input: [ 1 + 2 * 3 ] output: OPEN_BRAC NUMBER = 1 ADDOP NUMBER = 2 MULTOP NUMBER = 3 CLOSE_BRAC 11

Yacc -The parser gets it's input (a sequence of tokens) from the lexical analyzer

Yacc -The parser gets it's input (a sequence of tokens) from the lexical analyzer (created using lex). -The format of grammar rules {declarations} %% {rules} %% {programs} 12

digit [0 -9] digits {digit}+ whitespace [ tn] %% "[" { return (OPEN_BRAC); }

digit [0 -9] digits {digit}+ whitespace [ tn] %% "[" { return (OPEN_BRAC); } "]" { return (CLOSE_BRAC); } "+" { return (ADDOP); } "*" { return (MULTOP); } {digits} { yylval = atoi(yytext); return (NUMBER); } whitespace ; 13

%start mystartsymbol %token ADDOP MULTOP NUMBER OPEN_BRAC CLOSE_BRAC %left ADDOP %left MULTOP %% mystartsymbol

%start mystartsymbol %token ADDOP MULTOP NUMBER OPEN_BRAC CLOSE_BRAC %left ADDOP %left MULTOP %% mystartsymbol : expr { printf("the value of the expression is %dn", $1); } expr : OPEN_BRAC expr CLOSE_BRAC { $$ = $2; } | expr ADDOP expr { $$ = $1 + $3 ; } | expr MULTOP expr { $$ = $1 * $3 ; } | NUMBER { $$ = $1; } ; %% /* start of programs */ #include <stdio. h> #include "lex. yy. c" main() { return yyparse(); } yyerror(char *s) { fprintf(stderr, "%sn", s); } 14

Shell Program : : : echo lex expr. lex echo yacc expr. yacc echo

Shell Program : : : echo lex expr. lex echo yacc expr. yacc echo cc y. tab. c -ll 15

사이트 § http: //www. uman. com/lexyacc. shtml PCYACC 9. 0 is a complete language

사이트 § http: //www. uman. com/lexyacc. shtml PCYACC 9. 0 is a complete language development environment that generates C, C#, C++, Java, Delphi, and VBS source code from input Language Description Grammars for building Assemblers, Compilers, Interpreters, Browsers, Page Description Languages, Language Translators, Syntax Directed Editors, Language Validators, Natural Language Processors, Expert System Shells, and Query Languages. The PCYACC Tool-Kit includes PCLEX, Visual Debugging Tools, Object-Oriented Class Library's, and Pre. Written "Drop-In" Language engines for virtually every computer language in the world 16