YACC Introduction What is YACC a tool for

  • Slides: 15
Download presentation
YACC

YACC

Introduction • What is YACC ? a tool for automatically generating a parser given

Introduction • What is YACC ? a tool for automatically generating a parser given a grammar written in a yacc specification (. y file) • YACC (Yet Another Compiler) is a program designed to compile a LALR(1) grammar and to produce the source code of the syntactic analyzer of a language produced by this grammar. • A grammar specifies a set of production rules, which define a language. • A production rule specifies a sequence of symbols, sentences, which are legal in the language.

History • • • Yacc original written by Stephen C. Johnson, 1975. • Variants:

History • • • Yacc original written by Stephen C. Johnson, 1975. • Variants: – lex, yacc (AT&T) – bison: a yacc replacement (GNU) – flex: fast lexical analyzer (GNU) – BSD yacc – PCLEX, PCYACC (Abraxas Software)

How YACC Works YACC source (translate. y) yacc y. tab. c (1) Parse y.

How YACC Works YACC source (translate. y) yacc y. tab. c (1) Parse y. tab. c a. out cc/gcc (2) Compile Token stream a. out (3) Run 0 utput

Skeleton of a yacc specification (. y file) translate. y %{ < C global

Skeleton of a yacc specification (. y file) translate. y %{ < C global variables, prototypes, comments > %} [DEFINITION SECTION] %% [PRODUCTION RULES SECTION] %% < C auxiliary subroutines> y. tab. c is generated after running This part will be embedded into y. yab. c contains token declarations. Tokens are recognized in lexer. define how to “understand” the input language, and what actions to take for each “sentence”. Any user code. For example, a main function to call the parser function yyparse()

YACC File Format • Definition section – declarations of tokens – type of values

YACC File Format • Definition section – declarations of tokens – type of values used on parser stack • Rules section – list of grammar rules with semantic routines • User code • Comments in /*. . . */ may appear in any of the sections

Declaration Section • Two optional sections – Ordinary C declarations delimited by %{ and

Declaration Section • Two optional sections – Ordinary C declarations delimited by %{ and %} – Declarations of grammar tokens • % token DIGIT Declares DIGIT to be a token Tokens specified in this section can be used as terminal in the second and third sections.

Translation Rules Section • Each rule consists of a grammar production and associated semantic

Translation Rules Section • Each rule consists of a grammar production and associated semantic action. • <left side> → <alt 1> | <alt 2> | …. | <altn> would be written in YACC as <left side> : <alt 1> {semantic action 1} | <alt 2> {semantic action 2} |… | <altn> {semantic action n} ;

Translation Rules Section • Rule section is a grammar • Example expr : expr

Translation Rules Section • Rule section is a grammar • Example expr : expr '+' term | term; term : term '*' factor | factor; factor : '(' expr ')' | ID | NUM; • Semantic action is a sequence of C statements. • Semantic action is performed when we reduce by the associated production • Normally the semantic action computes a value for $$ in terms of $i s.

The Position of Rules expr : expr '+' term | term ; term :

The Position of Rules expr : expr '+' term | term ; term : term '*' factor | factor ; factor : '(' expr ')‘ | ID | NUM ; { $$ = $1 + $3; } { $$ = $1 * $3; } { $$ = $1; } { $$ = $2; }

Supporting C Routines • A lexical analyzer by the name yylex() should be provided.

Supporting C Routines • A lexical analyzer by the name yylex() should be provided. – yylex() produces a pair consisting of a token and its attribute value. – If a token such as DIGIT is returened it must be declared in the first section. – The attribute value associate with a token is communicated to the parser through a Yacc defined variable yylval. • Error recovery routines may be added as necessary.

Sample yacc program %{ #include<ctype. h> %} %token DIGIT line: expr ‘n’ ; Expr

Sample yacc program %{ #include<ctype. h> %} %token DIGIT line: expr ‘n’ ; Expr : expr ‘+’ term | term ; Term : term ‘*’ factor | factor ; Factor : ‘(‘ expr | DIGIT ; %% {printf(“%Dn”, $1); } {$$=$1+$3} {$$=$1*$3} {$$=$2}

Sample yacc program %{ #include< ctype. h> %} %token DIGIT %% line : expr

Sample yacc program %{ #include< ctype. h> %} %token DIGIT %% line : expr ‘n’ {printf(“%dn”, $1); } ; expr : expr ‘+’ term {$$=$1+$3} | term ; term : term ‘*’ factor {$$=$1*$3} | factor ; factor : ‘(‘ expr {$$=$2} | DIGIT ; %% yylex() { int c; c=getchar(); if isdigit(c ) { yylval=c-’ 0’; return DIGIT; } return c; }

Yacc with ambiguous grammar Precedence / Association %token NUMBER %left '+' '-' %left '*'

Yacc with ambiguous grammar Precedence / Association %token NUMBER %left '+' '-' %left '*' '/' %right UMINUS %% Lines : expr ‘n’ {printf(“%dn”, $1); } expr : expr ‘+’ expr { $$ = $1 + $3; } | expr ‘-’ expr { $$ = $1 - $3; } | expr ‘*’ expr { $$ = $1 * $3; } | expr ‘/’ expr { if($3==0) yyerror(“divide 0”); else | ‘-’ expr %prec UMINUS {$$ = -$2; } |NUMBER ; %%

Conflicts shift/reduce conflict – occurs when a grammar is written in such a way

Conflicts shift/reduce conflict – occurs when a grammar is written in such a way that a decision between shifting and reducing can not be made. ex: IF-ELSE ambigious. To resolve this conflict, yacc will choose to shift reduce/reduce Conflicts: start : expr | stmt expr ; : CONSTANT ; stmt : CONSTANT ; • Yacc resolves the conflict by reducing using the rule that occurs earlier in the grammar.