International Institute of Information Technology Pune Department of

  • Slides: 15
Download presentation
International Institute of Information Technology, Pune Department of Computer Engineering Systems Programming & Operating

International Institute of Information Technology, Pune Department of Computer Engineering Systems Programming & Operating Systems Unit – III Case Study: Overview of LEX and YACC

LEX & YACC • What is Lex? • Lex is officially known as a

LEX & YACC • What is Lex? • Lex is officially known as a "Lexical Analyser". • It's main job is to break up an input stream into more usable elements. • Or in, other words, to identify the "interesting bits" in a text file. • What is Yacc? • Yacc is officially known as a "parser". • In the course of it's normal work, the parser also verifies that the input is syntactically sound. • YACC stands for "Yet Another Compiler". This is because this kind of analysis of text files is normally associated with Prof. Deptii Chaudhari, I 2 IT, Pune writing compilers. 2

Prof. Deptii Chaudhari, I 2 IT, Pune 3

Prof. Deptii Chaudhari, I 2 IT, Pune 3

LEX Program Structure Definitions %{ C global variables, prototype, Comments %} Production Rules %%

LEX Program Structure Definitions %{ C global variables, prototype, Comments %} Production Rules %% ------------------%% User Subroutine Section (Optional) Prof. Deptii Chaudhari, I 2 IT, Pune 4

 • In the rules section, each rule is made up of two parts

• In the rules section, each rule is made up of two parts : a pattern and an action separated by whitespace. • The lexer that lex generates will execute the action when it recognizes the pattern. • The user subroutine section, consists of any legal C code. • Lex copies it to the C file after the end of the lex generated code. • Lex translates the Lex specification into C source file called lex. yy. c which we compile and link with lex library –ll. • Then we can execute the resulting program to check that it works as we expected. Prof. Deptii Chaudhari, I 2 IT, Pune

Example %{ #include <stdio. h> %} %% [0123456789]+ printf("NUMBERn"); [a-z. A-Z][a-z. A-Z 0 -9]*

Example %{ #include <stdio. h> %} %% [0123456789]+ printf("NUMBERn"); [a-z. A-Z][a-z. A-Z 0 -9]* printf("WORDn"); %% • Running the Program $ lex example_lex. l gcc lex. yy. c –ll. /a. out Prof. Deptii Chaudhari, I 2 IT, Pune 6

Pattern Matching Primitives Metacharacter Matches. any character except newline n newline * zero or

Pattern Matching Primitives Metacharacter Matches. any character except newline n newline * zero or more copies of the preceding expression + one or more copies of the preceding expression ? zero or one copy of the preceding expression ^ beginning of line $ end of line a|b a or b (ab)+ one or more copies of ab (grouping) "a+b" literal "a+b" (C escapes still work) [] character class Prof. Deptii Chaudhari, I 2 IT, Pune 7

Pattern Matching Examples Expression abc* abc+ a(bc)? [abc] [a-z] [a-z] [-az] [A-Za-z 0 -9]+

Pattern Matching Examples Expression abc* abc+ a(bc)? [abc] [a-z] [a-z] [-az] [A-Za-z 0 -9]+ [ tn]+ [^ab] [a^b] [a|b] a|b Matches abc abccc. . . abc, abccc, abcccc, . . . abc, abcbcbc, . . . a, abc one of: a, b, c any letter, a through z one of: a, -, z one of: - a z one or more alphanumeric characters whitespace anything except: a, b a, ^, b a, |, b a, b Prof. Deptii Chaudhari, I 2 IT, Pune 8

Operation of yylex() • When lex compiles the input specification, it generates the C

Operation of yylex() • When lex compiles the input specification, it generates the C file lex. yy. c that contains the routine int yylex(void). • This routine reads the input string trying to match it with any of the token patterns specified in the rules section. • On a match associated action is executed. • When we call yylex() function, it starts the process of pattern matching. • Lex keeps the matched string into the address pointed by pointer yytext. • Matched string's length is kept in yyleng while value of token is kept in variable yylval. Prof. Deptii Chaudhari, I 2 IT, Pune 9

$ cc lex. yy. c -ll %{ $. /a. out Write a C program

$ cc lex. yy. c -ll %{ $. /a. out Write a C program int com=0; #include<stdio. h> %} int main() %% { "/*"[^n]+"*/" {com++; fprintf(yyout, " "); } int a, b; %% /*float c; */ int main() printf(“Hi”); /*printf(“Hello”); */ { } printf("Write a C programn"); Comment=2 yyout=fopen("output", "w"); $ cat output yylex(); #include<stdio. h> printf("Comment=%dn", com); int main() return 0; { int a, b; } printf(“Hi”); Prof. Deptii Chaudhari, I 2 IT, Pune } 10

Lex Predefined Variables Prof. Deptii Chaudhari, I 2 IT, Pune 11

Lex Predefined Variables Prof. Deptii Chaudhari, I 2 IT, Pune 11

YACC • YACC is a parser generator that takes an input file with an

YACC • YACC is a parser generator that takes an input file with an attribute-enriched BNF (Backus – Naur Form) grammar specification. • It generates the output C file y. tab. c containing the function int yyparse(void) that implements its parser. • This function automatically invokes yylex() everytime it needs a token to continue parsing. Prof. Deptii Chaudhari, I 2 IT, Pune 12

Prof. Deptii Chaudhari, I 2 IT, Pune 13

Prof. Deptii Chaudhari, I 2 IT, Pune 13

Structure of YACC Program Definitions Context free grammar & action for each production %{

Structure of YACC Program Definitions Context free grammar & action for each production %{ C global variables, prototype, Comments %} %% ------------------%% Subroutines/Functions Prof. Deptii Chaudhari, I 2 IT, Pune 14

Arithmatic. l %{ #include<stdio. h> #include "y. tab. h" extern int yylval; %} %%

Arithmatic. l %{ #include<stdio. h> #include "y. tab. h" extern int yylval; %} %% [0 -9]+ { yylval=atoi(yytext); return NUMBER; } [t] ; [n] return 0; . return yytext[0]; %% int yywrap() { return 1; } How To Run: $yacc -d arithmatic. y $lex arithmatic. l $gcc lex. yy. c y. tab. c $. /a. out Prof. Deptii Chaudhari, I 2 IT, Pune 15