Parser and Scanner Generation An Introduction Using SLK

  • Slides: 15
Download presentation
Parser and Scanner Generation: An Introduction Using SLK and Flex++ Creative Commons License –

Parser and Scanner Generation: An Introduction Using SLK and Flex++ Creative Commons License – Curt Hill

Introduction • There a variety of parser generators and scanner generators • The standards

Introduction • There a variety of parser generators and scanner generators • The standards for UNIX seem to be lex and yacc – Yacc seems to have replaced something earlier • The GNU versions are flex and bison • We will use flex++ which is somewhat more parameterizable • We will also use SLK which seems to accept a better set of languages Creative Commons License – Curt Hill

Warning • This process is complicated • This presentation overviews the process • This

Warning • This process is complicated • This presentation overviews the process • This is followed by more detailed coverage of Flex++ and SLK – Neither program is really easy Creative Commons License – Curt Hill

Process 1 • Craft the BNF • Feed this into your parser generator •

Process 1 • Craft the BNF • Feed this into your parser generator • The parser generator should generate several files – Some of these will compile into the later pieces of the parser – Others will give information to help you debug your grammar – One should give you the reserved words of the language Creative Commons License – Curt Hill

Debugging Grammar? • Typically, what happens is that some construct was left out –

Debugging Grammar? • Typically, what happens is that some construct was left out – The non-terminal is never on the left-hand side of a production • When this is the case the parser generator declares it to be a reserved word • Thus, we look at the reserved words to see if they are all really valid Creative Commons License – Curt Hill

Makefile • Since we are likely to go through this process several times we

Makefile • Since we are likely to go through this process several times we use Make to create our executable • Unlike all previous makefile we will use Flex++ and SLK just as if they are compilers – Our source code will need to be newer than our SLK and Flex++ input Creative Commons License – Curt Hill

Tagging • SLK has a parameter that allows all files to be prefixed by

Tagging • SLK has a parameter that allows all files to be prefixed by some letters • This allows files with the same function but for different languages to be distinguished • For example if we were working on a C and a separate C++ parser we might use prefixes of C_ or CPP_ • In this class we will use our initials – One parser generator will be quite enough Creative Commons License – Curt Hill

Process 2 • Start creating the input to Flex++ • As previously mentioned, the

Process 2 • Start creating the input to Flex++ • As previously mentioned, the parser generator generates a file that will list the terminals – In SLK this is the XXXKeywords. txt file or the XXXConstants. h – Where XXX would be your initials Creative Commons License – Curt Hill

 • Process 2 Design your token data type – I suggest a class

• Process 2 Design your token data type – I suggest a class • It should have an enumeration or integer that is the token type • It should have a string to represent identifiers or number tokens • It should record the line and column number where token was found • It should have the usual constructors – Possibly one with all of the above Creative Commons License – Curt Hill

Process 3 • Start the scanner definition section – Name the scanner – Set

Process 3 • Start the scanner definition section – Name the scanner – Set whatever definitions make sense – The header define gets includes • Determine what the scanner will return: – Integer – Enumeration – Object – this is the recommended one Creative Commons License – Curt Hill

Process 4 • Start the rules section of the scanner generator • For each

Process 4 • Start the rules section of the scanner generator • For each terminal create a rule – Punctuation and reserved words are easy – Numbers, names are somewhat harder – Comments hardest • Tokens should return an initialized object Creative Commons License – Curt Hill

Process 5 • There are certain things that a parser generator cannot determine •

Process 5 • There are certain things that a parser generator cannot determine • These are often called semantic routines • One of these is what to do if there is an error in the input file • Next, we design the error handling – This is the XXXError class Creative Commons License – Curt Hill

Process 6 • Another thing that cannot be predetermined by the parser generator is

Process 6 • Another thing that cannot be predetermined by the parser generator is what to do for semantic action • BNF cannot: – Check types – Generate code • Instead, we specify a function that does something at various steps • The writing of these functions is what this step is about Creative Commons License – Curt Hill

Process 7 • Once all this is done the parser should be makeable •

Process 7 • Once all this is done the parser should be makeable • Of course, at this step we need to have various legal and illegal programs to test the parser • This will require various refinements – Anywhere from the grammar to one of the semantic routines Creative Commons License – Curt Hill

Finally • Next, we consider the SLK and Flex++ programs in more detail •

Finally • Next, we consider the SLK and Flex++ programs in more detail • We will then look at a completed project • An assignment would be next • Are you excited yet? Creative Commons License – Curt Hill