Syntax versus Semantics Syntax form or structure of

  • Slides: 27
Download presentation
Syntax versus Semantics • Syntax – form (or structure) of PL constructs – Syntax

Syntax versus Semantics • Syntax – form (or structure) of PL constructs – Syntax of C++ if statement: • • • If keyword followed by Left parenthesis followed by A boolean-valued expression followed by Right parenthesis followed by A statement • if ( <expr> ) <statement> • if ( a > 0 && b > foo. bar (a) ) cout << “Yeah, baby!” << endl; Sections 3. 1, 3. 2, 3. 3 1

Syntax versus Semantics • Semantics – meaning of PL constructs – Semantics of C++

Syntax versus Semantics • Semantics – meaning of PL constructs – Semantics of C++ if statement: • Evaluate the boolean-valued expression <expr> • If result is true execute statement <statement> • If result is false do not execute statement <statement> Sections 3. 1, 3. 2, 3. 3 2

Describing syntax • Describing syntax is easier than describing semantics • Universally accepted notation

Describing syntax • Describing syntax is easier than describing semantics • Universally accepted notation available for describing syntax (BNF) • Programming language design goal: semantics should directly follow from syntax (form should suggest meaning) Sections 3. 1, 3. 2, 3. 3 3

Components of a language • Lexemes – smallest units of the language; “words” –

Components of a language • Lexemes – smallest units of the language; “words” – Examples: 3, =, ==, +, count, while, if, else • Token – name of a category of lexemes – Identifier: index, count, j, k – Int_constant: 3, 100 – Plusop: + (∞ lexemes in class) (just one lexeme in class) • Sentences (Aka: strings, statements) – Sequence of tokens – An entire “program” is a string/sentence. – Then, is it (the sentence) in the C++ language? Sections 3. 1, 3. 2, 3. 3 4

Formal methods of describing syntax • Backus-Naur Form (BNF) • Context-free Grammar • Extended

Formal methods of describing syntax • Backus-Naur Form (BNF) • Context-free Grammar • Extended BNF • Syntax graph Sections 3. 1, 3. 2, 3. 3 5

BNF - Origins • Backus-Naur Form – John Backus and Peter Naur developed a

BNF - Origins • Backus-Naur Form – John Backus and Peter Naur developed a notation (called BNF) for describing programming language syntax • Context-Free Grammar – Noam Chomsky, a linguist, identified four categories of language, one of which is the context free grammar • These two are actually equivalent Sections 3. 1, 3. 2, 3. 3 6

BNF - Definitions • Metalanguage – language used to describe another language • BNF

BNF - Definitions • Metalanguage – language used to describe another language • BNF metalanguage contains – Rules (also called productions) • Format: <LHS> → <RHS> – Non-terminals – an abstraction; can be defined by other non-terminals and terminals; <LHS> is always a non-terminal Sections 3. 1, 3. 2, 3. 3 7

BNF - Definitions • Terminals – lexemes and tokens; <RHS> can be a mixture

BNF - Definitions • Terminals – lexemes and tokens; <RHS> can be a mixture of terminals and nonterminals • Examples (from Pascal) • <ifstmt> → if <logic-expr> then <stmt> else <stmt> or another way to write the two above • <ifstmt> → if <logic-expr> then <stmt> | if <logic-expr> then <stmt> else <stmt> Sections 3. 1, 3. 2, 3. 3 8

BNF - Definitions • Grammar – collection of rules • Recursive rule – LHS

BNF - Definitions • Grammar – collection of rules • Recursive rule – LHS appears in the RHS; useful for expressing variable length lists – <ident-list> → identifier | identifier, <ident-list> – This rule is right-recursive because <ident-list> appears at the end (right side) of the rule Sections 3. 1, 3. 2, 3. 3 9

BNF-Definitions • Start symbol – BNF is a generative device – Sentence of the

BNF-Definitions • Start symbol – BNF is a generative device – Sentence of the language generated by applying rules – First rule applied is one whose <LHS> is the start symbol <program> → begin <stmt-list> end Sections 3. 1, 3. 2, 3. 3 10

BNF - Definitions • Derivation – a sentence generation, the sentence is derived from

BNF - Definitions • Derivation – a sentence generation, the sentence is derived from the start symbol • Leftmost derivation – derived by replacing the leftmost non-terminal in a sentential form – <program> → begin <stmt_list> end – → begin <stmt>; <stmt_list> end – → begin <var> : = <expression>; <stmt_list> end – … – → begin foo : = 3 + 7; print(foo) end Sections 3. 1, 3. 2, 3. 3 11

BNF - Definitions • Parse tree – hierarchical structure of a sentence; internal nodes

BNF - Definitions • Parse tree – hierarchical structure of a sentence; internal nodes are nonterminals; leaf nodes are terminals • Ambiguous grammar – one sentence generated by >= 2 distinct parse trees; (Since compiler generates code program begin stmt_list stmt var from a parse tree, it could generate incorrect code if the grammar was ambiguous!) Sections 3. 1, 3. 2, 3. 3 : = ; end stmt_list expression And so on…. 12

Ambiguous grammar for a simple assignment statement • Rule 1: <assign> → <id> :

Ambiguous grammar for a simple assignment statement • Rule 1: <assign> → <id> : = <expr> • Rule 2: <id> → A | B | C • Rule 3: <expr> → <expr> + <expr> | <expr> * <expr> | (<exp>) | <id> Sections 3. 1, 3. 2, 3. 3 13

Problems • Show a leftmost derivation for the sentence A : = B +

Problems • Show a leftmost derivation for the sentence A : = B + C * A using the grammar on the previous slide • Show that the grammar is ambiguous by drawing two parse trees for the sentence Sections 3. 1, 3. 2, 3. 3 14

Another grammar for simple assignment statements • Rule 1: <assign> → <id> : =

Another grammar for simple assignment statements • Rule 1: <assign> → <id> : = <expr> • Rule 2: <id> → A | B | C • Rule 3: <expr> → <id> + <expr> | <id> * <expr> | (<expr>) | <id> Sections 3. 1, 3. 2, 3. 3 15

Problems • Show a derviation for the sentence A : = B + C

Problems • Show a derviation for the sentence A : = B + C * A • Show the parse tree for A : = B * C + A • Is the grammar ambiguous • Which operator has higher precedence + or *? Sections 3. 1, 3. 2, 3. 3 16

Precedence • Given a statement with multiple operators, the precedence rules indicate the order

Precedence • Given a statement with multiple operators, the precedence rules indicate the order in which the operators are to be evaluated. • Can determine the precedence of operators in a statement by drawing the parse tree. • Operators lower in the tree have higher precedence, because they will be “evaluated” earlier. • Note, sometimes operators may have equal precedence and evaluation order is determined by the associativity. Sections 3. 1, 3. 2, 3. 3 17

Another grammar for simple assignment statements • Rule 1: <assign> → <id> : =

Another grammar for simple assignment statements • Rule 1: <assign> → <id> : = <expr> • Rule 2: <id> → A | B | C | D • Rule 3: <expr> → <expr> + <term> | <term> • Rule 4: <term> → <term> * <factor> | <factor> • Rule 5: <factor> → (<expr>) | <id> Sections 3. 1, 3. 2, 3. 3 18

Problems • • Show the derivation for A : = B + C *

Problems • • Show the derivation for A : = B + C * D Show the parse tree Is the grammar ambiguous? Which has higher precedence + or *? Sections 3. 1, 3. 2, 3. 3 19

Designing an unambiguous grammar with desired precedence • Each operand should have its own

Designing an unambiguous grammar with desired precedence • Each operand should have its own abstraction – Last example • Abstraction for + was <expr> • Abstraction for * was <term> • Operands with lower precedence should be derived first – Last example: <expr> → <expr> + <term> – <term> → <term> * <factor> – + derived before * Sections 3. 1, 3. 2, 3. 3 20

Associativity • Given operators with equal precedence, the associativity determines whether the operators are

Associativity • Given operators with equal precedence, the associativity determines whether the operators are evaluated left to right or right to left – Left associative operator - evaluated left to right – Right associative operator – evaluated right to left Sections 3. 1, 3. 2, 3. 3 21

Another grammar for simple assignment statements • Rule 1: <assign> → <id> : =

Another grammar for simple assignment statements • Rule 1: <assign> → <id> : = <expr> • Rule 2: <expr> → <id> - <expr> | <id> • Rule 3: <id> → A | B | C Sections 3. 1, 3. 2, 3. 3 22

Problems • Draw a parse tree for A : = A – B –

Problems • Draw a parse tree for A : = A – B – C • Is the grammar ambiguous? • What is the associativity of the subtract operator? Sections 3. 1, 3. 2, 3. 3 23

Yet another grammar (YAG) • Rule 1: <assign> → <id> : = <expr> •

Yet another grammar (YAG) • Rule 1: <assign> → <id> : = <expr> • Rule 2: <expr> → <expr> - <id> | <id> • Rule 3: <id> → A | B | C Sections 3. 1, 3. 2, 3. 3 24

Problems • Draw a parse tree for A : = A – B –

Problems • Draw a parse tree for A : = A – B – C • Is the grammar ambiguous? • What is the associativity of the subtract operator? Sections 3. 1, 3. 2, 3. 3 25

Designing a grammar with desired associativity • Left recursive rule – yields left associative

Designing a grammar with desired associativity • Left recursive rule – yields left associative operator • Right recursive rule – yields right associative operator Sections 3. 1, 3. 2, 3. 3 26

Extended BNF • For wimps: simplifies some of the BNF rules. • 1. An

Extended BNF • For wimps: simplifies some of the BNF rules. • 1. An optional part (0 or 1 time) – “metasymbols” are underlined – <ifstm> → if ( <exp> ) <stm> [ else <stm> ] • <ifstm> → if ( <exp> ) <stm> else <stm> • 2. Optional repeat (0 or more times) – <idlist> → <id> { , <id> } • <idlist> → <id> , <idlist> • 3. Multiple-choice (1 from a set; radiobutton) – <expr> → <expr> ( + | - ) <term> • <expr> → <expr> + <term> • <expr> → <expr> - <term> Sections 3. 1, 3. 2, 3. 3 27