Chapter 2 Syntax Syntax n The syntax of
- Slides: 37
Chapter 2 Syntax
Syntax n The syntax of a programming language specifies the structure of the language n The lexical structure specifies how words can be constituted from characters n The syntactic structure specifies how sentences can be constituted from words
Lexical Structure n The tokens of a programming language consist of the set of all baisc grammatical categories that are the building blocks of syntax n A program is viewed as a stream of tokens
Standard Token Categories n Keywords, such as if and while n Literals or constants, such as 42 (a numeric literal) or "hello" (a string literal) n Special symbols, such as “; ”, “<=”, or “+” n Identifiers, such as x 24, putchar, or monthly_balance
White Spaces and Comments n n n White spaces and comments are ignored except they function as delimiters Typical white spaces: newlines, tabs, spaces Comments: n n /* … */, // … n (C, C++, Java) -- … n (Ada, Haskell) (* … *) (Pascal, ML) ; … n (Scheme)
C tokens There are six classes of tokens: identifiers, keywords, constants, string literals, operators, and other separators. Blanks, horizontal and vertical tabs, newlines, formfeeds, and comments as described below (collectively, "white space") are ignored except as they separate tokens. Some white space is required to separate otherwise adjacent identifiers, keywords, and constants. If the input stream has been separated into tokens up to a given character, the next token is the longest string of characters that could constitute a token.
An Example /* This program counts from 1 to 10. */ main( ) { int i; for (i = 1; i <= 10; i++) { printf(“%dn”, i); } }
Backus-Naur Form (BNF) n n n BNF is a notation widely used in formal definition of syntactic structure A BNF is a set of rewriting rules , a set of terminal symbols , a set of nonterminal symbols N, and a “start symbol” S N Each rule in has the following form A where A N and (N )*
Backus-Naur Form n The terminals in form the basic alphabet (tokens) from which programs are constructed n The nonterminals in N identify grammatical categories like Identifier, Integer, Expression, Statement, Function, Program n The start symbol S identifies the principal grammatical category being defined by the grammar
Examples 1. binary. Digit 0 binary. Digit 1 binary. Digit 0 | 1 2. metasymbol or metasymbol concatenate Integer Digit | Integer Digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Derivation Integer Digit Sentential Digit form 3 Digit 3 5 2 Sentence
Parse Tree Sentential form
Example: Expression Assignment Identifier = Expression Term | Expression + Term | Expression – Term Factor | Term * Factor | Term / Factor Identifier | Literal | ( Expression )
Example: Expression x+2*y
Syntax for a Subset of C Program void main ( ) { Declarations Statements } Declarations | Declarations Declaration Type Identifiers ; Type int | boolean Identifiers Identifier | Identifiers , Identifier Statements | Statements Statement ; | Block | Assignment | If. Statement | While. Statement Block { Statements } Assignment Identifier = Expression ; If. Statement if ( Expression ) Statement | if ( Expression ) Statement else Statement While. Statement while ( Expression ) Statement
Syntax for a Subset of C Expression Conjuction | Expression || Conjuction Relation | Conjuction && Relation Addition | Relation <= Addition | Relation >= Addition | Relation == Addition | Relation != Addition Term | Addition + Term | Addition – Term Negation | Term * Negation | Term / Negation Factor | ! Factor Identifier | Literal | ( Expression )
Example: Program . . void main ( ) { int x; x = 1; }
Ambiguity n A grammar is ambiguous if it permits a string to be parsed into two or more different parse trees Amb. Exp Integer | Amb. Exp – Amb. Exp 2 -3 -4
An Example 2 – (3 – 4) (2 – 3) – 4
The Dangling Else Problem if ( x < 0 ) if ( y < 0 ) y = y – 1; else y = 0;
The Dangling Else Problem if ( x < 0 ) if ( y < 0 ) y = y – 1; else y = 0;
The Dangling Else Problem n Solution I: use a special keyword fi to explicitly close every if statement. For example, in Ada If. Statement if ( E ) S fi | if ( E ) S else S fi n Solution II: use an explicit rule outside the BNF syntax. For example, in C, every else clause is associated with the closest preceding if in the statement
Extended BNF (EBNF) n EBNF introduces 3 parentheses: n It uses { } to denote repetition to simplify the specification of recursion n It uses [ ] to denote the optional part n It uses ( ) for grouping
An Example Expression Term | Expression + Term | Expression – Term Factor | Term * Factor | Term / Factor + number | - number | number grouping Expression Term { ( + | – ) Term } Term Factor { ( * | / ) Factor } zero or more Factor [ + | - ] number occurrences optional
Abstract Syntax n The abstract syntax of a language identifies the essential syntactic elements in a program without describing how they are concretely constructed while i < n do begin i : = i + 1 end while (i < n) { i = i + 1; } Pascal C
Example: Loop n n n Thinking a loop abstractly, the essential elements are a test expression for continuing a loop and a body which is the statement to be repeated All other elements constitute nonessential “syntactic sugar” The complete syntax is usually called concrete syntax
Example: Loop while i < n do begin i : = i + 1 end loop while (i < n) { i = i + 1; } C = < Pascal i n + i i 1
Example: Expression x+2*y
Example: Expression + x+2*y x * 2 y
Parser n n A parser of a language accepts or rejects strings based on whether they are legal strings in the language In a recursive-descent parser, each nonterminal is implemented as a function, and each terminal is implemented as a matching with the current token
Example: Calculator command expr ‘n’ expr term { ‘+ ’ term } term factor { ‘*’ factor } factor number | ‘(’ expr ‘)’ number digit { digit } digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Example: Calculator #include <ctype. h> #include <stdlib. h> #include <stdio. h> int token; int pos = 0; void command(void); void expr(void); void term(void); void factor(void); void number(void); void digit(void);
Example: Calculator main() { parse(); return 0; } void parse(void) { get. Token(); command(); } void get. Token(void) { token = getchar(); pos++; while (token == ' ') { token = getchar(); pos++; } }
Example: Calculator command expr ‘n’ void command(void) { expr(); match(‘n’); } void match(char c) { if (token == c) get. Token(); else error(); }
Example: Calculator expr term { ‘+ ’ term } term factor { ‘*’ factor } void expr(void) { term(); while (token == '+') { match('+'); term(); } } void term(void) { factor(); while (token == '*') { match('*'); term(); } }
Example: Calculator factor number | ‘(’ expr ‘)’ void factor(void) { if (token == '(') { match('('); expr(); match(')'); } else { number(); } } number digit { digit } void number(void) { digit(); while (isdigit(token)) digit(); }
Example: Calculator void digit(void) { if (isdigit(token)) match(token); else error(); } void error(void) { printf("parse error: position %d: character %cn", pos, token); exit(1); }
- Syntax directed translation
- Hình ảnh bộ gõ cơ thể búng tay
- Ng-html
- Bổ thể
- Tỉ lệ cơ thể trẻ em
- Voi kéo gỗ như thế nào
- Tư thế worm breton là gì
- Hát lên người ơi
- Môn thể thao bắt đầu bằng chữ f
- Thế nào là hệ số cao nhất
- Các châu lục và đại dương trên thế giới
- Công của trọng lực
- Trời xanh đây là của chúng ta thể thơ
- Cách giải mật thư tọa độ
- Làm thế nào để 102-1=99
- Phản ứng thế ankan
- Các châu lục và đại dương trên thế giới
- Thể thơ truyền thống
- Quá trình desamine hóa có thể tạo ra
- Một số thể thơ truyền thống
- Cái miệng bé xinh thế chỉ nói điều hay thôi
- Vẽ hình chiếu vuông góc của vật thể sau
- Thế nào là sự mỏi cơ
- đặc điểm cơ thể của người tối cổ
- Thế nào là giọng cùng tên? *
- Vẽ hình chiếu đứng bằng cạnh của vật thể
- Tia chieu sa te
- Thẻ vin
- đại từ thay thế
- điện thế nghỉ
- Tư thế ngồi viết
- Diễn thế sinh thái là
- Các loại đột biến cấu trúc nhiễm sắc thể
- Số.nguyên tố
- Tư thế ngồi viết
- Lời thề hippocrates
- Thiếu nhi thế giới liên hoan
- ưu thế lai là gì