ICS 804 Theory of Computation Ibrahim Otieno iotienouonbi
: : ICS 804: : Theory of Computation - Ibrahim Otieno iotieno@uonbi. ac. ke +254 -0722 -429297 SCI/ICT Building Rm. G 15
Course Outline Mathematical Preliminaries Turing Machines Recursion Theory Markov Algorithms Register Machines Regular Languages and finite-state automata Aspects of Computability
Last week: Register Machines Register machines and formal languages Model-independent characterization of computational feasibility
Course Outline Mathematical Preliminaries Turing Machines ◦ Additional Varieties of Turing Machines Recursion Theory Markov Algorithms Register Machines Regular Languages and finite-state automata Aspects of Computability
Regular Languages and finite-state automata Regular Expressions and regular languages Deterministic FSA Non-deterministic FSA Finite-state automata with epsilon moves Generative grammars Context-free, Context-sensitive languages Chomsky Hierarchy
Regular Expressions and Regular Languages
Characterizing formal languages Plain words: the language of all and only those words over ={a, b} of length 2 (aa, bb, ab, ba, bb) Set abstraction: {w|w * and |w| = 2} New way: regular expressions denote languages
Regular Expressions Denote Languages denotes language {an|n 0} (a). (b) or just ab denotes unit language {ab} a*b* denotes {anbm|n, m 0} a 2 b 3 denotes {aabbb} an not regular expression a+bb denotes {anbb|n 1} a? bb denotes {anbb| 0 n 1} a|b denotes {a, b} a*
Exercise denotes language {an|n 0} (a). (b) or just ab denotes unit language {ab} a*b* denotes {anbm|n, m 0} a 2 b 3 denotes {aabbb} an not regular expression a+bb denotes {anbb|n 1} a? bb denotes {anbb| 0 n 1} a|b denotes {a, b} a* What does ((a. b)*)|((b. a)*) mean? Give a few examples
Exercise a* denotes language {an|n 0} (a). (b) or just ab denotes unit language a*b* denotes {anbm|n, m 0} a 2 b 3 denotes {aabbb} an not regular expression a+bb denotes {anbb|n 1} a? bb denotes {anbb| 0 n 1} a|b denotes {a, b} {ab} What does ((a. b)*)|((b. a)*) mean? Give a few examples The language containing all and only even-length words consisting of alternating a’s and b’s { , ab, ba, abab, baba, …}
Kleene-Closure Operator (recap) Symbol *: certain unary operation on languages Given language L L* = def {w| for some n 0, w is the concatenation of n words of L} L*: is the result of concatenating 0 or more words of L
Language forming operations (recap) Binary concatenation operation: . L 1. L 2 = def. {w 1 w 2|w 1 L 1 & w 2 L 2} The language that results from taking a word from L 1 and appending to it a word from L 2
Definition - Regular Expression is regular expression (over ) and denotes language (ii) is regular expression (over ) and denotes language { } (iii) If s is in * then s itself is a regular expression and denotes language {s} (iv) Suppose s and r are regular expressions that denote languages Lr and Ls, then (a) (r|s) is a regular expression that denotes Lr Ls (b) (r. s) is a regular expression that denotes Lr. Ls (c) (r*) is a regular expression that denotes (Lr)* (v) No expression is a regular expression unless it is obtainable from (i) – (iv) (i)
Definition What about (r+) What about (r? )
Definition What about (r+) = ((r*). r) What about (r? ) = ( |r)
Notation Usually forget about parentheses: (ab) = ab (a|b|c): 3 -word language {a, b, c} parentheses > superscript > concatenation > alternation ab* = a(b*) (ab)* a|ba = (a|(b. a)) ((a|b). a)
Regular Languages L be a language over alphabet , i. e. , L *. Then L is said to be a regular language if L is denoted by some regular expression over Let be a finite alphabet and L 1 and L 2 regular languages over . Then L 1 L 2, L 1. L 2, and L 1* are also regular languages Let
Remarks is a finite alphabet and w is any word over *. Then unit language {w} is regular. if is a finite alphabet. Then any finite language over is regular. if
Deterministic Finite State Automata
Finite State Automata New model of computation: analysis of the kind of computation that requires a fixed (finite) amount of memory for arbitrary input Also called finite-state machines
Deterministic Finite-State Automata • = {a, b} • Vertices and arcs • Labels of arcs are members of • No tapes, but input • Input: (possibly empty) word over e. g. abb
Deterministic Finite-State Automata • Accepting configuration: FSA halts in state q 1 • The FSA accepts word abb • e. g. aba • q 2: trap state • L = {abn|n 0}
Determinism For each state/symbol pair, FSA M has exactly one instruction FSA M has at least one instruction. This makes M fully defined Determinism means that, within any state diagram for FSA, the path labeled by given word w is unique: for word w *, there is exactly one path starting at q 0 and labeled by w
Exercise • • • Which regular language is accepted by this FSA? What are the accepting states? Is an accepted word? What is the trap state? Is the trap state a sink? Is the language finite?
Exercise • • • Which regular language is accepted by this FSA? a(a|b)a? What are the accepting states? q 2 and q 3 Is an accepted word? no What is the trap state? q 4 Is the language finite? yes
Exercise • Which regular language is accepted by this FSA? • What are the accepting states? • Is an accepted word? • What is the trap state? • Is the language finite?
Exercise • Which regular language is accepted by this FSA? (aba)* • What are the accepting states? q 0 • Is an accepted word? yes • What is the trap state? q 3 • Is the language finite? no
Alternate description d. M(q 0, a) = q 1 d. M(q 0, b) = q 2 d. M(q 1, b) = q 1 d. M(q 1, a) = q 2 d. M(q 2, b) = q 2
Formal Definition A deterministic FSA is a quintuple , Q, qinit, F, d. M ◦ is the input alphabet ◦ Q is a finite, nonempty set of states ◦ qinit Q is the initial state or start state ◦ F Q is a (possibly empty) set of accepting or terminal states ◦ d. M: Q Q transition function (total and single valued)
Word Acceptance A deterministic finite-state automaton M accepts word w * if there is a unique path starting at qinit and labeled by w that leads to some member of F
Language Acceptance The language accepted by M is the set of all and only those words over that are accepted by M L(M) for the language accepted by M. FSAs are language acceptors only
Nondeterministic Finite State Automata
Non-determinism Cf. Turing Machines Existence of alternative instructions for a given state/symbol pair
A Nondeterministic Machine a q 0 q 1 b b a L = (ab)* {a} = (ab)*|a a q 2 L = (ab)* b
Non-determinism Nondeterministic FSA are usually easier to design but run the risk of accepting unintended words d. M: Q Q is a transition mapping Assumed to be total but permitted to be multi-valued Cf. difference between function and mapping!
Formal Definition A nondeterministic FSA is a quintuple , Q, qinit, F, d. M ◦ ◦ is the input alphabet Q is a finite, nonempty set of states qinit Q is the initial state or start state F Q is a (possibly empty) set of accepting or terminal states ◦ d. M: Q Q transition mapping (total and possibly multi-valued)
Word Acceptance w * is accepted by FSA M provided there exists some path, labeled by w, in the state diagram of M leading from qinit to a terminal state Word Cf. deterministic definition acceptance: unique path of word
Language Acceptance The language accepted by a nondeterministic FSA is the set of words accepted by M.
Nondeterminism Nondeterministic FSA are easier to design For every nondeterministic FSA, there exists an equivalent deterministic FSA We can automatically convert the nondeterministic FSA to an equivalent deterministic FSA through subset construction
Finite-state automata with epsilon moves
Epsilon moves arcs labeled do not advance input -arcs may or may not introduce nondeterminism Executing
Example b a 0 c 1 a b c 2 a (a*b*c*) c b 3 a b c
Equivalence Result M be FSA with -moves. Then there exists a FSA M´ with no -moves such that L(M) = L(M´) Let
Non-determinism -moves do not necessarily imply nondeterminism 0 1 a 2
Regular languages The family of regular languages is identical to the family of FSA-acceptable languages !!!!!!!!!
Generative Grammars
Generative Grammars Alternative characterization of the family of (regular) languages Example with just 2 productions (1) S a. Sb (2) S all words of form anbn for n 0 e. g. aaabbb S a. Sb (1) aa. Sbb (1) aaa. Sbbb (1) aaa bbb (2) Generates
Definition empty productions grammar terminals (usually lowercase) terminal alphabet grammar non-terminals (usually uppercase) Non-terminal alphabet G start symbol S in G production set
Second Example (1) (2) (3) S aa. Xcc X a. Xc X b all words of form anbcn for n 2 e. g. aaaabcccc S aa. Xcc (1) aaa. Xccc (2) aaaa. Xcccc (2) aaaabcccc (3) Generates
Third Example (1) S a. S´bc (2) S (3) S´ a. S´b. C (4) S´ (5) Cb b. C (6) Cc cc Generates language {anbncn|n 0} S a. S’bc (1) aa. S’b. Cbc (2) aaab. Cbc (4) aaabb. CCbc (5) aaabb. Cc (5) aaabbb. CCc (5) aaabbb. Ccc (6) aaabbbccc (6) e. g. aaabbbccc
Equivalence Two generative grammars G and G´ are said to be equivalent if L(G) = L(G´).
Right-linear grammars A generative grammar where the production rules are either of the form X w. Y or X w where X, Y are nonterminals and w is a (possibly empty) word
Equivalence result If L is generated by a right-linear grammar, then L is regular A language generated by a rightlinear grammar (i. e. a regular language) can always be accepted by a FSA
Context-free Languages There are languages that can be generated by a contextfree grammar that are not regular e. g (1) S a. Sb (2) S L = {anbn|n 0} Context-free grammars have a single non-terminal on the left-hand side of the production Context-free languages: the class of languages that can be generated by some context-free grammar
Context-sensitive languages e. g. (1) S a. S´bc (2) (3) (4) (5) (6) There S S´ a. S´b. C S´ Cb b. C Cc cc are languages that cannot be generated by a context-free grammar Language L is a context-sensitive language if there exists a context-sensitive grammar G such that either L = L(G) or L = L(G) { } Context-sensitive languages: the class of languages that can be generated by some context-sensitive grammar
An equivalence result Any context-sensitive language is Turingrecognizable but not vice-versa There exists Turing recognizable languages that are not context-sensitive: recursively enumerable languages
The Chomsky Hierarchy Accepted by nondeterministic push-down stack automaton G: Context-Free Grammars Accepted by deterministic FSA G: Right-linear grammars Regular languages Context-free languages Accepted by linear-bounded automaton G: context-sensitive grammars Contextsensitive languages Recursively enumerable languages Accepted by deterministic 1 tape Turing Machine
- Slides: 57