Tree Automata First A reminder on Automata on

  • Slides: 22
Download presentation
Tree Automata First: A reminder on Automata on words Typing semistructured data

Tree Automata First: A reminder on Automata on words Typing semistructured data

Finite state automata on words Transitions Alphabet State Initial state Accepting states Typing semistructured

Finite state automata on words Transitions Alphabet State Initial state Accepting states Typing semistructured data

Nondeterministic automaton: Example a q 0 b q 0 q 1 a q 0

Nondeterministic automaton: Example a q 0 b q 0 q 1 a q 0 a b q 0 q 1 a q 0 KO b q 0 q 1 a q 0 q 1 q 2 OK

Reminder • Deterministic – No transition – No alternative transitions such as • Determinization

Reminder • Deterministic – No transition – No alternative transitions such as • Determinization – It is possible to obtain an equivalent deterministic automaton – State of new automaton = set of states of the original one – Possible exponential blow-up • Minimization • Limitations – cannot do – Context-free languages • Essential tool – e. g. , lexical analysis

(Reminder (2 • • L(A) = set of words accepted by automata A Regular

(Reminder (2 • • L(A) = set of words accepted by automata A Regular languages Can be described by regular expressions, e. g. a(b+c)*d Closed under complement • Closed under union, intersection – Product automata with states (s, s’) where s is from A and s’ is from A’

Automata on words versus trees a Left to right a b b a Right

Automata on words versus trees a Left to right a b b a Right to left B o t t o m u p b b a a No difference a T o p d o w n b Differences

Automata on ranked trees Typing semistructured data

Automata on ranked trees Typing semistructured data

Binary tree automata • Parallel evaluation a • For leaves: • For other nodes:

Binary tree automata • Parallel evaluation a • For leaves: • For other nodes: B o t t o m u p q” b q’ b b q b a q Typing semistructured data q 2 a q 1 a q” b q q’

Bottom-up tree automata • Bottom-up: if a node labeled a has its children in

Bottom-up tree automata • Bottom-up: if a node labeled a has its children in states q, q’ then the node moves nondeterministically to state r or r’ • Accepts is the root is in some state in F • Not deterministic if alternatives or -transitions:

Example: deterministic bottom-up

Example: deterministic bottom-up

Boolean circuit evaluation v 1 1 v 0 v 1 0 v v v

Boolean circuit evaluation v 1 1 v 0 v 1 0 v v v 1 OK 1

Regular tree language = set of trees accepted by a bottom-up tree automaton Typing

Regular tree language = set of trees accepted by a bottom-up tree automaton Typing semistructured data

Regular tree languages Theorem: the following are equivalent – L is a regular tree

Regular tree languages Theorem: the following are equivalent – L is a regular tree language – L is accepted by a nondeterministic bottom-up automaton – L is accepted by a nondeterministic top-down automaton Deterministic top-down is weaker

Top-down tree automata • Top-down: if a node labeled a is in state q”,

Top-down tree automata • Top-down: if a node labeled a is in state q”, then its left child moves to state q, right to q’ • Accepts is all leaves are in states in F • Not deterministic if

Why deterministic top-down is weaker? • Consider the language – L = { <r>

Why deterministic top-down is weaker? • Consider the language – L = { <r> <a>, <b> <r>, <r> <b>, <a><r>) } • It can be accepted by a bottom-up TA – Exercise: write a BUTA A such that L = L(A) • Suppose that B is a deterministic top-down TA that accepts both trees in L – Exercise: Show that B also accepts <r> <a> <r> – A contradiction Fact: No deterministic top-down tree automata accepts exactly L

Ranked trees automata: Properties • • Like for words Determinization Minimization Closed under –

Ranked trees automata: Properties • • Like for words Determinization Minimization Closed under – Complement – Intersection – Union

…But • XML documents are unranked: book (intro, section*, conclusion)

…But • XML documents are unranked: book (intro, section*, conclusion)

Automata on unranked tree Typing semistructured data

Automata on unranked tree Typing semistructured data

Unranked tree automata Issue: represent an infinite set of transitions Solution: a regular language

Unranked tree automata Issue: represent an infinite set of transitions Solution: a regular language

(Unranked tree automata (2 • Rule: • Meaning: if the states of the children

(Unranked tree automata (2 • Rule: • Meaning: if the states of the children of some node labeled a form a word in L(Q), this node moves to some state in {r 1, …, rm}

Building on ranked trees a a b b b a b b Ranked tree:

Building on ranked trees a a b b b a b b Ranked tree: First. Child-Next. Sibling F: encoding into a ranked tree F is a bijection F-1: decoding

Building on bottom-up ranked trees (2) • For each Unranked TA A, there is

Building on bottom-up ranked trees (2) • For each Unranked TA A, there is a Ranked TA accepting F(L(A)) • For each Ranked TA A, there is an unranked TA accepting F-1(L(A)) • Both are easy to construct Consequence: Unranked TA are closed under union, intersection, complement Determinaztaion also possible, a bit more tricky