Elaborazione del linguaggio naturale Fabio Massimo Zanzotto FMZ
Elaborazione del linguaggio naturale Fabio Massimo Zanzotto FMZ
Part five Feature Structures FMZ
Where we are? Target of the analysis: – interpret NL sentences with respect to a sort of anambiguos internal laguage • Natural Languages is a ambiguous and social beast vs. Formal Languages is unambiguous and “top-down” decided • What’s a language model: – treating infinite sentences with generative machinaries with a finite set of rules FMZ
How we proceeded so far. . . ASSIOM: A syntactic interpretation of a sentence helps in understanding its semantics Let’s build a syntactic model for NL!! • Analysis of the chomsky hierarchy • Use of Context-free formalisms and related parsing algoritms – CYK – DCG (in prolog) FMZ
How we proceeded so far. . . OBSERVATION: NL is more difficult that we may think Let us renounce to the total grammaticality!!! • Model offering Partial Analysis – CYK – Chart parsing and early algorithm FMZ
Our Aim Lines of development Grammatical Representation Power: Build a formalism/model able to give the possibility of reducing the unnecessary interpretations Grammar Use: Build a formalism (and an associated algorithm) able to represent partial analysis FMZ
Our Aim Lines of development Grammatical Representation Power: • CFG (context free grammars) DCG Grammar Use: • CYK • Chart and Early Algorithm FMZ
Observing natural language Toy Examples: . . . La vecchia porta la sbarra. . . Il vecchio porta la sbarra. . . Flying planes can be dangerous. . . Flying planes is dangerous. . . FMZ
A sample Grammar (introspectively produced) S NP VP | S SBAR | SBAR S SBAR Cong. Sub S S S Cong. Coord S | S, S Cong. Coord S NP SBAR VP Verb. X NP | Verb. X NP PP Verb. X Verb | Modal Verb NP Art Noun | Art Adj Noun | Verb Noun | NP PP PP Prep NP FMZ
Observations • The sample grammar is insufficient!! • Spurious interpretations are produced for unambiguous sentences • Loosing the eternal struggle between coverage and induced ambiguity NP Art Noun | Art Adj Noun | Verb Noun | NP PP. . . the old man carries apples. . . A) the old man (VP carries apples) B) the old man (NP carries apples) FMZ
Necessary extensions • Introducing notions like: – – – gender: masculine, feminine number: singular, plural person (for verbs) time (for verbs) mood (for verbs) FMZ
Grammar Adding number (Sing, Plur) NPSing Art. Sing Noun. Sing NPPlur Art. Plur Noun. Plur VPSing Verb. XSing NP | Verb. XSing NP PP VPPlur Verb. XPlur NP | Verb. XPlur NP PP S NPSing VPSing | NPPlur VPPlur FMZ
Grammar Adding number (Sing, Plur) and gender (Mas, Fem) NPMas. Sing Art. Mas. Sing Noun. Mas. Sing NPFem. Sing Art. Fem. Sing Noun. Fem. Sing NPMas. Plur Art. Mas. Plur Noun. Mas. Plur NPFem. Plur Art. Fem. Plur Noun. Fem. Plur VPSing Verb. XSing NP | Verb. XSing NP PP VPPlur Verb. XPlur NP | Verb. XPlur NP PP S NPMas. Sing VPSing | NPFem. Sing VPSing | NPMas. Plur VPPlur | NPFem. Plur VPPlur !!Rules are uncontrollably proliferating!! FMZ
Feature Structures FMZ
What do we desire? Adding number (Sing, Plur) and gender (Mas, Fem) NPMas. Sing Art. Mas. Sing Noun. Mas. Sing NPFem. Sing Art. Fem. Sing Noun. Fem. Sing NPMas. Plur Art. Mas. Plur Noun. Mas. Plur NPFem. Plur Art. Fem. Plur Noun. Fem. Plur NP_Gen: X_Num: Y Art_Gen: X_Num: Y Noun_Gen: X_Num: Y FMZ
Feature Structures Feature structures (information containers) are: • Sets of attribute-value pairs • a value of an attribute may be: – a final value (i. e. , an element from a set) – a feature structure Cat: Agreement: np Gen: mas Num: sing FMZ
Feature Structures Formally if F is a feature structure, • F is a set of pairs (f, v) • given (f, v) F – v is a final value – v is a feature structure FMZ
Feature Structures: Lexicon • nouns – – forma_superficiale lemma genere numero • verbs – – – – – forma_superficiale radice coniugazione: are, ere, ire genere: mas, fem numero: sing, plur persona: 1, 2, 3 modo: indicativo, congiuntivo, imperativo tempo: presente, passato, . . . FMZ verso: attivo, passivo
Lexicon: examples forma_superficiale: radice: coniugazione: numero: persona: modo: tempo: mangeremo mangi are plur 2 indicativo futuro FMZ
Lexicon: examples forma_superficiale: radice: lemma: coniugazione: numero: persona: modo: tempo: mangerebbe mangiare sing 3 condizionale presente FMZ
Lexicon: examples forma_superficiale: lemma: numero: genere: uomini uomo plur mas FMZ
How to use the lexicon? “l’uomo mangierebbe pere” that may be seen: [forma_supericiale: l’] [forma_supericiale: uomo] [forma_supericiale: mangierebbe] [forma_supericiale: pere] forma_superficiale: mangierebbe radice: mangi lemma: mangiare coniugazione: are numero: sing persona: 3 modo: condizionale tempo: presente FMZ
Comparing feature structures: subsumption • A Feature Structure F 1 subsumes F 2 (F 1 F 2) if all the information that is in the F 1 is also in F 2 Formally, F 1 F 2 se e solo se v = v’ oppure (f, v) F 1 (f, v’) F 2. v v’ FMZ
After the lexicon and the subsumption “l’uomo mangierebbe pere” that may be seen: [forma_supericiale: l’] [forma_supericiale: uomo] [forma_supericiale: mangierebbe] [forma_supericiale: pere] forma_superficiale: lemma: numero: genere: uomo sing mas FMZ forma_superficiale: mangierebbe radice: mangi lemma: mangiare coniugazione: are numero: sing persona: 3 modo: condizionale tempo: presente
What if? “l’uomo mangierebbe pere” that may be seen: [ forma_supericiale: l’] [ forma_supericiale: uomo] [ forma_supericiale: mangierebbe , forma_fonologica: xxxx ] [ forma_supericiale: pere] Subsumption is not sufficient! forma_superficiale: mangierebbe radice: mangi lemma: mangiare coniugazione: are numero: sing persona: 3 modo: condizionale tempo: presente FMZ
Unification is a partial operation between two feature structures so that the new feature structure contain all the information of the two F 1 F 2 is so that: – F 1 F 2 – F 2 F 1 F 2 – if H has the property F 1 H and F 2 H then F 1 F 2 H FMZ
Unification Example forma_superficiale: forma_fonologica: mangierebbe xxx forma_superficiale: mangierebbe radice: mangi lemma: mangiare cat: verbo coniugazione: are numero: sing persona: 3 modo: condizionale tempo: presente = FMZ = forma_superficiale: mangierebbe radice: mangi forma_fonologica: xxx lemma: mangiare cat: verbo coniugazione: are numero: sing persona: 3 modo: condizionale tempo: presente
Unification between two feature structures may not exist. forma_superficiale: forma_fonologica: mangia xxx forma_superficiale: mangierebbe radice: mangi lemma: mangiare cat: verbo coniugazione: are numero: sing persona: 3 modo: condizionale tempo: presente FMZ
Coindexing What if we want to apply this rule? cat: s cat: numero: nome [1] forma_superficiale: lemma: cat: numero: genere: cat: numero: persona: uomo nome sing mas FMZ verbo [1] 3 forma_superficiale: mangierebbe radice: mangi lemma: mangiare cat: verbo coniugazione: are numero: sing persona: 3 modo: condizionale tempo: presente
Feature Structures in Prolog • feature structures will be represented as a open list of attribute value pairs • : (the colon) will be used to form attribute value pairs es. [number: sg, person: 3 | _ ] [cat: np, agr: [number: sg, person: 3 | _ ] FMZ
Unification in Prolog unify 0(Dag, Dag) : - !. unify 0([Feature: Value|Rest], Dag) : val(Feature, Value, Dag, Strip. Dag), unify 0(Rest, Strip. Dag). val(Feature, Value 1, [Feature: Value 2|Rest], Rest) : !, unify 0(Value 1, Value 2). val(Feature, Value, [Dag|Rest], [Dag|New. Rest]) : !, val(Feature, Value, Rest, New. Rest). FMZ
Where we worked today? Lines of development Grammatical Representation Power: Build a formalism/model able to give the possibility of reducing the unnecessary interpretations Grammar Use: Build a formalism (and an associated algorithm) able to represent partial analysis FMZ
- Slides: 32