Computational Morphology and its Implications for Theoretical Morphology

  • Slides: 52
Download presentation
Computational Morphology and its Implications for Theoretical Morphology Richard Sproat University of Illinois at

Computational Morphology and its Implications for Theoretical Morphology Richard Sproat University of Illinois at Urbana-Champaign PASCAL Morpho. Challenge Venice April 12, 2006 Computational Morphology/Theoretical Morphology

“Item-and-arrangement” versus “Item and process” • Charles Hockett (1954) “Two models of grammatical description”:

“Item-and-arrangement” versus “Item and process” • Charles Hockett (1954) “Two models of grammatical description”: – Item-and-arrangement: words are composed of morphemes that are put together by a kind of “word syntax” – Item-and-process: words are built up via the application of rules that add phonological and morphosyntactic information Computational Morphology/Theoretical Morphology 1

Stump’s classification Affix is a lexical entry that introduces morphosyntactic features Incremental Lexical Inferential

Stump’s classification Affix is a lexical entry that introduces morphosyntactic features Incremental Lexical Inferential hoot+s[3 sg] Ø’s / hoot[3 sg] Lieber Steele hoots = 3 sg because of -s Realizational -s is introduced due to 3 sg Halle&Marantz Stump, Beard’s LMBM Affix introduced because of morphosyntactic features Computational Morphology/Theoretical Morphology 2

Computational morphology • Nearly all morphological operations can be expressed in terms of regular

Computational morphology • Nearly all morphological operations can be expressed in terms of regular relations. – Only possible exception is reduplication • Regular relations are relations over pairs of strings that can be constructed solely by the operations of: – Concatenation: if R, S are regular relations then so is R • S – Union: if R, S are regular relations then so is RUS – Kleene closure: if R is a regular relation then so is R* (0 or more instances of R concatenated with itself) • Regular relations are closed under composition: if R, S are regular relations, then so is R○S • Implemented with finite-state transducers Computational Morphology/Theoretical Morphology 3

Transducers and composition (Johnson, 1972; Koskenniemi, 1983; Kaplan & Kay, 1994; Mohri & Sproat,

Transducers and composition (Johnson, 1972; Koskenniemi, 1983; Kaplan & Kay, 1994; Mohri & Sproat, 1996) • Consider 3 -letter alphabet {a, b, c} • Given a rule a " b, the equivalent transducer is: abbca bbbcb Computational Morphology/Theoretical Morphology 4

Another rule b"c / _ b Computational Morphology/Theoretical Morphology 5

Another rule b"c / _ b Computational Morphology/Theoretical Morphology 5

The two rules composed a"b b"c / _ b abbca ccbcb Computational Morphology/Theoretical Morphology

The two rules composed a"b b"c / _ b abbca ccbcb Computational Morphology/Theoretical Morphology 6

Composition and morphology • Composition is the most general computational mechanism that handles morphological

Composition and morphology • Composition is the most general computational mechanism that handles morphological operations (Roark and Sproat, 2006) • Affixation (which is more typically handled using concatenation) can also be handled using composition • Composition, and other closure properties of regular relations imply that there is no fundamental difference between morphological theories. Computational Morphology/Theoretical Morphology 7

Affixation as composition Any string over the alphabet Insert b Computational Morphology/Theoretical Morphology 8

Affixation as composition Any string over the alphabet Insert b Computational Morphology/Theoretical Morphology 8

Is this Rube-Goldbergesque? • No! Because many affixes either impose requirements on their base

Is this Rube-Goldbergesque? • No! Because many affixes either impose requirements on their base or modify their base. • Cf. Yowlumne (aka Yawelmani) (Archangeli, 1984) Computational Morphology/Theoretical Morphology 9

Yowlumne gerundial -inay • -inay requires the template CVC(C) Composing the base with k

Yowlumne gerundial -inay • -inay requires the template CVC(C) Composing the base with k 1 will modify the base and add [+GER] Computational Morphology/Theoretical Morphology 10

CVC(C) Computational Morphology/Theoretical Morphology 11

CVC(C) Computational Morphology/Theoretical Morphology 11

Some morphological operations • • Subsegmental morphology Truncation Infixation Root-and-pattern morphology Reduplication Morphomic requirements

Some morphological operations • • Subsegmental morphology Truncation Infixation Root-and-pattern morphology Reduplication Morphomic requirements (Aronoff, 1994) All of these can be handled using composition Computational Morphology/Theoretical Morphology 12

German diminutives Computational Morphology/Theoretical Morphology 13

German diminutives Computational Morphology/Theoretical Morphology 13

Koasati truncation (Lombardi & Mc. Carthy, 1991) Computational Morphology/Theoretical Morphology 14

Koasati truncation (Lombardi & Mc. Carthy, 1991) Computational Morphology/Theoretical Morphology 14

Two kinds of infixation • Extrametrical infixation – E. g. Bontoc • Positively circumscribed

Two kinds of infixation • Extrametrical infixation – E. g. Bontoc • Positively circumscribed infixation – E. g. Ulwa Computational Morphology/Theoretical Morphology 15

Bontoc infixation (Seidenadel, 1907) Computational Morphology/Theoretical Morphology 16

Bontoc infixation (Seidenadel, 1907) Computational Morphology/Theoretical Morphology 16

Ulwa infixation (CODIUL, 1989) Computational Morphology/Theoretical Morphology 17

Ulwa infixation (CODIUL, 1989) Computational Morphology/Theoretical Morphology 17

Root & pattern morphology (Mc. Carthy 1979) k t b Computational Morphology/Theoretical Morphology 18

Root & pattern morphology (Mc. Carthy 1979) k t b Computational Morphology/Theoretical Morphology 18

Root & pattern morphology Computational Morphology/Theoretical Morphology 19

Root & pattern morphology Computational Morphology/Theoretical Morphology 19

Root & pattern morphology: related approaches • Beesley & Karttunen (2000) merge propose an

Root & pattern morphology: related approaches • Beesley & Karttunen (2000) merge propose an approach using compile-replace plus dd Vu Vu rr Vi Ss • Surface form is a regular expressionsolution Kiraz (2000) proposes a multitape • But all of these are equivalent to composition Computational Morphology/Theoretical Morphology 20

Reduplication: Gothic (Wright 1910) • Prefix a syllable of the form (A)Cai to the

Reduplication: Gothic (Wright 1910) • Prefix a syllable of the form (A)Cai to the stem, where C is a consonant position and A is an optional appendix • Copy the onset of the stem to the C position. If there is a pre-onset appendix /s/, copy this to the appendix position Computational Morphology/Theoretical Morphology 21

Bambara reduplication (Culy, 1985) This is apparently beyond the power of finite-state methods. Computational

Bambara reduplication (Culy, 1985) This is apparently beyond the power of finite-state methods. Computational Morphology/Theoretical Morphology 23

Factoring reduplication • Prosodic constraints • Copy verification transducer C Computational Morphology/Theoretical Morphology 24

Factoring reduplication • Prosodic constraints • Copy verification transducer C Computational Morphology/Theoretical Morphology 24

Gothic index transducer Computational Morphology/Theoretical Morphology 25

Gothic index transducer Computational Morphology/Theoretical Morphology 25

Factoring reduplication • Then reduplication in Gothic can be modeled as: αo C •

Factoring reduplication • Then reduplication in Gothic can be modeled as: αo C • More generally, one can model reduplication as the following composition, where P implements the prosodic constraints, C the copy constraints, and A optional phonological adjustments: Po Co. A Computational Morphology/Theoretical Morphology 26

Other approaches • Walther (2000 a, 2000 b) proposes a special kind of transducer

Other approaches • Walther (2000 a, 2000 b) proposes a special kind of transducer involving – Repeat arcs: move backwards in a string and repeat – Skip arcs: skip over portions of the string • Cohen-Sygal & Wintner (forthcoming) introduce finite state registered automata, extending FSA’s with registers • These methods generally seem to presume exact copies Computational Morphology/Theoretical Morphology 27

Non-exact copies • Dakota (Inkelas & Zoll, 1999): Computational Morphology/Theoretical Morphology 28

Non-exact copies • Dakota (Inkelas & Zoll, 1999): Computational Morphology/Theoretical Morphology 28

Non-exact copies • Basic and modified stems in Sye (Inkelas & Zoll, 1999): “they

Non-exact copies • Basic and modified stems in Sye (Inkelas & Zoll, 1999): “they will fall over” Computational Morphology/Theoretical Morphology 29

Morphological Doubling Theory (Inkelas & Zoll, 1999) • In contradistinction to the more common

Morphological Doubling Theory (Inkelas & Zoll, 1999) • In contradistinction to the more common “correspondence” theory: – Reduplication involves doubling at the morphosyntactic level – Phonological doubling is thus expected, but not required Computational Morphology/Theoretical Morphology 30

Gothic reduplication under Morphological Doubling Theory Computational Morphology/Theoretical Morphology 31

Gothic reduplication under Morphological Doubling Theory Computational Morphology/Theoretical Morphology 31

More • Composition also elegantly accounts for other phenomena such as prosodic circumscription (Mc.

More • Composition also elegantly accounts for other phenomena such as prosodic circumscription (Mc. Carthy and Prince, 1990) or morphomic requirements (Aronoff, 1994). • Composition of regular relations can model rules • It can also model affixation • It doesn’t matter if you describe affixation as lexical-incremental or inferential-realizational Computational Morphology/Theoretical Morphology 32

Morphomic requirements (Aronoff, 1994) Latin 3 rd Stem Computational Morphology/Theoretical Morphology 33

Morphomic requirements (Aronoff, 1994) Latin 3 rd Stem Computational Morphology/Theoretical Morphology 33

So? • 3 rd stem is not morphologically uniform: – It differs across different

So? • 3 rd stem is not morphologically uniform: – It differs across different verb classes and some verbs have idiosyncratic third stems • It is not semantically coherent: – Forms that require the 3 rd stem are a motley crew • Yet there is clearly a notion of 3 rd stem: – If you tell me the 3 rd stem of a verb, I can tell you how the agentive noun, the supine, the perfect participle … are formed • 3 rd stem has a purely morphological function Computational Morphology/Theoretical Morphology 34

3 rd stem is just prosodically induced affixation • Assume we have a transducer

3 rd stem is just prosodically induced affixation • Assume we have a transducer T that forms the 3 rd stem of a verb: – of course, T will have to allow for a lot of idiosyncratic changes Σ* >3 st: ε Σ* Computational Morphology/Theoretical Morphology 35

Summary so far • Most or all morphological operations can be handled with composition

Summary so far • Most or all morphological operations can be handled with composition • We wish to show next that this fact, along with general properties of regular languages and relations, allows us to dispense with distinctions between morphological theories. Computational Morphology/Theoretical Morphology 37

Return to Stump (2001) • In (Roark & Sproat, 2006) we reanalyze Stump’s analyses

Return to Stump (2001) • In (Roark & Sproat, 2006) we reanalyze Stump’s analyses of: – Sanskrit nominal declensions – Swahili verbal declensions – Breton double plurals • All of which purport to show the need for an realizational-inferential account. • Here we will consider: – A simple example from Beard & Volpe’s analysis of English agentive nominals – A quick overview of the Sanskrit case. Computational Morphology/Theoretical Morphology 38

English Agentive Nominals (cf. Beard & Volpe, 2005) • read-er, stand-ee, correspond-ent, record-ist, cook

English Agentive Nominals (cf. Beard & Volpe, 2005) • read-er, stand-ee, correspond-ent, record-ist, cook • e " ent / [+ent][+noun, +agentive] S* __ $ • Call the set of all agentive rules R • We can define a new ‘metarule’ R′ that is the union of all rules in R: Computational Morphology/Theoretical Morphology 39

Feature [+noun, +agentive] • Presumably this is also introduced by rule: call this rule

Feature [+noun, +agentive] • Presumably this is also introduced by rule: call this rule M • Then given a base B, the base with that feature specification added is given by B○M • Then the appropriate suffixed form is given by [B○M]○R′ • But this can be written, by associativity, as B○[M○R′] • Finally, [M○R′] can be precomposed; call this R′′ Computational Morphology/Theoretical Morphology 40

So what? • R′′: – Introduces the morphosyntactic feature [+noun, +agentive] – Introduces the

So what? • R′′: – Introduces the morphosyntactic feature [+noun, +agentive] – Introduces the affixal morphology as appropriate to the base • In short, R′′ encodes a lexicalincremental model of morphology. Computational Morphology/Theoretical Morphology 41

Sanskrit declensions Computational Morphology/Theoretical Morphology 42

Sanskrit declensions Computational Morphology/Theoretical Morphology 42

Sanskrit declensions Computational Morphology/Theoretical Morphology 43

Sanskrit declensions Computational Morphology/Theoretical Morphology 43

Issues with Sanskrit • Nouns have two or three stems – strong, middle and

Issues with Sanskrit • Nouns have two or three stems – strong, middle and (optionally) weakest • A different series of stem alternations crosscuts this: guna, vrddhi, and zero: – “foot”: pād-, pad-, pd– strong stems may be guna or vrddhi – middle stems may be zero, or a lexeme-specific stem – weakest stems may be zero or lexeme-specific stem Computational Morphology/Theoretical Morphology 44

Sanskrit declensions guna zero Computational Morphology/Theoretical Morphology 45

Sanskrit declensions guna zero Computational Morphology/Theoretical Morphology 45

Sanskrit declensions vrddhi lexeme-class particular Computational Morphology/Theoretical Morphology 46

Sanskrit declensions vrddhi lexeme-class particular Computational Morphology/Theoretical Morphology 46

Further issues • Stump argues for Indexing Autonomy Hypothesis: – A stem’s index is

Further issues • Stump argues for Indexing Autonomy Hypothesis: – A stem’s index is independent of the form used for the stem – Sanskrit nominal declensions are morphomic in Aronoff’s sense • Also involved are rules of referral whereby a particular form is systematically used to represent more than one slot in the paradigm. – For example, in Latin the ablative and dative plural in nominal paradigms are identical no matter what form is used for the particular paradigm • So we have several layers of complexity here, which would seem to make an “item-and-arrangement” approach impossible Computational Morphology/Theoretical Morphology 47

Computational analysis Computational Morphology/Theoretical Morphology 48

Computational analysis Computational Morphology/Theoretical Morphology 48

Refactoring But this is just an item-and-arrangement analysis Computational Morphology/Theoretical Morphology 49

Refactoring But this is just an item-and-arrangement analysis Computational Morphology/Theoretical Morphology 49

Summary • Theoretical distinctions between different approaches to morphology seem to the issue of

Summary • Theoretical distinctions between different approaches to morphology seem to the issue of how cleanly one can describe a given phenomenon. • But it is not clear that they relate to important differences in underlying mechanisms. Computational Morphology/Theoretical Morphology 50

Why morphological theory? • Morphology has tended to develop highly articulated theories that are

Why morphological theory? • Morphology has tended to develop highly articulated theories that are (often) intended to represent the morphological component of some putative ‘language faculty’. • Need a set of mechanisms to account for complex morphological systems – e. g. Sanskrit. • Need to account for observed universals – These might related to built-in predispositions, but equally well might relate to historical change; cf. Blevins (2004) • Linguistic phenomena are complex: how can children learn them? – Clearly relates to learning mechanisms Computational Morphology/Theoretical Morphology 51

Whither morphological theory? • Assumptions underlying linguistic theory have not changed much in the

Whither morphological theory? • Assumptions underlying linguistic theory have not changed much in the last 50 years – Arguments against statistical learning methods are based on antiquated notions of what statistical methods are capable of • Meanwhile there have been significant advances in machine learning over the past 10 -20 years. • Some of this has made it into computational linguistics in the form of grammar induction methods (cf. Klein and Manning, 2004; Smith 2006) Computational Morphology/Theoretical Morphology 52

Morphological theory redux • Computational arguments (above) suggest there may not be as much

Morphological theory redux • Computational arguments (above) suggest there may not be as much difference between morphological theories as people like to think • Recent work on induction of morphology suggests that we need to revisit our assumptions. • Issues of the future will likely be: – What historical mechanisms explain the observed patterns across the world’s languages? – What general learning mechanisms can account for children’s learning of morphology? Computational Morphology/Theoretical Morphology 53