Automatic Methods to Detect the Compositionality of Multiwords

  • Slides: 31
Download presentation
Automatic Methods to Detect the Compositionality of Multiwords collocations idioms (Non-)compositionality pragmatic semantic Diana

Automatic Methods to Detect the Compositionality of Multiwords collocations idioms (Non-)compositionality pragmatic semantic Diana Mc. Carthy syntactic 10/6/2020

Outline 1. 2. 3. 4. 5. 6. 7. What we want to cover Why

Outline 1. 2. 3. 4. 5. 6. 7. What we want to cover Why we do it A survey of current methods Approaches to evaluation Comparison of some of the results Conclusions Directions for the future Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Compositionality, non-compositionality and decomposability • Compositionality : the meaning of the phrase is a

Compositionality, non-compositionality and decomposability • Compositionality : the meaning of the phrase is a function of the meaning of the parts + = • Non-Compositionality: The meaning of the phrase is not a function of the meaning of the parts + = • Decomposability: The meaning of the phrase can be ascribed to its parts -Idiosyncratic: spill the beans, let the cat out of the bag -Simple: traffic light, car park Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Correlation (or confusion) of compositionality: • with productivity frying pan car park one brick

Correlation (or confusion) of compositionality: • with productivity frying pan car park one brick short of a load one slice short of a loaf one pear short of a fruit salad • with statistical frequency of occurrence Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Motivation • Any requirement for semantic interpretation will require handling of noncompositional multiwords in

Motivation • Any requirement for semantic interpretation will require handling of noncompositional multiwords in order to arrive at the correct interpretation e. g. “She kicked the bucket” • Associated syntactic behaviour is needed for parsing e. g. “blow up the houses of parliament” • Important for lexical acquisition e. g. “eat hot dog” • Associated non-productive and syntactic behaviour important for generation e. g. “Wine and dine” Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Methods: the main categories Statistical p(see, red) / (p(see)p(red) Translations see red <-> aberrear

Methods: the main categories Statistical p(see, red) / (p(see)p(red) Translations see red <-> aberrear Dictionaries listings, semantic codes and semantic relationships Substitutions see red, see yellow, see blue Distributional see: look perceive gaze… red: yellow orange blue… Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Statistical Methods • Statistical measures e. g. pointwise mutual information Venkatapathy and Joshi, (2006)

Statistical Methods • Statistical measures e. g. pointwise mutual information Venkatapathy and Joshi, (2006) useful for alignment • Syntactic flexibility Fazly and Stevenson (2006) (verb+noun compounds) idiomatic nature reflected (passivization, determiner type and pluralization) Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Translations Melamed (1997) "non compositional compounds“ statistical comparison of translation models i) with concatenated

Translations Melamed (1997) "non compositional compounds“ statistical comparison of translation models i) with concatenated words ii) separate words Mukerjee et al (2006) Hindi-English Parallel corpora used for detecting Hindi complex predicates. Venkatapathy and Joshi (2006) compositionality (PMI) used for alignment. Translations from one ↔ many are not necessarily non-compositional e. g. swimming pool (piscine) video tape (video), Nevertheless, very useful to find collocations for a language pair Villada Moirón and Tieldemann (2006) diversity of translations for an expression. Overlap of meaning of expression from translation and those of its component words. Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Substitution Methods Pearce (2001) Anti-collocations using Word. Net synonyms baggage, luggage e. g. “emotional

Substitution Methods Pearce (2001) Anti-collocations using Word. Net synonyms baggage, luggage e. g. “emotional baggage” vs “emotional luggage” Lin (1999) PMI 95% significant difference between phrase and phrase with close substitute. Close substitutes found from an automatically generated thesaurus (Lin, 98) e. g. see: gaze, look, perceive… Lexical fixedness Fazly and Stevenson 2006 (verb+noun compounds) as Lin (1999) but using difference in PMI between target and average of the PMI of the set of substitutes Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Dictionary methods • Recognition of idiomatic tokens in a Japanese corpus using syntactic evidence

Dictionary methods • Recognition of idiomatic tokens in a Japanese corpus using syntactic evidence and information in an idiom dictionary Hasimoto et al (2006) • Using hierarchical information in Word. Net to model decomposability for evaluation (Baldwin et al. 2003) • Piao et al. (2006) lexical resource (Lancaster Semantic Lexicon) to compare meaning of listed multiword to that of its component words. Measure semantic distance using semantic tags given in lexicon Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Substitution Methods Contd… What is being captured? Bannard et al (2003) and Baldwin et

Substitution Methods Contd… What is being captured? Bannard et al (2003) and Baldwin et al (2003) argue that these methods capture non-productivity, (simple decomposable collocations) NB Pearce (2001) is explicitly targeting collocations rather than compositionality Fazly and Stevenson (2006) acknowledge the partial relationship (compositionality and lexical fixedness) but the relationship exists nevertheless Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Selectional Preference Models Bannard (2002) verb particle data eat up <object> vs eat <object>

Selectional Preference Models Bannard (2002) verb particle data eat up <object> vs eat <object> (Li and Abe, 1995) models acquired using corpus data and Word. Net, Current work (Mc. Carthy) “prototypical selectional preference models” acquired using corpus data and an automatically generated thesaurus (Lin, 98 …see later) e. g. drink <object> vs drink tea e. g. throw <object> vs throw light Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Distributional Approaches: Latent Semantic Analysis Contexts of ‘dog’ context frequency bark 50 animal 30

Distributional Approaches: Latent Semantic Analysis Contexts of ‘dog’ context frequency bark 50 animal 30 food 10 water 5 drink 3 bath 1 Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Distributional Approaches: Latent Semantic Analysis Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Distributional Approaches: Latent Semantic Analysis Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Distributional Approaches: Thesaurus creation Example dog, hot and “hot dog” feed the dog, keep

Distributional Approaches: Thesaurus creation Example dog, hot and “hot dog” feed the dog, keep dogs, keep cats, dog: cat animal pet horse … stroke cats, feed the horse, --------------------------------hot water cold water, hot milk, warm hot: cold warm boiling mild… milk, boiling milk, hot weather -------------------------------eat the sandwich, eat the hot dog, cook “hot dog” : hamburger sandwich pizza the hot dog, serve the burger Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Distributional Approaches • Schone and Jurafsky (2001) LSA weighed sum of vectors for component

Distributional Approaches • Schone and Jurafsky (2001) LSA weighed sum of vectors for component words compared to MWE candidate • Baldwin et al (2003) decomposability (simple vs non or idiosyncratic) of noun compounds and verb particle constructions. Compared vectors of constituent words in isolation • Bannard et al (2003) compare LSA with Lin (1999) on verb particle constructions • Katz and Giesbrecht (2006) do token analysis for 1 example "ins Wasser fallen". Compare literal and compositional vectors for this example. Type based experiment with composed vectors where constituent words have occurred in isolation. Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Distributional Methods Mc. Carthy et al. (2003) look at overlap of similar words (neighbours)

Distributional Methods Mc. Carthy et al. (2003) look at overlap of similar words (neighbours) in a distributional thesaurus for verb e. g. climb compared to verb and particle construction e. g. climb down clamber up climb up slither down walk down creep down walk jump go up Various other measures, including number of neighbours in the phrasal set with the same particle, (minus the number having the same particle in the simplex verb neighbours) Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Combining approaches Venkatapathy and Joshi (2005) 1. frequency 2. PMI 3. substitution based on

Combining approaches Venkatapathy and Joshi (2005) 1. frequency 2. PMI 3. substitution based on Lin (1999) 4. distributed frequency of object, 5. distributed frequency of object with dissimilar verbs 6. LSA similarity of V-O with verbal form of O 7. LSA dissimilarity of V-O with V All combined with SVM ranking Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Method: Selectional Preferences using distributional thesaurus (Mc. Carthy) • Is the argument prototypical for

Method: Selectional Preferences using distributional thesaurus (Mc. Carthy) • Is the argument prototypical for this predicate and argument relationship? E. g. eat my hat • like substitution methods, but not explicitly looking for substitute • Verb + direct objects e. g. eat {meal 5 dinner 5 tea 6 lunch 10 food 6 sandwich 3 duck 1 cheese 2 hat 3} food: sandwich, cheese, meat duck… ---------------- meal: dinner lunch tea supper … ---------------- clothing : shirt belt hat trousers… Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Methods for evaluation: token based: -Hashimoto et al (2006) 300 example sentences of 100

Methods for evaluation: token based: -Hashimoto et al (2006) 300 example sentences of 100 idioms, Information from dictionary for discrimination -Katz and Giesbrecht (2006) 67 occurrences of 1 idiom (ins Wasser fallen) literal and idiomatic readings have orthogonal LSA vectors Compare individual token vectors to these Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Methods for evaluation: type based Dictionary - Schone and Jurfasky (2001) Fazly and Stevenson

Methods for evaluation: type based Dictionary - Schone and Jurfasky (2001) Fazly and Stevenson (2001) • Using is-links (hyponymy) - Baldwin et al. (2003), Word. Net • manual verification - Lin (1999) • Web as validation - Villavicencio (2005) - Hayes et al (2005) • Compositionality judgements -Contribution from constituents, (Bannard, 2002) (Bannard et al 2003) -Along a continuum (Mc. Carthy et al 2003), (Venkatapath and Joshi, 2005) Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Some results: Compositionality Judgements on a Continuum Mc. Carthy et al. (2003) 111 phrasal

Some results: Compositionality Judgements on a Continuum Mc. Carthy et al. (2003) 111 phrasal verb versus verb constructions (0 -10) carry out cloud over climb up 3 native english speakers, highly significant Kendall coefficient of Concordance Venkatapathy and Joshi (2005) 765 verb object pairs (1 -6) change hands take interest announce plan 2 fluent english speakers, Spearmans Rank Correlation Coefficient Good level of agreement Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Results Mc. Carthy et al. datasets Overlap rs Z score p under H 0

Results Mc. Carthy et al. datasets Overlap rs Z score p under H 0 X = 30 0. 166 1. 74 0. 04 X = 50 0. 136 1. 43 0. 08 X = 30 0. 306 3. 21 <0. 0007 X = 50 0. 303 3. 18 <0. 0007 Overlap. S Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Results Mc. Carthy et al. datasets X=500 statistic Z score p under H 0

Results Mc. Carthy et al. datasets X=500 statistic Z score p under H 0 sameparticle rs=0. 414 4. 34 < 0. 00003 sameparticle-simplex rs=0. 49 5. 17 <0. 00003 simplexasneighbour Mann Whitney 0. 950 0. 171 simplexrank rs=-0. 115 -1. 21 0. 113 simplexscore rs=0. 052 0. 54 0. 295 Piao et al (2006) rs=0. 354 Semantic lexicon (79/116) 0. 001357 Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Correlation of Mc. Carthy et al (2003) human rankings with statistics and dictionaries statistic

Correlation of Mc. Carthy et al (2003) human rankings with statistics and dictionaries statistic Z score P under H 0 LLR rs= -0. 168 -1. 76 0. 0392 χ2 rs =-0. 213 -2. 22 0. 0139 MI rs =-0. 248 -2. 60 0. 0047 Phrasal freq rs =-0. 096 -1. 01 0. 156 Simplex freq rs =0. 092 0. 96 0. 169 Word. Net Mann Whitney 2. 39 0. 0084 ANLT phrasals Mann Whitney 3. 03 0. 0012 Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Correlation of measures with man-made resources (Mann Whitney Z scores) In Word. Net In

Correlation of measures with man-made resources (Mann Whitney Z scores) In Word. Net In ANLT phrasals PMI -2. 61 -4. 53 sameparticlesimplex 3. 71 4. 59 Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Results with Venkatapathy and Joshi (2005) dataset feature correlation 1) Frequency (BNC) . 129

Results with Venkatapathy and Joshi (2005) dataset feature correlation 1) Frequency (BNC) . 129 2) PMI . 203 3) Distributed frequency of object . 111 4) Distributed frequency of object with dissimilar verbs . 139 5) LSA dissimilarity of V-O . 139 6) LSA similarity of V-O with verbal form of O . 300 . 210 Ranking SVM function (using 17) . 448 with V 7) ~ Lin (1999) substitution Mc. Carthy 1/pref score (638/765) -. 403 Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Conclusions • Purpose of task should match method and evaluation • Evaluation is tricky

Conclusions • Purpose of task should match method and evaluation • Evaluation is tricky • Decisions are not clear cut • Statistical measures and substitution methods may be useful, though capturing behaviour that correlates with compositionality • Distributional approaches promising for languages without resources • Selectional preferences may add useful information, alongside other measures Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

Future • Address tokens as well as types • Tokens on a continuum •

Future • Address tokens as well as types • Tokens on a continuum • Error analysis • Separating non-decomposable from idiosyncratically decomposable • Detecting what multiwords mean, distributional approaches might be promising in this respect kick the bucket --- die • share datasets!!! Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

References Baldwin, Timothy, Colin Bannard, Takaaki Tanaka and Dominic Widdows (2003) An Empirical Model

References Baldwin, Timothy, Colin Bannard, Takaaki Tanaka and Dominic Widdows (2003) An Empirical Model of Multiword Expression Decomposability. In Proceedings of the ACL Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, Sapporo, Japan, pp. 89– 96. Bannard, Colin (2002) Statistical Techniques for Automatically Inferring the Semantics of Verb-Particle Constructions Lin. GO Working Paper No. 2002 -06 http: //lingo. stanford. edu/pubs/WP-2002 -06. pdf Bannard, Colin, Timothy Baldwin and Alex Lascarides (2003) A Statistical Approach to the Semantics of Verb-Particles, In Proceedings of the ACL Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, Sapporo, Japan, pp. 65– 72. Fazly, Afsaneh, and Suzanne Stevenson (2006) Automatically constructing a lexicon of verb phrase idiomatic combinations, In Proceedings of the 11 th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 337 -344, Trento, Italy. Hayes, Jer, Nuno Seco, and Tony Veale (2005) Creative discovery in the lexical validation gap. Computer Speech and Language, 19(4): 513 -523, Hashimoto, Chikara, Satoshi and Utsuro Takehito (2006) Japanese Idiom Recognition: Drawing a Line between Literal and Idiomatic Meanings, In Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions pp 353 -360, Sydney, Australia. Katz, Graham and Eugenie Giesbrecht (2006) Automatic Identification of Non-Compositional Multi-Word Expressions using Latent Semantic Analysis, In Proceedings of the ACL Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties Sydney Australia Lin, Dekang (1998) Automatic Retrieval and Clustering of Similar Words Automatic, In Proceedings of 17 th International Conference on Computational Linguistics and the 36 th Annual Meeting of the Association for Computational Linguistics. Montreal, Canada. Lin, Dekang (1999) Automatic Identification of Non-Compositional Phrases, In Proceedings of ACL-99, pp. 317 --324. University of Maryland, Colledge Park, Maryland. Melamed, I. Dan (1997) Automatic Discovery of Non-Compositional Compounds in Parallel Data, in Proceedings of the 2 nd Conference on Empirical Methods in Natural Language Processing (EMNLP), Providence, RI. Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020

References continued Mc. Carthy, Diana, Bill Keller and John Carroll (2003) Detecting a Continuum

References continued Mc. Carthy, Diana, Bill Keller and John Carroll (2003) Detecting a Continuum of Compositionality in Phrasal Verbs. In Proceedings of the ACL-SIGLEX Workshop on Multiword Expressions: Analysis, Acquisition and Treatment , Sapporo, Japan. Mukerjee, Amitabha, Ankit Soni and Achla M Raina (2006) Detecting Complex Predicates in Hindi using POS Projection across Parallel Corpora In Proceedings of the ACL Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties pp 28 -35 Sydney Australia Pearce, Darren (2001) Synonymy in Collocation Extraction. In Word. Net and Other Lexical Resources: Applications, Extensions and Customizations (NAACL 2001 Workshop). pp 41 -46. June. 2001. Carnegie Mellon University, Pittsburgh. Piao, Scott S. L. , Paul Rayson, Olga Mudraya, Andrew Wilson and Roger Garside (2006) Measuring MWE Compositionality Using Semantic Annotation In Proceedings of the ACL Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties Sydney Australia pp 28 -35 Schone, Patrick and Daniel Jurafsky (2001) Is Knowledge-Free Induction of Multiword Unit Dictionary Headwords a Solved Problem? Proceedings of Empirical Methods in Natural Language Processing, Pittsburgh, PA. Venkatapathy, Sriram and Aravind, K. Joshi (2005) Measuring the relative compositionality of verb-noun (V-N) collocations by integrating features. In Proceedings of HLT/EMNLP, Vancouver. Villada Moirón, Begoña and Joerg Tiedemann (2006). Identifying idiomatic expressions using automatic word-alignment. In Proceedings of the EACL Workshop on Multiword Expressions in a Multilingual Context. Trento, Italy. Villavicencio, A. (2005) The availability of verb-particle constructions in lexical resources: How. much is enough? Computer Speech and Language, 19(4) Automatic Methods for Detecting Compositionality, Diana Mc. Carthy 10/6/2020