ADAM MICKIEWICZ UNIVERSITY IN POZNA Faculty of English

  • Slides: 20
Download presentation
ADAM MICKIEWICZ UNIVERSITY IN POZNAŃ Faculty of English Extracting neologisms from a corpus using

ADAM MICKIEWICZ UNIVERSITY IN POZNAŃ Faculty of English Extracting neologisms from a corpus using Neo. Det Marta Grochocka martag@wa. amu. edu. pl

The development of a lexical item (Bauer 1983) 1. Nonce formation 2. Neologism (Fischer

The development of a lexical item (Bauer 1983) 1. Nonce formation 2. Neologism (Fischer 1998) 1. 2. certain frequency over a certain period of time distribution in different contexts and domains 2. Institutionalization 3. Lexicalization 2

Types of neologisms • formal a new word, including acronyms and affixes, e. g.

Types of neologisms • formal a new word, including acronyms and affixes, e. g. PC, e-, -gate (Metcalf 2002) • syntactic a new expression or grammatical construction • semantic a new meaning of an already existing word • borrowing 3

Methodology Aims of the study: to examine productive morphological processes in English by means

Methodology Aims of the study: to examine productive morphological processes in English by means of studying formal neologisms PART 1: Formal classification PART 2: Semantic classification 4

Neologism detector tool Functions: 1. compilation of the study corpus 2. neologism extraction based

Neologism detector tool Functions: 1. compilation of the study corpus 2. neologism extraction based on the exclusion principle 3. neologism management 5

Neologism extraction process Study corpus Exclusion sources Neologism candidates Manual verification Neologism management 6

Neologism extraction process Study corpus Exclusion sources Neologism candidates Manual verification Neologism management 6

Study corpus size and content 14. 3 million words newspaper articles and blogs published

Study corpus size and content 14. 3 million words newspaper articles and blogs published between 1 st Jan. 2009 and 26 th Oct. 2010 daily broadsheets: The Daily Telegraph, The Times, The Guardian tabloids: The Sun, The Daily Mail almost 9, 000 neologism candidates analyzed (out of ca. 73, 000) 121 neologisms extracted (without borrowings) 7

Exclusion sources Corpus: The British National Corpus (1991 -1994) General dictionaries: Oxford Advanced Learner’s

Exclusion sources Corpus: The British National Corpus (1991 -1994) General dictionaries: Oxford Advanced Learner’s Dictionary 7 th Edition, OALD 7 (2005) Merriam-Webster's Collegiate Dictionary 11 th Edition, MW 11 (2006) Macmillan English Dictionary 2 nd Edition, MEDAL 2 (2007) Cambridge Advanced Learner's Dictionary 3 rd Edition, CALD 3 (2008) Chambers 21 st Century Dictionary, CH 11 (2008) Google: COBUILD Longman Dictionary of Contemporary English 5 th Edition, LDOCE 5 (2009) Dictionary. com Slang dictionaries: The Oxford Dictionary of New Words (1991) The Probert Encyclopaedia of Slang (2004) The Concise New Partridge Dictionary of Slang and Unconventional English (2007) The Dictionary of Contemporary Slang (2007) Word lists: proper names, geographical names 8

Neologism candidates analysis 9

Neologism candidates analysis 9

Search engine 10

Search engine 10

Neologism management 1 11

Neologism management 1 11

Neologism management 2 12

Neologism management 2 12

Neologism management 3 13

Neologism management 3 13

Formal classification of neologisms 14

Formal classification of neologisms 14

Blends • • Twitterati (Twitter + glitterati) welectricity (wellingtons + electricity) retrotastic (retro +

Blends • • Twitterati (Twitter + glitterati) welectricity (wellingtons + electricity) retrotastic (retro + fantastic) girlicious (girl + delicious) Frankenfish (Frankenstein + fish) Obamarita (Obama + margarita) Holohoax (Holocaust + hoax) zeroflation (zero + inflation) 15

Semantic classification of neologisms 16

Semantic classification of neologisms 16

Semantic classification – examples IT and communications technology Politics and current affairs beatblogger Af-Pak

Semantic classification – examples IT and communications technology Politics and current affairs beatblogger Af-Pak cyber-locker Muslimist datablog Obamanomics Facebooker gamification i. Pad celebdom to liveblog fabby to retweet lip-syncher pet-set retrotastic Business and finance infocapitalism micro-employment zeroflation Entertainment Food and dieting frankenfish orthorexic 17

Problems • impossible to detect semantic and syntactic neologisms • alternative spelling, e. g.

Problems • impossible to detect semantic and syntactic neologisms • alternative spelling, e. g. micro-blog, G & T • items provided as examples in the exclusion sources not analyzed by Neo. Det • failure of the online exclusion sources to respond to the queries made by Neo. Det • overrepresentation of the Entertainment and News section in the study corpus 18

Conclusions • formal neologisms as indicators of productive word formation processes • confirmation of

Conclusions • formal neologisms as indicators of productive word formation processes • confirmation of the status of affixation and compounding as the most popular methods of extending the lexicon • blends as an important source of neologisms coined with the purpose of being witty, amusing and memorable • the largest number of neologisms in the area of IT and communications technology 19

Thank you ! 20

Thank you ! 20