- Slides: 28
DIFFERENCES IN MORPHOSYNTACTIC ANALYTICITY BETWEEN LANGUAGE VARIETIES Helle Metslang, Külli Habicht, Pärtel Lippus, Karl Pajusalu 51 st Annual Meeting of the Societas Linguistica Europaea Tallinn, 29. 8 – 1. 9. 2018 WS „Circum Baltic languages: varieties, comparisons, and change“
2 Outline • Variation in analyticity in languages and registers • Data • Analytic synthetic pairs in Estonian • Case 1: phrasal verb – simple verb • Case 2: adpositional phrase – case form • Conclusions
3 Variation in analyticity in languages and registers • Historically, Estonian is an agglutinative Finno Ugric language, but shows stronger tendencies toward analyticity than e. g. Finnish (Grünthal 2000). • Estonian tends to choose analytic expressions while Finnish prefers synthetic ones (Metslang 1994) – comparison of standard languages (fiction texts) • The degree of analyticity may differ between language varieties • Analyticity brings greater transparency, and therefore is preferred in spoken language and by non native speakers (Haspelmath, Michaelis 2017). • In written communication economy trumps transparency, and therefore syntheticity is preferred. • The degree of analyticity may be influenced by language contacts and language planning.
4 Variation in analyticity in registers • Research questions: • How to compare language varieties with regard to analyticity/syntheticity? • Estonian example: how does analyticity differ between registers?
5 Measuring analyticity and syntheticity • Measuring separately: analyticity index, syntheticity index (Szmrecsanyi 2012, Siegel et al. 2014) • Measuring together: reflects the choice between two alternative ways of expression • Indicators – pairs of synonymous expressions that differ in analyticity / syntheticity • 1) Quantitative analysis: compare the normalized frequencies of the analytic expressions in corpus material • 2) Qualitative analysis: compare the usage of typical analytic/synthetic pairs in different varieties
6 Background. Periods of standard language development Pre standardization (until the last quarter of the 19 th century) – development of the standard Estonian. Bible translations as source of standard language, considerable variation in use between different authors and regions. 2. Standardization (until World War II) – conscious development of standard by native speakers, popularization of standard 3. Post standardization (to the present day) – ensuring the functioning of the standard language with many registers, emergence of internet language, study of spoken language, weakening of standard language norms starting in the 1960 s. 1.
7 Studied varieties • Written language of non native speakers 17 th – 18 th century written Estonian, developed by Germans 21 st century L 2 Estonian interlanguage • Modern standard language • Newspaper texts • Fiction texts • Academic texts • Internet language (texts on websites) • Spontaneous spoken language
8 Studied phenomena • 6 verbs (ära) hävitama (ära) keelama (ära) kustutama (ära) lõpetama (ära) unustama (ära) varastama ‘destroy’ ‘forbid’ ‘erase’ ‘finish’ ‘forget’ ‘steal’ 6 adessive – S + adp peal pairs merel – mere peal ‘on the sea’ mäel – mäe peal ‘on the mountain’ laeval – laeva peal ‘on the ship’ laual – laua peal ‘on the table’ toolil – tooli peal ‘on the chair’ turul – turu peal ‘at the market’
9 Data: Estonian language corpora • OLE = Corpus of Old Literary Estonian. vakk. ut. ee doi: 10. 15155/TY. 0005 • PC = Phonetic Corpus of Estonian Spontaneous Speech. https: //www. keel. ut. ee/en/languages resources/phonetic corpus estonian spontaneous speech • ERC = Estonian Reference Corpus. http: //www. cl. ut. ee/korpused/segakorpus/index. php? lang=en • Et. Ten. www. keeleveeb. ee/dict/corpus/ettenten • Estonian Interlanguage Corpus. http: //evkk. tlu. ee/? language=en
10 Amount of data in corpora OLE (Old Literary Estonian) ~ 1 million PC (spontaneous spoken language) ~ 395 000 NEWS (standard written Estonian, newspapers) 5 million FICT (standard written Estonian, fiction texts) 5 million RES (standard written Estonian, academic texts) 5 million WEB (Internet language) ~ 315 million L 2 (interlanguage of L 2 learners) ~ 844 000
11 Analytic synthetic pairs 1: phrasal and simple verbs Phrasal verbs with the perfective particle ära vs. perfectivity marking via object case and/or the semantics of the verb (1 a) (1 b) (1 c) Peeter koristas köögi Peeter tidy: PST kitchen. GEN ‘Peeter tidied up the kitchen’ Peeter koristas kööki Peeter tidy: PST kitchen. PART ‘Peeter was tidying the kitchen’ ära PP
12 Analytic synthetic pairs 2: Phrases with the postposition peal ‘on’ vs. the adessive case (2 a) (2 b) Arvuti on laua computer is table. GEN ‘The computer is on the table’ Arvuti on laua-l computer is table AD ‘The computer is on the table’ peal on
13 Factors influencing analyticity/syntheticity • Greater transparency of analytical expressions (spoken language, language of non native speakers) • In speech, analytical expressions determine the sentence rhythm • Imitation of spoken interaction in text (fiction) • Analyticity of German (native language of the developers and users of the Old Literary Estonian) • Favoring of synthetic expressions by native Estonian language planners in the 20 th century (Modern Estonian, especially edited academic and newspaper texts) • Syntheticity is encouraged by the information density of academic texts • Combination of written and spoken language features in internet language. Competition between economy and transparency.
14 Hypotheses We hypothesize that analyticity is greater • in spoken language • in the language of non native speakers (OLE, L 2) than • in modern written language (especially in academic texts) Between these extremes, we should find • internet language • language of fiction
15 Comparison of the normalized frequencies of the analytic expressions in corpus material 1 Verb + ära: normalized frequencies per 10 000 words 60 52. 45 50 40 31. 53 30 21. 61 20 12. 18 8. 28 10 6. 58 3. 77 0 OLE PC NEWS FICT RES WEB Verb + ära: normalized frequencies per 10 000 words L 2
16 Comparison of the normalized frequencies of the analytic expressions in corpus material 2 Postposition peal: frequencies in different corpora and registers per 10 000 words 25 21. 19 20 15 9. 01 10 4. 25 3. 39 5 0. 92 0. 74 1. 87 0 OLE PC NEWS FICT RES Postposition peal: frequencies per 10 000 words WEB L 2
17 Conclusions from quantitative analysis • In descending order of analyticity, based on both linguistic phenomena examined: 1) Old Literary Estonian 2) Spoken language 3) Fiction texts 4) Web texts 5) Newspaper texts 6) Academic texts • Modern L 2 texts show a difference: the postposition peal is used frequently, while the particle ära is not. Possible explanations: The construction N+peal is a synonymous alternative to the adessive – the more transparent alternative is preferred. Ära merely duplicates other ways of expressing perfectivity, and is easy to omit. The methods of standard language instruction may also have an effect, stressing the superiority of synthetic forms and the need to avoid the redundancy of analytic forms.
18 Example of typical variation: ära + V vs. V ära unustama – unustama ‘forget’ 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% OLE PC NEWS unustama FICT unustama ära RES WEB L 2
19 ära unustama – unustama: examples Se v lle vnnutis Eua Iumala Kescku erra, ninck /. . . / soÿ sest Puhst, ninck andis Adamille kaas. . (OLE, 1603) ‘Eve forgot God’s command ate it from the tree and gave it to Adam as well’ ma mäletan et ta ütles mulle numbri ka aga ma unustasin selle ära (PC) ‘I remember that he told me the number but I forgot it’ Oli mis oli, unustame ära. (FICT) ‘Be as it may, let’s forget it. ’ Ühtne karjala kirjakeel unustati kohe ja vene keele kõrval sai ametlikuks keeleks taas soome keel. (RES) ‘The common Karelian standard language was immediately forgotten and Finnish once again joined Russian as an official language. ’ Ta unustas sõltumatust ja vabadust. (L 2) ‘He forgot independence and freedom’
20 What is revealed by the typical variation of a simple verb and a phrasal verb with ära? • By far the most analytic is the translation based OLE, where phrasal verbs with ära • • copy the German structure. In spontaneous spoken language, analytic and synthetic usage are found in equal proportion. Greater is analyticity in fiction and internet language than in newspaper and academic texts. Possible reasons: weaker influence of norms in the former group, greater proximity to standard language of the latter group. Fiction texts feature: imitations of spoken interaction; in narrative, perfective sequential events are presented and are highlighted by means of perfective particles. The use of ära is unexpectedly low in L 2 texts. This could be explained by the language learner’s desire to use simpler structures and by the methods of standard language instruction.
21 Example of a typical varying phrase: laual vs. laua peal ‘on the table’ 120% 100% 80% 60% 40% 20% 0% OLE PC NEWS laual FICT laua peal RES WEB L 2
22 laual – laua peal: examples Ja sedda roga, mis ta laua peäl, ja ta sullaste honed, ja ta teenride ammetid ja nende rided. . (OLE, 1739) ‘And the food on his table, and his farmhands’ homes, and his servants’ positions and clothing’ avastas et salatikauss on laua peal (PC) ‘discovered that the salad bowl is on the table’ Laua peal on minu arvuti. (L 2) ‘My computer is on the table’ Laual vedeleb lohakalt raputatud tuhk (NEWS) ‘There is ash strewn carelessly on the table’ Rulli tainas jahusel laual 1 cm paksuseks. (WEB) ‘Roll the dough on a floury table to 1 cm thickness’
23 What is revealed by the typical variation of the adessive and a postpositional phrase? • Analyticity is greater in spontaneous spoken language and in OLE. Possible explanations: desire for transparency; in spoken language also rhythm and redundancy; in OLE the influence of German analytic constructions. • L 2 texts show greater analyticity than L 1 standard language. Analyticity is motivated by the desire for transparency as well as overgeneralization from a more analytic L 1.
24 What is revealed by the typical variation of the adessive and a postpositional phrase? • Modern newspaper, fiction, and academic texts are dominated by the synthetic forms favored in standard language. The effect of recommendations of language planners. Desire for information density. About different motivations see also Klavan 2012: 253 257; Klavan, Kesküla, Ojava 2011. • Internet language shows somewhat greater analyticity than newspaper, fiction and academic texts. This confirms the placement of internet language as a modern variety between spontaneous spoken language and standard language.
25 Conclusions 1 • The preliminary results demonstrate the efficacy of the method and mainly support the hypotheses. • The degree of analyticity of different language varieties differs according to various factors, the most important of which are • the drive for transparency and simplicity; • influence of contact languages, • methods of standard language instruction and • language planning attitudes.
26 Conclusions 2 • The results of the quantitative and qualitative comparisons agree, therefore we can confidently identify a relationship between language variety and the degree of analyticity. • The language varieties examined reveal similar behavior with regard to both indicator phenomena and both quantitative and qualitative analysis: 1) greater analyticity (Old Literary Estonian, spontaneous spoken language), 2) medium analyticity (fiction and web texts), 3) low analyticity (newspaper and academic texts). • The analyticity of L 2 texts differs between the two indicator phenomena examined, reflecting the strategic choices made by language learners.
27 Conclusions 3 • The method is suitable for comparing degrees of analyticity across registers in one or multiple languages, if the languages have alternative pairs of analytic and synthetic expressions – e. g. for comparison of Finnic languages across varieties. • Our study demonstrates the importance of considering cross register differences in evaluating the degree of analyticity in languages.
28 References • Grünthal, Riho (2000), Typological characteristics of the Finnic languages: A reappraisal, in J. Laakso (ed. ), Facing • • Finnic: Some Challenges to Historical and Contact Linguistics, Helsinki: Finno Ugrian Society, 31− 63. Haspelmath, Martin & Susanne Maria Michaelis (2017), Analytic and synthetic. Typological change in varieties of European languages, in I. Buchstaller, and B. Siebenhaar (eds. ), Language Variation – European Perspectives VI: Selected Papers from the 8 th International Conference on Language Variation in Europe (ICLa. VE 8). Leipzig, 2015, Amsterdam: Benjamins, 3– 22. Klavan, Jane (2012). Evidence in linguistics: corpus-linguistic and experimental methods for studying grammatical synonymy. Dissertationes linguisticae universitatis Tartuensis 15. Tartu: University of Tartu Press. Klavan, Jane & Kaisa Kesküla, Laura Ojava (2011), The division of labour between synonymous locative cases and adpositions: the Estonian adessive and the adposition peal ’on’, in S. Kittilä, K. Västi, and J. Ylikoski (eds. ), Studies on Case, Animacy and Semantic Roles, Amsterdam: Benjamins, 111− 134. Metslang, Helle (1994), Temporal Relations in the Predicate and the Grammatical System of Estonian and Finnish. Oulu: University of Oulu. Metslang, Helle (2001), On the developments of the Estonian aspect: The verbal particle ära, in Ö. Dahl, and M. Koptjevskaja Tamm (eds. ), Circum-Baltic Languages, vol. 2: Grammar and Typology. (Studies in Language Companion Series 55), Amsterdam: Benjamins, 443– 479. Siegel, Jeff & Benedikt Szmrecsanyi, Jeff Kortmann (2014), Measuring analyticity and syntheticity in creoles, Journal of Pidgin and Creole Languages 29: 1, 49– 85. Szmrecsanyi, Benedikt (2012), Analyticity and syntheticity in the history of English, in T. Nevalainen, and E. Closs Traugott (eds. ), The Oxford Handbook of the History of English. Oxford: Oxford University Press, 654– 665.