Frequency dictionary of Belarusian borrowings in the Belarusian

Frequency dictionary of Belarusian borrowings in the Belarusian variety of the Russian language Olga Goritskaya & Mikita Suprunchuk Minsk State Linguistic University (Belarus) Slavi. Corp 2018 (24 -26 Septem ber 2018, Charles University, Prague)

Outline • Very short introduction about sociolinguistic situation in Belarus and Belarusian-Russian language continuum. • Corpora that can be used to study the Belarusian variety of the Russian language. • Frequency dictionary of Belarusian borrowings in Belarusian Russian: stages of creation and problems.

Languages in Belarus

Census in Belarus (2009): Belarusian, Russian, other (ethnic Belarusians – 84% of population) the Belarusian language: > symbolic function (national identity) the Russian language: > means of communication

Language continuum in Belarus • Standard Russian (norms of Russian are codified only in Russia) • Belarusian Russian (variety in formation) • Belarusian-Russian mixed speech (so-called Trasjanka ‘a blend of hay and straw’) • Russified Belarusian (Belarusian with Russian influence) • Standard Belarusian (two standards) R B

Russian-speaking Belarusians sometimes use Belarusian words ‘He thinks that he speaks Russian’ + examples of contact-induced phenomena (not all of them are true)

Frequency dictionary of Belarusian borrowings in the Belarusian variety of the Russian Language - Work in progress - Corpus-based (other studies of Belarusian Russian are mostly based on introspection and small amount of data) - For research purposes, part of the project about lexical and grammatical variation in Belarusian Russian

Corpora

Corpora of the varieties of Russian Subcorpus of regional and foreign press in the Russian National Corpus ruscorpora. ru/searchregional. html 13, 2 M words, Belarus (Grodno region) – 2, 7 M words (some texts are in Belarusian) Integrum database integrum. ru > 5000 newspapers, magazines and online media, > 500 million documents it is not a corpus In Belarusian libraries, the Belarusian part of Integrum is not available for free The General Internet. Corpus of Russian (GICR) webcorpora. ru 19. 7 G words rich metadata (author’s place of residence, age, etc. ) under development now

Araneum Russicum Externum aranea. juls. savba. sk/guest All countries, incl. Belarus Minus – 120 M Maius – 1, 20 G

The main subcorpus for the dictionary – Live. Journal (blogs): • 8. 7 G words (all countries ) • 21 th century (the majority of texts: 2010– 2013) • diverse texts (diary-style text entries and comments, mass media, fiction, etc. ), a lot of innovations • doesn’t represent the entire population (the vast majority of the blog authors were born in 1970– 1990) .

(? ? ? ) Belarus – ≈160 million words, ≈ 1, 9% of the LJ subcorpus 49% 32% 11% 7% 2% Russia not available other countries Ukraine Belarus

Stages Define the word list (borrowings from Belarusian) Find relevant contexts for each word Make the frequency list Analyze the results

Word list

Word list: sources A list of words typical of the Belarusian variety of Russian: • our own observations, • publications by other researchers on this topic, • “naïve” metalinguistic commentaries, • “The Language of Russian cities” dictionary (edited by V. Belikov) and the corresponding Internet forum: http : //forum. lingvolive. com/cat/l 26.

Word list: operational definition Belarusian borrowings (loanwords) – Belarusian words that are used in Russian speech. • Some of these words were borrowed from other languages (Polish, German, Yiddish, etc. ) into Belarusian. • Some of these words are shared with Ukrainian. German Polish Belarusian or Belarusian. Russian mixed speech Belarusian variety of Russian

Word list: operational definition If the word was present in some dictionaries of the Belarusian language and not present in the dictionaries of the Russian language (20 th – 21 st centuries) we put it to the word list. Some words are present in the dictionaries of the Belarusian and Russian languages but (can) differ in meaning, stylistic functions, frequency, etc. in Russian and Belarusian varieties of Russian. Excluded from the list: хлопец , хлопчык ‘guy, boy’ , чарка ‘glass’ , дык (discourse marker), etc. – topic for further research.

Exception : драник ‘potato pancake’

Debates: Belarusian бульба ‘potato’, Russian бульба (term in botany and engineering ) You can’t visit/live in Belarus for long without stumbling across the famous … Bulba is the Belarusian word for potato and is used more often* than the Russian word for potato. It is clearly a term of fond affection more than a simple noun. Belarusians love Bulba - in every way and every form and it is hard to imagine any meal without them! (etobelarusdetka. com ) * It is not true (O. G. & M. S. )

Collection of relevant contexts

Technical problems 1. There are ≈2500 locations in the subcorpus of Live. Journal (Belarus, Minsk, etc. ). Ноw to extract all contexts from Belarus quickly? 2. The corpus is called the General Internet-Corpus of Russian , but there are contexts in Belarusian, Ukrainian and other languages. ➔ Programs to process the data.

Morphology Belarusian words are not present in the dictionary used for the corpus: Tn. T-Russian + mystem (7 million forms). Problems: forms with morphological alternation: [lemma="оводень"] – only оводень; [word="оводн. *"] – оводни , оводней , оводнями … ➔ Manual input of word forms.

Theoretical problems: Belarusian and Russian are genetically closely related and structurally very similar languages Where is the border between idioms, e. g. Belarusian Russian and mixed speech?

No clear border between code-switching and borrowing + integration of borrowings as an ongoing process

Proper names official bilingualism, symbolic function of Belarusian, functional dominance of Russian ➔ various Belarusian proper names are used in Russian speech

Names containing only Belarusian words Бусел ‘stork’ ( one of the Belarusian symbols)

Multi-word names (Belarusian + Russian word)

Proper > common nouns

Contexts are diverse We can put various contexts to our sample, but mark them (not throw them away) : • metalinguistic contexts , • proper names (which types ? ), • phraseology, etc. шуфлядка 'drawer' ссобойка 'packed lunch to work, school, etc. ' metalinguistic natural

Frequency list

322 words in the list, 159 nouns 3 columns: • instances • texts • authors

Preliminary analysis of the results

A lot of words (26%) were not found in the corpus (in Russian texts) 42 117 0 >0

The majority of words have low frequency texts 700 600 500 400 300 200 100 0

The most frequent nouns драник 'potato pancake' 697 буся, буська 'kiss; cute person' 493 змагар 'oppositionist' 299 мова 'language (usually Belarusian)' 206 брама 'gate' 175 шуфляда, шуфлядка 'drawer' 156 трасянка 'mixed speech' 147 бульба 'potato' спадар 'gentleman, mister, man' плошча 'square, post-electional miting' 127 112 92

Plans for the near future • Conduct an in-depth analysis of the results (semantics + pragmatics, frequency of graphic variants, etc. ). • Compare the frequency of the Belarusian borrowings with their standard Russian equivalents (if any) in Belarusian Russian. • Analyze the words that differ in meaning, stylistic functions, frequency, etc. in the Russian and Belarusian varieties of Russian (or find that the expected difference is not significant). • Create an online resource about lexical variation in Belarusian Russian.

Frequency dictionary of Belarusian borrowings in the Belarusian variety of the Russian language Olga Goritskaya & Mikita Suprunchuk Minsk State Linguistic University (Belarus) Slavi. Corp 2018 (24 -26 Septem ber 2018, Charles University, Prague)
- Slides: 37