Russian multimodal corpora Andrej A Kibrik Inst of
Russian multimodal corpora Andrej A. Kibrik (Inst. of Linguistics RAN and MSU) aakibrik@gmail. com 1
Multimodality § § § Traditional linguistic approach: language = verbal material Multimodal approach: linguistic communication involves several modes, or channels Apart from the verbal mode, also: § non-segmental sound (=prosody) § visual mode (=“body language”) These modes are no less important for linguistic communication than the traditional verbal mode “Any use of language is inescapably multimodal” (Scollon 2006) In this talk: § I. Corpora annotated for prosody § II. Corpora annotated for gesticulation and prosody 2
I. CORPORA ANNOTATED FOR PROSODY § Night Dream Stories § Siberian Life Stories § Funny Life Stories ØThis work is being currently supported by the § Russian Academy of Sciences project “Corpus Linguistics” http: //www. corpling-ran. ru/ This online service has been created: http: //mib 431. ru/corpus/# 3
Night Dream Stories § § Authors: E. A. Korabelnikova, A. A. Kibrik, V. I. Podlesskaya, A. O. Litvinenko, N. A. Korotaev, M. K. Buryakov et al. Goals: § § § Discourse type: personal stories Speakers: children and adolescents Setting: § Multi-purpose corpus of spoken Russian § Comparison of language produced by normal speakers and speakers with neural disorders § When: • Recorded in 1990 s • 2000 -2009: the NDS project • 2011: the current stage § Where: mostly in a clinic § How: immediately after wake-up 4
Night Dream Stories § Composition § Audio files • Marked for temporal structure § Transcripts of three levels of detail: minimal, medium, and full § Volume § 129 stories § Almost 2 hours Ø Conservative estimate: transcribing one minute of talk takes an experienced transcriber 5 hours of work § 14, 000 words § 3776 elementary discourse units (EDUs) – basic building blocks of spoken language 5
Night Dream Stories § What’s in the transcript? § § § § EDUs Temporal dynamics Pauses Disfluencies Accents Tone in accents Illocutionary characteristics Phase Emphasis Reduction Tempo Tonal register General characterization Comments on specific EDUs Etc. , etc. 6
Night Dream Stories § Project site § Example: 016 z § Play § Three levels of detail in transcript § Play by EDU 7
Night Dream Stories: ELAN annotation 8
Siberian Life Stories § § Authors: K. V. Orlova, N. A. Korotaev, V. I. Podlesskaya, A. O. Litvinenko, M. L. Pal’ko, M. L. Buryakov, E. I. Il’yina Differences from the Nigth Dream Stories corpus § Volume: § Various age groups § “Tell me about a remarkable episode in your life” § Temporal dynamics was done in a more sophisticated way § 17 stories § 40 min. § 1267 EDUs 9
Funny Life Stories § § § Authors: A. A. Kibrik, N. Molchanova, T. Sokolova, N. A. Korotaev et al. Goal: resource for comparing written and spoken discourse Differences from the Nigth Dream Stories corpus § Volume: § § Students “Tell me about a funny episode in your life” Next week: “Write down the funny episode” Each story is represented in a spoken (audio + transcript) and written version § 40 spoken and 40 written stories § Spoken: 70 minutes, 2391 EDUs, 7000 words § Written: 10 000 words 10
Spoken corpora: Problems and perspectives § Problem § Perspectives § Adobe Flash Player, integrated into browsers, does not find the proper end of an EDU § HTML 5 player is used (refresh rate 0. 25 s) but the result is not satisfactory § Solution? § § Downloadable version ELAN multi-tier annotation Customization of transcription Search and statistics: • Prosody, such as accents, disfluencies, etc. • Frequent lexicon 11
Another spoken corpus: Stories about presents and skiing § § Authors: V. G. Xurshudyan, V. I. Podlesskaya, N. A. Korotaev, A. O. Litvinenko, O. A. Savel’eva et al. Goal: § Comparison of original comics-based stories and subsequent retellings § Cross-linguistics comparison • Russian, Belorussian, Polish, Armenian, Italian, French, Japanese, English § Design: § Volume (Russian): § Hyperfull transcription (intonation constructions) § Stories elicited from pictures § Retellings (by the same speaker) on the next day § 10 speakers for each language § 35 min. § 10 stories § 5500 words 12
II. CORPORA ANNOTATED FOR GESTURES § Pear Stories 1 § Pear Stories 2 13
Pear Stories 1 § § § Author: Julia V. Nikolaeva Goal: Study the coordination between gestures and discourse structure, both local and global Discourse type: § Retellings of the Pear Film (Chafe 1980) § Monologue with backchannels Speakers: students (pairwise) Setting: § When: recorded in 2006 § Where: Faculty of Foreign Languages, MSU § How: • To a person who had not seen the film • The picture includes both interlocutors 14
Pear Stories 1 § § Composition § Video § Audio § ELAN annotation Volume § 8 retellings § 20 minutes § 2500 words § 596 EDUs § 325 gestures 15
Pear Stories 1: Tiers § § § § § Transcript Gesture 1 Rhythmic gesture 1 Hand(s) 1 Gesture 2 Rhythmic gesture 2 Hand(s) 2 Comments Discourse level Catchment 16
Pear Stories 1: Tier “Transcript” § § Verbal component Local discourse structure: EDUs Dialog structure Prosody § Gestures § § § Pauses Disfluencies Illocutionary and phasal structure Reduction Smiling and laughing § Punctual | • Short gestures: beats • Emphasized points in extended gestures § Extended • Beginning { • PEAK PHASE • End } 17
18
Pear Stories 1: Tier “Gesture” GESTURE TYPES: § Pointing § Iconic § Rhythmic § Beats § Metatextual § Emblems § Blurred § Unclear 19
Pear Stories 1: Tier “Hands” § Rigth § Left § Two hands 20
Pear Stories 1: Tier “Catchment” § Gesture shape, gesture location and meaning are kept througout several EDUs § “Gestural sentence” § Switch to ELAN, 387 -392 21
Pear Stories 2 § § § Authors: O. V. Fedorova, S. Maljutina, Ju. Akinina, O. V. Dragoj Goal: Study of discourse strategies in aphasics, compared to normal participants Parallel corpus of normal and aphasic retellings Composition § Video § Audio § Transcripts • • Verbal component Pauses Disfluencies Comments Volume: § 30 normal and 23 aphasic retellings § 12, 000 words 22
Pear Stories 2 a § § Authors: O. V. Fedorova, A. Fejn, E. Pavlova Goal: Study the inheritance of discourse strategies between the original and second retellings § The status of discourse protagonist, as reflected in verbal vs. gestural component § Corpus § Composition § § Three original retellings (normal speakers) § 3 x 8=24 second retellings § Video § Audio § Transcripts • • • Verbal component Pauses Disfluencies Comments Gestures Multimodal analysis provides richer information on speakers’ strategies than the verbal component alone 23
24
Conclusion § § Developing multimodal corpora brings us closer to a genuine understanding of human communication In a better world, the reasonable sequence in the scientific study of language should have been: (1) basic, original use of language: spoken face-to-face communication (2) derived, secondary use of language: written texts § But if we cannot revert the history of linguistics, let us explore the fundamental form of language now – better late than never 25
Conclusion § § For other major languages, there exist some multimodal corpora already – see http: //www. multimodal-corpora. org/ Often designers of multimodal corpora just add gesture and other visual information to the verbal component But particularly important is to also include the prosodic channel Only a combination of all three can give us a realistic picture of human communication visual channel language verbal channel prosodic channel 26
Conclusion § Russian multimodal corpora are still in their § § incipient stage But they are steps in the right direction On the basis of the accumulated expertise, we could undertake a multimodal corpus that is § prosodically highly detailed § at the same time, contains the sufficiently detailed gesture and body language annotation Ø and therefore approaches an ecologically realistic model of actual human communication 27
Conclusion § Use of such future product § § linguistic research psychological research sociological research as well as various applied uses, such as spoken human-computer interaction and language teaching 28
Kiitos huomiostanne! visual channel language verbal channel prosodic channel 29
- Slides: 29