Adding Spoken Dialogue to a TextBased Tutorial Dialogue
Adding Spoken Dialogue to a Text-Based Tutorial Dialogue System Diane J. Litman Learning Research and Development Center & Computer Science Department University of Pittsburgh, PA 15260 USA c e l c ir
Why Spoken Dialogue Tutoring? e l c ir c Motivation – Promote learning gains by enhancing communication richness Benefits for Intelligent Tutoring Systems – – Naturalness and ease of use New sources of evidence regarding dialogue and pedagogy Complement to current talking heads Hands-free aspect can support multimodal interaction Benefits for Spoken Dialogue Systems – Tutoring is a challenging and largely unexplored application
Research Questions c e l c ir What are the advantages – and disadvantages – of using speech over text? Can prosody be used to infer pedagogically significant information? Can the tutoring system make use of such inferences? Testbed - Why 2: an existing University of Pittsburgh text-based dialogue tutoring system
Previous Work: Identifying and Recovering from User Problems e l c ir c Prediction of dialogue and utterance level problems – Automatically computable features (prosody, ASR, history, efficiency, quality, lexical) can accurately predict annotated turn classes (misrecognitions - 93%; aware sites - 89%; corrections -84%) – Prosodic features include pitch, energy, duration, pause, tempo, and silence Adaptation – Task success improved from 23% to 65% by detecting and reacting to problematic subdialogues
Example with Annotations S: Hi, this is AT&T Amtrak schedule system. This is TOOT. How may I help you? U: I'd like to find a train from Ch- to Chicago from Baltimore. MISRECOGNITION S: Sorry, I can't understand you. Please repeat your utterance. U: I'd like to find a train from Baltimore to Chicago. AWARE, CORRECTION c e l c ir
Implications for Tutorial Dialogue e l c ir c Hypothesis – Similar features will prove useful for predicting and adapting to problematic pedagogical situations Challenge – What pedagogical situations should (and can) be annotated? Starting Points – “Tutorial-level misrecognitions” (e. g. , a correct student answer is understood to be incorrect by the system) – Problematic affective states (e. g. , confusion, boredom, anger, frustration [Evens 2002]) – Off-talk, subjective language – Initiative (e. g. , papers in this session) – Dialogue acts / discourse structure
Emotion and Prosody (Shriberg et al. 2001) u Annoyed u Neutral – July 30 – Yes 1 2 u Disappointed/tired 6 – No e l c ir c 3 3 – Yes – Late morning (HYP) 8 u Frustrated – Yes – No 4 5 9 u Amused/surprised – No 7 – No, I am … (HYP) 10 – There is no Manila. . .
Summary e l c ir c Adding spoken dialogue to tutoring systems provides both opportunities and challenges Expected Contributions – Empirical comparisons with text-based tutoring dialogue systems – Annotation schemes for dialogue states of potential pedagogical interest – Use of prosodic and other features to predict such states – Exploitation of such predictions by the tutoring system
Example with Annotations c e l c ir S: Hi, this is AT&T Amtrak schedule system. This is TOOT. How may I help you? U: I'd like to find a train from Ch- to Chicago from Baltimore. MISRECOGNITION S: Sorry, I can't understand you. Please repeat your utterance. U: I'd like to find a train from Baltimore to Chicago. MISRECOGNITION, AWARE, CORRECTION S: I heard you say “leave today around 5 thirty in the morning. " Which city do you want to go to? U: I would like to leave Saturday at eight in the morning. MISRECOGNITION, AWARE, CORRECTION S: You can say the arrival city name, such as “New York City. " U: Baltimore to Chicago. AWARE, CORRECTION
- Slides: 9