Spoken Dialog Systems Diane J Litman Professor Computer
- Slides: 12
Spoken Dialog Systems Diane J. Litman Professor, Computer Science Department
Spoken Dialog Systems that interact with users via speech n Provide automated telephone or microphone access to a back-end n Advantages: naturalness, efficiency, eyes and hands free n user Speech Recognition TTS or recording Spoken Dialog System DB, web, system 2
Challenges in Spoken Dialog Systems n Automated speech recognition ¨ n n Sphinx, Microsoft Speech, Dragon Naturally Speaking Natural language understanding Dialog Management How to keep the conversation going? Best strategy? ¨ How to detect errors in communication? ¨ How to recover from errors? ¨ n Spoken language generation 3
Application areas I have worked on n AT&T ¨ ¨ ¨ n Pitt ¨ ¨ n Phone-based Information Access Call Centers Social Networking Systems (Physics) Tutoring Backup for Port Authority human operators Other Interests ¨ Training, Troublesheeting, PDA’s 4
Speech-based Computer Tutors n n What are they? Example Tutor: Well, if an object has non zero constant velocity, is it moving or staying still? ¨ Student: Moving ¨ Tutor: Yep. If it’s moving, then its position is changing. So then what will happen to the packet’s horizontal displacement from the point of its release? ¨ Student: It will change ¨ n Intersection of two fields: ¨ Spoken Dialog Systems ¨ Intelligent Tutoring Systems 5
Intelligent Tutoring Systems n Education ¨ Classroom instruction [most frequent form] ¨ Human (one-on-one) tutoring [most effective form] n Computer tutors – Intelligent Tutoring Systems ¨ Not as good as human tutors ¨ Ways to address the performance gap (Spoken) dialog systems n Affective (dialog) systems n 6
• Back-end is Why 2 -Atlas system [Van. Lehn, Jordan, Rose et al. 2002] • Sphinx 2 speech recognition and Cepstral text-to-speech 7
• Back-end is Why 2 -Atlas system [Van. Lehn, Jordan, Rose et al. 2002] • Sphinx 2 speech recognition and Cepstral text-to-speech 8
• Back-end is Why 2 -Atlas system [Van. Lehn, Jordan, Rose et al. 2002] • Sphinx 2 speech recognition and Cepstral text-to-speech 9
Current Research Directions n Automatic System Optimization ¨ n Can a system learn to optimize behavior based on prior data? Evaluation ¨ How can we tell if we are improving a system? n ¨ n Can systems be tested with simulated rather than real users? Affective Computing ¨ ¨ n Speech vs keyboard, TTS vs recordings, graphics vs. transcripts How can user emotions be predicted in real-time? How can the system exploit such information? Prosodic and Linguistic Analysis ¨ Respond to both what a user says, and how it is said 10
Human-Computer Excerpt Tutor 26: Let me ask my question in a more general way. What happens to the velocity of a body when there is no force acting on it? Student 27: dammit (ASR: it is) Tutor 28 : Could you please repeat that? Student 29 : same (ASR: i same) Tutor 30 : Right. And what happens to the HORIZONTAL velocity of a body when there is no HORIZONTAL force acting on it? Student 31: zero (ASR: the zero) Tutor 32 : Well. . . OK, let's consider how forces in the vertical direction affect the motion of a body in a horizontal direction in a different scenario <…omitted…> Student 33: oh gravity you already said this (ASR: beats gravity you're exert less) Tutor 34: Fine. Are there any other forces acting on the apple as it falls? Student 35: no why are you doing this again (ASR: no y and to it yes) Tutor 36: Correct. The only force acting on the apple as it falls to the earth is the earth's gravitational force. In which direction does gravitational force act? Student 37: downward you computer (ASR: downward you computer)11
Thank You! Questions? n Further Information n http: //www. cs. pitt. edu/~litman/itspoke. html 12