Speech Technology Group Cambridge Research Lab Toshiba Research

Speech Technology Group Cambridge Research Lab Toshiba Research Europe Ltd Kate Knill Manager, Interaction Technology kate. knill@crl. toshiba. co. uk 12 January 2010 Copyright 2009, Toshiba Corporation.

Toshiba • World leader in high technology • 3 key areas: – Digital media – Electronic devices and components – Social infrastructure systems • 197, 000 employees worldwide • Sales over US$70 billion • Strong ecological commitment 2

Toshiba R&D: Toward the Innovation Driven Company • Subline und Fliesstexte in Helvetica Neue 24 Light TARI Branch Office Toshiba China R&D Center Peking in Silicon Valley San Jose Ein Aufzählungszeichen ist auch möglich Toshiba Corporate R&D Center Toshiba America Research, Inc. Piscataway, New Jersey Toshiba Research Europe Limited ◆Cambridge Research Laboratory (CRL) ◆Telecommunications Research Laboratory Bristol 3

Toshiba Cambridge Research Lab Established 1991 – Semiconductor Physics for the 21 st Century – Quantum Information – Nano-biotechnology Speech Technology Group added 2002 Computer Vision Group added 2006 4

Toshiba Speech and Language R&D Toshiba China R&D, Beijing Toshiba Research Europe Ltd, Cambridge Toshiba Corporate R&D Center, Kawasaki 5

CRL Speech Technology Group • Focus on embedded ASR and TTS – Core technology research and development • Noise and speaker robustness • LVCSR Toshiba China • HMM-TTS R&D, Beijing – European and North American languages • Approx 15 researchers – Multinational team Toshiba Corporate – Mix of engineers, computer scientists. R&D and linguists Center, Kawasaki 6

Vision of Toshiba Speech Research • Enhance the human-machine interface Ø Interact with devices how, when and where you want • Create a paradigm shift Ø Input/output communication 7

Speech Recognition Challenges • Current ASR engines still suffer from lack of robustness – Major limitation in deploying speech recognition systems Task Robustness Speaker Robustness Noise Robustness 8

Text-to-Speech Synthesis Challenges • Increase in naturalness of synthesis – Same or even smaller footprint! neutral friendly expressive emotional • Increase in voice variety – Faster, cheaper addition – Non-professional voices large corpus professional voice small corpus amateur voices 9

Toshiba in SCALE: Second Supervisor • Recognition – Kate Knill – KK Chin • Projects: – RS-3 Hierarchical Trajectory Models for Speech Recognition, Heyun Huang, Lou Boves – AHSR-2 Data Association Multisource Acoustic Models, Liang Lu, Steve Renals • Synthesis – Heiga Zen – Projects: • RS-1 Trajectory HMMs for Reactive Speech Synthesis, Cassia Valentini, Simon King • RS-4 Speech Synthesis by Analysis, Mauro Nicalao, Roger Moore 10

11