SSML Extensions for TTS in Indian Languages II

  • Slides: 21
Download presentation
SSML Extensions for TTS in Indian Languages II workshop on Internationalizing SSML 30 -31

SSML Extensions for TTS in Indian Languages II workshop on Internationalizing SSML 30 -31 May 2006, Greece Nixon Patel and Kishore Prahallad Bhrigus Inc. Hyderabad, India IIIT Hyderabad, India 1

Topics n About Bhrigus n Collaborative Efforts between Bhrigus and IIIT Hyderabad n Nature

Topics n About Bhrigus n Collaborative Efforts between Bhrigus and IIIT Hyderabad n Nature of Indian language scripts – convergence and divergence n Issues across TTS rendering in all these languages n Proposed solutions/tags: n Syllable Element n Alien Element n Dialect Element © Copyright 2006, Bhrigus Software Private Limited. 2

Bhrigus voice & data solutions http: //www. bhrigus. com © Copyright 2006, Bhrigus Software

Bhrigus voice & data solutions http: //www. bhrigus. com © Copyright 2006, Bhrigus Software Private Limited. 3

About Bhrigus Established Business : : 2002 Key Customers : Hewitt Associates, AT&T, Pfizer,

About Bhrigus Established Business : : 2002 Key Customers : Hewitt Associates, AT&T, Pfizer, Merrill Lynch, Union pacific railroad, CDIA, South western energy, Orange county, Stryker n Providing IVR, Speech & Enterprise solutions to BFSI, Telco’s, contact centers & manufacturing companies. SEI CMM Level 4 Process Implementation undergoing, ISO 9001: 2000 – KPMG certified. © Copyright 2006, Bhrigus Software Private Limited. 4

Speech and Language Technology Lab @ Bhrigus n Playing a leadership role in the

Speech and Language Technology Lab @ Bhrigus n Playing a leadership role in the development of ASR and TTS for all official Indian languages to provide voice solutions for Indian market n Collaborations: IIIT Hyderabad, & Carnegie Mellon University 10 member team + board of advisors n n 3 Ph. Ds and 4 Masters Synthesis team, Recognition team, Linguist team and Language resources team Initiating SSML and VXML chapters in India © Copyright 2006, Bhrigus Software Private Limited. 5

Collaborative Efforts n Bhrigus Inc. Hyderabad – Voice based solution providers n IIIT Hyderabad

Collaborative Efforts n Bhrigus Inc. Hyderabad – Voice based solution providers n IIIT Hyderabad – one of the leading universities in India doing speech research n Telugu TTS – Collaborative Efforts between Bhrigus Inc. and IIIT n Goal: Develop ASR and TTS for all official Indian languages © Copyright 2006, Bhrigus Software Private Limited. 6

Nature of Indian Language (IL) Scripts n Basic units of the writing system are

Nature of Indian Language (IL) Scripts n Basic units of the writing system are Aksharas n An Akshara is an orthographic representation of a speech sound n Akshara is syllabic in nature, typical forms are V, CCV and CCCV (C – consonant, V – vowel) n Always ends with a vowel (or nasalized vowel) in written form n ~1652 dialects/native languages n 22 languages officially recognized © Copyright 2006, Bhrigus Software Private Limited. 7

Convergence of IL Scripts n Aksharas are syllabic in nature n Common phonetic base

Convergence of IL Scripts n Aksharas are syllabic in nature n Common phonetic base n n Share a common set of speech sounds across all languages Fairly good (though not exact) correspondence between sequence of Aksharas and the corresponding sequence of sounds n Often referred to as Letter-to-sound rules n Written from left-to-right as in European languages n Words are separated by space as in European languages © Copyright 2006, Bhrigus Software Private Limited. 8

Divergence of IL Scripts n Each IL has its own script n All IL

Divergence of IL Scripts n Each IL has its own script n All IL share a common phonetic base – however, Phonotactics in each IL are different from each other n IL are non-tonal languages unlike eastern languages such as Chinese © Copyright 2006, Bhrigus Software Private Limited. 9

How to represent Indian language Scripts n n Unicode n Useful for *rendering* the

How to represent Indian language Scripts n n Unicode n Useful for *rendering* the Indian language scripts n Not suitable for keying-in through QWERTY key board n Not suitable to build modules such as text-normalization (can’t see the Unicode characters on many editors) Itrans-3 / OM - A transliteration scheme by IISc Bangalore, India and Carnegie Mellon University n Useful for *keying-in and store* the scripts of Indian language using QWERTY keyboards n Useful for processing and writing modules/rules for letter-tosound, text normalization etc. © Copyright 2006, Bhrigus Software Private Limited. 10

Itrans-3 / OM Notation © Copyright 2006, Bhrigus Software Private Limited. 11

Itrans-3 / OM Notation © Copyright 2006, Bhrigus Software Private Limited. 11

Why Itrans-3/OM? n n Developed from the user readability aspects – Easier to read

Why Itrans-3/OM? n n Developed from the user readability aspects – Easier to read and type It is case-insensitive. n This scheme is phonetic in nature, the characters corresponds to the actual sound that is being spoken. n Thus a single transliteration scheme is used for all the Indian languages, as they share the same set of sounds. n Each character (corresponding to a phone/sound) is not more than three letters length. n Adapted across Universities in India/Abroad and some industrial labs such as Bhrigus Inc. © Copyright 2006, Bhrigus Software Private Limited. 12

Issues in TTS rendering in IL n TTS should be able to pronounce words

Issues in TTS rendering in IL n TTS should be able to pronounce words as Akshara (syllable) by Akshara (syllable) n Languages have heavy influence of English (alien) words n Alien words occur in between the sentences n Each language has its own dialect © Copyright 2006, Bhrigus Software Private Limited. 13

SSML Tag: Phoneme Element <phoneme> n <phoneme alphabet="itrans-3" ph="n aa t oo"> naatoo </phoneme>

SSML Tag: Phoneme Element <phoneme> n <phoneme alphabet="itrans-3" ph="n aa t oo"> naatoo </phoneme> n Ph attribute specifies phoneme/phone string n Rendering “n” “aa” “t” “oo” individually does not make sense to the native speakers of Indian languages n Sounds needs to be rendered in terms of syllables © Copyright 2006, Bhrigus Software Private Limited. 14

Syllable Element <syllable> n <syllable alphabet="itrans-3" syl="naa too"> naatoo </syallable> n Render “naa” and

Syllable Element <syllable> n <syllable alphabet="itrans-3" syl="naa too"> naatoo </syallable> n Render “naa” and “too” which are Aksharas (syllables) © Copyright 2006, Bhrigus Software Private Limited. 15

Motivation for Loan Word <alien> n Informal experiments suggested 33% of errors of TTS

Motivation for Loan Word <alien> n Informal experiments suggested 33% of errors of TTS of IL occur while rendering alien (non-native) words n Such alien words could be automatically detected due to syllabic properties of the Indian languages © Copyright 2006, Bhrigus Software Private Limited. 16

Example of loan word n BANK has to be pronounce as /B/ /AE/ /N/

Example of loan word n BANK has to be pronounce as /B/ /AE/ /N/ /K/ n /AE/ phoneme does not exist in Indian language phone set n <alien> baank </alien> n Alien (non-native) words could be rendered using different pronunciation dictionaries or letter-to-sound rules © Copyright 2006, Bhrigus Software Private Limited. 17

Dialect Element <dialect> n Each language has its own dialect n TTS should be

Dialect Element <dialect> n Each language has its own dialect n TTS should be able to handle dialects without unloading the language resources © Copyright 2006, Bhrigus Software Private Limited. 18

Dialect Element <dialect> n n n <? xml version="1. 0"? ><speak version="1. 0" xml:

Dialect Element <dialect> n n n <? xml version="1. 0"? ><speak version="1. 0" xml: lang="tel-in"> <voice gender="female"> <dialect name = “andhra”> yekkad’iki vel’laali </dialect> <dialect name = “telengana” pro = “yaad’iki poovaale”> yekkad’iki vel’laali </dialect> </voice></speak> © Copyright 2006, Bhrigus Software Private Limited. 19

Conclusions n Bhrigus Inc. Hyderabad taking lead position to develop ASR and TTS for

Conclusions n Bhrigus Inc. Hyderabad taking lead position to develop ASR and TTS for Indian languages n Proposed <syllable> <alien> <dialect> elements for SSML extensions © Copyright 2006, Bhrigus Software Private Limited. 20

References 1. Prahallad Lavanya, Prahallad Kishore and Ganapathi. Raju Madhavi, A Simple Approach for

References 1. Prahallad Lavanya, Prahallad Kishore and Ganapathi. Raju Madhavi, A Simple Approach for Building Transliteration Editors for Indian Languages, Journal of Zhejiang University Science, vol. 6 A, no. 11, pp. 1354 -1361, Oct 2005. © Copyright 2006, Bhrigus Software Private Limited. 21