Neural Net Algorithms for SC Vowel Recognition Presentation














- Slides: 14

Neural Net Algorithms for SC Vowel Recognition Presentation for EE 645 Neural Networks and Learning Algorithms Spring 2003 Diana Stojanovic

Summary n Neural net algorithms applied to recognition of Serbo-Croatian vowels n Follows Thubthong & Kijsirkul (2001) paper on Thai phoneme recognition n Light background will be provided

Introduction n Speech recognition has many applications (PCs, cell phones, home appliance activation a la Dilbert etc. )

Introduction 2 n There are various algorithms for recognizing speech, some of which rely on the recognition of individual phonemes or sounds

Block diagram of speech recognition system For this project Signal Processing: segmentation, spectral analysis Speech Recognition: Individual vowel recognition Signal Processing Speech Recognition

Previous work Thubthong & Kijsirkul (2001) tested multi-class Support Vector Machine (SVM) vs. Multilayer Perceptron (MLP) for recognition of Thai Vowels and tones n They claim superiority of SVM, while the recognition rate differs by 2 -3% for comparably complex systems n

About speech sounds Speech sound is an acoustic wave n Speaker’s vocal tract shapes the spectrum of each sound n Spectrum depends on the speaker and on the property of the particular sound (for instance /u/), thus recognition in spectral domain is possible n

Vowel Formants n Vowels can be recognized in spectral domain by the characteristic “lines” corresponding to their properties (backness, height, lip rounding etc. ) n These “lines” –formants- occur at resonant frequencies of the vocal tract

Serbo-Croatian Vowel Chart

Data Used in the Project Data collection and Properties n Type of speech: speaker dependent, accented syllables n 480 isolated words were recorded and digitized at 11 k. Hz n Vowels in accented position segmented manually n Vowel formants measured by PCQuirer

Sound Features Measured Only first two formants were used for training the nets in order to reduce complexity n Based on the property of the SC sounds, the performance should not suffer from this low dimensionality n

Perceptron, Backprop and Support Vector Machine n We learned about this throughout the semester . For details, please refer to the paper

Results for Thai (previous work) MLP SVM Results for SC (present work) MLP SVM 92. 28% 94. 99% 90 -95% Recognition rate (DDAG) Recognition rate Work in progres

What is next? n First, finish the SVM results n Examine fast, connected speech n Speaker independent recognition