Neural Net Algorithms for SC Vowel Recognition Presentation














- Slides: 14
Neural Net Algorithms for SC Vowel Recognition Presentation for EE 645 Neural Networks and Learning Algorithms Spring 2003 Diana Stojanovic
Summary n Neural net algorithms applied to recognition of Serbo-Croatian vowels n Follows Thubthong & Kijsirkul (2001) paper on Thai phoneme recognition n Light background will be provided
Introduction n Speech recognition has many applications (PCs, cell phones, home appliance activation a la Dilbert etc. )
Introduction 2 n There are various algorithms for recognizing speech, some of which rely on the recognition of individual phonemes or sounds
Block diagram of speech recognition system For this project Signal Processing: segmentation, spectral analysis Speech Recognition: Individual vowel recognition Signal Processing Speech Recognition
Previous work Thubthong & Kijsirkul (2001) tested multi-class Support Vector Machine (SVM) vs. Multilayer Perceptron (MLP) for recognition of Thai Vowels and tones n They claim superiority of SVM, while the recognition rate differs by 2 -3% for comparably complex systems n
About speech sounds Speech sound is an acoustic wave n Speaker’s vocal tract shapes the spectrum of each sound n Spectrum depends on the speaker and on the property of the particular sound (for instance /u/), thus recognition in spectral domain is possible n
Vowel Formants n Vowels can be recognized in spectral domain by the characteristic “lines” corresponding to their properties (backness, height, lip rounding etc. ) n These “lines” –formants- occur at resonant frequencies of the vocal tract
Serbo-Croatian Vowel Chart
Data Used in the Project Data collection and Properties n Type of speech: speaker dependent, accented syllables n 480 isolated words were recorded and digitized at 11 k. Hz n Vowels in accented position segmented manually n Vowel formants measured by PCQuirer
Sound Features Measured Only first two formants were used for training the nets in order to reduce complexity n Based on the property of the SC sounds, the performance should not suffer from this low dimensionality n
Perceptron, Backprop and Support Vector Machine n We learned about this throughout the semester . For details, please refer to the paper
Results for Thai (previous work) MLP SVM Results for SC (present work) MLP SVM 92. 28% 94. 99% 90 -95% Recognition rate (DDAG) Recognition rate Work in progres
What is next? n First, finish the SVM results n Examine fast, connected speech n Speaker independent recognition