Lecture 13 Summary The Bayesian method for pronunciation

Lecture 13 Summary The Bayesian method for pronunciation: The Bayesian algorithm can be used to solve pronunciation sub problem in speech recognition. Pronunciation sub problem: Given a series of phones, compute the most probable word that generated them. Select a single word such that P( Word / Observation) is highest. Simplifications: 1. Given the correct string of phones, a Speech recognizer relies on probabilistic estimators for each phone, so it is never entirely sure about identification of any particular phone. 2. The word boundaries are known. 3. Rules of pronunciation are associated with probabilities. Rules are run over the lexicon to generate different surface forms each with its own probability ECE 7000 Natural Language Processing Page 1 of 13

Lecture 13 Summary Difference between Bayesian methods for pronunciation and spelling errors: Bayesian spelling error correction algorithm has 2 components: 1) Candidate Generation and 2) Candidate scoring In speech recognizers each pronunciation is expanded and all the possible variants are pre stored with their scores. So, there is no need for candidate generation in the case of pronunciation errors. Decision Trees: For pronunciation variations we use a particular type of decision tree called the Classification and Regression Tree (CART) Input : A Lexical phone described in terms of a set of features. Output : Classification into category and the associated probability. They are trained for a particular surface phone. Advantages: They are automatically induced from a labeled corpus. They are concise. ECE 7000 Natural Language Processing Page 2 of 13

Slides: 2

Download presentation