ITCS 6010 Spoken Language Systems Architecture Elements of

  • Slides: 13
Download presentation
ITCS 6010 Spoken Language Systems: Architecture

ITCS 6010 Spoken Language Systems: Architecture

Elements of a Spoken Language System n n n Endpointing Feature extraction Recognition Natural

Elements of a Spoken Language System n n n Endpointing Feature extraction Recognition Natural language understanding Dialog management Endpointing Dialog Management Recognition Feature Extraction Natural Language Understanding

Elements of a Spoken Language System (cont’d) n Endpointing n n Detects the beginning

Elements of a Spoken Language System (cont’d) n Endpointing n n Detects the beginning and ending of speech Represents caller’s spoken utterance as wave form

Elements of a Spoken Language System (cont’d) n Feature extraction n Transforms endpoint utterance

Elements of a Spoken Language System (cont’d) n Feature extraction n Transforms endpoint utterance into sequence of feature vectors Feature vector – list of numbers that represent measurable characteristics of speech Characteristics related to energy amounts at varying frequencies

Elements of a Spoken Language System (cont’d) n Recognizer n n Determines spoken words

Elements of a Spoken Language System (cont’d) n Recognizer n n Determines spoken words using feature vectors Recognition model n n contains all word strings caller can say Consists of: 1. 2. 3. Acoustic model Dictionary Grammar

Elements of a Spoken Language System (cont’d) n Acoustic model n n n Internal

Elements of a Spoken Language System (cont’d) n Acoustic model n n n Internal representation of pronunciation of each basic sound/phoneme Created by training process Modeled features are same as those in feature vectors

Elements of a Spoken Language System (cont’d) n Dictionary n n n List of

Elements of a Spoken Language System (cont’d) n Dictionary n n n List of words and pronunciations Indicates which acoustic models create a word Can contain multiple entries/pronunciations for a word Dallas Boston dal*s bost*n economics E k * n A m I k s economics i k * n A m I k s

Elements of a Spoken Language System (cont’d) n Grammar n n n Definition everything

Elements of a Spoken Language System (cont’d) n Grammar n n n Definition everything caller can say to system Includes all possible strings of words and rules that associate meaning to strings Two types of grammars: n n Rule-based grammar – set of explicit rules completely define grammar Statistical language model (SLM) – statistical grammar created from the probability of word occurrence in given context

Elements of a Spoken Language System (cont’d) n Recognition search n For each word

Elements of a Spoken Language System (cont’d) n Recognition search n For each word model as defined in grammar: n n Defined in dictionary Has appropriate sequence of acoustic models Feature vectors compared to word model Recognition n Comparing of possible models against sequence of feature vectors to find best match

Elements of a Spoken Language System (cont’d) n 3 important features of recognition 1.

Elements of a Spoken Language System (cont’d) n 3 important features of recognition 1. 2. 3. Confidence measures N-best processing Barge-in

Elements of a Spoken Language System (cont’d) n Confidence measures n n n Quantitative

Elements of a Spoken Language System (cont’d) n Confidence measures n n n Quantitative measure of the recognizer’s confidence it found the right match Measure of closeness between feature vectors of caller’s utterance to best-matching path Used by designers in design process n e. g. to determine if explicit confirmation required

Elements of a Spoken Language System (cont’d) n N-best processing n n A number

Elements of a Spoken Language System (cont’d) n N-best processing n n A number of results (best possible matches) returned with their confidence measures Barge-in n n Allows callers to interrupt prompt Recognizer starts listening at beginning of prompt

Elements of a Spoken Language System (cont’d) n Natural language understanding n n Assigns

Elements of a Spoken Language System (cont’d) n Natural language understanding n n Assigns meaning to spoken words Slots defined for each item of information required Example Dialog manager n Determines application’s next step