Fundamentals of Speech Recognition Goal Automatic recognition of

Fundamentals of Speech Recognition • Disciplines applied to most of the speech recognition problems:

Fundamentals of Speech Recognition Ø Communication and information theory: the methods for detecting the

Fundamentals of Speech Recognition Ø Computer Science: the study of efficient algorithms for implementing,

The Paradigm Speech Recognition • Word recognition model: (spoken o/p is recognized) Speech signal

Slides: 7

Download presentation

Fundamentals of Speech Recognition • Goal – Automatic recognition of speech by machine

Fundamentals of Speech Recognition • Disciplines applied to most of the speech recognition problems: Ø Signal Processing: the process of extracting relevant information from the speech signal in an efficient and robust manner. Ø Physics: the science of understanding the relationship between the physical speech signal and physiological mechanisms that produces speech and with which the speech is perceived. Ø Pattern recognition: is the research area that studies the operation and design of the systems that recognize patterns in data.

Fundamentals of Speech Recognition Ø Communication and information theory: the methods for detecting the presence of particular speech pattern. Ø Linguistics: the relationship between sounds (phonology), words in a language (syntax), meaning of spoken words (semantics), and sense derived from the meaning (pragmatics). Ø Physiology: understanding of the mechanisms within the human central nervous system that account for speech production and perception in human beings.

Fundamentals of Speech Recognition Ø Computer Science: the study of efficient algorithms for implementing, in S/W and H/W, the various methods used in a practical speech-recognition system. Ø Psychology: the science of understanding the factors that enable a technology to be used by human beings in practical tasks.

The Paradigm Speech Recognition

The Paradigm Speech Recognition • Word recognition model: (spoken o/p is recognized) Speech signal is decoded into a series of words that are meaningful according to syntax, semantics, and pragmatics. • Higher-level processor: the meaning of the recognized words is obtained. The processor uses a dynamic knowledge representation to modify the syntax, semantics and the pragmatics according to the context of what it has previously recognized. • The feedback limits the search for valid input sentences from the user. • The system responds to the user in the form of a voice output.

Go through a Brief History