74 419 Artificial Intelligence 2004 Speech Natural Language
- Slides: 18
74. 419 Artificial Intelligence 2004 Speech & Natural Language Processing • Speech Recognition • acoustic signal as input • conversion into written words • Natural Language Processing • written text as input • sentences (well-formed or not) • Spoken Language Understanding • analysis of spoken language (transcribed speech)
Speech & Natural Language Processing Areas in Speech Recognition • Signal Processing • Phonetics • Word Recognition Areas in Natural Language Processing • • • Morphology Grammar & Parsing (syntactic analysis) Semantics Pragamatics Discourse / Dialogue Spoken Language Understanding
Speech Production & Reception Sound and Hearing • change in air pressure sound wave • reception through inner ear membrane / microphone • break-up into frequency components: receptors in cochlea / mathematical frequency analysis (e. g. Fast-Fourier Transform FFT) → Frequency Spectrum • perception/recognition of phonemes and subsequently words (e. g. Neural Networks, Hidden-Markov Models)
Speech Recognition Acoustic / sound wave Filtering, Sampling Spectral Analysis; FFT Frequency Spectrum Signal Processing / Analysis Features (Phonemes; Context) Phoneme Recognition: HMM, Neural Networks Phonemes Grammar or Statistics Phoneme Sequences / Words Word Sequence / Sentence Grammar or Statistics for likely word sequences
Speech Signal Analog-Digital Conversion of acoustic signal → Sampling in Time Frames = “windows” Characteristics of a Speech Signal Ø formants - strong frequency components; characterize e. g. vowels, gender of speaker; dark stripe in spectrum Ø pitch – fundamental frequency (baseline for higher frequency harmonics like formants) Ø place of articulation (recognition model based on model of vocal tract) Ø change in frequency distribution
Video of glottis and speech signal in ling. WAVES (from http: //www. lingcom. de)
Speech Signal Analog-Digital Conversion of Acoustic Signals → Sampling Analysis of Signal in Time Frames (“windows”) Characteristics of a Speech Signal Ø formants - strong frequency components; characterize e. g. vowels, gender of speaker; dark stripe in spectrum Ø pitch – fundamental frequency (baseline for higher frequency harmonics like formants) Ø place of articulation (recognition model based on model of vocal tract) Ø change in frequency distribution
Speech Recognition Characteristics Speech Recognition vs. Speaker Identification Speaker-dependent vs. speaker independent Single word vs. continuous speech Large vs. small vocabulary
Additional References Hong, X. & A. Acero & H. Hon: Spoken Language Processing. A Guide to Theory, Algorithms, and System Development. Prentice. Hall, NJ, 2001.
- Upenn cis 519
- Cis 519 upenn
- Cis 519
- Cis 419
- Cis 419 upenn
- Mycin advantages and disadvantages
- State space search
- Searching for solutions in artificial intelligence
- 15-780 graduate artificial intelligence
- Knowledge manipulation in ai
- Structural knowledge in ai
- American association for artificial intelligence 17 mar
- Kecerdasan kepemimpinan
- Uas kecerdasan buatan
- Math and artificial intelligence
- Peas example in ai
- 15-780 graduate artificial intelligence
- Machine learning xkcd
- Fuzzy proposition