LandmarkBased Speech Recognition Spectrogram Reading Support Vector Machines
Landmark-Based Speech Recognition: Spectrogram Reading, Support Vector Machines, Dynamic Bayesian Networks, and Phonology Mark Hasegawa-Johnson jhasegaw@uiuc. edu University of Illinois at Urbana-Champaign, USA
Lecture 2: Acoustics of Vowel and Glide Production • One-Dimensional Linear Acoustics – The Acoustic Wave Equation – Transmission Lines – Standing Wave Patterns • One-Tube Models – Schwa – Front cavity resonance of fricatives • Two-Tube Models – The vowel /a/ – Helmholtz Resonator – The vowels /u, i, e/ • Perturbation Theory – The vowels /u/, /o/ revisited – Glides
1. One-Dimensional Acoustic Wave Equation and Solutions
Acoustics: Constitutive Equations
Acoustic Plane Waves: Time Domain
Acoustic Plane Waves: Frequency Domain Tex
Solution for a Tube with Constant Area and Hard Walls
2. One-Tube Models
Boundary Conditions 0 L
Resonant Frequencies
Standing Wave Patterns
Standing Wave Patterns: Quarter. Wave Resonators Tube Closed at the Left End, Open at the Right End
Standing Wave Patterns: Half. Wave Resonators Tube Closed at Both Ends Tube Open at Both Ends
Schwa and Invv (the vowels in “a tug”) F 3=2500 Hz=5 c/4 L F 2=1500 Hz=3 c/4 L F 1=500 Hz=c/4 L
Front Cavity Resonances of a Fricative /s/: Front Cavity Resonance = 4500 Hz = c/4 L if Front Cavity Length is L=1. 9 cm /sh/: Front Cavity Resonance = 2200 Hz = c/4 L if Front Cavity Length is L=4. 0 cm
3. Two-Tube Models
Conservation of Mass at the Juncture of Two Tubes U 2(x, t)= 2 U 1(x, t) A 2 = A 1/2 A 1 Total liters/second transmitted = (velocity) X (tube area)
Two-Tube Model: Two Different Sets of Waves Incident Wave P 1+ Reflected Wave P 1 - Reflected Wave P 2+ Incident Wave P 2 -
Two-Tube Model: Solution in the Time Domain
Two-Tube Model in the Frequency Domain
Approximate Solution of the Two. Tube Model, A 1>>A 2 LBACK LFRONT Approximate solution: Assume that the two tubes are completely decoupled, so that the formants include - F(BACK CAVITY) = c/4 LBACK - F(FRONT CAVITY) = c/4 LFRONT
The Vowels /AA/, /AH/ LBACK LFRONT LBACK=8. 8 cm F 2= c/4 LBACK = 1000 Hz LFRONT=12. 6 cm F 1= c/4 LFRONT = 700 Hz
Acoustic Impedance Z(x, j. W) 0 x
Low-Frequency Approximations of Acoustic Impedance
Helmholtz Resonator -Z 1(x, j. W) = Z 2(x, j. W) 0 0 x x
The Vowel /i/ Back Cavity = Pharynx Resonances: 0 Hz, 2000 Hz, 4000 Hz Front Cavity = Palatal Constriction Resonances: 0 Hz, 2500 Hz, 5000 Hz Back Cavity Volume = 70 cm 3 Front Cavity Length/Area = 7 cm-1 1/2 p√MC = 250 Hz Helmholtz Resonance replaces all 0 Hz partial-tube resonances. 2500 Hz 2000 Hz 250 Hz
The Vowel /u/: A Two-Tube Model 2000 Hz 1000 Hz Back Cavity = Mouth + Pharynx Resonances: 0 Hz, 1000 Hz, 2000 Hz Front Cavity = Lips Resonances: 0 Hz, 18000 Hz, … Back Cavity Volume = 200 cm 3 Front Cavity Length/Area = 2 cm-1 1/2 p√MC = 250 Hz Helmholtz Resonance replaces all 0 Hz partial-tube resonances. 250 Hz
The Vowel /u/: A Four-Tube Model Pharynx Velar Tongue Body Constriction Two Helmholtz Resonators = Two Low-Frequency Formants! F 1 = 250 Hz F 2 = 500 Hz F 3 = Pharynx resonance, c/2 L = 2000 Hz Lips Mouth 2000 Hz 500 Hz 250 Hz
4. Perturbation Theory
Perturbation Theory (Chiba and Kajiyama, The Vowel, 1940) A(x) is constant everywhere, except for one small perturbation. Method: 1. Compute formants of the “unperturbed” vocal tract. 2. Perturb the formant frequencies to match the area perturbation.
Conservation of Energy Under Perturbation
Conservation of Energy Under Perturbation
“Sensitivity” Functions
Sensitivity Functions for the Quarter. Wave Resonator (Lips Open) 0 x L /AA/ /ER/ /IY/ /W/
Sensitivity Functions for the Half. Wave Resonator (Lips Rounded) 0 x L /L, OW/ /UW/
Formant Frequencies of Vowels From Peterson & Barney, 1952
Summary • Acoustic wave equation easiest to solve in frequency domain, for example: – Solve two boundary condition equations for P+ and P-, or – Solve the two-tube model (four equations in four unknowns) • Quarter-Wave Resonator: Open at one end, Closed at the other – Schwa or Invv (“a tug”) – Front cavity resonance of a fricative or stop • Half-Wave Resonator: Closed at the glottis, Nearly closed at the lips – /uw/ • Two-Tube Models – Exact solution: use reflection coefficient – Approximate solution: decouple the tubes, solve separately • Helmholtz Resonator – When the two-tube model seems to have resonances at 0 Hz, use, instead, the Helmholtz Resonance frequency, computed with low-frequency approximations of acoustic impedance – /iy/: F 1 is a Helmholtz Resonance – /uw/ and /ow/: Both F 1 and F 2 are Helmholtz Resonances • Perturbation Theory – Perturbed area Perturbed formants – Sensitivity function explains most vowels and glides in one simple chart
- Slides: 37