PATTERN COMPARISON TECHNIQUES Test Pattern Reference Pattern 1

  • Slides: 64
Download presentation
PATTERN COMPARISON TECHNIQUES Test Pattern: Reference Pattern: 1

PATTERN COMPARISON TECHNIQUES Test Pattern: Reference Pattern: 1

4. 2 SPEECH (ENDPIONT) DETECTION 2

4. 2 SPEECH (ENDPIONT) DETECTION 2

4. 3 DISTORTION MEASURESMATHEMATICAL CONSIDERATIONS x and y: two feature vectors defined on a

4. 3 DISTORTION MEASURESMATHEMATICAL CONSIDERATIONS x and y: two feature vectors defined on a vector space X The properties of metric or distance function d: A distance function is called invariant if 3

PERCEPTUAL CONSIDERATIONS Spectral changes that do not fundamentally change the perceived sound include: 4

PERCEPTUAL CONSIDERATIONS Spectral changes that do not fundamentally change the perceived sound include: 4

PERCEPTUAL CONSIDERATIONS Spectral changes that lead to phonetically different sounds include: 5

PERCEPTUAL CONSIDERATIONS Spectral changes that lead to phonetically different sounds include: 5

PERCEPTUAL CONSIDERATIONS Just-discriminable change: known as JND (just-noticeable difference), DL (difference limen), or differential

PERCEPTUAL CONSIDERATIONS Just-discriminable change: known as JND (just-noticeable difference), DL (difference limen), or differential threshold 6

4. 4 DISTORTION MEASURESPERCEPTUAL CONSIDERATIONS 7

4. 4 DISTORTION MEASURESPERCEPTUAL CONSIDERATIONS 7

4. 4 DISTORTION MEASURESPERCEPTUAL CONSIDERATIONS 8

4. 4 DISTORTION MEASURESPERCEPTUAL CONSIDERATIONS 8

Spectral Distortion Measures Spectral Density Fourier Coefficients of Spectral Density Autocorrelation Function 9

Spectral Distortion Measures Spectral Density Fourier Coefficients of Spectral Density Autocorrelation Function 9

Spectral Distortion Measures Short-term autocorrelation Then is an energy spectral density 10

Spectral Distortion Measures Short-term autocorrelation Then is an energy spectral density 10

Spectral Distortion Measures Autocorrelation matrices 11

Spectral Distortion Measures Autocorrelation matrices 11

Spectral Distortion Measures If σ/A(z) is the all-pole model for the speech spectrum, The

Spectral Distortion Measures If σ/A(z) is the all-pole model for the speech spectrum, The residual energy resulting from “inverse filtering” the input signal with an all-zero filter A(z) is: 12

Spectral Distortion Measures Important properties of all-pole modeling: The recursive minimization relationship: 13

Spectral Distortion Measures Important properties of all-pole modeling: The recursive minimization relationship: 13

LOG SPECTRAL DISTANCE 14

LOG SPECTRAL DISTANCE 14

LOG SPECTRAL DISTANCE 15

LOG SPECTRAL DISTANCE 15

CEPSTRAL DISTANCES The complex cepstrum of a signal is defined as The Fourier transform

CEPSTRAL DISTANCES The complex cepstrum of a signal is defined as The Fourier transform of log of the signal spectrum. 16

CEPSTRAL DISTANCES Truncated cepstral distance 17

CEPSTRAL DISTANCES Truncated cepstral distance 17

CEPSTRAL DISTANCES 18

CEPSTRAL DISTANCES 18

CEPSTRAL DISTANCES 19

CEPSTRAL DISTANCES 19

Weighted Cepstral Distances and Liftering It can be shown that under certain regular conditions,

Weighted Cepstral Distances and Liftering It can be shown that under certain regular conditions, the cepstral coefficients, except c 0, have: 1) Zero means 2) Variances essentially inversed proportional to the square of the coefficient 3) index: If we normalize the cepstral distance by the variance inverse: 20

Weighted Cepstral Distances and Liftering Differentiating both sides of the Fourier series equation of

Weighted Cepstral Distances and Liftering Differentiating both sides of the Fourier series equation of spectrum: This is an L 2 distance based upon the differences between the spectral slopes 21

Cepstral Weighting or Liftering Procedure h is usually chosen as L/2 and L is

Cepstral Weighting or Liftering Procedure h is usually chosen as L/2 and L is typically 10 to 16 22

A useful form of weighted cepstral distance: 23

A useful form of weighted cepstral distance: 23

Likelihood Distortions Previously defined: Itakura-Saito distortion measure Where and of are one-step prediction errors

Likelihood Distortions Previously defined: Itakura-Saito distortion measure Where and of are one-step prediction errors and as defined: 24

25

25

Likelihood Distortions The residual energy can be easily evaluated by: 26

Likelihood Distortions The residual energy can be easily evaluated by: 26

Likelihood Distortions By replacing by its optimal p-th order LPC model spectrum: If we

Likelihood Distortions By replacing by its optimal p-th order LPC model spectrum: If we set σ2 to match the residual energy α : Which is often referred to as Itakura distortion measure 27

Likelihood Distortions Another way to write the Itakura distortion measure is: Another gain-independent distortion

Likelihood Distortions Another way to write the Itakura distortion measure is: Another gain-independent distortion measure is called the Likelihood Ratio distortion: 28

4. 5. 4 Likelihood Distortions 29

4. 5. 4 Likelihood Distortions 29

4. 5. 4 Likelihood Distortions That is, when the distortion is small, the Itakura

4. 5. 4 Likelihood Distortions That is, when the distortion is small, the Itakura distortion measure is not very different from the LR distortion measure 30

4. 5. 4 Likelihood Distortions 31

4. 5. 4 Likelihood Distortions 31

4. 5. 4 Likelihood Distortions Consider the Itakura-Saito distortion between the input and output

4. 5. 4 Likelihood Distortions Consider the Itakura-Saito distortion between the input and output of a linear system H(z) 32

4. 5. 4 Likelihood Distortions 33

4. 5. 4 Likelihood Distortions 33

4. 5. 4 Likelihood Distortions 34

4. 5. 4 Likelihood Distortions 34

4. 5. 5 Variations of Likelihood Distortions Symmetric distortion measures: 35

4. 5. 5 Variations of Likelihood Distortions Symmetric distortion measures: 35

4. 5. 5 Variations of Likelihood Distortions COSH distortion 36

4. 5. 5 Variations of Likelihood Distortions COSH distortion 36

4. 5. 5 Variations of Likelihood Distortions 37

4. 5. 5 Variations of Likelihood Distortions 37

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale Psychophysical studies have shown

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale Psychophysical studies have shown that human perception of the frequency Content of sounds does not follow a linear scale. This research has led to the idea of defining subjective pitch of pure tones. For each tone with an actual frequency, f, measured in Hz, a subjective pitch is measured on a scale called the “mel” scale. As a reference point, the pitch of a 1 k. Hz tone, 40 d. B above the perceptual hearing threshold, is defined as 1000 mels. 38

39

39

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale 40

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale 40

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale 41

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale 41

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale 42

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale 42

Examples of Critical bandwidth 43

Examples of Critical bandwidth 43

Warped cepstral distance b is the frequency in Barks, S(θ(b)) is the spectrum on

Warped cepstral distance b is the frequency in Barks, S(θ(b)) is the spectrum on a Bark scale, and B is the Nyquist frequency in Barks. 44

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale Where the warping function

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale Where the warping function is defined by 45

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale 46

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale 46

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale 47

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale 47

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale 48

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale 48

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale Mel-frequency cepstrum: is the

4. 5. 6 Spectral Distortion Using a Warped Frequency Scale Mel-frequency cepstrum: is the output power of the triangular filters Mel-frequency cepstral distance 49

4. 5. 7 Alternative Spectral Representations and Distortion Measures 50

4. 5. 7 Alternative Spectral Representations and Distortion Measures 50

4. 5. 7 Alternative Spectral Representations and Distortion Measures 51

4. 5. 7 Alternative Spectral Representations and Distortion Measures 51

4. 5. 7 Alternative Spectral Representations and Distortion Measures 52

4. 5. 7 Alternative Spectral Representations and Distortion Measures 52

Summary of Spectral Distortion Measures Distortion Measure Notation Expression Computation 53

Summary of Spectral Distortion Measures Distortion Measure Notation Expression Computation 53

Summary of Spectral Distortion Measures Distortion Measure Notation Expression Computation 54

Summary of Spectral Distortion Measures Distortion Measure Notation Expression Computation 54

Summary of Spectral Distortion Measures Distortion Measure Notation Expression Computation 55

Summary of Spectral Distortion Measures Distortion Measure Notation Expression Computation 55

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE 56

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE 56

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE Fitting the cepstral

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE Fitting the cepstral trajectory by a second order polynomial, Choose h 1, h 2, h 3 such that E is minimized. Differentiating E with respect to h 1, h 2, and h 3 and setting to zero results in 3 equations: 57

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE The solutions to

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE The solutions to these equations are: 58

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE 59

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE 59

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE 60

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE 60

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE A differential spectral

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE A differential spectral distance: A second differential spectral distance: 61

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE Cepstral weighting or

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE Cepstral weighting or liftering by differentiating 62

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE A weighted differential

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE A weighted differential cepstral distance: 63

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE Taking the L

4. 6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE Taking the L 2 distance as an example: 64