Melspectrum to Melcepstrum Computation A Speech Recognition presentation

  • Slides: 9
Download presentation
Mel-spectrum to Mel-cepstrum Computation A Speech Recognition presentation October 1 2003 Ji Gu J.

Mel-spectrum to Mel-cepstrum Computation A Speech Recognition presentation October 1 2003 Ji Gu J. Gu@umail. Leiden. Univ. nl

Mel-spectrum to Mel-cepstrum Computation Now we have known: • The FFT processing step converts

Mel-spectrum to Mel-cepstrum Computation Now we have known: • The FFT processing step converts each frame of N samples from the time domain into the frequency domain. • The result of the Mel-spectrum computation is:

Mel-spectrum to Mel-cepstrum Computation To compute Mel-cepstrum: • We convert the log Mel-spectrum back

Mel-spectrum to Mel-cepstrum Computation To compute Mel-cepstrum: • We convert the log Mel-spectrum back to time domain using the Discrete Cosine Transform (DCT). (Because the Mel-spectrum coefficients and their logarithm are real numbers) • The result obtained is called the Mel Frequency Cepstrum Coefficients (MFCC).

Mel-spectrum to Mel-cepstrum Computation Therefore : A DCT is applied to the natural logarithm

Mel-spectrum to Mel-cepstrum Computation Therefore : A DCT is applied to the natural logarithm of the Melspectrum to obtain the Mel-cepstrum, c[n] as: C is the number of the cepstral coefficients

Mel-spectrum to Mel-cepstrum Computation In SPHINX III Signal Processing Front End Specification • First,

Mel-spectrum to Mel-cepstrum Computation In SPHINX III Signal Processing Front End Specification • First, the Cosine section of c[n] is computed: int 32 fe_compute_melcosine(melfb_t *MEL_FB) { float period, freq; int 32 i, j; period = (float)2*MEL_FB->num_filters; if ((MEL_FB->mel_cosine = (float **) fe_create_2 d(MEL_FB>num_cepstra, MEL_FB->num_filters, sizeof(float)))==NULL){ fprintf(stderr, "memory alloc failed in fe_compute_melcosine()n. . . exitingn"); exit(0); }

Mel-spectrum to Mel-cepstrum Computation for (i=0; i<MEL_FB->num_cepstra; i++) { freq = 2*(float)M_PI*(float)i/period; for (j=0;

Mel-spectrum to Mel-cepstrum Computation for (i=0; i<MEL_FB->num_cepstra; i++) { freq = 2*(float)M_PI*(float)i/period; for (j=0; j< MEL_FB->num_filters; j++) MEL_FB->mel_cosine[i][j] = (float)cos((double)(freq*(j+0. 5))); } return(0); } • Second, a Cosine transform of the Logarithm of the Mel-spectrum:

Mel-spectrum to Mel-cepstrum Computation void fe_mel_cep(fe_t *FE, double *mfspec, double *mfcep) { int 32

Mel-spectrum to Mel-cepstrum Computation void fe_mel_cep(fe_t *FE, double *mfspec, double *mfcep) { int 32 i, j; /* static int first_run=1; */ /* unreferenced variable */ int 32 period; float beta; period = FE->MEL_FB->num_filters; for (i=0; i<FE->MEL_FB->num_filters; ++i) { if (mfspec[i]>0) mfspec[i] = log(mfspec[i]); else mfspec[i] = -1. 0 e+5; }

Mel-spectrum to Mel-cepstrum Computation for (i=0; i< FE->NUM_CEPSTRA; ++i){ mfcep[i] = 0; for (j=0;

Mel-spectrum to Mel-cepstrum Computation for (i=0; i< FE->NUM_CEPSTRA; ++i){ mfcep[i] = 0; for (j=0; j<FE->MEL_FB->num_filters; j++){ if (j==0) beta = 0. 5; else beta = 1. 0; mfcep[i] += beta*mfspec[j]*FE->MEL_FB->mel_cosine[i][j]; } mfcep[i] /= (float)period; } return; }

Mel-spectrum to Mel-cepstrum Computation By applying the procedure described above: • For each speech

Mel-spectrum to Mel-cepstrum Computation By applying the procedure described above: • For each speech frame, a set of mel-frequency cepstrum coefficients(MFCC) is computed. • This set of coefficients is called an acoustic vector which represents the phonetically important characteristics of speech and is very useful for further analysis and processing in Speech Recognition. End