Introduction to training of HTK CHEN TZAN HWEI

  • Slides: 13
Download presentation
Introduction to training of HTK CHEN TZAN HWEI NTNU SPEECH LAB

Introduction to training of HTK CHEN TZAN HWEI NTNU SPEECH LAB

Two type of training o training data with correct label but no correct alignment.

Two type of training o training data with correct label but no correct alignment. o Training data with correct label and alignment. o The general steps of training: n n 10/3/2020 Generate initial model Training model (several method) NTNU SPEECH LAB 2

Training without correct label (1/5) o 1. HCompv : n n o calculate the

Training without correct label (1/5) o 1. HCompv : n n o calculate the global mean and variance of training data (there is only one Gaussian in every model or state) Also set the initial model. Common parameter : n n 10/3/2020 -c : the configurations of coefficient -S : the list of coefficient of training data -M : the output directory -m : updating the mean (must specify) NTNU SPEECH LAB 3

Training without correct label (2/5) o HHed : n n o split the single

Training without correct label (2/5) o HHed : n n o split the single Gaussian of every HMM state to n Gaussian n must depend on the training data of Model Common parameter n -C : the same as HCompv n -d : the dir of initial HMM n -M : the same as HCompv 10/3/2020 NTNU SPEECH LAB 4

Training without correct label (3/5) o HErest : n n o Apply the forward

Training without correct label (3/5) o HErest : n n o Apply the forward and backward training in total utterance no need of boundary information Common parameter n n -C : the same as HCompv -d : the dir of initial HMM -M : the same as HCompv -S : the same as HCompv 10/3/2020 NTNU SPEECH LAB 5

Training without correct label (4/5) o o HErest (cont) Common parameter n n n

Training without correct label (4/5) o o HErest (cont) Common parameter n n n -L : the dir of transcription -X : the suffix of file name -s : output the statistic of every state to the named file. -t : pruning threshold -v : set the minimum variance to value we give. 10/3/2020 NTNU SPEECH LAB 6

Training without correct label (5/5) o o HErest (cont) Common parameter n 10/3/2020 -T

Training without correct label (5/5) o o HErest (cont) Common parameter n 10/3/2020 -T : show trace information NTNU SPEECH LAB 7

Training with correct label (1) o Hinit : iteratively computes an initial parameter using

Training with correct label (1) o Hinit : iteratively computes an initial parameter using segmental k-mean training procedure 10/3/2020 NTNU SPEECH LAB 8

Training with correct label (2) o o Hinit (cont): Common parameter : n -C

Training with correct label (2) o o Hinit (cont): Common parameter : n -C : the same as HCompv n -L : the same as HErest n -X : the same as HErest n n 10/3/2020 -v : the same as HErest -T : the same as HErest -M : the same as HErest -S : the same as HErest NTNU SPEECH LAB 9

Training with correct label (3) o o Hinit (cont): Common parameter : n n

Training with correct label (3) o o Hinit (cont): Common parameter : n n n 10/3/2020 -m : This sets the minimum number of training examples so that if fewer than N examples are supplied an error is reported. -i : This sets the maximum number of estimation cycles -l : HInit searches through all of the training files and cuts out all segments with the given label. NTNU SPEECH LAB 10

Training with correct label (4) o o Hinit (cont): Common parameter : n 10/3/2020

Training with correct label (4) o o Hinit (cont): Common parameter : n 10/3/2020 -o : The string is used as the name of the output HMM in place of the source name. NTNU SPEECH LAB 11

Training with correct label (5) o HRest : n n o used to further

Training with correct label (5) o HRest : n n o used to further re-estimate the HMM parameters initially computed by HInit. Baum-Welch re-estimation procedure is used, instead of the segmental k-means training procedure for HInit. Common parameter : n 10/3/2020 The same as HInit NTNU SPEECH LAB 12

Recognition o o HVite Common parameter n n n 10/3/2020 -S : the list

Recognition o o HVite Common parameter n n n 10/3/2020 -S : the list of test file. -l : the output directory. -d : the dir of HMM. NTNU SPEECH LAB 13