A Study of Single Channel Blind Source Separation
A Study of Single Channel Blind Source Separation and Recognition Based on Mixed-State Prediction Reporter:Chia-Cheng Chen Advisor :Wen-Ping Chen Department of Electrical Engineering National Kaohsiung University of Applied Sciences Network Application Laboratory 1
Outline n. Introduction and Motivation n. Background n. Research Methods n. Experimental Results n. Conclusion and Future Works n. Research Results 2
Introduction n The applications of voiceprint recognition system • Call routing (1997) • Jupiter (1997) • Let’s Go! (2002) • Siri (2010) • Skyvi (2011) • Vlingo (2011) 3
Introduction n Current Ecological Status of the Survey: • Sensor networks • Wireless networks • Database • Voiceprint recognition system n Advantage • Reduce the cost of human resource and time • Save and share the raw data conveniently 4
Introduction Blind Source Separation http: //metadata. froghome. org/about. php 台灣地區兩棲類物種描述資料 5
Introduction Blind Source Separation ? 6
Introduction n. Voiceprint recognition • C. J. Huang, Y. J. Yang, D. X. Yang and Y. J. Chen, “Frog classification using machine learning techniques, ” Expert Systems with Applications, Vol. 36, No. 2, pp. 3737 -3743, 2009. (SCI) • S. C. Hsieh, W. P. Chen, W. C. Lin, F. S. Chou, and J. R. Lai, “Endpoint detection of frog croak syllables with using average energy entropy method, ” Taiwan Journal of Forest Science, Vol. 27, No. 2, pp. 149 -161, Jun. 2012. (EI) • W. P. Chen, S. S. Chen, C. C. Lin, Y. Z. Chen and W. C. Lin, “Automatic recognition of frog call using multi-stage average spectrum, ” Computers & Mathematics with Applications, Vol. 64, No. 5, pp. 1270 -1281, Sep. 7
Introduction n. Single channel source separation • M. N. Schmidt and M. Mørup, “Nonnegative matrix factor 2 -D deconvolution for blind single channel source separation, ” Proceedings of International Conferences Independent Component Analysis and Blind Signal Separation, Vol. 3889, pp. 700 -707, Mar. 2006. (SCI) • S. Kırbız and B. Gunsel, “Perceptually weighted non-negative matrix factorization for blind single-channel music source separation, ” 21 st International Conference on Pattern Recognition, Nov. 2012. (EI) 8
Motivation n. Automatic frog species voiceprint recognition system • Predicting the number of mixed signal • Single channel blind source separation • Biologist • People 9
Outline n. Introduction and Motivation n. Background n. Research Methods n. Experimental Results n. Conclusion and Future Works n. Research Results 10
Background Signal Processing Pre-emphasis Frame Window Endpoint Detection Time Domain Frequency Domain Feature Extraction Mel-frequency Cepstrum Coefficient Matching Adaptive Multi-stages Average Spectrum Blind Source Separation Non-negative Matrix Factor 2 -D Deconvolution 11
Background n. Voiceprint Recognition Signal Processing Syllable Segmentation Feature Extraction Matching 12
Signal Processing n. Signal Processing Resample 44100 Hz Frog Signal Pre-emphasis Frame Hamming Window 13
Syllable Segmentation n. Endpoint Detection Algorithm • Energy • Time Domain • Simple • Square of the Amplitude or Absolute Value of the Amplitude • Vulnerable to Noise Impact • Entropy • Frequency Domain • Complex • Noise Immunity 14
Average Energy Entropy n. Signal Transform s(n):windowed signal N:frame size k:frequency component n. Average Energy u:the mean for energy of input signal A(n):the amplitude value of input signal N:total number of input signal 15
Average Energy Entropy n. Probability Density Function 16
Average Energy Entropy n. Average Energy Entropy H’:the negative entropy for each frame 17
Endpoint Detection Algorithm Signal AEE Absolute Energy Square Energy 18
Feature Extraction 19
Adaptive Multi-stage Average Spectral n. Adaptive Clustering Cluster B Cluster A 20
Adaptive Multi-stage Average Spectral n. Adaptive Clustering Cluster B Cluster A 21
Adaptive Multi-stage Average Spectral n. Adaptive Clustering 22
Adaptive Multi-stage Average Spectral n. Template Training Frame 1 Stage 1 Frame 2 Frame 3 Frame 4 Stage 2 Frame 5 Frame 6 Frame 7 Stage 3 23
Adaptive Multi-stage Average Spectral n. Template Training 24
Adaptive Multi-stage Average Spectral n. Template Training Minimum Cumulative Difference 25
Adaptive Multi-stage Average Spectral n. Template Maching Minimum Cumulative Difference 26
Blind Source Separation , n. Non-negative Matrix Factor 2 -D Deconvolution , • α basis matrix and βcoefficient matrix • Obtain the relations between the time and the pitch • Shift operator V: Original Signal : Reconstructed Signal 27
Non-negative Matrix Factor 2 -D Deconvolution 28
Non-negative Matrix Factor 2 -D Deconvolution n. Non-negative Matrix Factor 2 -D Deconvolution • Cost function • Based on Euclidean Distance • Based on Kullback-Leibler Divergence 29
Outline n. Introduction and Motivation n. Background n. Research Methods n. Experimental Results n. Conclusion and Future Works n. Research Results 30
Research Methods n. Mixed-State Prediction voiceprint recognition method • Training • Mixed signals states • Testing • Two stages voiceprint recognition • Mixed-State Prediction 31
32
First Stage Independent signal Signal Processing Syllable Segmentation Latouche's frog MFCC Mixed signal Feature Extraction Matching Moltrecht's green tree frog + Latouche's frog MFCC 33
Mixed signals states 34
Mixed States n. Average Energy Independent signal Mixed signal E:the average energy for the frequency X(k) N:the length of the syllable 35
Predicting the number of mixed signal T:the separation threshold E:the mean spectral energy for test syllable a:the mean energy of training data 36
Outline n. Introduction and Motivation n. Background n. Research Methods n. Experimental Results n. Conclusion and Future Works n. Research Results 37
Experimental Results Parameter Value Frame Length 512 samples Frame Overlapping 50% Window Function Hamming Window Frequency Bin 512 Feature Parameters Mel-Frequency Cepstral Coefficient Feature Dimensions 15 Separation Threshold 0. 3 38
Experimental Results n. Recognition Experiment • Independent signals Method Total Syllable Error Mixed Correct Syllable Accuracy(%) DTW 373 31 282 75. 6% AMSAS 373 31 317 84. 71% 39
Experimental Results n. Recognition Experiment • Mixed signals Total Syllable Error Mixed Correct Syllable Accuracy(%) 167 36 131 78. 44% Method Total Syllable Correct Syllable Accuracy(%) DTW 269 183 68. 02% AMSAS 269 211 78. 43% 40
Experimental Results 41
Experimental Results 42
Conclusion and Future Works n. The proposed method • Improve the mixed signal recognition rate • Proposed a method to predict the number of mixed signal 43
Conclusion and Future Works n. Future Works • Study of de-noise methods • Collect more features between independent and mixed signals • Mixed signals recognition within same species • Collect various sound of species. Then, improve the system performance • Adopt Support Vector Machines(SVM), Neural Network… 44
Thank you for your attention !! 46
- Slides: 46