Presentation on Timbre Similarity Alexandre Savard March 2006
Presentation on Timbre Similarity Alexandre Savard March 2006 Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Content • Introduction • Measurement of timbre • Measurement of similarity • Systems Evaluation • Recent developments • Conclusion Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Introduction Incomplete timbre definition – Timbre is a fundamental dimension of sound. – Timbre has been too often described as the dimension of sound that lets the listener makes distinction between two sounds that have the same pitch and the same loudness. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Introduction Incomplete timbre definition An efficient operational definition of timbre haven’t been already achieved. – Previous research demonstrated the multidimensional nature of timbre. – Existing timbre researches has already compared the similarity of the timbre of single instrumental notes. – Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Introduction Physical features of timbre – Attack transients – Spectral flux – Spectral gravity centre – Harmonicity Ratio – Spectral/Temporal Envelope – Other factors: • Pitch • Loudness Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Introduction Global Timbre – A local definition of timbre appears to be useless for electronic music distribution development or music recommendation systems. – Researches use the concept of “global” timbre that attributes a timbre quality for an entire piece. – This idea only makes sense if there is only little variations in texture and instrumentation. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Measurement of timbre Mel-Frequency Cepstrum Coeficient – Mel-Frequency Cepstrum Coefficient (MFCC) • Spectral gravity centre • Spectral envelope • Spectral Flux • Combines those measures in a “feature vector” Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Measurement of timbre Mel-Frequency Cepstrum Coefficient – It is a measure of the spectral envelope variations. – Consist of a mapping of the linear frequencies to the psychoacoustically-based Mel scale. – It results an ordered sequence of coefficients. – Low-order coefficients describe slow temporal changes of the spectral envelope. – High-order coefficients describe fast changes. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Measurement of Similarity Metric – Metrics are applied to calculate the distance between two representations and determine the similarity of the music. – Should be related to strategy used by humans in similarity judgments of timbre. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Measurement of Similarity Gaussian Mixture Model – MFCC involves a large amount of coefficients. – It is necessary to get a more compact representation to handle those results. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Measurement of Similarity Gaussian Mixture Model – GMM is composed of one or more components Gaussian probability distributions. – Distance between GMM’s can be seen as a measurement of the similarity. – Random probabilities are computed from each song to be compared. – Samples are taken from both songs to be compared. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Measurement of Similarity Gaussian Mixture Model – “Distance” between GMM’s can be seen as a measurement of the similarity. – “Distance” is the amount of necessary changes to obtain samples of the second song from the first one. – The higher are those probabilities, the higher the similarity is. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Measurement of Similarity Gaussian Mixture Model J. Aucouturier et al, 2004 “The Way It Sounds”: Timbre Models for Analysis and Retrieval of Music Signals. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Measurement of Similarity Different Approaches – Neural Networks – Hidden Markov Model – Gaussian Mixture Models – Self-Organizing Map Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Systems Evaluation criteria – Timbre similarity judgment is based on a set of objective and subjective perceptual, cognitive and cultural aspects. – Measure are highly dependent of music present in the database. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Systems Evaluation Objective Evaluation – The objective evaluation of timbral similarity measure is problematic. – Metadatas of a given database include description of the artist and of the genre. However, timbre quality is not usually described in it. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Systems Evaluation Subjective Evaluation – Conducting a psychoacoustical survey – Deciding whether two songs have similar timbre can be uncertain as it is an ill-defined concept. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Recent Developments Aucouturier and Pachet (2002) – Segmentation of each song using invariable 50 ms windows. – Make use of a 8 coefficient MFCC to characterize each segments. – Used Gaussian Mixture Model composed of three Gaussian probability distribution. – 100 random samples are taken for similarity measurement. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Recent Developments Aucouturier and Pachet (2002) J. Aucouturier et al, 2004, “The Way It Sounds”: Timbre Models for Analysis and Retrieval of Music Signals. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Recent Developments Aucouturier and Pachet (2004) – Finding the best set of parameters • Sampling rate of the music signal • Number of MFCCs extracted from each frame of data • Number of components used in the GMM • The distance sample rate to estimate the likelihood of one model given another • Window size Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Recent Developments Aucouturier and Pachet (2004) J. Aucouturier et al, 2004, “The Way It Sounds”: Timbre Models for Analysis and Retrieval of Music Signals. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Recent Developments Aucouturier and Pachet (2004) – Alternative similarity measurements using Earth Mover’s Distance and Hidden Markov Model. – Those techniques didn’t improved the performances. – Bring the idea that there could exist a ceiling for the performance of technique involving timbre similarity. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Recent Developments Liu and Huang (2000) – Developed an algorithm for singing voice. – Used MFCC as well as GMM for their timbre representation. – The segmentation of audio signal is done according to the phonemes in singing. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Recent Developments Logan and Salomon (2001) – Characterized timbre with MFCC. – Used K-means clustering instead of GMM. – Calculate the amount of similarity using Earth Mover’s Distance. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Conclusion Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
Bibliography • J. Aucouturier, F. Pachet, and Mark Sandler. 2004. “The way it sounds”: Timbre models for analysis and retrieval of music signals. IEEE Transaction on multimedia. • J. Aucouturier, and F. Pachet. 2004. Improving timbre similarity : How high’s the sky ? Proceedings of the International Conference on Music Information Retrieval. • J. Aucouturier, and F. Pachet. 2002. Music similarity measures: What’s the use ? Proceedings of the International Conference on Music Information Retrieval. • C. Liu, and C. Huang. 2002. A singer identification technique for content-based classification of mp 3 music object. Proceeding of the Conference on Information and Knowledge Management. • B. Logan, and A. Salomon. 2001. A music similarity function based on signal analysis. Proceeding of the International Conference on Multimedia and Expo. Template copyright 2005 MUMT 611: Music Information Acquisition, Preservation, and Retrieval www. brainybetty. com
- Slides: 26