Audio Processing Mitch Parry Similar to Image Processing
- Slides: 37
Audio Processing Mitch Parry
Similar to Image Processing? • For images a pixel is the smallest unit • The color is a distribution of the spectrum of visible light. • Video samples at ~30 frames per second Amplitude R G B One color red yel. green blue http: //www. chemistryland. com www. jiscdigitalmedia. ac. uk
Similar to Image Processing? • Each pixel contains R, G, and B corresponding to three cones that perceive color. • A frame is a picture from “one instant of time” Amplitude R G B One color red yel. green blue http: //www. chemistryland. com www. jiscdigitalmedia. ac. uk
Resource!
Chapter 2: Sound Waves • • Sound Waves and Harmonic Motion Properties of Sine Waves Resonance as Harmonic Frequencies Nonsinusoidal Waves
Chapter 5: Digitization • • Sampling and Aliasing Quantization Dynamic Range Nyquist and Aliasing
– Hundredths of a second • One audio frame – Hundredths of a second blue green yel. red 400 nm 700 nm Power • One color in one pixel in one frame of video Power Spectral Domain 20 Hz 20 k. Hz
Audacity: Plot Spectrum
Audio Mixing • Free Multitrack Downloads • http: //www. cambridge-mt. com/ms-mtk. htm
“Stop Messing with Me” by Sven Bornemark • • Steinberg Grand Piano Acoustic Guitar Bass Drums Overhead Electric Guitar Ambience Kick Drum Vocal
Audacity: Mixing Tutorial • Mixing Tutorial
Simple Unmixing • Left: Drums + 0. 5 * Vocal • Right: Guitar + 0. 5 * Vocal • Remove vocals: – Karaoke track = Left – Right = Drums – Guitar
Audacity: Let’s try it. Real example: Norah Jones
Removing Hiss_*. wav
Removing Clicks
Short-Time Fourier Transform Spectrogram FFT Each frame contributes one column of spectrogram
Audacity: Let’s try it.
Changing Speed • Downsample – Shorten the clip – Increase its pitch
Changing Tempo • Change length of clip without changing pitch • Split into frames, repeat or remove frames
Changing Pitch • Change pitch without changing length – Increase pitch: Repeat frames and downsample – Decrease pitch: Remove frames and upsample
Beats • Amplitude Envelope – – Filterbank Full-wave rectify Low-pass filter Differentiate/Half-wave rectify Scheirer. JASA 1998. Tzanetakis. AMTA 2001 IPEM Toolbox
Beats • Beat Envelope – Filterbank (Discrete Wavelet Transform) – Full-wave rectify – Low-pass filter – Differentiate/Half-wave rectify – Low-pass filter – Sum • Peak detection Scheirer. JASA 1998. Tzanetakis. AMTA 2001 IPEM Toolbox
Audacity: Beat Detection • Drum track
Audacity • Audacity Manual • More Effects and Analyzers
Musical Features • • Visualizing Structure Rhythm/ Tempo Melody Timbre
Foote & Cooper. ICMC 2001. Visualizing Structure • Compute any features • Choose similarity metric • Visualize self-similarity
Visualizing Structure • High-level segmentation based on novelty score
Tempo Foote & Cooper. ICMC 2001 Beat Spectrum Diagonal Sums Autocorrelation
Haitsma & Kalker. ISMIR 2002. Identifying Identical Audio • Segmentation – 0. 37 second frames – Overlapping by 31/32 • FFT – Band Division – Energy computed for 33 non-overlapping logarithmically spaced frequency bands (300 -2000 Hz) – E(n, m) = energy of band m of frame n.
Haitsma & Kalker. ISMIR 2002. Identifying Identical Audio 2 • 32 -bit sub-fingerprint represents increase/decrease between neighboring frequency bands and frame n-1 n … Time (Frames) F(n, m) = [E(n-1, m+1) + E(n, m)] -[E(n-1, m) + E(n, m+1)] > 0 Frequency Bands m m+1 … 33 257 + + -
Haitsma & Kalker. ISMIR 2002. Identifying Identical Audio 3 • Similarity is the bit error rate (BER) between two fingerprints • Approximately 3 seconds of audio • 256 X 32 -bit = 1 KB per fingerprint.
Aucouturier & Klapuri. ISMIR 2002. Timbre Similarity • Timbre = “Color” of sound • Timbre = Type of instrument, voice • Similarity decreases in order: – Same recording – Same artist – Same genre • Useful for finding different live performances of the same song by an artist
Aucouturier & Klapuri. ISMIR 2002. Timbre Similarity 2 • Timbre Features – Low-order MFCCs account for timbre. – Hi-order MFCCs account for pitch. – Only use first 8 MFCCs (out of 13). • Feature Extraction: – Segment signal into 0. 05 sec. non-overlapping frames – Compute first 8 MFCCs for each frame. – Yields ~3600 features (28, 800 scalars) per song
Aucouturier & Klapuri. ISMIR 2002. Timbre Similarity 3 • Gaussian Mixture Model (GMM) – Approximates the distribution of features as the sum of M Gaussian distributions –M=3 • Learn timbre model for each song • Timbre similarity between song A and song B is the likelihood that the model for song A generated the features in song B.
Timbre Similarity Examples • http: //www. csl. sony. fr/~jj/Timbre/timbre. html
Audio Textures Lu et. al. ICASSP 2002 • Generate new audio given examples • Analysis – Segment into frames – Extract MFCCs – Similarity • Window Weighted Cosine Distance – Transition probabilities proportional to exponential similarity – Segment into sub-clips according to novelty score
References • • Aucouturier, J-J. , and Klapuri, A. (2002). "Music Similarity Measures: What's the Use? ". Proc. of Int'l Conference on Music Information Retrieval, 3, (pp. 157 -163). PDF Foote, J. and Cooper, M. (2001). "Visualizing Musical Structure and Rhythm via Self. Similarity. " Proc. of Int'l Computer Music Conference, 27, (pp. 419 -422). PDF Haitsma, J. and Kalker, T. (2002). "A Highly Robust Audio Fingerprinting System. " Proc. of Int'l Conference on Music Information Retrieval, 3, (pp. 107 -115). PDF Lu, L. , Li, S. , Liu, W. , AND Zhang, H. (2002). “Audio Textures. ” Proc. of IEEE Int’l Conference on Acoustics, Speech and Signal Processing. PDF Paulus, J. & Klapuri, A. (2002). Measuring the Similarity of Rhythmic Patterns. Proc. of the International Conference on Music Information Retrieval, 3, (pp. 150 -156). Paris: IRCAM Centre Pompidou. PDF Scheirer, E. (1998). "Tempo and Beat Analysis of Acoustic Musical Signals. ” Journal of the Acoustical Society of America, 103(1), 588 -601. PDF Tzanetakis, G. , Essl, G. , & Cook, P. (2001). Audio Analysis using the Discrete. Wavelet Transform. Proc. of WSES International Conference on Acoustics and Music: Theory and Applications. PDF Tzanetakis, G. , Ermolinskiy, A. and Cook, P. (2002). "Pitch Histograms in Audio and Symbolic Music Information Retrieval. " Proc. of Int'l Conference on Music Information Retrieval, 3, (pp. 31 -38). PDF
- Tuesdays with morrie the curriculum
- Eifionydd r williams parry
- How to get through a job interview when you stutter
- Microwave optics
- Translate
- Noise
- Image compression in digital image processing
- Key stage in digital image processing
- Objective fidelity criteria
- Image sharpening in digital image processing
- Geometric transformation in digital image processing
- Steps of image processing
- Image transforms in digital image processing
- Image geometry in digital image processing
- Noise
- High boost filtering matlab
- Point processing in image processing example
- Histogram processing in digital image processing
- A generalization of unsharp masking is
- Point processing in image processing
- Thinning and thickening in image processing example
- Similar disuelve a similar
- Unidad de medida de solubilidad
- Similar
- Precipitancy creates prodigality
- Mitch thornton
- Mitch mandich
- Mitch gusat
- Mitch and amy
- Mitch
- Gaudy seed bearer
- "this may seem like a pretty boring topic, yet mitch"
- Spore pivi
- Symbols in tuesdays with morrie
- Mitch denny
- Algoritma scratch
- Tuesday with morrie questions
- Mitch begelman