Audio Compression Techniques MUMT 611 January 2005 Assignment
- Slides: 26
Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik 1
Introduction n Digital Audio Compression ¨ Removal of redundant or otherwise irrelevant information from audio signal ¨ Audio compression algorithms are often referred to as “audio encoders” n Applications ¨ Reduces required storage space ¨ Reduces required transmission bandwidth 2
Audio Compression n Audio signal – overview ¨ Sampling rate (# of samples per second) ¨ Bit rate (# of bits per second). Typically, uncompressed stereo 16 -bit 44. 1 KHz signal has a 1. 4 MBps bit rate ¨ Number of channels (mono / stereo / multichannel) n Reduction by lowering those values or by data compression / encoding 3
Audio Data Compression n Redundant information ¨ Implicit in the remaining information ¨ Ex. oversampled audio signal n Irrelevant information ¨ Perceptually insignificant ¨ Cannot be recovered from remaining information 4
Audio Data Compression n Lossless Audio Compression ¨ Removes redundant data ¨ Resulting signal is same as original – perfect reconstruction n Lossy Audio Encoding ¨ Removes irrelevant data ¨ Resulting signal is similar to original 5
Audio Data Compression n Audio vs. Speech Compression Techniques ¨ Speech Compression uses a human vocal tract model to compress signals ¨ Audio Compression does not use this technique due to larger variety of possible signal variations 6
Generic Audio Encoder 7
Generic Audio Encoder n Psychoacoustic Model ¨ Psychoacoustics – study of how sounds are perceived by humans ¨ Uses perceptual coding n eliminate information from audio signal that is inaudible to the ear ¨ Detects conditions under which different audio signal components mask each other 8
Psychoacoustic Model n Signal Masking ¨ Threshold cut-off ¨ Spectral (Frequency / Simultaneous) Masking ¨ Temporal Masking n Threshold cut-off and spectral masking occur in frequency domain, temporal masking occurs in time domain 9
Signal Masking n Threshold cut-off ¨ Hearing threshold level – a function of frequency ¨ Any frequency components below the threshold will not be perceived by human ear 10
Signal Masking n Spectral Masking ¨A frequency component can be partly or fully masked by another component that is close to it in frequency ¨ This shifts the hearing threshold 11
Signal Masking n Temporal Masking ¨A quieter sound can be masked by a louder sound if they are temporally close ¨ Sounds that occur both (shortly) before and after volume increase can be masked 12
Spectral Analysis n Tasks of Spectral Analysis ¨ To derive masking thresholds to determine which signal components can be eliminated ¨ To generate a representation of the signal to which masking thresholds can be applied n Spectral Analysis is done through transforms or filter banks 13
Spectral Analysis n Transforms ¨ Fast Fourier Transform (FFT) ¨ Discrete Cosine Transform (DCT) - similar to FFT but uses cosine values only ¨ Modified Discrete Cosine Transform (MDCT) [used by MPEG-1 Layer-III, MPEG-2 AAC, Dolby AC-3] – overlapped and windowed version of DCT 14
Spectral Analysis n Filter Banks ¨ Time sample blocks are passed through a set of bandpass filters ¨ Masking thresholds are applied to resulting frequency subband signals ¨ Poly-phase and wavelet banks are most popular filter structures 15
Filter Bank Structures n Polyphase Filter Bank [used in all of the MPEG-1 encoders] ¨ Signal is separated into subbands, the widths of which are equal over the entire frequency range ¨ The resulting subband signals are downsampled to create shorter signals (which are later reconstructed during decoding process) 16
Filter Bank Structures n Wavelet Filter Bank [used by Enhanced Perceptual Audio Coder (EPAC) by Lucent] ¨ Unlike polyphase filter, the widths of the subbands are not evenly spaced (narrower for higher frequencies) ¨ This allows for better time resolution (ex. short attacks), but at expense of frequency resolution 17
Noise Allocation n System Task: derive and apply shifted hearing threshold to the input signal ¨ Anything below the threshold doesn’t need to be transmitted ¨ Any noise below the threshold is irrelevant n Frequency component quantization ¨ Tradeoff between space and noise ¨ Encoder saves on space by using just enough bits for each frequency component to keep noise under the threshold - this is known as noise allocation 18
Noise Allocation n Pre-echo ¨ In case a single audio block contains silence followed by a loud attack, pre-echo error occurs - there will be audible noise in the silent part of the block after decoding ¨ This is avoided by pre-monitoring audio data at encoding stage and separating audio into shorter blocks in potential pre-echo case ¨ This does not completely eliminate pre-echo, but can make it short enough to be masked by the attack (temporal masking) 19
Pre-echo Effect 20
Additional Encoding Techniques n Other encoding techniques are available (alternative or in combination) ¨ Predictive Coding ¨ Coupling / Delta Encoding ¨ Huffman Encoding 21
Additional Encoding Techniques n Predictive Coding ¨ Often used in speech and image compression ¨ Estimates the expected value for each sample based on previous sample values ¨ Transmits/stores the difference between the expected and received value ¨ Generates an estimate for the next sample and then adjusts it by the difference stored for the current sample ¨ Used for additional compression in MPEG 2 AAC 22
Additional Encoding Techniques n Coupling / Delta encoding ¨ Used in cases where audio signal consists of two or more channels (stereo or surround sound) ¨ Similarities between channels are used for compression ¨ A sum and difference between two channels are derived; difference is usually some value close to zero and therefore requires less space to encode ¨ This is a case of lossless encoding process 23
Additional Encoding Techniques n Huffman Coding ¨ Information-theory-based technique ¨ An element of a signal that often reoccurs in the signal is represented by a simpler symbol, and its value is stored in a look-up table ¨ Implemented using a look-up tables in encoder and in decoder ¨ Provides substantial lossless compression, but requires high computational power and therefore is not very popular ¨ Used by MPEG 1 and MPEG 2 AAC 24
Encoding - Final Stages Audio data packed into frames n Frames stored or transmitted n 25
Conclusion n HTML Bibliography http: //www. music. mcgill. ca/~pkoles n Questions 26
- Audio compression techniques
- Www.tccard.transitcenter.com
- What does 611 mean in the bible
- Cs 611
- Csce 611
- Tca domestic assault
- Compression test is carried on which material
- 5 schedule compression techniques
- Video compression techniques
- Principles of audio-lingual method
- Visual techniques
- Les fonctions techniques et les solutions techniques
- He was born in salzburg austria on january 27 1756
- Zodiac for january 20
- January 4 1643
- February and march season
- January february maruary
- Chemistry regents january 2018 answers
- Kent chemistry reference table
- January 2012 chemistry regents answers
- Life of a plant poem by risa jordan
- 2019 ib boundaries
- Respect traits
- Paula hurlock birthday
- January 27, 1756
- January 27 1756
- January february spelling