Audio Digital Audio Audio comes from different sources

  • Slides: 19
Download presentation
Audio

Audio

Digital Audio § Audio comes from different sources: – Speech. – Sounds of instruments,

Digital Audio § Audio comes from different sources: – Speech. – Sounds of instruments, Music. – Sounds of all other kinds (the sound of wind, train and ocean). § Audio needs new methods for coding and processing. § Audio processing is a key task in multimedia systems – – Audio coding (MPEG audio, mp 3, AAC and others) Authoring and representation (composition) Analysis and searching (retrieval and database) 3 D sound, etc. § We will focus on basic audio processing, MPEG audio and related topics. CS 335 Principles of Multimedia Systems

Audio Processing § Audio authoring Audio file formats: Waveform files and MIDI: Musical Instrument

Audio Processing § Audio authoring Audio file formats: Waveform files and MIDI: Musical Instrument Digital Interface. Instead of storing the waveform samples, MIDI file has a sequence of commands to control an audio device to generate a specified note with given properties. CS 335 Principles of Multimedia Systems

Audio Processing Using Matlab § To load a wave in Windows: audat = wavread(‘filename.

Audio Processing Using Matlab § To load a wave in Windows: audat = wavread(‘filename. wav’); Or, directly open the file and load a stream of “words” (2 bytes) or bytes depending on the wav format. § To play a sound, use sound(audat, samplingrate). § To display the spectrogram, use specgram. § Audio analysis are done in frames of 20 ms – 40 ms long. CS 335 Principles of Multimedia Systems

Frequency Domain Analysis § Fourier transform can be used to decompose any signal into

Frequency Domain Analysis § Fourier transform can be used to decompose any signal into summation of sinusoidal waves. § In Matlab, we can use fft (Fast Fourier Transform) for frequency domain analysis. T The time domain waveform Base frequency ¼ 1/T CS 335 Principles of Multimedia Systems The frequency Domain components.

MP 3 and Others § MPEG (Motion Picture Expert Group) and ISO (International Standard

MP 3 and Others § MPEG (Motion Picture Expert Group) and ISO (International Standard Organization) have published several standards about digital audio coding. – MPEG-1 Layer 1, 2 and 3 (MP 3) – MPEG 2 AAC – MPEG 4 AAC and Twin. VQ § Other standards – Dolby AC 3 § They have been widely used in consumer electronics, digital audio broadcasting, DVD and movies etc. CS 335 Principles of Multimedia Systems

Perceptual Coding in MPEG audio Encoder FFT Masking Threshold MUX Bit stream Dynamic bit

Perceptual Coding in MPEG audio Encoder FFT Masking Threshold MUX Bit stream Dynamic bit allocation Encoder audio Dynamic bit allocation CS 335 Principles of Multimedia Systems

Simultaneous Masking § A strong audio component can mask its nearby frequency components. d.

Simultaneous Masking § A strong audio component can mask its nearby frequency components. d. B Sound pressure level 20 Masker Masking threshold Threshold in quiet 1000 CS 335 Principles of Multimedia Systems 20000 Hz

Masking and Quantization Masker d. B Sound pressure level Signal To mask ratio m+1

Masking and Quantization Masker d. B Sound pressure level Signal To mask ratio m+1 -bit quantizer SNR Minimum masking threshold for band A. m-bit quantizer SNR 20 Critical band A Neighbor critical band 20000 Hz A critical band defines the “resolution” of the hearing at some frequency location. CS 335 Principles of Multimedia Systems

Temporal Masking Amplitude Pre-masking curve Post-masking curve time CS 335 Principles of Multimedia Systems

Temporal Masking Amplitude Pre-masking curve Post-masking curve time CS 335 Principles of Multimedia Systems

MPEG Perceptual Model § A matlab demo. CS 335 Principles of Multimedia Systems

MPEG Perceptual Model § A matlab demo. CS 335 Principles of Multimedia Systems

MPEG Audio Layer 1 § MPEG (1 and 2) audio allows sampling rate at

MPEG Audio Layer 1 § MPEG (1 and 2) audio allows sampling rate at 44. 1 48, 32, 22. 05, 24 and 16 KHz. § MPEG filters the input audio into 32 bands. Audio 384 samples 12 samples Filtering And downsampling Normalize By scale factor 12 samples CS 335 Principles of Multimedia Systems Perceptual coder

MPEG Audio Layer 2 § Layer 2 is very similar to Layer 1, but

MPEG Audio Layer 2 § Layer 2 is very similar to Layer 1, but groups 3 12 samples together in coding. § It also improves the scaling factor quantization and also groups 3 audio samples together in bit assignment. Audio 3 x 384 samples 36 samples Filtering And downsampling Normalize By scale factor 36 samples CS 335 Principles of Multimedia Systems Perceptual coder

Overlapped Transform and MDCT Window 1 Window 3 2 N Window 2 Window 4

Overlapped Transform and MDCT Window 1 Window 3 2 N Window 2 Window 4 In overlapped transform, 2 N samples are transformed to N elements. 1 In reverse Transform: + 3 2 4 Reconstructed result. CS 335 Principles of Multimedia Systems

Some Matlab Codes § The program compares DCT and MDCT in audio processing. §

Some Matlab Codes § The program compares DCT and MDCT in audio processing. § Code is available on the course website as a tar ball mdct_and_dct. tar. CS 335 Principles of Multimedia Systems

MP 3 § MP 3 is another layer built on top of MPEG audio

MP 3 § MP 3 is another layer built on top of MPEG audio layer 2. § MP 3 further does MDCT on each band tries to encode the MDCT coefficients. § MP 3 then uses Huffman coding to further compress the bit streams losslessly. CS 335 Principles of Multimedia Systems

File Format Mpeg audio puts header in each of the frame, so that they

File Format Mpeg audio puts header in each of the frame, so that they can be decoded separately. Header CRC Bit Allocation Scale factors Frame 1 Subband Data Header CRC Bit Allocation Scale factors Frame 2 CS 335 Principles of Multimedia Systems Subband Data

Other Audio Coding Standards § MPEG 2 and MPEG 4 ACC (advanced audio coding)

Other Audio Coding Standards § MPEG 2 and MPEG 4 ACC (advanced audio coding) – Not backward compatible – Use MDCT without bandpass filtering § Dolby AC 3 – MDCT based codec – Similar to MPEG ACC but uses a different quantization and coding scheme – A de-facto standard for DVD and Digital audio in Movie. CS 335 Principles of Multimedia Systems

Realtime Audio Systems Write pointer Audio I/O Process Read pointer Audio input circular queue

Realtime Audio Systems Write pointer Audio I/O Process Read pointer Audio input circular queue Audio Processing Unit Audio output circular queue CS 335 Principles of Multimedia Systems