Developement and Implementation of an MPEG 1 Layer

What is MPEG 1 Layer III ? • • • Frequently referred to as

Standard MPEG 1 • 3 possible compression types (increasing complexity): – Layer III •

BITSTREAM FORMAT • Whole bitstream is divided into frames of defined length: framesize =

FRAME HEADER • Syncword = 12 bits put to ‘ 1’ • ID =

SIDE INFORMATION • Length depends on number of channels • 17 bytes for single

BIT RESERVOIR It is one of the most important features of Layer III format

MAIN DATA • SCALEFACTORS • informations in the Side Information section • HUFFMAN CODED

DECODING STEPS • SYNCHRONIZATION • HEADER DECODING • SKIPPING CRC (if present) • SIDE

HUFFMAN DECODING • Lossless - type coding / decoding • Fixed – variable •

HUFFMAN DECODING • Big Values • Region 0 • Region 1 • Region 2

HUFFMAN DECODING Couple of f. lines ( big-values ) Quadruple of f. lines (

HUFFMAN DECODING • CLUSTERED HUFFMAN DECODING (R. Hashemian ) • Compromise between binary-tree and

HUFFMAN DECODING Example Clustered Table 1 Address HIT / MISS 0 MISS 1 HIT

REQUANTIZATION (DESCALING) The Huffman decoded frequency lines are restored to their original values according

REQUANTIZATION (DESCALING) • Use of large look-up table with all possible values of modulus

REQUANTIZATION (DESCALING) • Shift – based power computing (T. Uželac ) Requantization has to

REQUANTIZATION (DESCALING) • shift operations • 2 small look-up tables (total of 32 Bytes)

STEREO PROCESSING • INTENSITY STEREO In the critical bands higher than 2 k. Hz,

STEREO PROCESSING There are 4 different typologies of transmission for stereophonic signals (according to

STEREO PROCESSING • MIDDLE/SIDE STEREO Left and Right channels are simply reconstructed according to:

REORDERING It is performed only when using short blocks: this is due to the

Slides: 23

Download presentation

Developement and Implementation of an MPEG 1 Layer III Decoder on x 86 and TMS 320 C 6711 platforms Braidotti Enrico (Farina Simone)

What is MPEG 1 Layer III ? • • • Frequently referred to as “MP 3” Method to store compressed audio (LOSSY ) Developed by Moving Pictures Expert Group (MPEG) Standard ISO/IEC 11172 -3 (Audio Part 3), 1991 Compression rate w/out recognizeable quality loss up to 12 x • Last release of MPEG 1 family: – Highest complexity – Provides best quality

Standard MPEG 1 • 3 possible compression types (increasing complexity): – Layer III • Sampling frequencies for Layer III: – 32 k. Hz – 44. 1 k. Hz – 48 k. Hz • Bitrates: – Min 32 kbit/s – Max 320 kbit/s Compact Disc: 1. 41 Mbit/s

BITSTREAM FORMAT • Whole bitstream is divided into frames of defined length: framesize = 144· bitrate / sampling frequency + padding • (bytes) Frames are divided in 2 granules and are composed by different parts: • Header • CRC (optional) • Side Information • Main data • Ancillary data (optional)

FRAME HEADER • Syncword = 12 bits put to ‘ 1’ • ID = 1 for MPEG 1 Audio (2 bits used for MPEG 2 and 2. 5) • Padding = to adjust framesize (and effective bitrate of CBR files)

SIDE INFORMATION • Length depends on number of channels • 17 bytes for single channel • 32 bytes for others • Contains all necessary informations for decoding the Main data section • Main structure is:

BIT RESERVOIR It is one of the most important features of Layer III format and it works as follows (use of main_data_end ):

MAIN DATA • SCALEFACTORS • informations in the Side Information section • HUFFMAN CODED DATA • extraction of scaled frequency lines (not ordered in some cases)

DECODING PROCESS

DECODING STEPS • SYNCHRONIZATION • HEADER DECODING • SKIPPING CRC (if present) • SIDE INFO DECODING • SCALEFACTORS DECODING

HUFFMAN DECODING • Lossless - type coding / decoding • Fixed – variable • Based on 18 Huffman Tables (specific for MPEG 1) • Codewords up to 19 -bit long • Tables up to 256 values

HUFFMAN DECODING • Big Values • Region 0 • Region 1 • Region 2 • Count 1 • RZero

HUFFMAN DECODING Couple of f. lines ( big-values ) Quadruple of f. lines ( count 1 )

HUFFMAN DECODING • CLUSTERED HUFFMAN DECODING (R. Hashemian ) • Compromise between binary-tree and direct look-up decoding • Custom made Huffman tables containing 16 -bit words • Structure of words depend on HIT / MISS:

HUFFMAN DECODING Example Clustered Table 1 Address HIT / MISS 0 MISS 1 HIT 10 MISS 11 New Bits Address 10 x y 1 0 0 HIT 1 0 100 HIT 1 1 101 HIT 0 1 100 1 Huffman Table 1 x 0 0 y len codeword 0 1 1 1 3 001 1. 1 0 2 01 2. 1 1 3 000

REQUANTIZATION (DESCALING) The Huffman decoded frequency lines are restored to their original values according to the following formulas:

REQUANTIZATION (DESCALING) • Use of large look-up table with all possible values of modulus of Huffman decoded data (0 → 15 + 213 = 8206) • pros: speed, accuracy • cons: memory requirements (32 KByte with float precision) Reduced Look-up table • pros: table is 87. 5 % smaller (4 KByte with float precision) • cons: speed (need to calculate is· 0. 125), accuracy

REQUANTIZATION (DESCALING) • Shift – based power computing (T. Uželac ) Requantization has to be done up to 2304 times each frame, direct computation of: would require too many clock cycles

REQUANTIZATION (DESCALING) • shift operations • 2 small look-up tables (total of 32 Bytes) scale = scalefac_scale + 1; a = global_gain - 210 - (scalefac_long << scale); if (preflag) a -= (pretab << scale); tab contains values: if (a < -127) y = 0; [20, 21/4, 21/2, 23/4] if (a >= 0) y = tab[a&3]*(1 << (a >> 2)); tabi contains values: else y = tabi[(-a)&3]/(1 << ((-a) >> 2)); [20, 2 -1/4, 2 -1/2, 2 -3/4]

STEREO PROCESSING • INTENSITY STEREO In the critical bands higher than 2 k. Hz, the sensation of stereo is given mainly by the envelope of the signal. The encoder codes only one sum - like signal and the decoder extracts separate L and R with different scalefactors • MIDDLE/SIDE STEREO Encoding of the Middle (L+R) and Side (L-R) signals for reducing redundant elements

STEREO PROCESSING There are 4 different typologies of transmission for stereophonic signals (according to mode_extension, found in the header ):

STEREO PROCESSING • MIDDLE/SIDE STEREO Left and Right channels are simply reconstructed according to: • INTENSITY STEREO Values are read from the Rzero part of Left channel and IS positions is_pos (sfb ) are read from scalefactors of right channel:

REORDERING It is performed only when using short blocks: this is due to the way the MDCT in the encoder arranges the output lines.