Developement and Implementation of an MPEG 1 Layer

  • Slides: 23
Download presentation
Developement and Implementation of an MPEG 1 Layer III Decoder on x 86 and

Developement and Implementation of an MPEG 1 Layer III Decoder on x 86 and TMS 320 C 6711 platforms Braidotti Enrico (Farina Simone)

What is MPEG 1 Layer III ? • • • Frequently referred to as

What is MPEG 1 Layer III ? • • • Frequently referred to as “MP 3” Method to store compressed audio (LOSSY ) Developed by Moving Pictures Expert Group (MPEG) Standard ISO/IEC 11172 -3 (Audio Part 3), 1991 Compression rate w/out recognizeable quality loss up to 12 x • Last release of MPEG 1 family: – Highest complexity – Provides best quality

Standard MPEG 1 • 3 possible compression types (increasing complexity): – Layer III •

Standard MPEG 1 • 3 possible compression types (increasing complexity): – Layer III • Sampling frequencies for Layer III: – 32 k. Hz – 44. 1 k. Hz – 48 k. Hz • Bitrates: – Min 32 kbit/s – Max 320 kbit/s Compact Disc: 1. 41 Mbit/s

BITSTREAM FORMAT • Whole bitstream is divided into frames of defined length: framesize =

BITSTREAM FORMAT • Whole bitstream is divided into frames of defined length: framesize = 144· bitrate / sampling frequency + padding • (bytes) Frames are divided in 2 granules and are composed by different parts: • Header • CRC (optional) • Side Information • Main data • Ancillary data (optional)

FRAME HEADER • Syncword = 12 bits put to ‘ 1’ • ID =

FRAME HEADER • Syncword = 12 bits put to ‘ 1’ • ID = 1 for MPEG 1 Audio (2 bits used for MPEG 2 and 2. 5) • Padding = to adjust framesize (and effective bitrate of CBR files)

SIDE INFORMATION • Length depends on number of channels • 17 bytes for single

SIDE INFORMATION • Length depends on number of channels • 17 bytes for single channel • 32 bytes for others • Contains all necessary informations for decoding the Main data section • Main structure is:

BIT RESERVOIR It is one of the most important features of Layer III format

BIT RESERVOIR It is one of the most important features of Layer III format and it works as follows (use of main_data_end ):

MAIN DATA • SCALEFACTORS • informations in the Side Information section • HUFFMAN CODED

MAIN DATA • SCALEFACTORS • informations in the Side Information section • HUFFMAN CODED DATA • extraction of scaled frequency lines (not ordered in some cases)

DECODING PROCESS

DECODING PROCESS

DECODING STEPS • SYNCHRONIZATION • HEADER DECODING • SKIPPING CRC (if present) • SIDE

DECODING STEPS • SYNCHRONIZATION • HEADER DECODING • SKIPPING CRC (if present) • SIDE INFO DECODING • SCALEFACTORS DECODING

HUFFMAN DECODING • Lossless - type coding / decoding • Fixed – variable •

HUFFMAN DECODING • Lossless - type coding / decoding • Fixed – variable • Based on 18 Huffman Tables (specific for MPEG 1) • Codewords up to 19 -bit long • Tables up to 256 values

HUFFMAN DECODING • Big Values • Region 0 • Region 1 • Region 2

HUFFMAN DECODING • Big Values • Region 0 • Region 1 • Region 2 • Count 1 • RZero

HUFFMAN DECODING Couple of f. lines ( big-values ) Quadruple of f. lines (

HUFFMAN DECODING Couple of f. lines ( big-values ) Quadruple of f. lines ( count 1 )

HUFFMAN DECODING • CLUSTERED HUFFMAN DECODING (R. Hashemian ) • Compromise between binary-tree and

HUFFMAN DECODING • CLUSTERED HUFFMAN DECODING (R. Hashemian ) • Compromise between binary-tree and direct look-up decoding • Custom made Huffman tables containing 16 -bit words • Structure of words depend on HIT / MISS:

HUFFMAN DECODING Example Clustered Table 1 Address HIT / MISS 0 MISS 1 HIT

HUFFMAN DECODING Example Clustered Table 1 Address HIT / MISS 0 MISS 1 HIT 10 MISS 11 New Bits Address 10 x y 1 0 0 HIT 1 0 100 HIT 1 1 101 HIT 0 1 100 1 Huffman Table 1 x 0 0 y len codeword 0 1 1 1 3 001 1. 1 0 2 01 2. 1 1 3 000

REQUANTIZATION (DESCALING) The Huffman decoded frequency lines are restored to their original values according

REQUANTIZATION (DESCALING) The Huffman decoded frequency lines are restored to their original values according to the following formulas:

REQUANTIZATION (DESCALING) • Use of large look-up table with all possible values of modulus

REQUANTIZATION (DESCALING) • Use of large look-up table with all possible values of modulus of Huffman decoded data (0 → 15 + 213 = 8206) • pros: speed, accuracy • cons: memory requirements (32 KByte with float precision) Reduced Look-up table • pros: table is 87. 5 % smaller (4 KByte with float precision) • cons: speed (need to calculate is· 0. 125), accuracy

REQUANTIZATION (DESCALING) • Shift – based power computing (T. Uželac ) Requantization has to

REQUANTIZATION (DESCALING) • Shift – based power computing (T. Uželac ) Requantization has to be done up to 2304 times each frame, direct computation of: would require too many clock cycles

REQUANTIZATION (DESCALING) • shift operations • 2 small look-up tables (total of 32 Bytes)

REQUANTIZATION (DESCALING) • shift operations • 2 small look-up tables (total of 32 Bytes) scale = scalefac_scale + 1; a = global_gain - 210 - (scalefac_long << scale); if (preflag) a -= (pretab << scale); tab contains values: if (a < -127) y = 0; [20, 21/4, 21/2, 23/4] if (a >= 0) y = tab[a&3]*(1 << (a >> 2)); tabi contains values: else y = tabi[(-a)&3]/(1 << ((-a) >> 2)); [20, 2 -1/4, 2 -1/2, 2 -3/4]

STEREO PROCESSING • INTENSITY STEREO In the critical bands higher than 2 k. Hz,

STEREO PROCESSING • INTENSITY STEREO In the critical bands higher than 2 k. Hz, the sensation of stereo is given mainly by the envelope of the signal. The encoder codes only one sum - like signal and the decoder extracts separate L and R with different scalefactors • MIDDLE/SIDE STEREO Encoding of the Middle (L+R) and Side (L-R) signals for reducing redundant elements

STEREO PROCESSING There are 4 different typologies of transmission for stereophonic signals (according to

STEREO PROCESSING There are 4 different typologies of transmission for stereophonic signals (according to mode_extension, found in the header ):

STEREO PROCESSING • MIDDLE/SIDE STEREO Left and Right channels are simply reconstructed according to:

STEREO PROCESSING • MIDDLE/SIDE STEREO Left and Right channels are simply reconstructed according to: • INTENSITY STEREO Values are read from the Rzero part of Left channel and IS positions is_pos (sfb ) are read from scalefactors of right channel:

REORDERING It is performed only when using short blocks: this is due to the

REORDERING It is performed only when using short blocks: this is due to the way the MDCT in the encoder arranges the output lines.