CSc 461561 Multimedia Systems Part B 2 Lossy

CSc 461/561 Multimedia Systems Part B: 2. Lossy Compression CSc 461/561

Summary (1) Why is lossy compression possible? (2) Distortion measure (3) Quantization (4) Transformation (5) Introduction to JPEG- Part I (6) Introduction to MPEG-Part I CSc 461/561

1. Why is lossy compression possible? – some information is more important than others for human – keep the important one Original Compression Ratio: 7. 7 CSc 461/561 Compression Ratio: 12. 3 Compression Ratio: 33. 9

2. Distortion measure • Rate Distortion A – # of bits per source symbol • Distortion – one measure: mean square error (MSE) – x: original value; y: reconstructed value – MSE = [(x 1 -y 1)2+(x 2 -y 2)2+…+(x. N-y. N)2]/N • Rate vs distortion – lower rate, higher distortion CSc 461/561 B Rate

3. Quantization (1) • Quantization (recall audio A/D) – use a discrete value to represent a value range – information loss! • The smaller range, the less distortion – granular distortion • Quantization steps – uniform: all ranges have the same size – non-uniform: otherwise CSc 461/561

3. Uniform quantization (2) • Quantization step: uniform • Two constructions: midrise, midtread Uniform Midrise Quantizer 3. 5∆ 2. 5∆ 1. 5∆ -3∆ -2∆ -∆ -0. 5∆ Reconstruction 3∆ 0. 5 ∆ ∆ 2∆ 3∆ Input -1. 5∆ -2. 5∆ -3. 5∆ CSc 461/561 Uniform Midtread Quantizer 2∆ ∆ -2. 5∆ -1. 5∆ -0. 5∆ 1. 5∆ 2. 5∆ Input -∆ -2∆ -3∆

3. Signal-to-quantization-noise ratio (3) • Quantization – n bits; 2 n steps for [-Xmax, Xmax] – step size: delta = 2 Xmax / 2 n – granular distortion: • SQNR in d. B – 10 log 10 signal_energy / noise_energy =10 log 10 [(2 Xmax)2/12]/[delta 2/12]=20 n log 102 • One more bit adds 6 d. B to SQNR CSc 461/561

3. Non-uniform quantization (4) • Recall u-law or A-law voice compander • How to choose quantization steps? – Int xi+1 xi f(x) dx = 1/2 n f(x) Non-uniform Uniform x CSc 461/561 0 xi xi+1 x 0 xi xi+1

3. Non-uniform quantization: more (5) • How to represent a range? yi – Intx f(x) dx = 1/2 n+1 i – when uniform: yi=(xi+xi+1)/2 f(x) Non-uniform Uniform x CSc 461/561 0 xi xi+1 yi x 0 xi yi xi+1

4. Transformation (1) • Transformation – represent information in anther space • identify and remove (hard-to-remove) correlation, i. e. , redundancy, in the original space • information loss! – e. g. , time/space => frequency (FFT) • Inverse transformation – represent the info back in the original space CSc 461/561

4. Discrete Cosine Transform (2) • Recall: a wave is of many waves • “Any signal can be expressed as a sum of multiple signals that are sine or cosine waveforms at various amplitudes and frequencies. ” • Cosine transform: using cosine waveforms • DCT: integer indexes – widely used in image compression (e. g. , JPEG) CSc 461/561

4. DCT: more (3) • 2 -D DCT (8 x 8); C(x)=1/sqrt(2) when x=0 • Inverse 2 -D DCT (IDCT); C(x)=1 otherwise CSc 461/561

4. DCT: examples (4) DC Component Original values of an 8 x 8 block (in spatial domain) CSc 461/561 Corresponding DCT coefficients (in frequency domain)

5. Introduction to JPEG-Part I (1) • Joint Photographic Experts Group (JPEG) – ISO standard (1992) – widely used (. jpeg, . jpe, . jpg; C/R: 10~20) • The family of JPEGs – lossless JPEG: prediction-based compression – lossy JPEG: DCT-based compression – M-JPEG: motion JPEG – JPEG 2000: discrete wavelet transform; new! CSc 461/561

5. Introduction to JPEG-Part I (2) JPEG compression guidelines – Brightness vs color sensitivity • RGB => YUV/YIQ • chroma subsampling (4: 2: 0) – Spatial correlation among nearby pixels • slice an image into 8 x 8 blocks (bad for text) – Remove redundancy in frequency domain • discrete cosine transform (DCT) • coarse quantization for high freq coefficients CSc 461/561

5. Introduction to JPEG-Part I (3) • Sequential mode • Progressive mode – low quality first, then differential data added • DC first, then AC; or MSB first, then LSB • Hierarchical mode – lowest resolution first and then higher resolutions • Lossless mode – prediction and entropy encoding CSc 461/561

5. Introduction to JPEG-Part I (4) • We will revisit the topic later. CSc 461/561

6. Introduction to MPEG-Part I (1) • MPEG-1 (1991): VCD (VCR+CD quality) – 352 x 240, 1. 2 Mbps video CBR, 256 Kbps audio – progressive scan only (1 x CD-ROM) • MPEG-1 video compression – similar to H. 261, with a few differences • more formats, flexible slices, quantization table – I-frame: JPEG-like compression – P-frame: prediction-based; B-frame CSc 461/561

6. Introduction to MPEG-Part I (2) MPEG-1: more • Bi-directional search – search both previous and next frames for similar macro-blocks • MPEG-1 GOP 1 2 3 I B B – I-frame, P-frame, B-frame 4 P 5 B 6 B 7 P • display order: IBBPBBPBBI (M=3, N=15) • coding order: IPBBPBBIBB; timestamps – D-frame: for search through the video, DC only CSc 461/561 8 9 B B

6. Introduction to MPEG-Part I (3) MPEG-2 • MPEG-2 (1994): DVD, HDTV, etc – also adopted as ITU-T H. 262 – many video formats and data rates; better audio • profiles: simple (4: 2: 0, I/P), main (+B), SNR (+variable quality), spatial (+variable resolution), high (+4: 2: 2) • levels: low (352 x 288), main (720 x 576), high 1440 (1440 x 1152), high (1920 x 1152) – support interlaced video (broadcasting!) CSc 461/561

6. Introduction to MPEG-Part I (4) MPEG-2 scalability • Layered encoding – base layer: independent for basic quality – enhancement layer: dependent on the base layer • E. g. , SNR scalability – base: low SQNR (coarse quantization) – enhance: high SQNR (fine Q on actual-base) • E. g. , spatial scalability – base: low resolution; enhance: high resolution CSc 461/561

6. Introduction to MPEG-Part I (5) MPEG-4 • MPEG-4 (1999): content-based, object-oriented – based on H. 263, initially for low bit-rate apps – video sequence: a collection of media objects • objects: still image, moving object, audio, etc • how to decompose is NOT specified (encoder) – VOP: video object plane • GOV: I-VOP, P-VOP, B-VOP • VOP is divided into many macro-blocks – motion estimation: bounding box; padding CSc 461/561

6. Introduction to MPEG-Part I (5): MPEG-4: object oriented CSc 461/561

6. Introduction to MPEG-Part I (6) MPEG-4: more • Fine gain scalability – spatial scalability – temporal scalability – quality scalability • MPEG-4 audio – general audio (2~64 Kbps) – speech (2~4 Kbps: HVXC; 4~24 Kbps: CELP) – synthesized (e. g. , MIDI, TTS) CSc 461/561

6. Introduction to MPEG-Part I (7) • We will revisit the topic later. CSc 461/561