Discrete Cosine Transform and Image Compression MTHCSC 421

  • Slides: 49
Download presentation
Discrete Cosine Transform and Image Compression MTH/CSC 421 Thomas Buck Regal Ferrulli Jimmie Maggard

Discrete Cosine Transform and Image Compression MTH/CSC 421 Thomas Buck Regal Ferrulli Jimmie Maggard Zachary Hilty

Common Applications l JPEG Format l MPEG-1 and MPEG-2 l MP 3, Advanced Audio

Common Applications l JPEG Format l MPEG-1 and MPEG-2 l MP 3, Advanced Audio Coding, WMA l What’s l in common? All share, in some form or another, a DCT method for compression.

Overview l One-dimensional DCT l Least Squares Approximation l Two-dimensional DCT l Image Compression

Overview l One-dimensional DCT l Least Squares Approximation l Two-dimensional DCT l Image Compression (grayscale, color)

One-dimensional DCT Definition: Let n be a positive integer. The one-dimensional DCT of order

One-dimensional DCT Definition: Let n be a positive integer. The one-dimensional DCT of order n is defined by an n x n matrix C whose entries are

The Advantage of Orthogonality C orthogonal: CTC = I l Implies C-1 = CT

The Advantage of Orthogonality C orthogonal: CTC = I l Implies C-1 = CT l l Makes solving matrix equations easy Solve Y = CXCT for X: l CTY = CTCXCY = XCT l CTYC = XCTC = X l

One-dimensional DCT The discrete cosine transform, C, has one basic characteristic: it is a

One-dimensional DCT The discrete cosine transform, C, has one basic characteristic: it is a real orthogonal matrix.

One-dimensional DCT Interpolation Theorem satisfies Pn(j)=xj for j=0, …, n-1 C transforms the n

One-dimensional DCT Interpolation Theorem satisfies Pn(j)=xj for j=0, …, n-1 C transforms the n data points into n interpolation coefficients. The DCT provides coefficients for the trigonometric interpolation function using only cosine terms.

One-dimensional DCT Suppose we are given a vector The Discrete Cosine Transform of x

One-dimensional DCT Suppose we are given a vector The Discrete Cosine Transform of x is the ndimensional vector Where C is defined as

Interpolation with DCT Why interpolate with DCT? l What about Lagrange or Splines? l

Interpolation with DCT Why interpolate with DCT? l What about Lagrange or Splines? l l DCT interpolation gives terms already arranged in terms of importance to the human visual system !! l First terms are most important, last terms are least important

One-dimensional DCT l Least Squares Approximation Theorem

One-dimensional DCT l Least Squares Approximation Theorem

DCT Interpolation & Approximation

DCT Interpolation & Approximation

2 -D DCT Interpolation Given a matrix of 16 data points we can plot

2 -D DCT Interpolation Given a matrix of 16 data points we can plot the surface in 3 -D space. 1 1 1 0 0 1 1 1

2 -D Least Squares l Done in the same way as with 1 -D

2 -D Least Squares l Done in the same way as with 1 -D l Implement a low pass filter (drop terms) l Delete the “high-frequency” components

2 -D Least Squares Approximation 1. 25 0. 75 0. 25 0. 75 1.

2 -D Least Squares Approximation 1. 25 0. 75 0. 25 0. 75 1. 25 Sizeable Error due to small number of points

Two-Dimensional DCT l Idea 2 D-DCT: Interpolate the data with a set of basis

Two-Dimensional DCT l Idea 2 D-DCT: Interpolate the data with a set of basis functions l Organize information by order of importance to the human visual system l Used to compress small blocks of an image (8 x 8 pixels in our case)

Two-Dimensional DCT Use One-Dimensional DCT in both horizontal and vertical directions. First direction F

Two-Dimensional DCT Use One-Dimensional DCT in both horizontal and vertical directions. First direction F = C*XT Second direction G = C*FT We can say 2 D-DCT is the matrix: Y = C(CXT)T

Image Compression l Image compression is a method that reduces the amount of memory

Image Compression l Image compression is a method that reduces the amount of memory it takes to store in image. l We will exploit the fact that the DCT matrix is based on our visual system for the purpose of image compression. l This means we can delete the least significant values without our eyes noticing the difference.

Image Compression l Now we have found the matrix Y = C(CXT)T l Using

Image Compression l Now we have found the matrix Y = C(CXT)T l Using the DCT, the entries in Y will be organized based on the human visual system. l The most important values to our eyes will be placed in the upper left corner of the matrix. l The least important values will be mostly in the lower right corner of the matrix. Most Important Semi. Important Least Important

Image Compression 8 x 8 Pixels Image

Image Compression 8 x 8 Pixels Image

Image Compression l Gray-Scale Example: l Value Range 0 (black) --- 255 (white) 63

Image Compression l Gray-Scale Example: l Value Range 0 (black) --- 255 (white) 63 33 36 28 63 81 27 18 17 11 22 48 72 52 28 15 17 16 132 100 56 19 10 9 187 186 166 88 13 34 184 203 199 177 82 44 211 214 208 198 134 52 211 210 203 191 133 79 X 86 98 104 108 47 77 21 55 43 51 97 73 78 83 74 86

Image Compression l 2 D-DCT of matrix Numbers are coefficients of polynomial -304 210

Image Compression l 2 D-DCT of matrix Numbers are coefficients of polynomial -304 210 -327 -260 93 -84 89 33 -9 42 -5 15 10 3 12 30 104 -69 10 67 70 -10 -66 16 24 -19 -20 -26 18 27 -7 -10 17 32 -1 2 0 -3 -3 Y 20 -15 -2 21 -17 -15 3 -6 -12 7 21 8 -5 9 -3 0 29 -7 -4 7 -2 -3 12 -1

Image Compression l Cut -304 210 -327 -260 93 -84 89 33 -9 42

Image Compression l Cut -304 210 -327 -260 93 -84 89 33 -9 42 -5 15 10 0 the least significant components 104 -69 10 67 70 -10 -66 16 24 -19 -20 0 18 0 0 0 20 -12 0 -15 0 0 0 0 0 As you can see, we save a little over half the original memory.

Inverse 2 D-DCT gives us Y = C(CXT)T which can be rewritten Y =

Inverse 2 D-DCT gives us Y = C(CXT)T which can be rewritten Y = CXCT Since C is an orthogonal we can solve for X using the fact C-1 = CT Therefore, X = CTYC

Reconstructing the Image l In Mathematical terms: l Let X = (xij) be a

Reconstructing the Image l In Mathematical terms: l Let X = (xij) be a matrix of n 2 real numbers Y = (ykl) be the 2 D-DCT of X a 0 = 1/sqrt(2) and ak = 1 for k > 0 l Then: l l Satisfies Pn(I, j) = xij for I, j=0, …, n-1

Reconstructing the Image l New Matrix and Compressed Image 55 41 27 39 56

Reconstructing the Image l New Matrix and Compressed Image 55 41 27 39 56 69 92 106 35 22 7 16 35 59 88 101 65 49 21 5 6 28 62 73 130 114 75 28 -7 -1 33 46 180 175 148 95 33 16 45 59 200 206 203 165 92 55 71 82 205 207 214 193 121 70 75 83 214 205 209 196 129 75 78 85

Can You Tell the Difference? Original Compressed

Can You Tell the Difference? Original Compressed

Image Compression Original Compressed

Image Compression Original Compressed

Tan without Danger

Tan without Danger

Linear Quantization l We will not zero the bottom half of the matrix. l

Linear Quantization l We will not zero the bottom half of the matrix. l The idea is to assign fewer bits of memory to store information in the lower right corner of the DCT matrix.

Linear Quantization Use Quantization Matrix (Q) qkl = 8 p(k + l + 1)

Linear Quantization Use Quantization Matrix (Q) qkl = 8 p(k + l + 1) for 0 < k, l < 7 Q=p* 8 16 24 32 40 48 56 64 72 80 88 40 48 56 64 72 80 88 72 80 88 96 104 88 95 104 112 96 104 112 120

Linear Quantization l p is called the loss parameter l It acts like a

Linear Quantization l p is called the loss parameter l It acts like a “knob” to control compression l The greater p is the more you compress the image

Linear Quantization We divide the each entry in the DCT matrix by the Quantization

Linear Quantization We divide the each entry in the DCT matrix by the Quantization Matrix -304 210 -327 -260 93 -84 89 33 -9 42 -5 15 10 3 12 30 104 -69 10 67 70 -10 -66 16 24 -19 -20 -26 18 27 -7 -10 17 32 -1 2 0 -3 -3 20 -15 -2 21 -17 -15 3 -6 -12 7 21 8 -5 9 -3 0 29 -7 -4 7 -2 -3 12 -1 8 16 24 32 40 48 56 64 72 80 88 40 48 56 64 72 80 88 72 80 88 96 104 88 95 104 112 96 104 112 120

Linear Quantization p=1 -38 13 4 -2 -20 -11 2 2 4 -3 -2

Linear Quantization p=1 -38 13 4 -2 -20 -11 2 2 4 -3 -2 0 3 1 0 0 0 0 0 0 0 0 0 p=4 0 0 0 0 New Y: 14 terms -9 3 1 -1 0 0 -5 -3 1 0 0 0 1 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 New Y: 10 terms

Linear Quantization p=1 p=4

Linear Quantization p=1 p=4

Linear Quantization p=1 p=4

Linear Quantization p=1 p=4

Linear Quantization p=1 p=4

Linear Quantization p=1 p=4

Memory Storage l The original image uses one byte (8 bits) for each pixel.

Memory Storage l The original image uses one byte (8 bits) for each pixel. Therefore, the amount of memory needed for each 8 x 8 block is: l 8 x (82) = 512 bits

Is This Worth the Work? l The question that arises is “How much memory

Is This Worth the Work? l The question that arises is “How much memory does this save? ” Linear Quantization p Total bits Bits/pixel X 512 8 1 249 3. 89 2 191 2. 98 3 147 2. 30

JPEG Imaging l It is fairly easy to extend this application to color images.

JPEG Imaging l It is fairly easy to extend this application to color images. These are expressed in the RGB color system. l Each pixel is assigned three integers for each color intensity. l

RGB Coordinates

RGB Coordinates

The Approach l There a few ways to approach the image compression. l l

The Approach l There a few ways to approach the image compression. l l Repeat the discussed process independently for each of the three colors and then reconstruct the image. Baseline JPEG uses a more delicate approach. l Define the luminance coordinate to be: l l Y = 0. 299 R + 0. 587 G + 0. 114 B Define the color differences coordinates to be: l l U=B–Y V=R–Y

More on Baseline l This transforms the RGB color data to the YUV system

More on Baseline l This transforms the RGB color data to the YUV system which is easily reversible. V B G R l U Y It applies the DCT filtering independently to Y, U, and V using the quantization matrix QY.

JPEG Quantization Luminance: QY = p { 16 11 10 16 24 40 51

JPEG Quantization Luminance: QY = p { 16 11 10 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99}

JPEG Quantization Chrominance: QC = { 17 18 24 47 99 99 18 21

JPEG Quantization Chrominance: QC = { 17 18 24 47 99 99 18 21 26 66 99 99 24 26 56 99 99 99 47 66 99 99 99 99 99 99 99 99 99 99}

Luminance and Chrominance l Human eye is more sensible to luminance (Y coordinate). l

Luminance and Chrominance l Human eye is more sensible to luminance (Y coordinate). l It is less sensible to color changes (UV coordinates). l Then: compress more on UV ! l Consequence: color images are more compressible than grayscale ones

Reconstitution l After compression, Y, U, and V, are recombined and converted back to

Reconstitution l After compression, Y, U, and V, are recombined and converted back to RGB to form the compressed color image: B= U+Y R= V+Y G= (Y- 0. 299 R - 0. 114 B) / 0. 587

Comparing Compression Original p=4 p=1 p=8

Comparing Compression Original p=4 p=1 p=8

Up Close

Up Close

The End Thanks for Coming!

The End Thanks for Coming!