Data Compression Prof Aiman Hanna Department of Computer

  • Slides: 52
Download presentation
Data Compression © Prof. Aiman Hanna Department of Computer Science Concordia University Montreal, Canada

Data Compression © Prof. Aiman Hanna Department of Computer Science Concordia University Montreal, Canada

Why needed? u u D ata Compression Size of applications is going from large

Why needed? u u D ata Compression Size of applications is going from large to larger MP 3, MPEG, Tiff, Facsimile (fax), …etc. Fax has about 4 million dots/page more than 1 minutes over 56 Kbps TV / Motion Pictures uses 30 pictures (frames) / second • 200, 000 pixels / frames • Color pictures require 3 bytes for each pixel (RGB) • Each frame has 200, 000 * 24 = 4. 8 Mbits • 2 -hour movie requires 216, 000 pictures • total bits for such movie = 216, 000 * 4. 8 Mbits = 1. 0368 x 1012 • This is much higher than the capacity of DVDs 2

D ata Compression Simple Compression example u Assume only uppercase characters are to be

D ata Compression Simple Compression example u Assume only uppercase characters are to be sent u ASCII can be used 8 -bit * number of characters to sent u Alternatively, a different 5 -bit code can be used A: B: C: 000001 00010 : : Y: Z: u 11000 11001 37. 5% reduction is achieved 3

u u F requency-Dependent Codes Some characters are used more than others ASCII assigns

u u F requency-Dependent Codes Some characters are used more than others ASCII assigns the same number of bits to all characters Alternatively, assign shorter code for those who are used more frequently Huffman Code & Arithmetic Compression are examples of frequency-dependent code 4

H uffman Code u Assign a percentage of usage to each character u Create

H uffman Code u Assign a percentage of usage to each character u Create a Huffman code based on that u Example • For illustration, assume only 5 characters are used Letter A B C D E Frequency 25% 10% 20% 30% 5

u H uffman Code Huffman code for the previous example Letter Code A 01

u H uffman Code Huffman code for the previous example Letter Code A 01 B 110 C 111 D 10 E 00 u Now, assume the sent code is 11010010 u How the receiver can decode that sequence? 6

u u H uffman Code Huffman uses no-prefix property Codes obtained by creating Huffman

u u H uffman Code Huffman uses no-prefix property Codes obtained by creating Huffman trees and merging them Figure 5. 2 – Merging Huffman Trees 7

H uffman Code u Using the no-prefix property enables decoding Figure 5. 1 –

H uffman Code u Using the no-prefix property enables decoding Figure 5. 1 – Receiving & Interpreting Huffman-Coded Message 8

u u u A rithmetic Compression Interprets a character string as a single real

u u u A rithmetic Compression Interprets a character string as a single real number Define an association between a character string and a real number [0. . 1] Example • For illustration, assume only 5 characters are used Letter A B C D E Frequency 25% 10% 20% 30% Subinterval [p, q] [0, 0. 25] [0. 25, 0. 40] [0. 4, 0. 5] [0. 5, 0. 7] [0. 7, 1] 9

A rithmetic u In a nutshell, the idea is: Compression • Start with the

A rithmetic u In a nutshell, the idea is: Compression • Start with the entire range, that is [0. . 1] • Narrow this range down every time you move through the string. • This narrowing down operation depends on two factors: The previous narrowed down range u The frequency of the character u • Once reached the last character in the string, just pick up any value from the final range 10

A rithmetic u Example: Letter A B C D E Compression Frequency 25% 10%

A rithmetic u Example: Letter A B C D E Compression Frequency 25% 10% 20% 30% Subinterval [p, q] [0, 0. 25] [0. 25, 0. 40] [0. 4, 0. 5] [0. 5, 0. 7] [0. 7, 1] • • • What is the code for CABACE? Start range is [0. . 1], distance is 1 C has subinterval [0. 40. . 0. 50] we get 0. 40 * 1 (which is 0. 40) and 0. 50% * 1 (which is 0. 50) from the beginning of the previous range • New range now is [0. 40. . 0. 50]; that is [0. 4 + 0. 5 + 0] 11

u A rithmetic Example: Letter A B C D E Compression Frequency 25% 10%

u A rithmetic Example: Letter A B C D E Compression Frequency 25% 10% 20% 30% Subinterval [p, q] [0, 0. 25] [0. 25, 0. 40] [0. 4, 0. 5] [0. 5, 0. 7] [0. 7, 1] • What is the code for CABACE? • Start range now is [0. 40. . 0. 50] distance is 0. 10 • A has subinterval [0. . 0. 25] we get 0 * 0. 10 (which is 0) and 0. 25% * 0. 10 (which is 0. 025) from the beginning of the previous range • New range now is [0. 40. . 0. 425]; that is [0. 40 + 0. 025] 12

u A rithmetic Compression Example (continue): What is the representation of CABACE? Letter A

u A rithmetic Compression Example (continue): What is the representation of CABACE? Letter A B C D E Frequency 25% 10% 20% 30% Subinterval [p, q] [0, 0. 25] [0. 25, 0. 40] [0. 4, 0. 5] [0. 5, 0. 7] [0. 7, 1] • from previous page, the last range after CA are coded is [0. 4. . 0. 425] a width/difference of 0. 025 (that is 0. 425 – 0. 4) • B has subinterval [0. 25. . 0. 40] we get 25% * 0. 025 (which is 0. 00625) and 0. 4% * 0. 0. 025 (which is 0. 01) from the beginning of the previous range • New range now is [0. 40625. . 0. 41]; that is [0. 4 + 0. 00625. . 0. 4 + 0. 01] • A will move the range to [0. 40625. . 0. 4071875] • C will move the range to [0. 406625. . 0. 40671875] • E will finally move the range to [0. 406690625. . 0. 40671875] • A final code can then be 0. 40671 13

A rithmetic u u Compression Decoding: real value character string What is the string

A rithmetic u u Compression Decoding: real value character string What is the string for 0. 40671? Step Value (V) Subinterval [p, q] Width Char V-p Divide by width 1 0. 40671 [0. 4. . 0. 5] 0. 10 C 0. 0067 0. 067 2 0. 067 [0. . 0. 25] 0. 25 A 0. 067 0. 268 3 0. 268 [0. 25. . 0. 40] 0. 15 B 0. 018 0. 12 4 0. 12 [0. . 0. 25] 0. 25 A 0. 12 0. 48 5 0. 48 [0. 4. . 0. 5] 0. 10 C 0. 08 0. 8 6 0. 8 [0. 7. . 1] 0. 30 E 0. 10 0. 333 14

u R un-Length Encoding Run of the Same Bit • 0 represents white spot,

u R un-Length Encoding Run of the Same Bit • 0 represents white spot, 1 black spot • Bits are grouped into long runs of 0 s • Instead of transmitting the bits, transmit their run amount • Use 4 -bit for run (value ranges from 0 to 15), however • • The maximum length of a run is 14 Run of 15 is represented as 1111 0000 Run of 30 is represented as 1111 0000 Run of 31 is represented as 1111 0001 15

R un-Length Encoding u Run of the Same Bit Figure 5. 5 – Run-Length

R un-Length Encoding u Run of the Same Bit Figure 5. 5 – Run-Length Encoding 16

R un-Length Encoding Runs with Different Characters u Needed if the transmitted stream is

R un-Length Encoding Runs with Different Characters u Needed if the transmitted stream is a character stream (not only 1 s & 0 s) u A string of “HHHHUFFFFFFFGGG” would be coded as: 4 H 1 U 14 F 3 G 17

R elative Encoding u Also referred to as Differential Encoding u Huffman codes are

R elative Encoding u Also referred to as Differential Encoding u Huffman codes are good for data messages u Run Length encoding is good for fax and voice u None of them is that suitable for video u Pictures may have little difference in between u u Send the full 1 st frame, then send frames that have the differences Run-length encoding can be used for those differential frames 18

R elative Encoding Figure 5. 6 – Relative Encoding 19

R elative Encoding Figure 5. 6 – Relative Encoding 19

L empel-Ziv Compression u Focuses on repetition of words or phrases u For example:

L empel-Ziv Compression u Focuses on repetition of words or phrases u For example: the, that, ing, in, of, . . . etc u u Look for often-repeated strings and store (code) them just once Reference those strings through their special code 20

I mage Compression Image Representation u Images are made up of very small dots

I mage Compression Image Representation u Images are made up of very small dots (pixels) u Is there really such thing as black & white photos? 21

I mage Compression u u Different video colors may be obtained using combination of

I mage Compression u u Different video colors may be obtained using combination of Red, Green & Blue (RGB) 8 bits can be used to represent each of the tree colors A total of 24 bits 224 different combinations Since human eye cannot distinguish these many colors, we think of it as true color 22

u u I mage Compression An alterative to RGB is YIQ, which is based

u u I mage Compression An alterative to RGB is YIQ, which is based on RGB YIQ uses 8 -bit group for luminance (brightness), and two other 8 -bit (each) for chrominance (color) For example, Y=0. 30 R+0. 59 G+. 11 B (luminance) I=0. 60 R-0. 28 G-0. 32 B (chrominance) Q=0. 21 R-0. 52 G+0. 31 B (chrominance) Human eye is more sensitive to luminance than chrominance; that is some loss in colors through transmission may not be visually detectable That is useful information when it comes to image compression 23

u u I mage Compression Regardless of using RGB or YIQ, there is 3

u u I mage Compression Regardless of using RGB or YIQ, there is 3 8 -bit groups per pixel To fill a 640 x 480 computer screen, we need: 640 * 480 * 24 7, 372, 800 bits Video usually uses 30 images/sec, and transfer may be done simultaneously to many users This may easily average to Gbps rate and higher, which is far more than what current technology (as of 2006, when this was written) can handle 24

JPEG Compression u u u Joint Photographic Experts Group Formed by ISO, ITU &

JPEG Compression u u u Joint Photographic Experts Group Formed by ISO, ITU & IEC Previous methods were examples of lossless compression JPEG however is lossy Considering the optical system of a human, that may still be acceptable 25

JPEG Compression JPEG is good for grayscale or photo-colored image u JPEG compression consists

JPEG Compression JPEG is good for grayscale or photo-colored image u JPEG compression consists of three phases: Discrete Cosine Transform (DCT), Quantization, and Encoding u DCT divides an image into a series of “blocks” (8 x 8 pixels each block). I. e. 640 x 480 pixels = 80 x 60 blocks u Figure 5. 8 – JPEG’s Three Phases 26

JPEG Compression u For example, 800 x 800 image would be divided into 100

JPEG Compression u For example, 800 x 800 image would be divided into 100 x 100 blocks – each block has 8 x 8 pixels Figure 5. 9 – 800 x 800 VGA Screen Image Divided into 8 x 8 pixel blocks 27

JPEG Compression If grayscale, then each pixel is represented by an 8 bit number

JPEG Compression If grayscale, then each pixel is represented by an 8 bit number u Each 8 x 8 pixel block will be represented as 2 -D array with 8 rows & 8 columns u If color, then each pixel is represented by a 24 -bit number u Each 8 x 8 pixel block will be represented as three 2 -D arrays with 8 rows & 8 columns each u 28

JPEG Compression DCT Phase u Basically, it is a function that takes a 2

JPEG Compression DCT Phase u Basically, it is a function that takes a 2 -D array with 8 rows & 8 columns and produces another 2 -D array u u u P[i][j] is the input array, T[i][j] is the output array The values inside T[i][j] are called spatial frequencies Formula page 240 29

JPEG Compression u DCT Phase Figure 5. 10 – Discrete Cosine Transform Results on

JPEG Compression u DCT Phase Figure 5. 10 – Discrete Cosine Transform Results on Two Different Arrays 30

JPEG Compression Quantization Phase u Provides a way of ignoring small difference that may

JPEG Compression Quantization Phase u Provides a way of ignoring small difference that may not be perceptible u u u It produces another 2 -D array, call it Q for example Q[i][j] values are obtained by dividing T[i][j] values by some number and rounding to the nearest integer The resulting array with have fewer distinct numbers, so it easier to compress 31

JPEG Compression Quantization Phase u Example: T values are divided by 10 then rounded

JPEG Compression Quantization Phase u Example: T values are divided by 10 then rounded Step 5 -3, page 242 - T Array Step 5 -4, page 242 - Q Array 32

JPEG Compression Quantization Phase u How can we obtain T (decompression) from Q then?

JPEG Compression Quantization Phase u How can we obtain T (decompression) from Q then? ! Step 5 -5, page 242 – T Array After Decompression 33

JPEG Compression Quantization Phase u To preserve as much information as possible, define a

JPEG Compression Quantization Phase u To preserve as much information as possible, define a quantization array, say U, and divide T values by the values of U Step 5 -3, page 242 - T Array Step 5 -6, page 243 - U Array 34

JPEG Compression Quantization Phase Step 5 -7, page 243 - Q Array Step 5

JPEG Compression Quantization Phase Step 5 -7, page 243 - Q Array Step 5 -8, page 243 – T Array After Decompression 35

JPEG Compression Encoding Phase u So far, transformation & quantization were done, but nothing

JPEG Compression Encoding Phase u So far, transformation & quantization were done, but nothing about compression! u This phase finally does the compression u The idea is to linearize the content of the Q array and compress it for transmission u Run-length coding can then be used Figure 5. 11 – Order in which Array elements are transmitted 36

u u u JPEG Compression JPEG can use Huffman encoding or Arithmetic encoding for

u u u JPEG Compression JPEG can use Huffman encoding or Arithmetic encoding for the non-zero values Since many 2 -D must be transmitted for the image, and many of them may not have much differences, Differential encoding can also be applied In general, JPEG compression ration is about 20: 1; that is, the resulted file is 5% of the original u Better ratios are possible but loss of quality may become noticeable u JPEG 2000 is the newest JPEG coding u Mass details of JPEG can be found at www. JPEG. org 37

GIF Files u Graphics Interchange Format u The number of colors that GIF works

GIF Files u Graphics Interchange Format u The number of colors that GIF works with is only 256 (28 in contrast to 224 for JPEG u Stores up to 256 colors in a table and attempt to cover the range of colors in an image as closely as possible u The resulting bit values are subjected to some form of Lempel-Ziv compression u Lossy if the number of colors in an image is more than 256; lossless otherwise u u Best suited for graphics that contain relatively few colors and sharply defined boundaries between the colors, such as cartoons, charts, …etc. Not that suitable for images with lots of variations between colors, such as fullcolor photographic-quality images 38

M ultimedia Compression u Compression of video, motion pictures and audio u MPEG -

M ultimedia Compression u Compression of video, motion pictures and audio u MPEG - Moving Pictures Expert Group u MP 3 – File extension for MPEGaudio Layer 3 39

MPEG u Actually not a single standard: • MPEG-1: designed for video on CD-ROM

MPEG u Actually not a single standard: • MPEG-1: designed for video on CD-ROM • MPEG-2: designed for more demanding applications, such as multimedia entertainment and high-definition television (HDTV) • MPEG-4: designed for videoconferencing over low BW channels • MPEG-7 & MPEG-21 are also in progress 40

MPEG u u Video/motion is produced by displaying still pictures at a rate of

MPEG u u Video/motion is produced by displaying still pictures at a rate of some frames/second NTSC defined that as 30 frames/second Lower rate than that may produce a motion, but a jerky one, as in older movies MPEG compression may be obtained by using JPEG compression for each of these frames, however this is not sufficient 41

MPEG u u On 640 x 480 VGA, one frame may contain 24 *

MPEG u u On 640 x 480 VGA, one frame may contain 24 * 640 * 480 = 7, 372, 800 bits On 20: 1 JPEG compression, image is reduced to 368, 640 30 images/second 30 * 368, 640 = 11, 059, 200 bps That is still too high, especially considering shared channels that may be used by multiple users accessing video 42

MPEG u u No matter how much action is in a video, the difference

MPEG u u No matter how much action is in a video, the difference between consecutive frames is usually quite small MPEG compression takes advantage of this redundancy (temporal redundancy) in successive frames, however • What happen if hidden objects in the old frame start to appear? • What happens if the seen completely changes? u MPEG identifies 3 types of frames: • • • I Frame (Intrapicture Frame) P Frame (Predicted Frame) B Frame (Bidirectional Frame) 43

MPEG Figure 5. 12 – Typical MPEG Frame Sequence (Logical Sequence) u u u

MPEG Figure 5. 12 – Typical MPEG Frame Sequence (Logical Sequence) u u u P frames are coded using a method called motioncompensated prediction This method divides the image into a collection of macroblocks of 16 x 16 pixels (totals to 256 pixels) If each pixel has a color (1 luminescence & 2 chrominance), then three 2 -D arrays are needed (each array of size 16 x 16) 44

MPEG u To speed things up, the two chrominance arrays are reduced from 16

MPEG u To speed things up, the two chrominance arrays are reduced from 16 x 16 to 8 x 8 Figure 5. 13 – Reduction of 16 x 16 Chrominance Arrays to 8 x 8 Chrominance Arrays 45

MPEG u u Prior to sending the P frame, an algorithm runs to find

MPEG u u Prior to sending the P frame, an algorithm runs to find out what is the best matched macroblock in the I frame; this may not be at the same position in the I frame The algorithm then calculates the differences between the matching macroblocks in the I & P frames The algorithm also calculates a motion-vector Both the differences and the motion-vectors are then transmitted 46

MPEG u u The B frame is similar to P frame, except that the

MPEG u u The B frame is similar to P frame, except that the macroblocks are interpolated from matching macroblocks in a prior & future frames Interpolation is a way of predicting a value based on two existing values Figure 5. 14 – Using Interpolation to Estimate a Value 47

u Notes: MPEG • MPEG has a high computation and algorithmic complexity • However,

u Notes: MPEG • MPEG has a high computation and algorithmic complexity • However, in most applications video is recorded just once and stored in some medium • As a result, time spend on encoding (compression) is not a concern here • Yet, time spent in decompression (decoding) is very significant since it is run-time 48

MP 3 u MPEG Layer 3 for audio compression u Pulse-Code Modulation (PCM) can

MP 3 u MPEG Layer 3 for audio compression u Pulse-Code Modulation (PCM) can be used to transform analog (audio) to digital u Human’s auditory system can only hear frequencies between 20 Hz & 20 KHz u u According to Nyquist Theorem, a sampling frequency of about 40 KHz would be sufficient to reconstruct the audio signal within a human’s hearing range Consequently, common PCM techniques to produce a CD-quality sound use 16 bit sampling and 44. 1 KHz sampling frequency u 1 second of PCM-coded music requires 16 * 44. 1 * 1000 ≈ 700, 000 bits u A 2 -chaneel stereo would hence require about 1. 4 Mb u 2 -minute recording would require about 168 Mb 49

u Psychoacoustic Model: MP 3 • What can human auditory hear? • What can

u Psychoacoustic Model: MP 3 • What can human auditory hear? • What can human auditory distinguish? u Auditory Masking: • If two sounds have similar frequencies, but one is weak and the other is high, it is possible that a human cannot hear the weak sound u MP 3 fundamental idea is: • Capture an audio signal • Determine what cannot be heard and remove it, then • Digitize the rest 50

u Psychoacoustic Model: MP 3 • What can human auditory hear? • What can

u Psychoacoustic Model: MP 3 • What can human auditory hear? • What can human auditory distinguish? u Auditory Masking: • If two sounds have similar frequencies, but one is weak and the other is high, it is possible that a human cannot hear the weak sound u MP 3 fundamental idea is: • Capture an audio signal • Determine what cannot be heard and remove it, then • Digitize the rest 51

MP 3 Figure 5. 15 – MP 3 Encoding 52

MP 3 Figure 5. 15 – MP 3 Encoding 52