Data Representation Compression jamiedrfrostmaths com www drfrostmaths com

  • Slides: 16
Download presentation
Data Representation : : Compression jamie@drfrostmaths. com www. drfrostmaths. com @Dr. Frost. Maths Last

Data Representation : : Compression jamie@drfrostmaths. com www. drfrostmaths. com @Dr. Frost. Maths Last modified: 19 th July 2019

www. drfrostmaths. com Registering on the Dr. Frost. Maths platform allows you to save

www. drfrostmaths. com Registering on the Dr. Frost. Maths platform allows you to save all the code and progress in the various Computer Science mini-tasks. It also gives you access to the maths platform allowing you to practise GCSE and A Level questions from Edexcel, OCR and AQA. Everything is completely free. Why not register? With Computer Science questions by: Your code on any mini-tasks will be preserved. Note: The Tiffin/DFM Computer Science course uses Java. Script as its core language. Most code examples are therefore in Java. Script. Using these slides: Green question boxes can be clicked while in Presentation mode to reveal. ? Slides are intentionally designed to double up as revision notes for students, while being optimised for classroom usage. The Mini-Tasks on the DFM platform are purposely ordered to correspond to these slides, giving your flexibility over your lesson structure.

Learning Objectives Directly from the OCR specification: Not in the syllabus: Compression algorithms, i.

Learning Objectives Directly from the OCR specification: Not in the syllabus: Compression algorithms, i. e. how data is compressed (although we will touch upon these just for your interest)

The need for compression Compression is reducing the amount of data needed for a

The need for compression Compression is reducing the amount of data needed for a file/data stream. There are many reasons why we’d want to use compression: 1 Web pages load more quickly Some of the larger Java. Script files on Dr. Frost. Maths I run through a tool at www. jscompress. com. This removes whitespace, renames variables to single letters and uses various clever programming syntax to reduce code length. The file size ends up being more than 50% less. As a convention we name ‘minified’ Java. Script files to end with. min. js

The need for compression Compression is reducing the amount of data needed for a

The need for compression Compression is reducing the amount of data needed for a file/data stream. There are many reasons why we’d want to use compression: 2 Files take up less storage space ‘zip’ files are a compressed collection of files. It has the advantage of treating a directory as a single file (making it easier to send), but also takes up less overall space.

The need for compression Compression is reducing the amount of data needed for a

The need for compression Compression is reducing the amount of data needed for a file/data stream. There are many reasons why we’d want to use compression: 3 Files/data takes up less bandwidth Bandwidth is the amount of data transferred in a fixed amount of time. Having to download less data may save on your mobile phone bill! Chrome on Android phones uses a ‘compression proxy’. All requests for web data goes via Google’s servers, which compresses the data before delivering it to your phone.

The need for compression Compression is reducing the amount of data needed for a

The need for compression Compression is reducing the amount of data needed for a file/data stream. There are many reasons why we’d want to use compression: 4 Emails have limited attachment size While email standards size as MIME don’t have a theoretical maximum file attachment size, in practice most email services have a limit.

Lossy vs Lossless Compression For some data, any compression must allow the full original

Lossy vs Lossless Compression For some data, any compression must allow the full original data to be reconstructed, e. g. • Compressed code would not function correctly if we lost code. • Compressed files similarly might be corrupted if we couldn’t recover some of the original data after uncompressing. ! Lossless compression allows the original data to be reconstructed in full. Data is only temporarily removed while the data is in compressed form. However, for audio or image files, sometimes we tolerate some of the original data to be lost at the expense quality. Reduce audio sample size time ! Lossy compression permanently discards some of the data.

Lossy vs Lossless Compression Advantages Disadvantages Example audio/visual file types Lossless • No reduction

Lossy vs Lossless Compression Advantages Disadvantages Example audio/visual file types Lossless • No reduction in • Relatively small quality: image will look reduction in file size. exactly the ? ? same/audio sound exactly the same. png (image) gif (image) wav (audio) ? Lossy • Larger reduction file • Loses data, so can’t jpg (image) size/reduced reconstruct original. mp 3 (audio) bandwidth. • Can’t be used on files • Commonly used, which must preserve ? ? ? therefore most all data. software can read such • Loss of quality may be data. noticeable if compression high.

JPEGs We saw on the previous slide that JPGs result in a permanent loss

JPEGs We saw on the previous slide that JPGs result in a permanent loss in quality of the image. For images with blocks of colour, e. g. the above, we tend to get quite a lot of ‘noise’, so PNGs/GIFs tend to be better for ‘graphic art’. JPEG compression tends to work much better on photos, and is the file format typically outputted by cameras. Decreasing compression rate We can customise the amount of JPEG compression. Higher compression reduces file size but also reduces quality, as demonstrated above.

For your interest : : Image Compression Algorithms (Not in the syllabus) PNG compression

For your interest : : Image Compression Algorithms (Not in the syllabus) PNG compression [Source: Pink Kitty 111] This image shows the ‘relative cost’ (in terms of number of bits) required for each pixel, with blue the least bits and red the most bits. As you can see, areas of the same colour take up less space. But also repeating textures (e. g. ends of the bananas) also take up less space due to how the compression algorithm works.

For your interest : : Image Compression Algorithms (Not in the syllabus) PNG compression

For your interest : : Image Compression Algorithms (Not in the syllabus) PNG compression There is a two stage compression process, part of a compression algorithm known as DEFLATE: #1 : : LZSS Compression This identifies repeating sequences of characters. For images, this corresponds to efficiently compressing repeating segments within the image. LZSS (5, 3) means we’re using the word starting at position 5 (the ‘S’) and 3 characters long.

For your interest : : Image Compression Algorithms (Not in the syllabus) #2 :

For your interest : : Image Compression Algorithms (Not in the syllabus) #2 : : Huffman Coding Typically we would use the same number of bits for each character, e. g. 8 bits. But it would be more space efficient to use a varying number of bits for each character, so that more common characters use less bits and less common characters use more bits. a b c d Symbol Code a 0 b 10 c 110 d 111 Because no code is a prefix of any other code (e. g. 10 doesn’t appear as the first two digits of any other code), it means there is no ambiguity in how the string is split up. e. g. 0110101110 Suppose we had just 4 letters used in our data: a, b, c, d. The decimals show the proportion of time each letter appears, e. g. ‘a’ 40% of the time (and thus should have the least number of bits). 0 110 10 111 ? 0 110 10 111 0 acbda ?

For your interest : : Image Compression Algorithms (Not in the syllabus) JPEG compression

For your interest : : Image Compression Algorithms (Not in the syllabus) JPEG compression is considerably more complicated and uses a large amount of mathematics. But a summary: • The colour model is converted from RGB (Red-Green-Blue) to Y’CRCB, where Y’ is to do with Brightness and CRCB two colour components. Because the brightness is confined to a single value (rather than spread across R, G and B), and because human visual perception is dominated by brightness over colour, we can compress colour information more efficiently. A single 2 D cosine wave: • We use bit encoding techniques similar to that for PNGs, e. g. Huffman encoding.

Exam Question OCR Sample Question Paper ? Notice that you need to give the

Exam Question OCR Sample Question Paper ? Notice that you need to give the practical implication of each benefit (even if it’s really obvious!)

Coding Mini-Tasks Return to the Dr. Frost. Maths site to complete the various tasks

Coding Mini-Tasks Return to the Dr. Frost. Maths site to complete the various tasks on compression.